31 3 40

ASLP-lab

xikoDois's profile picture

kivirvvn's profile picture

Reynold97's profile picture

http://www.nwpu-aslp.org/

ASLP-lab

AI & ML interests

None yet

Recent Activity

updated a dataset 6 days ago

ASLP-lab/FMSU-Bench

updated a model 13 days ago

ASLP-lab/FM-Speech

updated a dataset 25 days ago

ASLP-lab/UrduSpeech

View all activity

Organizations

None yet

ASLP-lab 's collections 15

FMSU

Towards Fine-Grained Multi-Dimensional Speech Understanding: Data Pipeline, Benchmark, and Model

ASLP-lab/FMSU-Bench

Viewer • Updated 6 days ago • 24.1k • 3.69k • 1
ASLP-lab/FM-Speech

Audio Classification • 35B • Updated 13 days ago • 22 • 2

OmniCodec

Low Frame Rate Universal Audio Codec with Semantic–Acoustic Disentanglement

ASLP-lab/OmniCodec

Feature Extraction • Updated Apr 8 • 2

WenetSpeech-Wu

ASLP-lab/WenetSpeech-Wu-Speech-Understanding

Updated Feb 2 • 2
ASLP-lab/WenetSpeech-Wu-Bench

Viewer • Updated Feb 8 • 242 • 391 • 4
ASLP-lab/WenetSpeech-Wu-Speech-Generation

Text-to-Speech • Updated Feb 1 • 3
ASLP-lab/WenetSpeech-Wu

Updated Feb 5 • 117 • 3

SenSE

ASLP-lab/SenSE

Updated Oct 16, 2025 • 27 • 7

SongFormer

Configuration error

Agents

22

SongFormer

🎵

22

State-of-the-art music analysis with multi-scale datasets
ASLP-lab/SongFormer

0.7B • Updated May 14 • 508 • 17
ASLP-lab/SongFormDB

Updated May 15 • 5.09k • 9
ASLP-lab/SongFormBench

Viewer • Updated May 14 • 3.82k • 464 • 3

WenetSpeech-Yue

A Large-scale Cantonese Speech Corpus with Multi-dimensional Annotation

Runtime error

Agents

12

WenetSpeech Yue

🔥

12

Large-Scale Cantonese Speech Corpus
ASLP-lab/WenetSpeech-Yue

Updated Feb 5 • 528 • 42
ASLP-lab/WSYue-ASR-eval

Viewer • Updated Sep 8, 2025 • 7.8k • 93 • 3
ASLP-lab/WSYue-TTS-eval

Viewer • Updated Sep 8, 2025 • 1.31k • 53 • 1

LLaSE

ASLP-lab/LLaSE-G1

Audio-to-Audio • Updated Mar 14, 2025 • 28

DiffRhythm

ASLP-lab/DiffRhythm-vae

Updated May 8, 2025 • 42
ASLP-lab/DiffRhythm-base

Updated Mar 26, 2025 • 40 • 171
ASLP-lab/DiffRhythm-full

Updated Mar 26, 2025 • 62 • 50
ASLP-lab/DiffRhythm-1_2

Updated May 8, 2025 • 68 • 17

Speaker-Reasoner

Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR

ASLP-lab/Speaker-Reasoner-4194h

32B • Updated Apr 24 • 122 • 1
ASLP-lab/Speaker-Reasoner

32B • Updated Apr 24 • 42 • 2

YingMusic-Singer-Plus

YingMusic-Singer-Plus: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance

Configuration error

Agents

8

YingMusic-Singer-Plus

🎤

8

Edit lyrics, keep the melody
ASLP-lab/YingMusic-Singer-Plus

0.7B • Updated Apr 9 • 2.29k • 7
ASLP-lab/LyricEditBench

Viewer • Updated Apr 9 • 7.2k • 301 • 2

VoiceSculptor

An instruct text-to-speech model developed by ASLP.

ASLP-lab/VoiceSculptor-VD

Text-to-Speech • 4B • Updated Feb 26 • 25 • 18
Runtime error

Agents

2

VoiceSculptor

📚

2

Easy Turn

ASLP-lab/Easy-Turn

Updated Oct 11, 2025 • 41 • 15
ASLP-lab/Easy-Turn-Trainset

Viewer • Updated Oct 18, 2025 • 1.91k • 789 • 9
ASLP-lab/Easy-Turn-Testset

Updated Sep 30, 2025 • 1.7k • 7

WenetSpeech-Chuan

a large-scale open-source corpus with a full processing pipeline and benchmarks for ASR and TTS

ASLP-lab/WSChuan-ASR

Automatic Speech Recognition • Updated Jan 9 • 6
ASLP-lab/WSChuan-TTS

Updated Sep 24, 2025 • 4
ASLP-lab/WSC-Train

Preview • Updated Apr 21 • 369 • 127
ASLP-lab/WSC-Eval

Viewer • Updated Dec 10, 2025 • 1.19k • 5.03k • 7

OSUM

OSUM: Open Speech Understanding Model, open-sourced by ASLP@NPU.

ASLP-lab/OSUM

Updated Feb 16, 2025 • 12

C2SER

ASLP-lab/Emotion2Vec-S

Updated Feb 27, 2025 • 4
ASLP-lab/C2SER-LLM

Updated Mar 3, 2025
ASLP-lab/Emo-Emilia

Viewer • Updated Feb 27, 2025 • 1.4k • 129 • 10

FMSU

Towards Fine-Grained Multi-Dimensional Speech Understanding: Data Pipeline, Benchmark, and Model

ASLP-lab/FMSU-Bench

Viewer • Updated 6 days ago • 24.1k • 3.69k • 1
ASLP-lab/FM-Speech

Audio Classification • 35B • Updated 13 days ago • 22 • 2

Speaker-Reasoner

Speaker-Reasoner: Scaling Interaction Turns and Reasoning Patterns for Timestamped Speaker-Attributed ASR

ASLP-lab/Speaker-Reasoner-4194h

32B • Updated Apr 24 • 122 • 1
ASLP-lab/Speaker-Reasoner

32B • Updated Apr 24 • 42 • 2

OmniCodec

Low Frame Rate Universal Audio Codec with Semantic–Acoustic Disentanglement

ASLP-lab/OmniCodec

Feature Extraction • Updated Apr 8 • 2

YingMusic-Singer-Plus

YingMusic-Singer-Plus: Controllable Singing Voice Synthesis with Flexible Lyric Manipulation and Annotation-free Melody Guidance

Configuration error

Agents

8

YingMusic-Singer-Plus

🎤

8

Edit lyrics, keep the melody
ASLP-lab/YingMusic-Singer-Plus

0.7B • Updated Apr 9 • 2.29k • 7
ASLP-lab/LyricEditBench

Viewer • Updated Apr 9 • 7.2k • 301 • 2

WenetSpeech-Wu

ASLP-lab/WenetSpeech-Wu-Speech-Understanding

Updated Feb 2 • 2
ASLP-lab/WenetSpeech-Wu-Bench

Viewer • Updated Feb 8 • 242 • 391 • 4
ASLP-lab/WenetSpeech-Wu-Speech-Generation

Text-to-Speech • Updated Feb 1 • 3
ASLP-lab/WenetSpeech-Wu

Updated Feb 5 • 117 • 3

VoiceSculptor

An instruct text-to-speech model developed by ASLP.

ASLP-lab/VoiceSculptor-VD

Text-to-Speech • 4B • Updated Feb 26 • 25 • 18
Runtime error

Agents

2

VoiceSculptor

📚

2

SenSE

ASLP-lab/SenSE

Updated Oct 16, 2025 • 27 • 7

Easy Turn

ASLP-lab/Easy-Turn

Updated Oct 11, 2025 • 41 • 15
ASLP-lab/Easy-Turn-Trainset

Viewer • Updated Oct 18, 2025 • 1.91k • 789 • 9
ASLP-lab/Easy-Turn-Testset

Updated Sep 30, 2025 • 1.7k • 7

SongFormer

Configuration error

Agents

22

SongFormer

🎵

22

State-of-the-art music analysis with multi-scale datasets
ASLP-lab/SongFormer

0.7B • Updated May 14 • 508 • 17
ASLP-lab/SongFormDB

Updated May 15 • 5.09k • 9
ASLP-lab/SongFormBench

Viewer • Updated May 14 • 3.82k • 464 • 3

WenetSpeech-Chuan

a large-scale open-source corpus with a full processing pipeline and benchmarks for ASR and TTS

ASLP-lab/WSChuan-ASR

Automatic Speech Recognition • Updated Jan 9 • 6
ASLP-lab/WSChuan-TTS

Updated Sep 24, 2025 • 4
ASLP-lab/WSC-Train

Preview • Updated Apr 21 • 369 • 127
ASLP-lab/WSC-Eval

Viewer • Updated Dec 10, 2025 • 1.19k • 5.03k • 7

WenetSpeech-Yue

A Large-scale Cantonese Speech Corpus with Multi-dimensional Annotation

Runtime error

Agents

12

WenetSpeech Yue

🔥

12

Large-Scale Cantonese Speech Corpus
ASLP-lab/WenetSpeech-Yue

Updated Feb 5 • 528 • 42
ASLP-lab/WSYue-ASR-eval

Viewer • Updated Sep 8, 2025 • 7.8k • 93 • 3
ASLP-lab/WSYue-TTS-eval

Viewer • Updated Sep 8, 2025 • 1.31k • 53 • 1