This dataset is characterized by a high sampling rate of 48 kHz, recordings collected in controlled quiet environments, and contributions from a demographically diverse pool of speakers varying in region, age, and gender to ensure speech diversity. The dataset encompasses a wide array of topics across 20 domains, including daily life, leisure and entertainment, education and training, and healthcare.
Language
Chinese
Data Style
Conversational Style
Audio Format
PCM
Sampling Rate
48kHz
Bit Rate
16 bits
Paralanguage
Cough、Yawn、Laughter、Swallowing etc.
Number of Speakers
5000
Total Audio Duration
2000 hours
Compared to emotional expressions, paralanguage demonstrates continuous and context-dependent variations during conversations, making it inherently more challenging to describe. To enhance the ability of large-scale models to effectively learn human paralanguage features, Magic Data has developed a novel “High-Quality Spontaneous Speech Datasets of Expressive Paralinguistics” This dataset is the result of a collaboration between the company's experienced product specialists and senior voice synthesis consultants, involving meticulous design, iterative refinement of the labeling framework, and production through a professional data pipeline. It is characterized by a high sampling rate of 48 kHz, recordings collected in controlled quiet environments, and contributions from a demographically diverse pool of speakers varying in region, age, and gender to ensure speech diversity. The dataset encompasses a wide array of topics across 20 domains, including daily life, leisure and entertainment, education and training, and healthcare.
ISO/IEC 27001 & ISO/IEC 27701:2019 compliant
Audio, text, image, and video multi-modal data
Conversational, scripted, and spontaneous data covering extensive domains
Expertise secured quality result