Open-Source Datasets
Provide extensive training data for AI research and improve model performance quickly
Voice Datasets

Artificial intelligence requires huge volume of data to be trained. For some AI companies and researchers, data can be difficult and time consuming to collect. Open-source data can help mitigate these challenges and boost the development of AI.

Chinese Read Speech Recognition Corpus

MAGICDATA Mandarin Chinese Read Speech Corpus was developed by MAGIC DATA TECHNOLOGY Co., Ltd. and freely published for non-commercial use. The corpus consists of 755 hours of scripted read speech data by 1000 native speakers of the Mandarin Chinese spoken in mainland China.

Data Specification
Japanese Read Speech Recognition Corpus

Japanese Read Speech Recognition Corpus was developed by MAGICDATA TECHNOLOGY Co., Ltd. with a significant volume of 1500 hours. A subset of 30-hour scripted read speech data was developed and freely published for non-commercial use. 37 native speakers are from different areas, including Tokyo, Osaka, Hokkaido, etc. The corpus is a test set, recorded indoors and the output is PCM formatted. The recording texts are from daily conversation.

Data Specification
MAGICDATA Kid Voice TTS Corpus in Mandarin Chinese

MAGICDATA Kid Voice TTS Corpus in Mandarin Chinese was recorded by a four-year-old Chinese girl originally born in Beijing China. This time we published 15-minute speech data from the corpus for non-commercial use.

Data Specification
Show More
>
Sales Department
Please fill in this form to purchase datasets or quote for
data collection/ annotation services.
Name
*
Company Name
*
Email
*
Phone Number
*
Detail
Country
City
Submit
Resources Department
If you want to be our data collection and annotation team
member, please fill in this form.
DATA COLLECTION PROJECTS
Language*
Location*
DATA ANNOTATION PROJECTS
Language*
CONTACT INFORMATION
Name*
Company Name*
E-mail*
Phone Number*
Experience*
Address*
Submit
Marketing Department
If you want to forward our article or tell us marketing
events, please fill in this form.
Name
*
Company Name
*
Email
*
Phone Number
*
Detail
Submit
Human Resources Department
Please fill in this form to be a member of Magic Data Tech.
Name
*
Email
*
Phone Number
*
Job
*
Upload Resume
Submit
Sample Download
Name*
E-mail*
Phone Number*
Company Name*
Job
Department
Company Product
I am also interested in the following data:
Languages
Style
Scenario

We will contact you via telephone to confirm your information and provide the method to download.
Submit
Submission Successful!
We will contact you as soon as possible.
This page would be
closed in 3 seconds automatically.
>
TOP