See What's NEW


press images

MAGICDATA Kid Voice TTS Corpus in Mandarin Chinese

Date : 2020-09-18     View : 1474

MAGICDATA Kid Voice TTS Corpus in Mandarin Chinese was recorded by a four-year-old Chinese girl originally born in Beijing China. This time we published 15-minute speech data from the corpus for non-commercial use. This is the first time to publish this voice!

The contents and the corresponding descriptions of the corpus:

(1) The corpus contains 15 minutes of speech data, which is recorded in NC-20 acoustic studio.

(2) The speaker is 4 years old originally born in Beijing

(3) Detail information such as speech data coding and speaker information is preserved in the metadata file.

(4) This corpus is natural kid style.

(5) Annotation includes four parts: pronunciation proofreading, prosody labeling, phone boundary labeling and POS Tagging.

(6) The annotation accuracy is higher than 99%.

(7) For phone labeling, the database contains the annotation not only on the boundary of phonemes, but also on the boundary of the silence parts.

The corpus aims to help researchers in the TTS fields. And it is part of a much bigger dataset (2.3 hours MAGICDATA Kid Voice TTS Corpus in Mandarin Chinese) which was recorded in the same environment.

Speaker intro: The speaker, NiuNiu, is lively and cheerful. When she first came to the studio, she couldn't wait to introduce herself. "My name is NiuNiu, I am 4 years old." An outgoing child can always get along with others quickly. NiuNiu ‘s favorite cartoons are “Frozen” and “My Little Pony”.

Please note that this corpus has got the speaker and her parents’ authorization.

For more details or for commercial use, please contact us: E-mail:

Get Started?

Contact Us

Talk to Magic Data