How to Get Comprehensive Data Solution for Customer Service Models
For intelligent speech-interaction customer service models, as for other similar human-AI interaction models, in order to recognize effectively a human customer’s speech content, to identify his intentions and then to react properly in a certain customer service scenario, models require rich conversational speech data and conversations’ content in this scenario for training. These data are expected to cover conversations recorded in different languages, dialects, accents and transmission channels, with speakers in different ages and genders and with a background noise corresponding to the scenario. The transcription of these conversations should include speech content as well as sound events annotated. To assure model’s high accuracy in recognition of speech content and precise replies in interaction with customers, a large quantity of qualified data is required. This requirement is challenging for any application scenario.
Magic Data Tech provides to intelligent speech-interaction customer service models speech datasets consisting of conversations in target language with/without accent variation between service staff and customers of different age and gender. The conversations, recorded with various background noises and in different transmission channels, cover various business scenarios. Magic Data Tech has unique and strict criterion for speakers, recording environment, devices, manipulation procedure, annotation norm and acceptance inspection, which ensure the uniformity, stability and reliable quality of data products and services within and across scenarios and languages. Relying on its standard quality management system, Magic Data Tech meets variety of expectations for data products in different scenarios and in different recording conditions, and promotes in this way the development of intelligent customer service systems.
Collection of customer service conversational speech between two speakers in different scenarios; Collection of customer service conversational speech between two speakers with various background noises; Collection of customer service conversational speech between two speakers through various transmission channels; Annotations of speech content respecting customized norms.
Since 4 years, Magic Data Tech has provided annotation services and data products for a large number of companies and research institutions, covering multiple business scenarios in telecommunication, E-commerce, finance, education, tourism, etc. These services, favorably commented, help the clients in building scenario-oriented intelligent speech-interaction customer service models, in improving their models’ performance, as well as in making progress in research.
Capacity to cover more than 50 languages for audio recording and annotation services; Series of datasets available for immediate application in intelligent customer service system development in various commercial domains; Rapid reaction to requirements, with more than 300,000 speakers and human annotators from around the world providing audio recording service on Magic Data Tech’s platform; Annotation norms that can be customized to meet actual needs; Strict quality management system, ensuring a continuous output of high quality data products.
At the INTERSPEECH2020 industrial forum, Magic Data as the platinum sponsor of the conference, will hold a live session at the virtual booth from 20:15 - 20:45 on Oct 28th. The live broadcast theme is: “Data sets your model --Which data strategy should be adopted to achieve better performance?”
MAGICDATA Kid Voice TTS Corpus in Mandarin Chinese was recorded by a four-year-old Chinese girl originally born in Beijing China. This time we published 15-minute speech data from the corpus for non-commercial use. This is the first time to publish this voice!
We are honored to say that our Chinese Mandarin Conversational Speech was selected in LDC Catalog! The catalog No. is