See What's NEW

Industry Solutions

Social Networks

social-img

Social Networks

Optimizing AI models with Magic Data AI data total solution.

Magic Data AI data solution in social network scenarios such as sentiment analysis, recommendation systems, virtual host, makes the platform intelligent.

Contact Sales

Scenarios

0-img

Video

Subtitle Generator/Review&Comment Analysis/Content Classification

1-img

Livestreaming

Voice Changer/Realtime Captioning/Review&Comment Analysis

2-img

Virtual Host

Anchor/E-commerce livestreaming/Digital singer

Challenge

Imprecise voice recognition of special words and phrases and accented speaking
Large volume and great variety of comments and reviews
Stiff and unnatural response of virtual human
Large category and cross-domain topics of video

Annotator® AI-Assisted Annotation Platform

Audio Annotation Text Annotation Image Annotation
  • Live Show - Content annotation (ASR)
  • Virtual Host - Voice interaction annotation (ASR)
  • Virtual Host - Text segmentation, part-of-speech, and phoneme annotation (TTS)
  • Video - Video segmentation, content classification (Video)
annotator-img
  • Live Show - Sentiment analysis of reviews (NLP)
  • Virtual Host - Sentiment analysis of reviews (NLP)
annotator-img
  • Video - Video segmentation, content classification (Video)
annotator-img

MD Dataset Portfolio

Speech Recognition
Text-to-Speech
Natural Language Understanding
OCR

Contact us for data collection and annotation service

annotator-serve-img

Related Datasets

MDT-AG001 Yunan Dialect Conversational Speech Corpus

Play Audio

MDT-AG022 Chinese Duplex Conversation Training Dataset

Magic Data has proactively launched the "Multi-stream Spontaneous Conversation Training Datasets_Chinese" . This dataset comprises 10,000 hours of Chinese conversational data, encompassing diverse voice scenarios.Our dataset allows AI models to better understand contextual changes, tonal variations, and emotional shifts in conversations, thereby producing responses that are more natural and accurate.
Play Audio

Multi-Emotional Natural Speech Dataset

Magic Data has newly introduced the "Multi-Emotional Natural Speech Dataset", comprising various datasets designed to enhance expressiveness and naturalness in speech technology, enabling intelligent devices to exhibit a wide range of emotional expressions. This dataset significantly enhances the emotional expressiveness of large speech models. By leveraging our dataset, the expressiveness and emotional authenticity of large speech models can be greatly improved.
Play Audio

MDT-AI090 Spontaneous Speech Datasets of Expressive Paralinguistics

This dataset is characterized by a high sampling rate of 48 kHz, recordings collected in controlled quiet environments, and contributions from a demographically diverse pool of speakers varying in region, age, and gender to ensure speech diversity. The dataset encompasses a wide array of topics across 20 domains, including daily life, leisure and entertainment, education and training, and healthcare.
Play Audio

MDT-NF007 Chinese Malay Parallel Corpus

MDT-AG037 Swedish Spontaneous Speech Corpus

Play Audio

Contact us for the best practices

Get started today

TOP
Talk to Magic Data