See What's NEW

Industry Solutions

Social Networks

social-img

Social Networks

Optimizing AI models with Magic Data AI data total solution.

Magic Data AI data solution in social network scenarios such as sentiment analysis, recommendation systems, virtual host, makes the platform intelligent.

Contact Sales

Scenarios

0-img

Video

Subtitle Generator/Review&Comment Analysis/Content Classification

1-img

Livestreaming

Voice Changer/Realtime Captioning/Review&Comment Analysis

2-img

Virtual Host

Anchor/E-commerce livestreaming/Digital singer

Challenge

Imprecise voice recognition of special words and phrases and accented speaking
Large volume and great variety of comments and reviews
Stiff and unnatural response of virtual human
Large category and cross-domain topics of video

Annotator® AI-Assisted Annotation Platform

Audio Annotation Text Annotation Image Annotation
  • Live Show - Content annotation (ASR)
  • Virtual Host - Voice interaction annotation (ASR)
  • Virtual Host - Text segmentation, part-of-speech, and phoneme annotation (TTS)
  • Video - Video segmentation, content classification (Video)
annotator-img
  • Live Show - Sentiment analysis of reviews (NLP)
  • Virtual Host - Sentiment analysis of reviews (NLP)
annotator-img
  • Video - Video segmentation, content classification (Video)
annotator-img

MD Dataset Portfolio

Speech Recognition
Text-to-Speech
Natural Language Understanding
OCR

Contact us for data collection and annotation service

annotator-serve-img

Related Datasets

[Open-Source] MDT-AF002 Mandarin Chinese Conversational Speech Corpus

[Open-Source]
Play Audio

MDT-AF029 Italian Scripted Speech Corpus

Play Audio

MDT-NF026 Mandarin Chinese Prosody Text Corpus

MDT-AI090 Spontaneous Speech Datasets of Expressive Paralinguistics

This dataset is characterized by a high sampling rate of 48 kHz, recordings collected in controlled quiet environments, and contributions from a demographically diverse pool of speakers varying in region, age, and gender to ensure speech diversity. The dataset encompasses a wide array of topics across 20 domains, including daily life, leisure and entertainment, education and training, and healthcare.
Play Audio

MDT-AG022 Multi-stream Spontaneous Conversation Training Datasets_Chinese

Magic Data has proactively launched the "Multi-stream Spontaneous Conversation Training Datasets_Chinese" . This dataset comprises 10,000 hours of Chinese conversational data, encompassing diverse voice scenarios.Our dataset allows AI models to better understand contextual changes, tonal variations, and emotional shifts in conversations, thereby producing responses that are more natural and accurate.
Play Audio

MDT-BD003 Chinese Female Voice Emotion TTS Dataset

Play Audio

Contact us for the best practices

Get started today

TOP
Talk to Magic Data