See What's NEW

Industry Solutions

Social Networks

social-img

Social Networks

Optimizing AI models with Magic Data AI data total solution.

Magic Data AI data solution in social network scenarios such as sentiment analysis, recommendation systems, virtual host, makes the platform intelligent.

Contact Sales

Scenarios

0-img

Video

Subtitle Generator/Review&Comment Analysis/Content Classification

1-img

Livestreaming

Voice Changer/Realtime Captioning/Review&Comment Analysis

2-img

Virtual Host

Anchor/E-commerce livestreaming/Digital singer

Challenge

Imprecise voice recognition of special words and phrases and accented speaking
Large volume and great variety of comments and reviews
Stiff and unnatural response of virtual human
Large category and cross-domain topics of video

Annotator® AI-Assisted Annotation Platform

Audio Annotation Text Annotation Image Annotation
  • Live Show - Content annotation (ASR)
  • Virtual Host - Voice interaction annotation (ASR)
  • Virtual Host - Text segmentation, part-of-speech, and phoneme annotation (TTS)
  • Video - Video segmentation, content classification (Video)
annotator-img
  • Live Show - Sentiment analysis of reviews (NLP)
  • Virtual Host - Sentiment analysis of reviews (NLP)
annotator-img
  • Video - Video segmentation, content classification (Video)
annotator-img

MD Dataset Portfolio

Speech Recognition
Text-to-Speech
Natural Language Understanding
OCR

Contact us for data collection and annotation service

annotator-serve-img

Related Datasets

MDT-RJ009 Spanish Spoken Speech Dataset

This dataset is designed to train AI models that better understand spoken Spanish, enhancing natural interaction in speech recognition. It includes diverse real-life dialogues with high transcription accuracy. Key features like liaison and elision are carefully annotated, and punctuation reflects Spanish rhythm. Complete sentences support learning of complex verb forms, improving recognition robustness.

48kHz Multi-Speaker Speech Dataset for Voice Cloning--English

Play Audio

MDT-AG031 Mandarin Heavy Accent (Jiangsu) Conversational Speech Corpus

Play Audio

MDT-LD009 Cantonese Lexicon

MDT-LG003 Nanchang Dialect Lexicon

MDT-RJ001 Japanese Spoken Speech Dataset

This dataset is designed to train AI models that better understand spoken Japanese, improving natural interaction in speech recognition. It includes diverse real-life conversations with high transcription accuracy. Pitch accents and special syllables like sokuon and nasal sounds are carefully annotated. Punctuation reflects Japanese pause rhythms, helping models learn ellipses and emotional particles for more natural responses.

Contact us for the best practices

Get started today

TOP
Talk to Magic Data