See What's NEW

Industry Solutions

Smart Home

home-img

Smart Home

Optimizing AI models with Magic Data AI data total solution.

Make your state-of-the-art products more intelligent and competitive.

Contact Sales

Scenarios

0-img

Household Appliance Automation

Household appliance wake-up,remote control,consumer robots,Smart household appliance

1-img

Smart Device Control

Smart phone/tablet,wearable,remote control

2-img

Home Security

Security monitoring of the elderly and children, maintenance of household appliance, monitoring of break-in, motion sensor

3-img

Virtual Assistant

Information query, travel arrangement, phone call, entertainment

Challenge

Imprecise voice recognition in residence and usage scenario
Unable to correctly understand ambiguous and long-tail queries
Stiff and unnatural response
Limited data on safety monitoring

Annotator® AI-Assisted Annotation Platform

Audio Annotation Text Annotation Image Annotation
  • Household Appliance Automation - Speech command and query annotation (ASR)
  • End-User Device Control - Speech command and query annotation (ASR)
  • Virtual Assistant - Speech command and query integration annotation (ASR)
  • Virtual Assistant - Rhythm, text segmentation, part-of-speech, and phoneme annotation (TTS)
annotator-img
  • Household Appliance Automation - Command generalization (NLP)
  • End-User Device Control - Command generalization (NLP)
  • Virtual Assistant - Interaction query generalization (NLP)
annotator-img
  • Home Security - Interior and exterior home image annotation (CV)
annotator-img

MD Dataset Portfolio

Speech Recognition
Text-to-Speech
Natural Language Understanding
OCR

Contact us for data collection and annotation service

annotator-serve-img

Related Datasets

MDT-AF069 English Duplex Conversation Training Dataset

Magic Data has introduced the "Multi-stream Spontaneous Conversation Training Datasets_English".This dataset comprises 5,000 hours of multi-accent English conversational data, encompassing a wide range of vocal scenarios. Our dataset allows AI models to better understand contextual changes, tonal variations, and emotional shifts in conversations, thereby producing responses that are more natural and accurate.
Play Audio

MDT-LD011 Shanghai Dialect Lexicon

MDT-AF076 Yunnan Dialect Conversational Speech Corpus

Play Audio

MDT-AE067 Korean Conversational Speech Corpus

[Open-Source]
Play Audio

MDT-BD003 Chinese Female Voice Emotion TTS Dataset

Play Audio

MDT-AJ039 Japanese Duplex Conversation Training Dataset

This dataset uses high-fidelity independent audio tracks to comprehensively capture natural interaction features in daily conversations, such as interruptions, overlapping speech, intonation shifts, and emotional pauses. All conversations are annotated with multi-speaker labels and span diverse scenarios, providing robust training resources for AI models to comprehend the intricate Japanese honorific system, colloquial ellipses, and context-dependent logic.

Contact us for the best practices

Get started today

TOP
Talk to Magic Data