See What's NEW

Industry Solutions

Smart Home

home-img

Smart Home

Optimizing AI models with Magic Data AI data total solution.

Make your state-of-the-art products more intelligent and competitive.

Contact Sales

Scenarios

0-img

Household Appliance Automation

Household appliance wake-up,remote control,consumer robots,Smart household appliance

1-img

Smart Device Control

Smart phone/tablet,wearable,remote control

2-img

Home Security

Security monitoring of the elderly and children, maintenance of household appliance, monitoring of break-in, motion sensor

3-img

Virtual Assistant

Information query, travel arrangement, phone call, entertainment

Challenge

Imprecise voice recognition in residence and usage scenario
Unable to correctly understand ambiguous and long-tail queries
Stiff and unnatural response
Limited data on safety monitoring

Annotator® AI-Assisted Annotation Platform

Audio Annotation Text Annotation Image Annotation
  • Household Appliance Automation - Speech command and query annotation (ASR)
  • End-User Device Control - Speech command and query annotation (ASR)
  • Virtual Assistant - Speech command and query integration annotation (ASR)
  • Virtual Assistant - Rhythm, text segmentation, part-of-speech, and phoneme annotation (TTS)
annotator-img
  • Household Appliance Automation - Command generalization (NLP)
  • End-User Device Control - Command generalization (NLP)
  • Virtual Assistant - Interaction query generalization (NLP)
annotator-img
  • Home Security - Interior and exterior home image annotation (CV)
annotator-img

MD Dataset Portfolio

Speech Recognition
Text-to-Speech
Natural Language Understanding
OCR

Contact us for data collection and annotation service

annotator-serve-img

Related Datasets

MDT-AF076 Yunnan Dialect Conversational Speech Corpus

Play Audio

MDT-NF004 Chinese English Hindi Parallel Corpus

MDT-NF007 Chinese Malay Parallel Corpus

MDT-AG039 Hebei Dialect Spontaneous Speech Corpus

Play Audio

MDT-LE003 Filipino/Tagalog Lexicon

MDT-AI101 Spanish Duplex Conversation Training Dataset

Preserving features such as tonal jumps, spontaneous interruptions, and collaborative speech in fast-paced native conversations, this dataset uses independent channel recording for precise voice separation. Combined with multi-speaker labeling and scenario classification, it provides a solid training foundation for AI models to manage diverse speech rates and regional linguistic variations in Spanish.

Contact us for the best practices

Get started today

TOP
Talk to Magic Data