See What's NEW

Industry Solutions

Smart Home

home-img

Smart Home

Optimizing AI models with Magic Data AI data total solution.

Make your state-of-the-art products more intelligent and competitive.

Contact Sales

Scenarios

0-img

Household Appliance Automation

Household appliance wake-up,remote control,consumer robots,Smart household appliance

1-img

Smart Device Control

Smart phone/tablet,wearable,remote control

2-img

Home Security

Security monitoring of the elderly and children, maintenance of household appliance, monitoring of break-in, motion sensor

3-img

Virtual Assistant

Information query, travel arrangement, phone call, entertainment

Challenge

Imprecise voice recognition in residence and usage scenario
Unable to correctly understand ambiguous and long-tail queries
Stiff and unnatural response
Limited data on safety monitoring

Annotator® AI-Assisted Annotation Platform

Audio Annotation Text Annotation Image Annotation
  • Household Appliance Automation - Speech command and query annotation (ASR)
  • End-User Device Control - Speech command and query annotation (ASR)
  • Virtual Assistant - Speech command and query integration annotation (ASR)
  • Virtual Assistant - Rhythm, text segmentation, part-of-speech, and phoneme annotation (TTS)
annotator-img
  • Household Appliance Automation - Command generalization (NLP)
  • End-User Device Control - Command generalization (NLP)
  • Virtual Assistant - Interaction query generalization (NLP)
annotator-img
  • Home Security - Interior and exterior home image annotation (CV)
annotator-img

MD Dataset Portfolio

Speech Recognition
Text-to-Speech
Natural Language Understanding
OCR

Contact us for data collection and annotation service

annotator-serve-img

Related Datasets

MDT-LG001 Brazilian Portuguese Lexicon

MDT-AF078 Spanish Conversational Speech Corpus

Play Audio

MDT-LF002 French Lexicon

MDT-NF004 Chinese English Hindi Parallel Corpus

MDT-AF069 Multi-stream Spontaneous Conversation Training Datasets_English

Magic Data has introduced the "Multi-stream Spontaneous Conversation Training Datasets_English".This dataset comprises 5,000 hours of multi-accent English conversational data, encompassing a wide range of vocal scenarios. Our dataset allows AI models to better understand contextual changes, tonal variations, and emotional shifts in conversations, thereby producing responses that are more natural and accurate.
Play Audio

[Open-Source] MDT-AF002 Mandarin Chinese Conversational Speech Corpus

[Open-Source]
Play Audio

Contact us for the best practices

Get started today

TOP
Talk to Magic Data