See What's NEW

Industry Solutions

Financial Services

financial-img

Financial Services

Streamling your business process with enterprise-level security guaranteed.

Our top priority lays on compliance and security.

Revitalizing financial industry by providing one-stop AI data solutions for customer service, virtual counter, virtual assistant, targeting marketing, and other AI applications.

Contact Sales

Scenarios

0-img

Customer Service

Dial In/Out, Collect Accounts Receivable

1-img

Smart Meeting

Realtime Captioning,Translation,Meeting Minutes Generation

2-img

Automated Invoice Processing

Identity Certification, Warranty OCR, Medical Record OCR

3-img

Virtual Human

Smart Shopping Guidance, Marketing

Challenge

Imprecise voice recognition of customer service scenario
Unable to correctly understand commands and queries
Impersonal and unnatural communication
Different format between invoices, warranties, and medical records

Annotator® AI-Assisted Annotation Platform

Audio Annotation Text Annotation Image Annotation
  • Customer Service - Customer service annotation
  • Virtual Human - Command and query annotation
  • Smart Meeting - Meeting scenarios voice annotation
  • Virtual Human - Rhythm, Text segmentation, part-of-speech, and phoneme annotation
annotator-img
  • Customer Service - User queries relevance annotation
  • Virtual Human - User interaction content annotation
annotator-img
  • Automated Invoice Processing - Invoice OCR annotation
annotator-img

MD Dataset Portfolio

Speech Recognition
Text-to-Speech
Natural Language Understanding
OCR

Contact us for data collection and annotation service

annotator-serve-img

Related Datasets

MDT-AF063 Mandarin Chinese Scripted Speech Corpus

Play Audio

MDT-NG001 Chinese POI Text Corpus

MDT-AF069 English Duplex Conversation Training Dataset

Magic Data has introduced the "Multi-stream Spontaneous Conversation Training Datasets_English".This dataset comprises 5,000 hours of multi-accent English conversational data, encompassing a wide range of vocal scenarios. Our dataset allows AI models to better understand contextual changes, tonal variations, and emotional shifts in conversations, thereby producing responses that are more natural and accurate.
Play Audio

MDT-NB002 Chinese Named Entity Extraction (NEE) Corpus

MDT-RI002 Cantonese Spoken Speech Dataset

This dataset is built to train AI models that better understand spoken Cantonese, improving natural interaction in speech recognition. It covers diverse real-life dialogues with high transcription accuracy. Annotations are optimized for Cantonese-specific features like nine tones, lazy sounds, and slang, ensuring accurate audio-text alignment. Natural sentence structures and punctuation help models grasp pause patterns and sentence-final particles, boosting dialect recognition performance.

MDT-AF012 Singaporean English Scripted Speech Corpus—Daily Use Sentence

Play Audio

Contact us for the best practices

Get started today

TOP
Talk to Magic Data