• 5
    Data processing centers
  • 10 +
    Experience in data industry
  • 50 +
    Languages in expertise
  • 100 +
    Partners from the technology and AI industries
  • 100,000 +
    Hours of multi-lingual audio databases
  • 300,000 +
    Curated contributors


Providing valuable data for machine learning and helping improve performances of AI systems

Data Solutions

Converting raw data into valuable data for machine learning through our professional collection and annotation services

One-stop AI Data Service Process

Best service only for your satisfaction

  • Customized Requirements
  • Requirements Assessment
  • Samples Confirmation
  • Contract Formulation
  • Production
  • Delivery


Providing professional services guaranteed by our experienced experts

Zhang Qingqing

• Speech technology expert, technical director of AI services
• Ph.D. degree from the Institute of Acoustics of CAS and associate researcher at IOA
• Postdoctoral researcher at CNRS-LIMSI
• CAS Distinguished Scientific Achievement Award

Daniel Povey
Principal Scientist Advisor

• Main developer and maintainer of Kaldi

Our Advantages

  • Scale

    1000,000+ hours multi-lingual audio databases collected from various scenes; expertise in over 50 languages
  • Quality

    Data transcription accuracy rate up to 99% with experts in linguistics and phonetics
  • Efficiency

    Human-in-the-loop processing platform; intelligent project management, task-segmentation, data transcription and inspection; 300,000+ curated contributors
  • One-Stop Data Service

    Customized data consulting services; independent copyright datasets; data transcription services
  • Multi-Field Service

    Extensive range covers audio data, images, texts, etc

Customer Review

“We used nearly 10,000 hours of Magic Data's conversational speech data, to update automatic speech recognition models for noisy speech and conversational speech. The final performance improvements were significant. For the conversational speech recognition, the word error rate has been reduced by 30% related. At the same time, we were surprised that their data can also help us reduce the word error rate by 10% related for noisy speech recognition. This shows that the spontaneousness of their data can not only help models catch up the speech naturalness, but also enhance the robustness against to noise!“

“We tested Magic Data Technology’s Indonesian Vehicle-command corpus on the highway. During the test, there are some speeding cars passing by outside the window.. To our surprise, the effect of words spotting is still very good. We are looking forward to continue our cooperation with Magic Data Technology!”

Oriented by clients’ needs, Magic Data Technology is always at your service.

Contact Us
Name *
Phone Number
E-mail *
Company Name *
Message Content

Please complete the form above and a member of our team will be in touch with you.