Company
blog
Blog
blog
21
Oct
27
Browse: 560
How to Improve Multilingual Speech Recognition Performance? In View of Acoustic Modeling

As the development of modern technology, cross-culture communication become more frequent and code-mixing becomes a common phenomenon. People are getting used to mingle different languages into a single sentence, sometimes even intuitively.

The code-mixing phenomenon brings much challenges to the automatic speech recognition system development. How to develop a reliable multilingual speech recognition system have become a heated topic within the industry.

Recognition of the embedded language is the pressure point. This means researchers must deal with two problems. The first one comes from recognition of the matrix language accent in embedded language. The second one is how to balance the cost and effectiveness in developing the multilingual speech recognition model, especially when the embedded language comes from a data-scare language.

There are two main directions we may take into consideration in terms of acoustics modeling: applying multilingual datasets in training data deployment and applying transfer learning.

We will use an example of developing a Mandarin-English bilingual speech recognition system for real world music retrieval in this passage to expand the idea.

Multilingual datasets training data for acoustic modeling

In training data deployment, in addition to Chinese dataset and English dataset, Chinese-English code-mixing dataset is recommended to train the ASR model, as researches shows that compared with using monolingual Chinese and English training dataset, using Chinese-English speech data for phoneme clustering and model training, the error rate of the baseline model in Chinese-English mixed speech recognition is reduced by 37.93%.

Global phone set lexicon

Compared with building and training an acoustic model from 0, transfer learning can quickly achieve a favorable outcome without costing large amount of time and resources. Using transfer learning for reference, we can adopt a global phone set lexicon in building an adaptation acoustic model, reducing the amount of embedded language data as required for training while lowering word error rate.

For more data insight, contact our data experts (business@magicdatatech.com).

Share
Previous
Page
What Conversational Data Play in the Future of Online Conferencing?
Next
Page
How to Start Your Machine Learning Projects with MagicData-RAMC?
Latest Blogs
What Conversational Data Play in the Future of Online Conferencing?

Over two years into the pandemic, a lot of things have changed in the remote work landscape. As more jobs move to remote settings than ever before, the communication between coworkers and customers has shifted to that realm as well. With that shift comes a new set of trials and tribulations that didn’t exist in face-to-face meetings.

21
Oct
27
How to Start Your Machine Learning Projects with MagicData-RAMC?

As a collection of high quality and richly annotated training data, MagicData-RAMC is applicable to a series of research. This article will introduce 3 experiments related to speech recognition, speaker diarization and keyword search based on MagicData-RAMC conducted by Magic Data, together with the Institute of Acoustics, Chinese Academy of Sciences, Shanghai Jiao Tong University and Northwestern Polytechnic University.

21
Oct
27
Open-source MagicData-RAMC: 180-hour Conversational Speech Dataset in Mandarin Released

MagicHub, an open-source community for AI, releases 180-hour conversational speech dataset in Mandarin for free, enriching the open source speech corpus and promoting the development of spoken language processing technology and conversational AI.

21
Oct
27
Magic Data Launches Conversational AI Datasets for Machine Learning

Magic Data launches an accumulation of more than 200,000 hours of training datasets, including 140,000 hours of conversational AI training datasets and 60,000 hours of read speech datasets, covering Asian languages, English dialects, and European languages, boosting the rapid development of human-computer interaction in artificial intelligence.

21
Oct
27
Moving Toward the Globe | Magic Data Builds Partnership with AWS, Empowering AI Data Processing

Recently, Magic Data officially become one of AWS’s ISV partners after Annotator® 5.0, an AI-assisted data labeling platform passing the ASW foundation technology Review (FTR).

21
Oct
27
Sales Department
Please fill in this form to purchase datasets or quote for
data collection/ annotation services.
Name
*
Company Name
*
Title
*
Email
*
Phone Number
*

Country code + Phone Number

Phone number-e.g. +86 138315xxxxx

Detail
Country
City
Submit
Sales Department
Please fill in this form and we will contact you soon
Name
*
Company Name
*
Email
*
Phone Number
*

Country code + Phone Number

Phone number-e.g. +86 138315xxxxx

Detail
Country
City
Submit
Resources Department
If you want to be our data collection and annotation team
member, please fill in this form.
DATA COLLECTION PROJECTS
Language*
Location*
DATA ANNOTATION PROJECTS
Language*
CONTACT INFORMATION
Name*
Company Name*
E-mail*
Phone Number*

Country code + Phone Number

Phone number-e.g. +86 138315xxxxx

Experience*
Address*
Submit
Marketing Department
If you want to forward our article or tell us marketing
events, please fill in this form.
Name
*
Company Name
*
Email
*
Phone Number
*

Country code + Phone Number

Phone number-e.g. +86 138315xxxxx

Detail
Submit
Human Resources Department
Please fill in this form to be a member of Magic Data Tech.
Name
*
Email
*
Phone Number
*

Country code + Phone Number

Phone number-e.g. +86 138315xxxxx

Job
*
Upload Resume
Submit
Sample Download
Name*
E-mail*
Phone Number*

Country code + Phone Number

Phone number-e.g. +86 138315xxxxx

Company Name*
Job
Department
Company Product
I am also interested in the following data:
Languages
Style
Scenario

We will contact you via telephone to confirm your information and provide the method to download.
Submit
Submission Successful!
We will contact you as soon as possible.
This page would be
closed in 3 seconds automatically.
Talk to Magic Data
>
TOP