Environmental Adaptation of Voice Assistants

Date : 2022-07-15 View : 1451

Voice assistants have been playing a more and more important role in our life. However, sometimes a smart voice assistant indoors may become "dumb" when it is used in outdoor environment. Just like us humans, the voice assistant gets panic when it arrives in a new environment. People tend to be more nervous and cautious when communicating in a non-native environment. Humans are uncomfortable with things they are not familiar, not to mention artificial intelligence algorithms. The voice assistant need to adapt to different domain, which is called transfer learning, an important but difficult problem in speech recognition, speech synthesis, speaker recognition and other speech fields.

The artificial intelligence algorithm inside the voice assistant is often learned from a large amount of data. This large amount of data cannot cover all application scenarios, which leads to some new scenarios, such as curved conference halls, open squares and other scenarios. The accuracy of speech recognition is greatly reduced. Or, models trained on a large number of voice dialogue data recorded in recording studios cannot be directly used in some professional fields, such as e-commerce customer service, financial intelligent customer service, and intelligent medical fields. Due to the lack of domain knowledge, the model effect is not satisfactory in new scenarios. How to adapt the model to each vertical domain is generally considered from two aspects.

Transfer Learning Algorithms

Transfer learning refers to the model we train on data of one scenario, which can adapt to the transfer learning algorithm, apply this model to other scenarios, and try to keep the performance of this model from being affected by changes in the environmental domain as much as possible. Transfer learning relaxes the assumption that the training data must be independent and identically distributed (i.i.d) with the test data, motivating us to use transfer learning to solve the problem of insufficient training data. In transfer learning, training data and test data do not need to be i.i.d. Models in the target domain do not need to be trained from scratch, which can significantly reduce the need for training data and training time in the target domain. According to the summary of S. J. Pan and Q. Yang, A survey on transfer learning, transfer learning algorithms can be divided into the following categories according to different situations:

Although there are many transfer learning algorithms mentioned above, the execution of the algorithm is still inseparable from the support of the data in the domain. Almost none of the above algorithms can be implemented without in-domain data.

Intra-domain data adaptation

The simplest and most effective transfer learning method is to fine-tune an existing model with a small amount of in-domain data. Make the existing model adapt to the current data scene. The above transfer learning algorithms are also inseparable from the support of data in the domain. For voice assistants to be applied in various vertical domains, it is inseparable from learning on the data of each vertical domain. This requires a professional data company team like Magic Data to provide vertical data for many researchers in industry and universities to support the research of the above-mentioned transfer learning algorithm and the application of voice assistants in various fields. Magic Data is a leading global AI data solutions provider, with over 400 speech datasets in more than 60 languages and dialects covering a great variety of scenarios. Examples of which are as follows:

MDT-ASR-E026 Mandarin Chinese Conversational Telephone Speech Corpus

MDT-ASR-E059 Turkish In-Vehicle Scripted Speech Corpus—Smart Mobility

For more information, check out www.magicdatatech.com/datasets

Latest Press

Qingqing ZHANG: Conversation Data Promotes AIGC—Training Data of Large-Scale Models

"Training data is technology " .

That’s what OpenAI co-founder Ilya Sutskever said when taking interview with The Verge. ChatGPT amaze the world since its release. The stunning performance of GPT-4 makes us believe we have enter a new era in AI.

What makes large model so omniscient? In our opinion, the reason may lie in the data...

This article is a collection of Dr. Qingqing Zhang’s thoughts on data, large models and generative AI.

Integrating ASR with Text Summarizer, Secure Your Leading Position in Web Conferencing Market with Magic Data Multi-Person Spontaneous Meetings Dataset

Online meetings have become a frequently used tool for business and learning. How to meet the more diversifying online conferencing needs of users has brought great challenges to remote work applications, including captioning, real-time machine translation, smart meeting minutes and other artificial intelligence applications.

Open Dataset | Automobile Cabin Voice Interaction Data Solution

In recent years, with the development of artificial intelligence, chip technology, and new innovations in the automotive industry have been driven by the increase in smart car popularity. A smart car consists of three parts: The Internet of Vehicles, the smart cockpit, and the autonomous driving. The smart cockpit is equipped with intelligent and networked in-vehicle software, which can intelligently interact with people, roads, and vehicles. It is an important link and key node for the evolution of the human-vehicle relationship from a tool to a partner.

The Future of Virtual Companionship

Nowadays, more and more young people are buying chat services on e-commerce platforms to accompany them virtually and confiding in “chat buddy” to communicate and express their feelings. Prices for various degrees of companionship range from tens of yuan to the customized "virtual lover" for thousands of yuan. In recent years, virtual companionship services have become a fashionable self-healing way for young people to seek spiritual comfort and express their voices on the Internet. There are many stores on Taobao that provide this service, such as "gentle and cute little sweetheart", "overbearing dictatorial president fan", as long as you pay, you can find your favorite "buddy".

Will Humans Be Replaced by AI?

AI-generated art has experienced rapid growth in both popularity and accessibility over the past few months. With engines like DALL-E, Midjourney, and Stable Diffusion spurring an influx of AI-generated artwork on online platforms.

News

Environmental Adaptation of Voice Assistants

Get Started?