Does Your Voice Assistant Have Enough Vocabulary?

Date : 2022-11-09 View : 2210

"Hi Sir, please play Mozart's piano music", "Okay, which song do you want to play?"... Nowadays, voice assistants have entered thousands of households, and almost all of them are equipped with mobile phones, tablets or smart speakers. A voice assistant that is always on call. But how much vocabulary do these fluent voice assistants have? Does does their language learning need to start from ABC just like humans do? The answer is that they don't need to accumulate gradually, but through a library of pronunciation dictionaries, which covers all the voices that the voice assistant can recognize.

The pronunciation dictionary (Lexicon) contains the mapping from words (Words) to phonemes (Phones), which is used to connect the acoustic model and the language model. The pronunciation dictionary contains a collection of words that the system can handle, with their pronunciations marked. The relationship between it and other modules of speech recognition is as follows: the mapping relationship between the modeling unit of the acoustic model and the modeling unit of the language model is obtained through the pronunciation dictionary, so as to connect the acoustic model and the language model to form a search state space, using Decoding in the decoder. Our recognition target is a word sequence (the result of sentence segmentation), and each word is converted from a pre-constructed pronunciation dictionary (Lexicon) to a corresponding phoneme sequence (Chinese phonemes usually refer to the initials and finals in Pinyin), that is, the word sequence Convert to phoneme sequence.

In a speech recognition system, the larger the amount of data contained in the pronunciation dictionary, the better the effect of improving the accuracy of speech recognition. Pronunciation dictionaries and languages correspond to each other, and a pronunciation dictionary needs to be prepared for each language. When new words are generated, these words and corresponding phonetic symbols can be added to continuously expand the size of the dictionary. Therefore, vocabulary size, phonetic transcription and proofreading accuracy are important criteria to measure the quality of the pronunciation dictionary.

At present, many pronunciation dictionaries are not very accurate because they are generated by themselves, which will affect the performance of the speech recognition system. How to collect a large number of accurate and comprehensive pronunciation dictionaries has become another problem in the field of speech. At the same time, since the collection, labeling, and cleaning of pronunciation dictionaries require professional linguists and acousticians to control, there is very little open source pronunciation dictionary corpus, and professional data companies are required to provide more support and collection of pronunciation dictionary data resources.

At present, Magic Data has established a mature pronunciation dictionary construction process and accumulated profound basic research results of phonetic linguistics. There are pronunciation dictionaries of various languages and dialects, including Mandarin, French, Italian, Japanese and many others. Each pronunciation dictionary has been comprehensively collected, meticulously annotated, and each word in it has been manually proofread. It is a set of high-quality pronunciation dictionaries. These pronunciation dictionaries can be used to build larger, more comprehensive, and more accurate pronunciation dictionaries, thereby improving the accuracy of speech recognition.

Try Magic Data open-source pronunciation dictionary at MagicHub.

Italian Lexicon

Indonesian Lexicon

Latest Press

Qingqing ZHANG: Conversation Data Promotes AIGC—Training Data of Large-Scale Models

"Training data is technology " .

That’s what OpenAI co-founder Ilya Sutskever said when taking interview with The Verge. ChatGPT amaze the world since its release. The stunning performance of GPT-4 makes us believe we have enter a new era in AI.

What makes large model so omniscient? In our opinion, the reason may lie in the data...

This article is a collection of Dr. Qingqing Zhang’s thoughts on data, large models and generative AI.

Integrating ASR with Text Summarizer, Secure Your Leading Position in Web Conferencing Market with Magic Data Multi-Person Spontaneous Meetings Dataset

Online meetings have become a frequently used tool for business and learning. How to meet the more diversifying online conferencing needs of users has brought great challenges to remote work applications, including captioning, real-time machine translation, smart meeting minutes and other artificial intelligence applications.

Open Dataset | Automobile Cabin Voice Interaction Data Solution

In recent years, with the development of artificial intelligence, chip technology, and new innovations in the automotive industry have been driven by the increase in smart car popularity. A smart car consists of three parts: The Internet of Vehicles, the smart cockpit, and the autonomous driving. The smart cockpit is equipped with intelligent and networked in-vehicle software, which can intelligently interact with people, roads, and vehicles. It is an important link and key node for the evolution of the human-vehicle relationship from a tool to a partner.

The Future of Virtual Companionship

Nowadays, more and more young people are buying chat services on e-commerce platforms to accompany them virtually and confiding in “chat buddy” to communicate and express their feelings. Prices for various degrees of companionship range from tens of yuan to the customized "virtual lover" for thousands of yuan. In recent years, virtual companionship services have become a fashionable self-healing way for young people to seek spiritual comfort and express their voices on the Internet. There are many stores on Taobao that provide this service, such as "gentle and cute little sweetheart", "overbearing dictatorial president fan", as long as you pay, you can find your favorite "buddy".

Will Humans Be Replaced by AI?

AI-generated art has experienced rapid growth in both popularity and accessibility over the past few months. With engines like DALL-E, Midjourney, and Stable Diffusion spurring an influx of AI-generated artwork on online platforms.

News

Does Your Voice Assistant Have Enough Vocabulary?

Get Started?