Whether it is the personalized recommendation of short videos, or the optimal route design for takeaway delivery, or the face recognition during payment, AI technology represented by algorithms has been applied in full swing in the consumer Internet industry.
In recent years, with the development of artificial intelligence technology, the performance of speech recognition application has been significantly improved. Many companies claim that the accuracy rate of speech recognition technology has reached more than 98%. Has the performance of speech recognition exceeded the human ear? There is something more need to be discussed before we making the final conclusion.
Recently, according to the ‘Washington Post’ report, Blake Lemoine, a software engineer at Google, said that Google's artificial intelligence chatbot LaMDA (Language Model for Dialogue Applications) already has ‘consciousness’ and even a ‘soul’. One of the signs that distinguishes humans from other species is that people think that they are conscious, and consciousness is a choice made by human for the world. If AI really has consciousness, then human beings are possible to be taken over by AI robots some day.
Voice assistants have been playing a more and more important role in our life. However, sometimes a smart voice assistant indoors may become "dumb" when it is used in outdoor environment. Just like us humans, the voice assistant gets panic when it arrives in a new environment. People tend to be more nervous and cautious when communicating in a non-native environment. Humans are uncomfortable with things they are not familiar, not to mention artificial intelligence algorithms. The voice assistant need to adapt to different domain, which is called transfer learning, an important but difficult problem in speech recognition, speech synthesis, speaker recognition and other speech fields.
On July 4, 2022, ISCSLP 2022 Conversational Short-phrase Speaker Diarization Challenge (CSSD) which is jointly sponsored by the Institute of Acoustics CAS, Northwestern Polytechnical University, Singapore A*STAR Institute of Information and Communication, Shanghai Jiaotong University and Magic Data (Beijing Aishu Smart Technology Co., Ltd.), is officially opened for registration. Groups and individuals from academia and industry are welcome to register for the competition.
Magic Data’s paper Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset is accepted by INTERSPEECH 2022, the world's largest and most comprehensive conference on the science and technology of spoken language processing. Themed "Human and Humanizing Speech Technology", INTERSPEECH 2022 will take place from September 18-22 virtually and in Incheon Korea.
Since 2020, the new crown epidemic has prevented many actors and actress from filming, but AI technology has made application of virtual human popular. In China, the most popular virtual human recently is Baidu's Du Xiaoxiao, a sweet and virtuous beauty who can sing "Every Minute, Every Day" with intelligent AI character of Gong Jun, a Chinese TV celebrity. The painting painted by Du Xiaoxiao in a few seconds have sold 170,000 RMB. Du Xiaoxiao also finished the 800 words college entrance examination composition, in one second.
The 2022 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2022) — one of the world’s leading NLP conferences, will take place from July 10-15 in Seattle, Washington, and virtually. Magic Data is proud to be one of the sponsors of the event and present our latest training data portfolio for artificial intelligence research and development to the estimated 3,000+ attendees.
Language is the foundation of information communication, and barrier-free communication has always been a dream of mankind. However, due to differences in dialects, languages, speaking styles, etc., there are great communication barriers between people. At present, the development of artificial intelligence is breaking down the barriers of language communication. Many multilingual speech recognition products have appeared on the market. In addition to Chinese Mandarin and English, these products also support multiple languages and dialects, breaking the barriers of national boundaries and regions for communication between people. Multilingual speech recognition is full of challenges when trying to achieve high-accuracy recognition because of its understanding of multiple languages.