Magic Data Paper is Accepted by INTERSPEECH 2022

Date : 2022-07-04 View : 2838

Magic Data’s paper Open Source MagicData-RAMC: A Rich Annotated Mandarin Conversational(RAMC) Speech Dataset is accepted by INTERSPEECH 2022, the world's largest and most comprehensive conference on the science and technology of spoken language processing. Themed "Human and Humanizing Speech Technology", INTERSPEECH 2022 will take place from September 18-22 virtually and in Incheon Korea.

In order to further enrich the open source speech corpus and promote the development of speech language processing technology and conversational AI, this year in April, Magic Data officially released a 180-hour Chinese conversational speech dataset — MagicData-RAMC on MagicHub, an open source community, developed by Magic Data.

The Paper introduces MagicData-RAMC dataset and research based on the dataset conducted by Magic Data, together with the Institute of Acoustics, Chinese Academy of Sciences, Shanghai Jiao Tong University and Northwestern Polytechnic University.

As a collection of high quality and richly annotated training data, MagicData-RAMC includes 351 sets of multi-turn Mandarin conversations recorded in indoor environment by smart phone with a total duration of 180 hours. The data set can support developers in completing research on speech recognition, speaker diarization, and keyword search.

For the Automatic Speech Recognition task, the research team use a Conformer-based end-to-end (E2E) model implemented by ESPnet2 toolkit. The experimental result is shown in terms of character error rate (CER) as 19.1.

For speaker diarization task, the baseline system consists of three components: speaker activity detection (SAD), speaker embedding extractor and clustering. Experimental result shows that JER is 47.49 on test set.

The keyword search task follows the DTA Att-E2E-KWS approach proposed in relying on attention-based E2E ASR framework and frame-synchronous phoneme alignments. The result shows that the system got 0.8587 on precision rate.

Preprint paper available on arxiv

https://arxiv.org/abs/2203.16844

About Magic Data

Magic Data is a global AI data solutions provider headquartered in Beijing, providing professional data solutions to enterprises and academic institutions engaged in artificial intelligence R&D and application research to voice recognition (ASR), speech synthesis (TTS), natural language processing (NLP), and computer vision (CV).

About MagicHub

MagicHub community is an open-source data platform developed by Magic Data dedicated to assisting AI developers in model training and to promoting the development of an open-source ecosystem. For more information, contact open@magicdatatech.com

Latest Press

Qingqing ZHANG: Conversation Data Promotes AIGC—Training Data of Large-Scale Models

"Training data is technology " .

That’s what OpenAI co-founder Ilya Sutskever said when taking interview with The Verge. ChatGPT amaze the world since its release. The stunning performance of GPT-4 makes us believe we have enter a new era in AI.

What makes large model so omniscient? In our opinion, the reason may lie in the data...

This article is a collection of Dr. Qingqing Zhang’s thoughts on data, large models and generative AI.

Integrating ASR with Text Summarizer, Secure Your Leading Position in Web Conferencing Market with Magic Data Multi-Person Spontaneous Meetings Dataset

Online meetings have become a frequently used tool for business and learning. How to meet the more diversifying online conferencing needs of users has brought great challenges to remote work applications, including captioning, real-time machine translation, smart meeting minutes and other artificial intelligence applications.

Open Dataset | Automobile Cabin Voice Interaction Data Solution

In recent years, with the development of artificial intelligence, chip technology, and new innovations in the automotive industry have been driven by the increase in smart car popularity. A smart car consists of three parts: The Internet of Vehicles, the smart cockpit, and the autonomous driving. The smart cockpit is equipped with intelligent and networked in-vehicle software, which can intelligently interact with people, roads, and vehicles. It is an important link and key node for the evolution of the human-vehicle relationship from a tool to a partner.

The Future of Virtual Companionship

Nowadays, more and more young people are buying chat services on e-commerce platforms to accompany them virtually and confiding in “chat buddy” to communicate and express their feelings. Prices for various degrees of companionship range from tens of yuan to the customized "virtual lover" for thousands of yuan. In recent years, virtual companionship services have become a fashionable self-healing way for young people to seek spiritual comfort and express their voices on the Internet. There are many stores on Taobao that provide this service, such as "gentle and cute little sweetheart", "overbearing dictatorial president fan", as long as you pay, you can find your favorite "buddy".

Will Humans Be Replaced by AI?

AI-generated art has experienced rapid growth in both popularity and accessibility over the past few months. With engines like DALL-E, Midjourney, and Stable Diffusion spurring an influx of AI-generated artwork on online platforms.

News

Magic Data Paper is Accepted by INTERSPEECH 2022

Get Started?