The Future of Virtual Companionship

Date : 2022-11-21 View : 7543

Nowadays, more and more young people are buying chat services on e-commerce platforms to accompany them virtually and confiding in “chat buddy” to communicate and express their feelings. Prices for various degrees of companionship range from tens of yuan to the customized "virtual lover" for thousands of yuan. In recent years, virtual companionship services have become a fashionable self-healing way for young people to seek spiritual comfort and express their voices on the Internet. There are many stores on Taobao that provide this service, such as "gentle and cute little sweetheart", "overbearing dictatorial president fan", as long as you pay, you can find your favorite "buddy".

The momentum of virtual human development is like mushrooms after a rain. According to the EqualOcean, a consultancy, as of September 2022, the investment and financing amount of China’s virtual digital human track has exceeded last year, reaching 2.49 billion yuan. In 2015, this figure was only 33 million yuan, with a compound annual growth rate of 97.71%. With such a huge market share, what makes virtual humans so fascinating?

Market Demand

The world given by the virtual character is futuristic and borderless, a technological artistic vision full of bizarre "images with nothing to see". People can establish a good interactive relationship with virtual people, and the love between virtual people is mutual and equal, and new imaginations are generated through interaction with each other. People complete the constant transition between themselves as spectators and themselves in virtual characters. So how is the powerful interactive ability of virtual human realized?

Interaction Ability

The interaction between virtual human and human needs to be understood and generated through text, voice, and vision, combined with action recognition and driving, environmental perception and other methods. Multimodal human-computer interaction can fully simulate the interaction between humans. Among them, speech recognition and speech synthesis are the core functions of virtual human interaction. A simple definition of speech recognition is the technology that enables computers to recognize, understand, and translate human speech into text. Speech recognition technology uses natural language processing or NLP and machine learning to translate human speech. The speech recognition flow chart of virtual human is as follows:

The charming voice of the virtual human comes from the synthesis of the voice of the voice actors, and the speech synthesis is the artificial way of generating the human voice. If a computer system is used in speech synthesis, it is called a speech synthesizer, and a speech synthesizer can be implemented by software/hardware. Text-To-Speech (TTS). Its process is as follows:

Whether it is speech synthesis or speech recognition algorithms for virtual humans, a large number of high-quality precise corpora are required for training. The quality and quantity of data often determines the degree of optimization of deep learning algorithms. The larger the amount of data, the more accurate the labeling, and the smarter the trained virtual human will be. Communication and interaction with people will be smoother, and the synthesized speech will be more anthropomorphic. Data is the cornerstone of all deep learning tasks. Since researchers need to focus most of their energy on developing new algorithms and models, data collection requires the assistance of professional data companies. Magic Data is a professional AI data solutions provider with a large amount of ASR and TTS data. We provide various Speech recognition corpus of various languages and scenes. At the same time, it also has a large number of accurately annotated TTS corpora. Try open-source TTS datasets at MagicHum.com, a data-centric Open Souce community released by Magic Data.

TTS-SCCUSSERFSC: A Scripted Chinese Customer Service Female Speech Corpus

TTS-SCFCHILSC: A Scripted Chinese Female Child’s Speech Corpus

For more information, contact business@magicdatatech.com.

Latest Press

Qingqing ZHANG: Conversation Data Promotes AIGC—Training Data of Large-Scale Models

"Training data is technology " .

That’s what OpenAI co-founder Ilya Sutskever said when taking interview with The Verge. ChatGPT amaze the world since its release. The stunning performance of GPT-4 makes us believe we have enter a new era in AI.

What makes large model so omniscient? In our opinion, the reason may lie in the data...

This article is a collection of Dr. Qingqing Zhang’s thoughts on data, large models and generative AI.

Integrating ASR with Text Summarizer, Secure Your Leading Position in Web Conferencing Market with Magic Data Multi-Person Spontaneous Meetings Dataset

Online meetings have become a frequently used tool for business and learning. How to meet the more diversifying online conferencing needs of users has brought great challenges to remote work applications, including captioning, real-time machine translation, smart meeting minutes and other artificial intelligence applications.

Open Dataset | Automobile Cabin Voice Interaction Data Solution

In recent years, with the development of artificial intelligence, chip technology, and new innovations in the automotive industry have been driven by the increase in smart car popularity. A smart car consists of three parts: The Internet of Vehicles, the smart cockpit, and the autonomous driving. The smart cockpit is equipped with intelligent and networked in-vehicle software, which can intelligently interact with people, roads, and vehicles. It is an important link and key node for the evolution of the human-vehicle relationship from a tool to a partner.

Will Humans Be Replaced by AI?

AI-generated art has experienced rapid growth in both popularity and accessibility over the past few months. With engines like DALL-E, Midjourney, and Stable Diffusion spurring an influx of AI-generated artwork on online platforms.

Visual Perception - The Eyes of Self-Driving Cars

Autonomous cars, or self-driving cars, has gradually entered the public eye from the original black technology. According to the degree of intelligence, autonomous driving is divided into 5 levels from L1 to L5: L1 refers to assisted driving, L2 refers to partial autonomous driving, L3 refers to conditional autonomous driving, L4 refers to highly autonomous driving, and L5 refers to fully autonomous driving- a true driverless vehicle.

News

The Future of Virtual Companionship

Market Demand

Interaction Ability

Get Started?