Responsible AI vs DeepFake

Date : 2022-08-08 View : 1181

According to Reuters, Amazon will upgrade Alexa's vocal simulation capabilities, which can simulate human voices in recordings based on recordings of less than a minute. Rohit Prasad, Amazon's senior vice president, said that during the epidemic, many people have lost a loved one, and the new feature can make the memory last. This is a benefit for those with language barriers or missing relatives, but for those with ulterior motives, it is a deceiving tool brought by AI, that is, DeepFake. Deepfake technology has risen rapidly in recent years, and is widely used in political smearing, military deception, economic crimes and even terrorist actions between countries, posing new threat to political security, economic security, social security, national security and other national security.

Discussion about responsible A -- the practice of designing, developing, and deploying AI with good intention to empower employees and businesses, and fairly impact customers and society, are widespread.

AI Scam

In general, DeepFake is to replace the face, tone or video face in pictures, sounds, videos through artificial intelligence synthesis technology. For example, in 2017, a Reddit social networking site user named "deepfakes" posted on the Reddit social networking site. Fake videos that map the faces of actresses like Scarlett Johansson to porn performers. Coincidentally, a gang that was recently exposed by CCTV to attract overseas fraudulent organizations, they used robots to automatically make 17 million harassing calls, and finally screened out more than 800,000 valid "customers", and received a total of nearly 180 million yuan in "pulling people" commissions. Compared with the synthetic counterfeiting of images and videos, audio synthesis is easier, more widely used in daily life, and more likely to be used by fraud gangs. Among them, voice conversion in speech synthesis converts one person's timbre into another person's timbre, which is the mainstream AI voice timbre conversion technology.

Voice Conversion

A typical speech conversion scheme includes speech analysis, mapping and reconstruction modules as shown in the figure below, which is called analysis-mapping-reconstruction. The model needs to extract the speaker representations of the source speaker and the target speaker, and replace the target speech speaker information with the representation of the target speaker. The picture is from the paper: Berrak Sisman's paper "An Overview of Voice Conversion and its Challenges: From Statistical Modeling to Deep Learning". Among them, the development of acoustic models is extended from traditional statistical models to deep learning models.

The above voice conversion technology requires professional VC data of TTS for training. According to the design of the algorithm model, the data can be parallel or non-parallel. Magic Data provide professional data service in TTS, covering dozens of languages, including English, Mandarin, Portuguese, Korean, etc. An example of this is as follows:

MDT-TTS-E009 American Male Voice TTS Dataset

MDT-TTS-E019 Korean Speech Corpus for TTS

MDT-TTS-G002 Brazilian Portuguese Speech Corpus for TTS

Detection of Fake Voice

While voice conversion brings benefits to the mass media, it may bring "fraud" opportunities to criminals. Therefore, fraudulent speech detection has become a new direction in the field of deep learning. Most of the main detection techniques are still based on deep learning models. A discriminator is added on the basis of the downstream task to judge whether the data input to the model is true or false. Both the training of the discriminative model and the training of the synthetic model are inseparable from the support of TTS data.

Deep learning models bring infinite possibilities to the development of artificial intelligence. While the notion of making AI systems transparent, fair, secure, and inclusive should be bared in mind when building advanced AI. Only in this way can we trust AI and scale it up with confidence.

Latest Press

Qingqing ZHANG: Conversation Data Promotes AIGC—Training Data of Large-Scale Models

"Training data is technology " .

That’s what OpenAI co-founder Ilya Sutskever said when taking interview with The Verge. ChatGPT amaze the world since its release. The stunning performance of GPT-4 makes us believe we have enter a new era in AI.

What makes large model so omniscient? In our opinion, the reason may lie in the data...

This article is a collection of Dr. Qingqing Zhang’s thoughts on data, large models and generative AI.

Integrating ASR with Text Summarizer, Secure Your Leading Position in Web Conferencing Market with Magic Data Multi-Person Spontaneous Meetings Dataset

Online meetings have become a frequently used tool for business and learning. How to meet the more diversifying online conferencing needs of users has brought great challenges to remote work applications, including captioning, real-time machine translation, smart meeting minutes and other artificial intelligence applications.

Open Dataset | Automobile Cabin Voice Interaction Data Solution

In recent years, with the development of artificial intelligence, chip technology, and new innovations in the automotive industry have been driven by the increase in smart car popularity. A smart car consists of three parts: The Internet of Vehicles, the smart cockpit, and the autonomous driving. The smart cockpit is equipped with intelligent and networked in-vehicle software, which can intelligently interact with people, roads, and vehicles. It is an important link and key node for the evolution of the human-vehicle relationship from a tool to a partner.

The Future of Virtual Companionship

Nowadays, more and more young people are buying chat services on e-commerce platforms to accompany them virtually and confiding in “chat buddy” to communicate and express their feelings. Prices for various degrees of companionship range from tens of yuan to the customized "virtual lover" for thousands of yuan. In recent years, virtual companionship services have become a fashionable self-healing way for young people to seek spiritual comfort and express their voices on the Internet. There are many stores on Taobao that provide this service, such as "gentle and cute little sweetheart", "overbearing dictatorial president fan", as long as you pay, you can find your favorite "buddy".

Will Humans Be Replaced by AI?

AI-generated art has experienced rapid growth in both popularity and accessibility over the past few months. With engines like DALL-E, Midjourney, and Stable Diffusion spurring an influx of AI-generated artwork on online platforms.

News

Responsible AI vs DeepFake

Get Started?