Voice cloning technology is becoming more and more efficient. Now it is attracting more and more interest not only from actors, but also from cybercriminals. We will tell you how this technology works, how it is dangerous and what stage of development it is at now.
Voice cloning is when a computer program is used to create a synthetic, adaptable copy of a person’s voice. Based on the recording of someone’s speech, she can reproduce the voice of a woman or a man, pronouncing any words or sentences that the operator enters on the keyboard.
How does voice cloning work?
Recent advances in technology have made computer sound incredibly accurate. The program can capture not only a person’s accent, but also timbre, pitch, tempo, speech flow and your breathing.
And the cloned voice can be customized to display any desired emotion, such as anger, fear, happiness, love, or boredom.
This is made possible by advances in machine learning. Previously, the presence of an actor was required to create the most realistic synthetic voices. After recording, his speech was divided into constituent sounds and combined them together to form new words.
Now neural networks can be trained on unsorted target voice data. All the necessary information is loaded, the network is trained, and then the program says in a synthetic voice what is required.
How is the technology used today?
In an interview with the BBC, Tim Heller, a 29-year-old voice actor and sound engineer from Texas, described how he uses voice cloning in his work. He recently used technology to improve his career, he said. For example, a voice clone comes in handy if an actor is booked in two different studios at the same time, or if he has too many orders. To clone his voice, Heller turned to VocaliD. It was founded by Rupal Patel, professor of communication sciences and disorders at Northeastern University. Professor Patel founded the business in 2014 as an extension of her clinical work, creating artificial voices for patients. They helped communicate with those who cannot speak without assistance, such as people who have lost their voice after surgery or illness.
Patel says AI software can learn and adapt on its own. As a result, the technology has advanced significantly over the past few years. In terms of voice acting, voice cloning can also be used to translate an actor’s words into different languages. True, this is bad news for dubbing actors. For example, American film production companies do not need to hire actors to dub films for overseas distribution. For example, representatives of the Canadian company Resemble AI say their software can now convert cloned English voices into 15 other languages. Its CEO, Zohaib Ahmed, said that in order to create a quality copy of someone’s voice, the program requires only 10 minutes of recording of someone’s speech.
What is the danger of technology?
While voice cloning has obvious commercial potential, the technology raises concerns among security experts. For example, it can be used by cybercriminals. For example, cybersecurity expert Eddie Bobritzky emphasizes that synthetic voices pose a “huge security risk.”
“People have become accustomed to not trusting messages very much, for example, on social networks or sent by e-mail. It has been known for many years that it is quite easy to impersonate others using this format of communication, explains the head of the Israeli firm Minerva Labs. “So far, talking on the phone with someone you trust and know well has been one of the most common ways to make sure you are not being fooled or tricked.”
Indeed, when we receive questionable messages from friends, for example, with an urgent request to transfer money, the most common practice is to call back and make sure that you are not being deceived.
However, Bobritsky says that the situation is changing now. “For example, if a boss calls an employee with a request to provide confidential information, and the employee recognizes the manager’s voice, he can instantly do what he was asked to do what he was asked to do. This is the path to a lot of cybercrimes. ”
Scammers have already used voice clones to trick companies into transferring money to criminals’ accounts. Two years ago, The Wall Street Journal reported that the chief executive of a British energy company had been tricked into transferring € 200,000 to a Hungarian supplier. He was confident that he was receiving instructions from his boss. But that was not the case. Energy insurance company Euler Hermes Group SA told WSJ that the fraudster used artificial intelligence software to mimic an executive’s voice.
“The program was able to mimic the voice, as well as tonality, intonation punctuation and German accent,” a spokesman for Euler Hermes later told The Washington Post. The phone call was accompanied by an e-mail, and the CEO of the power company did what was asked of him. The money itself disappeared irrevocably, it was transferred through accounts in Hungary and Mexico.
What’s the bottom line?
One thing is for sure: in the future, anyone will be able to create their own AI voice clone if they want to. But the script for this chorus of digital voices has yet to be written. As with face deepfakes, the law and ethics don’t keep up with technology.