Categories: world

Watch out, Alexa. Artificial voices start to sound just like humans

The Ad part of a demo roll on YouTube created by a new start called WellSaid Labs – is short and smooth. But something is a little different. While the model you see is a human being, the background poll only sounds like one. The Seattle-based company uses voice actors and artificial intelligence to create synthetic voices that sound like a lot to humans. The company claims that the text-to-speech software it has worked for over the past year can produce sounds that sound more human like other synthetic voices. The reason, according to the company, is that it does not strictly control different variables of speech such as speed, pronunciation and volume when practicing their voice model. "The voice we are trying to create here is super expressive and realistic in its end result," said Well Hunt Labs CEO Matt Hocking to CNN Business. Computerized voices seem to be everywhere these days, offering news from a smart speaker in your living room or giving you swinging directions in the car. Yet Alexa, Siri, Google Assistant and others you will likely hear from, still tend to speak in stilted, robot-lit voices. (A remarkable exception, Google Duplex, can call some companies to make reservations with an impressive human audio AI-enabled voice, Google makes it increasingly available, but you have to be at the receiving end of a phone call ̵ 1; at a restaurant, for example – for to hear it). However, WellSaid Labs does not plan to take over the…

The Ad part of a demo roll on YouTube created by a new start called WellSaid Labs – is short and smooth. But something is a little different. While the model you see is a human being, the background poll only sounds like one.

The Seattle-based company uses voice actors and artificial intelligence to create synthetic voices that sound like a lot to humans. The company claims that the text-to-speech software it has worked for over the past year can produce sounds that sound more human like other synthetic voices. The reason, according to the company, is that it does not strictly control different variables of speech such as speed, pronunciation and volume when practicing their voice model.

“The voice we are trying to create here is super expressive and realistic in its end result,” said Well Hunt Labs CEO Matt Hocking to CNN Business.

Computerized voices seem to be everywhere these days, offering news from a smart speaker in your living room or giving you swinging directions in the car. Yet Alexa, Siri, Google Assistant and others you will likely hear from, still tend to speak in stilted, robot-lit voices. (A remarkable exception, Google Duplex, can call some companies to make reservations with an impressive human audio AI-enabled voice, Google makes it increasingly available, but you have to be at the receiving end of a phone call ̵

1; at a restaurant, for example – for to hear it).

However, WellSaid Labs does not plan to take over the voice assistant market. Rather, Hocking said it would sell votes to companies wishing to use them in advertising, marketing, and e-learning courses.

The company says it builds a number of people-like voices that the customers will be able to use, and hopes to work with voice players to create different data sets that can be used to create all kinds of artificial voices.

You’ve probably heard about stock photos; You might think of this as stock voices.

To make the woman’s voice in the faux ads, WellSaid Labs first had a voice actor reading articles from Wikipedia. These recordings formed a dataset that it used to train an artificial neural network – a computer system whose structure is modeled loosely for neurons in the brain.

Another online demo shows how similar AI-generated voices can sound to the actors, with sounds switching between two almost inseparable voices – a human voice-over actor, an AI-generated voice – that sounds like a middle-aged woman . You may sometimes notice some differences, but they are small; The emphasis you expect can be off with just a bit of a word, for example.

The start-up said that it does not have to pre-treat or comment on text given to the software so that it can do things like emphasizing words in a natural way – something that is difficult for an artificial voice without help (even if companies like Google have worked with it). And if you fed the same text to your text-to-speech generator twice, you would get different results.

It takes about four seconds to make a series of text right now, says chief engineer Michael Petrochuk. The model is not designed to interpret long text pieces, but it can be used to speak several sentences, but the text of an entire CNN Business article, for example, would have to be cut into pieces before it could be analyzed and spoken by a WellSaid Labs voice.

 This AI is so good to write that its creator has not won use it

It is difficult to make a synthetic voice sound consistently good. Alan Black, a professor of language technology at Carnegie Mellon University, said that those we know, like Amazon’s Alexa, are robotic sounds because it’s tricky to let it be natural in all situations. It is difficult, he said, to give the right amount of information to a speech synthesis so that it can react with the right amount of feeling.

“We don’t have a little knob on our synthesizer to say” Feel 87%, “he said.

He listened to some of WellSaid Lab’s demo voices and thought they sound” pretty good “.

But if artificial voices sound close – or inseparable from – people, listeners should be clued by not listening to a real personal? After Google showed Duplex in 2018 with a call that its human AI was made to a Bay Area restaurant, the technology company was criticized for not having AI reveal what it was.

Black does not believe that information is necessary, at least in connection with ads.

“I think that most people in general are relatively aware that what they see in video and sounds somehow treated, “he said.” They know that when they look at “The Lord of the Rings” there are really not many orcs in New Zealand shown in the movie. “


Source link

Share
Published by
Faela