In 2021, Yandex Browser received a major update that added automatic voice over and video voice over features on popular platforms like YouTube.

At the start, only English was available. Since then, 2 years have passed, and the list of available languages ​​has grown to five:

• English
• German
• spanish
• French
• Chinese

And only recently I discovered this powerful functionality and realized how convenient it is. I decided to watch WWDC 2023 with the help of neuro-voicing so as not to miss anything important.

Yes, there are always subtitles, but I have a special relationship with them. They are convenient to use, but at the same time this function literally attracts to the screen, you don’t know how to special, if you don’t know the language or you need to perform other tasks.

With the same WWDC, I need to write a large number of articles after another, and my listening to English is far from perfect. That’s why it is very convenient when someone simultaneously translates the presentation: you write the text and listen at the same time what you are telling.

If at the start the function plays only exceptional video in Russian, now it goes ahead and broadcasts straightforwardly, which is very convenient. You can even distinguish one speaker from another – different synthesized voices are selected for them.

How does simultaneous translation work in general and how to use it? We understand.

How automatic video translation works in Yandex Browser

To implement the functions, Yandex uses a corporate service Translator, speech technologies and biometrics. It is noteworthy that not one, but six neural networks are involved in the voice acting of the video.

First. Specifies the language of the speaker. If he offers to use the content in Russian.

Second. Converts speech to text. Extra sounds were found in the resulting model of the car audio track. It is also the processing of the word-parasites, the final text is “cleaner”.

Third. Normalizes the text and places punctuation marks.

That is, she receives a set of translated words, makes competent sentences from it and deals with punctuation, while maintaining the original meaning.

The neural networks collected the selected context in order to better understand what the video is about. And then she makes sense of everything herself. But sometimes it takes a little more time. In the case of streaming, this will be a compromise between quality and latency.

If we are not sure whether it is necessary to break into sentences in an existing place, then they are stored a little longer until new words come. Then we will either better define the split, or exceed the closure by context and take into account where almost certainly.

— Yandex

Fourth. Specifies the number of speakers, their full and practiced pronouns. The type of voice (male or female) is calculated according to the sound frequencies: for men – 80-150 Hz, for women – 150-250 Hz.

Fifth. Engaged in translation of the text into Russian.

sixth. Synthesizes speech and synchronizes it with the video. She pauses at the same time as the person speaking, and also observes his rate of speech, sometimes speeding up or slowing down.

How auto translation of broadcasts works


The scheme of the translator’s work

It is clear that a stream is no longer a finished video. It cannot be pre-evaluated and imposed voice acting.

If in the first case the neural network already receives the whole audio track and works with it, then in the second case there is no such time reserve. We have to work in the mode of simultaneous translation on the current road.

wp-image-1201899

For this reason, the technologies are a bit important, but the language models used are the same. In the case of broadcasts, a third neural network comes to the rescue, which is responsible for speech normalization. It recognizes the beginning and end of a sentence, highlights introductory words, defines compound sentences, and so on.

As soon as the neural network puts all the punctuation marks, the system receives sentences with complete thoughts and already sends them for translation.

The delay in translating live broadcasts can range from 20 to 50 seconds. Not a bad result to keep up with the flames a lot.

I was especially pleased that the system can voice different voices. And this appears to have been relatively recent. Because at the start there were only two voices available: one male and one female. Now each of them has several variations of sound.

How to enable automatic video translation

The feature is available on iOS, Android, Windows and macOS in the Yandex app or Yandex Browser.

To start, just open any video on popular platforms such as YouTube, Rutube, Vimeo and so on. After the release of the video, the announcement button will be automatically. It remains only to wait.

For regular videos, the process of launching the translator takes a couple of seconds. And broadcasts usually take about 15-20 seconds. And if he is already walking, then the same couple of seconds.

Try it, it’s a very useful feature

Many foreign videos in Russian. Only rare bloggers order dubbing for themselves.

With the help of auto-translation in Yandex Browser, this problem is solved once or twice. In a couple of clicks, I’m already watching a video in Russian and I don’t even suffer from subtitles. Nothing from the meeting sound works well.

The only thing I personally lack is the “liveness” of the voice acting itself. I would like to hear more similar ones with original intonations. Well, while the function works far not with all streams. Hope this gets fixed soon.






Source: Iphones RU

Previous articleRussian electronics manufacturer supports ban on parallel import of Samsung equipmentScience and technology13:30 | 18 June 2023
Next articleDomestic kamikaze plane hits the fortress of the Armed Forces of Ukraine for the first timeScience and technology13:46 | 18 June 2023
I am a professional journalist and content creator with extensive experience writing for news websites. I currently work as an author at Gadget Onus, where I specialize in covering hot news topics. My written pieces have been published on some of the biggest media outlets around the world, including The Guardian and BBC News.

LEAVE A REPLY

Please enter your comment!
Please enter your name here