Thus, Yandex.Browser users can now watch English videos with polyphonic audio translation into Russian.

If initially the technology used two synthesized voices, a male and a female, for speech translation, there are now twelve voices. Six for each gender.

The neural network reportedly “distributes” sounds to different speakers and then “remembers” them using AI models created in Yandex.

Moreover, it all works in multiple layers: first, a neural network translates speech into text, restores punctuation and marks the boundaries of sentences, and then another analyzes the spectrogram of the voice and marks the parts spoken by different people.

Source: Ferra

Previous articleYellow baboon: female offspring quickly become independent
Next articleThe Echoes app turns your smartphone into an electronic stethoscope

LEAVE A REPLY

Please enter your comment!
Please enter your name here