SberDevices announced a family of open source GigaAM models.

They provide a correct understanding of the Russian language and emotions. they can be used for writing scientific articles and dissertations.

The family consists of three neural models: GigaAM, GigaAM-CTC and GigaAM-Emo.

GigaAM — Audio Foundation model, pre-trained in Russian speech. It is needed to adapt to various audio tasks, including speech and emotion recognition, speaker detection, and many others.

GigaAM-KTS is a model for satisfying Russian-speaking requests. According to the company, the model allows 20–35% fewer word errors in query results in expressions with NeMo-Conformer-RNNT and Whisper-Large-v3.

GigaAM-Emo is an acoustic model for detecting emotions. She became the best on the largest Soul dataset among famous models.


Comparison of GigaAM with analogues

SberDevices notes that all new models are publicly available with a non-commercial license.

New products are available on the SaluteSpeech API platform and in the SaluteSpeech App. Businesses will be able to integrate bots at home into their solutions using applications, for example, being able to test recognition in lectures or during meetings.






Source: Iphones RU

Previous articleAeroflot and SDEK launch next-day parcel delivery service
Next articleUnique cosmopolitan SpaceTime Blade vehicles are designed to last for kilometers
I am a professional journalist and content creator with extensive experience writing for news websites. I currently work as an author at Gadget Onus, where I specialize in covering hot news topics. My written pieces have been published on some of the biggest media outlets around the world, including The Guardian and BBC News.

LEAVE A REPLY

Please enter your comment!
Please enter your name here