SberDevices introduced GigaAM, a family of open source machine learning models for speech and emotion recognition, SberDevices said in a statement.
Join
Acoustic models can be used to prepare dissertations and scientific articles. The development was carried out by the GigaChat and SaluteSpeech service teams from SberDevices.
- Giga AM — Audio Foundation Model, previously trained in a variety of Russian speech. It can be used to accommodate various sound work tasks, including speech and emotion recognition, speaker identification, and others.
- GigaAM-CTC — an open model for recognizing queries in Russian. Quality evaluation on 7 data segments (from requests to smart speakers to registrations of a telephone channel) showed that the model allows 20% to 35% fewer word errors in short requests compared to solutions such as NeMo- Conformer-RNNT and Whisper-Large-v3. .
- GigaAM-Emo is an acoustic model created to detect emotions. According to SberDevices, it demonstrated the best result on the largest Dusha data set among known models.
All models are publicly available with a non-commercial license.
Author:
Anastasia Marina
Source: RB

I am a professional journalist and content creator with extensive experience writing for news websites. I currently work as an author at Gadget Onus, where I specialize in covering hot news topics. My written pieces have been published on some of the biggest media outlets around the world, including The Guardian and BBC News.