SberDevices introduced GigaAM, a family of open source machine learning models for speech and emotion recognition, SberDevices said in a statement.

SberDevices presented a family of machine learning models for speech and emotion recognition


Acoustic models can be used to prepare dissertations and scientific articles. The development was carried out by the GigaChat and SaluteSpeech service teams from SberDevices.

  • Giga AM — Audio Foundation Model, previously trained in a variety of Russian speech. It can be used to accommodate various sound work tasks, including speech and emotion recognition, speaker identification, and others.
  • GigaAM-CTC — an open model for recognizing queries in Russian. Quality evaluation on 7 data segments (from requests to smart speakers to registrations of a telephone channel) showed that the model allows 20% to 35% fewer word errors in short requests compared to solutions such as NeMo- Conformer-RNNT and Whisper-Large-v3. .
  • GigaAM-Emo is an acoustic model created to detect emotions. According to SberDevices, it demonstrated the best result on the largest Dusha data set among known models.

All models are publicly available with a non-commercial license.


Anastasia Marina

Source: RB

Previous articleThe iPhone 16 Pro would be better in this key department, but the price would remain the same.
Next articleXbox is in full swing on its next console: “It will be the biggest technological leap ever”
I am a professional journalist and content creator with extensive experience writing for news websites. I currently work as an author at Gadget Onus, where I specialize in covering hot news topics. My written pieces have been published on some of the biggest media outlets around the world, including The Guardian and BBC News.


Please enter your comment!
Please enter your name here