Inspired by stable diffusion, this method replaces words with gaps and then slowly “fills the gaps” to create a final answer. This allows Mercury to work parallel, which makes it much faster than the models that produce the text in order.
According to Inception Labs, Mercury can produce more than 1,000 words per second in NVIDIA H100 graphics processors and can significantly exceed similar models. For example, the Mercury encoder mini reaches a speed of 19 times higher than the GPT-4O Mini, while maintaining similar accuracy in solving coding problems.
Mercury encoder can be tested on the start demo site.
Source: Ferra

I am a professional journalist and content creator with extensive experience writing for news websites. I currently work as an author at Gadget Onus, where I specialize in covering hot news topics. My written pieces have been published on some of the biggest media outlets around the world, including The Guardian and BBC News.