OpenAI competitor Elon Musk’s xAI has unveiled its first multi-modal model, the Grok-1.5 Vision (or Grok-1.5V). The tool is capable of processing data displayed in drawings within documents, graphs, tables, screenshots and photographs. The solution will soon be available for testing among market professionals and current Grok users.

xAI Elon Musk presented a Grok-1.5V multimodal model

Join

The presentation took place a few weeks after xAI presented an updated model of the Grok-1.5 chatbot. And this is another step for Musk’s company, which set the creation of “useful public AI” as a key objective, recalls VentureBeat.

The company cites several examples of how Grok-1.5V works, from converting a sketched flowchart into Python code or converting a table to a CSV file, to creating a bedtime story based on a child’s drawing or explaining a meme. .

xAI representatives claim that this multimodal model stands out from its analogues (GPT-4V, Claude 3Sonnet, Claude 3 Opus and Gemini Pro 1.5) and surpasses its competitors in the analysis of the surrounding space.

The advantage is achieved through RealWorldQA technology (Creative Commons license), trained on more than 700 images, which were accompanied by questions and answers on each topic.

RB.RU recommends the best digital solution providers for your business: click here

Author:

Ekaterina Alipova

Source: RB

Previous articleTotal spending and average bill for entertainment increased in the Russian Federation by 10-15% in the first quarter of 2024
Next articleFractal Payments founder Pavel Skalin included in Forbes’ “30 under 30” rating
I am a professional journalist and content creator with extensive experience writing for news websites. I currently work as an author at Gadget Onus, where I specialize in covering hot news topics. My written pieces have been published on some of the biggest media outlets around the world, including The Guardian and BBC News.

LEAVE A REPLY

Please enter your comment!
Please enter your name here