Apple kicks off the second quarter of the year with news announcements artificial intelligence. Researchers from Cupertino presented a new model based on artificial intelligence capable of providing context about what is displayed on the device screen. Pay attention to the title Region and according to a published report, it is capable of exceed GPT-4.

The goal of this new system is to improve communication by making interaction more natural through visual context.. ReALM adds to this equation information displayed on the screentherefore, voice is no longer the only element with which the user can communicate.

Moreover, opportunities lie not only in providing additional information, but it could also be done specific questions about what the screen is showing. Apple’s research is in its early stages, but it is already capable of producing a ranking that ReALM outperforms its direct competitors.

Apple AI already outperforms GPT-4

Of course, power Seeing, reading and understanding information on a screen is nothing new.. Most languages ​​and companies are working on something similar, but with completely different goals. Apple is one of the most interested because its the catalog is full of devices with panels and you can get a lot of juice from this system.

Image from arxiv.org

Even though she was not the first to announce her research, she managed to become one of the best. The researchers wrote that the most advanced version of ReALM is capable of surpassing GPT-4 when capturing visual links (screen links).

A report published by Apple provides a table of estimates that ReALM-3B, its most powerful version, is capable of outperforming MARRS, GPT-3.5 and GPT-4 in terms of screen views. The company emphasizes that your system is much smaller and yet it offers better results.

The key to improving Siri

It’s a fact that Siri, Apple’s virtual assistant, is one step behind its competitors. ReALM allows A Peek at Siri’s Promising Future on all company devices. He visual screen context opens the door to a new way of interacting with artificial intelligence and virtual assistants.

In the future, when ReALM integrates with Siri, will allow you to read the information displayed on the screen to resolve doubts or even display information of interest without asking the user.

A good example would be the automatic browsing of a restaurant website and iPhone. detect address on screen to provide Maps notification with driving directions.

Finally, Apple also commented on the limitations of this system. According to the document, Moving from plain text to images is a complex process. and this requires much more advanced systems. Siri can offer context when displaying text, but distinguishing images or analyzing them doesn’t seem to come close.

Source: Hiper Textual

Previous articleWho is Omoda: the electric vehicle brand that could revolutionize the Spanish market
Next articleNew OpenAI can clone watches with 15-second sound
I'm Ben Stock, a highly experienced and passionate journalist with a career in the news industry spanning more than 10 years. I specialize in writing content for websites, including researching and interviewing sources to produce engaging articles. My current role is as an author at Gadget Onus, where I mainly cover the mobile section.

LEAVE A REPLY

Please enter your comment!
Please enter your name here