New productive artificial intelligence is extremely useful for answering questions, helping coding, shaping texts and even playing video games. Researchers at the University of California in San Diego have created a method to evaluate IAS playing performance Super Mario Bros.

This is not exactly the 1985 classic, But from a modified version of the gameIt was performed on an emulator and integrated with Gamingagent, a framework developed by researchers. Through this tool, large language models (LLMs) can control Mario and guide the obstacles.

In Gamingagent, IAS is “educated için to understand both game elements (such as holes, tunnels, enemies and hidden blocks) and the commands required to interact with the scenario. From this information, Models create code in Python to control Mario and try to overcome the stage..

As the game encounters the difficulties, IAS develops strategies to optimize progress and makes its actions more efficient.

Which one plays the best Mario?

Although it is extremely strong in traditional criteria, Openai’s rational O1 O1 performed weakly in Hao Lab tests, and the Google Gemini 1.5 Pro and the Antropic’s Claude 3.5 lost to simpler models.

According to researchers, This is because O1 is not as agile as its competitors.. The model takes more time to process information and create answers – it is advantageous for answering questions or writing codes, but not ideal for action games where fast decisions are required.

A similar limitation, 27) was observed in GPT-4.5 published last Thursday. Despite the calculation power, The high -delay of the model breaks the gameIt makes him die even for a simple goomba. Gemini 2 flash, presented in January 2025, performed much better.

In an X broadcast, the official profile of Hao AI Lab said that Claude 3.7, released in February, exceeded Claude 3.5, Gemini 1.5 Pro and GPT-4O. In the attached video, you can see AI more than more than competitors in the game.

They also play Pokémon

In addition Super Mario Bros.Researchers tested AI models Pokémon RedThe game was released in 1996. In this experiment Claude 3.7 Sonnet was forced to play real -timeWith live broadcast on Twitch since February.

Contrary to the operation of the Hao AI laboratory, Pokémon plays using virtual buttons. Model removes prints from the screen, analyzes the situation, decides the next step, and then gives a command – All very slowly. Nevertheless, he managed to win at least three gym leaders.

In the experiment, AI reasoning is displayed on the screen while the game appears on the right. However, Progress is extremely time consumingEspecially during movements from the map.

How does IAS deal with Minecraft’s freedom?

Another synthetic reasoning test containing IAS takes place in the Mindcraft project broadcasted on the emergency garden channel on Youtube. Max Robinson, the creator of the channel, Analysis as productive models adapted for Minecraft are interested in the internal systems of the game.

https://www.youtube.com/watch?v=iexadwbvdie

In videos, the more abstract and complex the commands, the greater the chance to disappear and restart. The model, for example, can create a choice of iron with some help, but it does not do much beyond the gathering of basic resources and death when it challenges “living forever” in the game.

Does this really represent a breakthrough?

. Using artificial intelligence to play video games is not exactly newAnd there are those who question whether such tests will be considered as a real performance indicator.

In an interview VenturebeatYou.com and Richard Socher, the founder of the search engine, argued that playing games does not prove that an AI is really smart.

The reason for this is unlike the real world, Games allow models to be trained with infinite amounts of data. The manager quoted the Openai Five case, which was trained to play Dota 2 long before the productive AI explosion, which accumulates the equivalent of the 180 -year -old game every day.

“Games helped the research progresses with new ideas, but the problem is that people often believe that the hard for people is difficult for machines.” “He wasn’t smarter than people who learned to play chess, he was good at chess.”

In any case, the gameplay of productive models tends to be very interesting to better understand the limitations of technology mixed with real people.

More information about artificial intelligence, Tecmundo.


Source: Tec Mundo

Previous articleThe end of Mickey 17 explained: how does the spaceship of the director Parasito and Robert Pattinson end?
Next articleAmazon offers JBL, Philco, Pulse and more; Check the Opportunities
I am a passionate and hardworking journalist with an eye for detail. I specialize in the field of news reporting, and have been writing for Gadget Onus, a renowned online news site, since 2019. As the author of their Hot News section, I’m proud to be at the forefront of today’s headlines and current affairs.

LEAVE A REPLY

Please enter your comment!
Please enter your name here