The study of Harvard University, the University of Chicago and the Massachusetts Technological Institute found that IA models do not understand what they sayThe researchers found the field that almost all language models can generate the correct answers, but cannot apply this knowledge into practice.

In the preliminary version of the manuscript Published in ArxivScientists defined this behavior as Potemkinsor Illusion of understanding with the help of large language modelsThe field according to the study, the majority of AI believe that they understand, but in fact they do not have a consistent representation of concepts.

The word Potemkin is connected with false villages with the same name that General Grigory Potemkin built to impress Katalina II in 1787. Researchers use this term when AI lacks the ability to understand the concept In the same way we make people.

In the study, 32 tests were carried out in three areas: literary methods, cognitive games and prejudice theory. Scientists appreciated whether IA could determine the concept, but could not use it in the classification, generation and editing tasks. For each test, scientists used models such as GPT-4O, Claude 3.5 Sonnet, Deepseek-V3, Deepseek-R1, QWEN2-VL and Call 3.3.

Examples of Potmkin. Image: Potemkin understanding in large linguistic models

AI believes that he understands, but he has no idea what he is talking about

After evaluating the results, the researchers found that AI Correctly determines the concepts in 94.2 % of cases, but more than 55 % fail when applying themFor example, the GPT-4O field could explain how Abab Rhyme works, but could not apply it in the poem. In another test, Claude 3.5 determined the concept of displacement for sure, although he could not identify biased texts.

A new study opened a Pandora box in the tests of language models. Companies such as Openai, Anpropic or Google usually publish tables with the results obtained by their AI in various TestsNevertheless, these tests are designed to evaluate people and are not always valid for models.

The existence of Potmkins means that behavior that implies understanding in people is not used in LLM“Said Keyon Vafa, one of Studio Co -Outhor.”This means that we need new methods of assessing LLM, in addition to answering the same questions that are used to evaluate people, or find ways to eliminate this behavior from them

The authors will present the results of the study during the next publication of the International Conference on Automatic Training. The results can Help design tests to guarantee that IA is really intellectualThe field in the long run this will open the door for the development of artificial superethligree.

Source: Hiper Textual

Previous articleApple was going to launch her own cloud from Apple Silicon, but something went wrong

LEAVE A REPLY

Please enter your comment!
Please enter your name here