The study discovers that AI does not understand what he says

The study of Harvard University, the University of Chicago and the Massachusetts Technological Institute found that IA models do not understand what they sayThe researchers found the field that almost all language models can generate the correct answers, but cannot apply this knowledge into practice.

In the preliminary version of the manuscript Published in ArxivScientists defined this behavior as Potemkinsor Illusion of understanding with the help of large language modelsThe field according to the study, the majority of AI believe that they understand, but in fact they do not have a consistent representation of concepts.

The word Potemkin is connected with false villages with the same name that General Grigory Potemkin built to impress Katalina II in 1787. Researchers use this term when AI lacks the ability to understand the concept In the same way we make people.

In the study, 32 tests were carried out in three areas: literary methods, cognitive games and prejudice theory. Scientists appreciated whether IA could determine the concept, but could not use it in the classification, generation and editing tasks. For each test, scientists used models such as GPT-4O, Claude 3.5 Sonnet, Deepseek-V3, Deepseek-R1, QWEN2-VL and Call 3.3.

Examples of Potmkin. Image: Potemkin understanding in large linguistic models

AI believes that he understands, but he has no idea what he is talking about

After evaluating the results, the researchers found that AI Correctly determines the concepts in 94.2 % of cases, but more than 55 % fail when applying themFor example, the GPT-4O field could explain how Abab Rhyme works, but could not apply it in the poem. In another test, Claude 3.5 determined the concept of displacement for sure, although he could not identify biased texts.

A new study opened a Pandora box in the tests of language models. Companies such as Openai, Anpropic or Google usually publish tables with the results obtained by their AI in various TestsNevertheless, these tests are designed to evaluate people and are not always valid for models.

“The existence of Potmkins means that behavior that implies understanding in people is not used in LLM“Said Keyon Vafa, one of Studio Co -Outhor.”This means that we need new methods of assessing LLM, in addition to answering the same questions that are used to evaluate people, or find ways to eliminate this behavior from them“

The authors will present the results of the study during the next publication of the International Conference on Automatic Training. The results can Help design tests to guarantee that IA is really intellectualThe field in the long run this will open the door for the development of artificial superethligree.

Windows won’t start for you? The latest update may be to…

The foldable iPad will arrive later than expected, and your wallet…

The release date of the 3rd season, one of the best…

Acer has released the Predator Helios 18 AI gaming laptop in…

OpenAI has released a new Atlas browser together with the ChatGPT…

Windows won’t start for you? The latest update may be to…

The foldable iPad will arrive later than expected, and your wallet…

The release date of the 3rd season, one of the best…

Acer has released the Predator Helios 18 AI gaming laptop in…

OpenAI has released a new Atlas browser together with the ChatGPT…

Windows won’t start for you? The latest update may be to…

The foldable iPad will arrive later than expected, and your wallet…

The release date of the 3rd season, one of the best…

Acer has released the Predator Helios 18 AI gaming laptop in…

OpenAI has released a new Atlas browser together with the ChatGPT…

Windows won’t start for you? The latest update may be to…

The foldable iPad will arrive later than expected, and your wallet…

The release date of the 3rd season, one of the best…

Acer has released the Predator Helios 18 AI gaming laptop in…

OpenAI has released a new Atlas browser together with the ChatGPT…

The study discovers that AI does not understand what he says

AI believes that he understands, but he has no idea what he is talking about

LEAVE A REPLY Cancel reply

Recent Posts

Thing. Xiyear amateur night vision device, also with zoom

Apple will host a random AI summit. For employees only

Comparison of the new MacBook Air on M2 and the old one on M1....

Wild berries will reduce tariffs for Russian brands, the decision has already been made...

Apple Watch SE is now (even) cheaper

EDITOR PICKS

POPULAR POSTS

iPhone 15 Pro and iPhone 15 Pro Max (Ultra): Everything we...

How much does the production cost of iPhone 15, iPhone 15...

What would an iPhone 14 Pro mini look like? That’s...

POPULAR CATEGORY