The study used the algorithms of SO -CALLED Vision Transformers (VIT), who learned to process visual information. Instead of a classic method with marked images, they used an approach called Dino – this is a kind of self -learning that examines images without indicators of exactly what is depicted.
When such models were taught in the video, they started to draw attention unexpectedly in the same areas as a human being – for example, faces, figures of people, and even background. Surprisingly, a part of the model carefully emphasized the details of the background – the other – the other – the other – the other – the other – the other – the other – the other – without a direct indication of what is a person. This behavior looked like how visual perception was regulated in humans.
Scientists compared the results of the model with eye movement monitoring data in humans. It turned out that the coincidence between the attention of AI and the human appearance emerged, especially in the scenes with people. The results led researchers the idea that the usual model of the “figure figure” perception could be expanded to three components, figures and environment.
Such discoveries may be useful to create more understandable and “human” robots and in projects related to cognitive development support in children.
Source: Ferra

I am a professional journalist and content creator with extensive experience writing for news websites. I currently work as an author at Gadget Onus, where I specialize in covering hot news topics. My written pieces have been published on some of the biggest media outlets around the world, including The Guardian and BBC News.