The study was presented at the ICML 2025 International Education Conference in Vancouver. This is one of the biggest forums in the field of artificial intelligence. Previously, scientists from this team have developed a way to watch how the semantic features are “alive ında in the model. Now they have taken the next step – they learned to determine exactly where these signs look and adjust their work at different stages.
The new system constitutes the multi -faceted graph of the flow of feature flow that allows you to monitor how the elements of meaning are formed, transformed and disappeared in the model. Unlike previous methods, the analysis is no longer not only among layers, but also in the components of the model – between the modules of attention and logic. This helps to understand whether the model uses information from the context or its internal knowledge.
The most important thing is that a new approach allows you to influence these signs, to strengthen one and suppress others. As a result, you can change the style, subject or tonality of the text without changing the parameters of the model. Such interventions are possible at various levels at the same time, making control more accurate and stable.
The method does not require additional data and works with already trained models. This is particularly valuable for research and commercial projects where resources are limited.
Development can help to create more predictable and secure AI systems to filter the unwanted content without the need to completely reconstruct the model.
Source: Ferra

I am a professional journalist and content creator with extensive experience writing for news websites. I currently work as an author at Gadget Onus, where I specialize in covering hot news topics. My written pieces have been published on some of the biggest media outlets around the world, including The Guardian and BBC News.