With the increasing use of machine learning and artificial intelligence, new challenges have emerged that require an engineering approach. MLE-bench includes 75 benchmarks from the Kaggle platform that evaluate how well AI can solve real-world problems, such as deciphering ancient scrolls or developing new types of mRNA vaccines.
Although the tool does not address AI safety issues, it opens up the possibility of developing tools aimed at preventing potential negative outcomes. The test results will help the OpenAI team track progress in AI research and evaluate autonomous engineering and innovation capability.
Source: Ferra

I am a professional journalist and content creator with extensive experience writing for news websites. I currently work as an author at Gadget Onus, where I specialize in covering hot news topics. My written pieces have been published on some of the biggest media outlets around the world, including The Guardian and BBC News.