The development of the AI ​​industry has hit another hurdle: data availability is beginning to decline, writes the New York Times.

AI started to “feel” the lack of data, restrictions are tightening
  1. News


Author:

Subscribe to RB.RU on Telegram

The MIT-led Data Provenance Initiative conducted a study and found that many key web sources have begun to limit the use of their data, which has a negative impact on the training of powerful systems.

14,000 domains were analyzed and used in three major AI training datasets. The results revealed a significant “emerging crisis of consent.” In one year, about 5% of all data and 25% of the highest quality data are known to be restricted by the Robots Exclusion Protocol, a tool that site owners use to block automated data harvesters.

And it turns out that nearly 45% of the data in the C4 dataset is now restricted by websites’ terms of service.

The new restrictions are expected to affect not only companies developing AI, but also researchers, scientists and non-profit organizations that use web data.

We previously wrote about what “AI PR” is and why it has become a problem.

Set up your RB.RU subscription

Author:

Nikolai Tikhonov

Source: RB

Previous article“House of the Dragon” 2×06: The dragon riders arrive in Westeros.
Next articleImplemented UmboMic microphone, which can be removed from other devices.
I am a professional journalist and content creator with extensive experience writing for news websites. I currently work as an author at Gadget Onus, where I specialize in covering hot news topics. My written pieces have been published on some of the biggest media outlets around the world, including The Guardian and BBC News.

LEAVE A REPLY

Please enter your comment!
Please enter your name here