Artificial intelligence (AI) systems could devour all of the internet's free knowledge as soon as 2026, a new study has warned.
AI models such as GPT-4, which powers ChatGPT, or Claude 3 Opus rely on the many trillions of words shared online to get smarter, but new projections suggest they will exhaust the supply of publicly-available data sometime between 2026 and 2032.
This means to build better models, tech companies will need to begin looking elsewhere for data. This could include producing synthetic data, turning to lower-quality sources, or more worryingly tapping into private data in servers that store messages and emails. The researchers published their findings June 4 on the preprint server arXiv.
"If chatbots consume all of the available data, and there are no further advances in data efficiency, I would expect to see a relative stagnation in the field," study first author Pablo Villalobos, a researcher at the research institute Epoch AI, told Live Science. "Models [will] only improve slowly over time as new algorithmic insights are discovered and new data is naturally produced."
There is still an abundance of private information out there. The chatbot companies may just have to pay for it.
To read more, click here.