Could AI eat itself? - Steve Sammartino

Listen to Steve read this post (9 min audio) – Includes bonus ideas – give it a listen!

Up until November last year, when the first useful ChatBot – ChatGPT – was released to the general public, almost everything on the internet was essentially ‘us’: human-centric content. Articles, posts, podcasts, chat forums, videos, images — all created, posted, described and tagged by people.

This 30-years or so of deep and wide content which populated the internet has been the perfect training ground for Generative AI. All the breadth, nuance and insight of creative humans is what enabled the output to be so ‘human-like’ – even if it comes off a little dry at times. And in less than a year, the internet is starting to morph and change its shape.

Generative AI Ingredients

There are three key ingredients that have made the generative AI revolution we’ve just entered possible:

the AI models, most notably the arrival of Large Language Models, which are the neural networks;
the chips and processing units that can cope with such a large compute across many billions of parameters; and
the data sets that populate the LLMs and get processed.

Of course, without all of it, there can be none of it, but the latter ingredient – the data sets – are vital, given that is what the AI models learn from and use to predict what we want and essentially provide the ‘generative content’.

The AI Internet

Just reading the web in recent months, and we can already see the shift. Social media is increasingly besieged with newly AI-generated content or, content telling you how to create AI-generated content while technology firms and digital media are making cutbacks in staff in a move to create automated content.

Both the demand and supply of AI-generated content are skyrocketing, with the most common job posts in the content realm requiring ‘AI work-withs’ to accelerate output to 100x of what a mere human could produce.

In mere months, the digital landscape has transformed itself. Sites once filled with human insight and opinion are now flooded with AI-generated text, audio, images, and video. Some AIs are even starting to quote and cite each other, creating these echo chambers of misinformation. The internet is going through a hyper-scaled AI industrialization. In a meaningful way, the internet is becoming less human.

While much of this is anecdotal, some research is starting to emerge which demonstrates these changes.

Bonus – Check out my Robot literally 3D printing a building – Visionary Investors wanted!

Experiments in AI

A study emerging out of a collaboration from the University of Oxford, Cambridge, Toronto, and the Imperial College of London found the type of data in the models is all-important. They concluded that if you train an AI system on what they call ‘Synthetic Data’, that is data generated by another AI system, it causes the models to degrade rapidly, and ultimately collapse and fail to function. It may well be that data is a little bit like food. That which is generated naturally by humans, or ‘organically’, is different from the manufactured type.

This is where things get interesting, even a little strange. Given that all LLMs are trained on huge bodies of human text, it seems logical that we’ll need to update that corpus or continue to add human content. And already, that requirement is being compromised by the AI era of the web.

This research is essentially saying that if enough of the internet is output from Generative AI models, then the models will stop working – AI could well eat itself. But we don’t know because most of the training sets are not live and rely on pre-generative AI internet training data sets from 1-2 years ago. Although Google Bard and Microsoft Bing are starting to add live data.

Dead Internet Theory

The Dead Internet Theory is a quasi-conspiracy that has been around for a few years. The general idea is that the internet has largely been taken over by bots – with Statista claiming it is almost 50 per cent of web traffic. Given that generating attention and making money has become so algorithmically driven and a contest for SEO, likes, followers, and fans, a way to win the game is to be releasing bots to generate content and populate your feed or desired political message. Theorists posit that the internet will eventually be a battle of bots against bots, with humans mere bit players.

After doing 600+ keynotes in 40+ countries my new Keynote on AI – is my best ever… but don’t take my word for it – This was quote was from yesterday at an AI Seminar I presented at:

“It was Steve Sammartino and daylight – no one came close. He made the entire event worthwhile”

It’s time for some Sammatron at your next event – Get me in – Don’t miss out!

The Other World Internet

This, and the potential for Generative AI consuming itself, have dramatic implications for the internet. The power of the internet has been derived from human nuance, and insight. If it becomes predictive AI giving average insight from a world of statistical averages The internet could become a kind of other world – where nothing is really ‘human’ – or really ground breaking. If, and it is still a big ‘if’, AI becomes a circular reference tool with degrading data, any advice it provides would just become a loud echo chamber worth avoiding.

An alternative thought is that Generative AI starts to create it’s own emergent behaviour and ideas- thereafter developing deep insights humans would never arrive at.

It’s a live experiment – it will be worth watching this one closely.

—

Keep Thinking,

Steve.