TLDR;
- Generative AI has massively increased the amount of content on the web.
- Some people have exploited this opportunity to contribute to the sheer volume of content.
- Others have promoted these exploits, further increasing the reach of these methods.
- It’s 2008 repeating itself, when then Google CEO Eric Schmidt called the internet a “cesspool”
- Google cracked down with Panda and Penguin updates to limit the damage from thin content and link spam.
- Even in 2020 Google was detecting up to 40 BILLION spam web pages every single day.
- But now, on the 5th March 2024, Google has updated its web spam policies.
- The deluge of web spam is now being threatened again and the spammers don’t like it.
- We’re due to see the Google Webspam Report 2023 in April.
- The Google March 2024 update is a welcome purge to allow good quality to keep its well-earned visibility.
Ever since ChatGPT burst onto the scene in November 2022, the world has never been the same.
Obviously, working in digital, so-called Artificial Intelligence (AI) has been everywhere since then. In WordPress, Elementor, Yoast SEO, Edge browser, Bing search, the big announcements at Google I/O last year, when I last opened Excel, Google Sheets extensions, in the tray on my Windows 11, and everything in between.
Of course, every time a new technology appears, there’s always a rush of people trying to position themselves as the next “guru” and so the world wide web has been awash with the usual claims of “the only AI prompt cheat sheet you’ll ever need” and similar propositions.
However, the worst boast I saw went along the lines of:
“How to generate 300 blog posts in 5 minutes using ChatGPT”
I was utterly mortified when I saw this.
Slowly Does It
To explain my shock and horror at the claim that someone was showing the world, via YouTube, how to generate hundreds of pieces of website content in just a few minutes, let me go back in time…
I started building websites back in 1998. I started my own web design studio back in 2000. I landed my first professional role as a web designer in 2000.
During all of those twenty six years I have been a consummate professional, pretty much doing everything “by the book” and really pushing myself and my limits by learning almost everything that I could – hosting, DNS, email servers, HTML, CSS, web design, content creation, business development, marketing, digital marketing, SEO, PPC, social media, and anything and everything that connects all these disciplines.
So, apart from an HNC in Graphics & Design that I completed in 1998, a two-day course in Drupal development, and an hour being taught the absolute basics of Google Analytics 4 (GA4), I am completely self taught.
Now, focusing on content creation and management alone, someone saying they could create hundreds of blogs in just a few minutes was an appalling thing to hear.
Last year I probably wrote fewer than 50 blog posts – for my employer, for their clients, and for myself. The time it took to write these blog posts varied between an hour to three or four hours to write something well-researched, detailed, and comprehensive enough to satisfy readers. Let’s say that, on average, they took a couple of hours each; that adds up to approximately 100 hours. I would have sat down for the best part of three full working weeks to create those well-crafted posts.
When AI appears on the scene and is suddenly leveraged to do what would be over 4 months’ worth of effort, you can see why humans feel put out.
Generate 300 Blog Posts in 5 Minutes Using ChatGPT
I am paraphrasing the YouTube video title, but I’m probably not far wrong. In fact there were a few videos like this, but you get the gist.
Basically, somebody was pushing the fact that anyone, without any skills or experience, could spend five minutes to create what I would take a few months to produce. In fact, it’s not just four months of my time, it’s the twenty six years, and more, of experience that has allowed me to hone my expertise.
There are two main reactions to the boast of being able to do an expert’s long work in a short space of time:
- Great! I have no skills, no expertise, no experience, and I can compete with, and probably win against (in terms of volume at least) against those who have spent their lives developing their talents.
- No! I am an expert and people with zero talent are supposedly able to do what I can do “at the press of a button”.
As someone who always advocate a “third way” there’s also some middle ground – what can a tool like generative AI do in the hands of someone who is already an expert? Can it save them time and money?
I follow this third way. I was and indeed still do feel threatened by generative AI taking away some of the fire I have spent decades nurturing. However, tools in the hands of the inept are never as effective as they are in the hands of the skilful.
Flooding the Zone
If the notion of generating hundreds of articles in minutes wasn’t bad enough, the fact that this was a YouTube video just amplified the whole issue.
There are a few YouTube vids with similar titles and ideas, and I’m not going to link to them, but one of the most notorious “pimps” of these ideas has over 70,000 subscribers and 10,000 views of this particular tactic. With most of the comments being positive and supporting the idea that you can create a site populated with hundreds of blogs in minutes, you just have to wonder – did all the commenters copy this and go do the same? If not just the commenters then how many of the viewers thought “that’s a great idea” and went off and did it themselves?
Even if just 5% of the 10,000 viewers followed the steps in the video, would we have had 500 copycat websites spring up after watching the video?
Whatever the numbers, we may never now, but if people think that “flooding the zone” is a great strategy then that heralds a real “race to the bottom”.
I’ve always been about learning the skills, honing your craft, and putting in the hours. Now anyone without the experience or expertise is a potential competitor. This changes the playing field from one where skilful and talented experts could pride themselves on their work, to another where everyone is able to compete.
Now, I’m all for a “level playing field” having started at the bottom and worked my way up through sheer determination and years of hard work. But this sort of tactic really damages the quality of content on the web and encourages low-grade entrants into the arena.
The Internet is a Cesspool
When I discuss mass-generated AI content to compete in niche subjects by people who neither care nor have any experience in the matters, the one thing I cite is from the Google CEO back in 2008, Eric Schmidt. He was famously quoted as saying:
“The internet is a cesspool”
Strong words.
Now I’d been on the web for 10 years by the time he said this and I was initially taken aback. I had spent a decade pushing to get my content and my websites seen by as many people as I could. I spent a long time in the Digital Point and Web Pro World forums, watching and discussing strategy and tactics. This had been the time where I was most receptive to the more “grey hat” experimentation.
But what I saw repeatedly was tactic after tactic that initially reaped rewards but was soon shut down. We’re talking cross-linking, then triangulation, doorway pages, dynamically-generated content, content spinning, etc. I saw it all.
Yet Google wasn’t having it. Triangulation, for starters, was quickly nipped in the bud when Google disregarded links from the same class-C IP address. And then services popped up where you could by hosting that was guaranteed to have multiple IPs to avoid this “trap”.
Then there were the infamous “content farms” and the days of Demand Media and people all across the world earning $5 an hour to churn out as many blog posts as they could about “how to boil an egg”.
Once again, Google cracked down. Link spam and thin content were tackled by the Penguin and Panda updates. Demand Media’s once-heralded IPO took a nosedive, and the internet became a better place once again…
Oh No It Didn’t
Schmidt’s cesspool comment was right. The internet was full of low-grade, low-quality content. It was every where. And the biggest thing as an end user was to find quality content. As a writer and publisher, all I wanted to deal in was quality content.
Since 1998 I’ve always attempted to write quality, helpful content. Since 2008 that flame burned brighter. And by the end of 2022, that was all under threat.
You only have to go back to Google’s Webspam Report 2020 to read the staggering figures about the sheer volume of what they have to contend with and what they STOP from reaching the search results and flooding the SERPs;
“every day, we discover 40 billion spammy pages”
That was 40 billion, a day, and that was back in 2020, before the dawn of mass AI-generated content. That’s 14.6 trillion spam pages just detected in a year alone!
By the Google Webspam Report 2021, they dropped the number and only said
“In 2021, SpamBrain identified nearly six times more spam sites than in 2020”
There’s no hard figure there, but if that’s 40 billion a day times six, then the amount of spam detected was a whopping 240 billion pages a day, or 86.7 trillion a year, right?
And then in the Webspam Report 2022, again Google don’t report a big figure, they only say:
“SpamBrain detected 5 times more spam sites compared to 2021”
So that’s a steady growth rate to potentially 1.2 trillion spam pages detected every day, or 438 trillion spam pages detected in a year!
I expect that number has absolutely exploded since November 2022 but I suspect that particular data will be more prevalent in the Webspam report for 2023 which is not published yet and should be out in April 2024.
Google March 2024 Core & Spam Updates
However, web spam is probably of such a scale now that it dwarfs previous numbers. That’s why we’ve just had the early March announcement:
Today we announced improved quality ranking and new spam policies that we believe significantly enhance the quality and helpfulness of your search results. Learn more: https://t.co/AnStGjFkTW
— Google SearchLiaison (@searchliaison) March 5, 2024
In this piece, Google remind us that they are continuing to fight spam and low-quality content. The key points are their improved quality ranking and their new and improved spam policies.
The detail of the spam policies is particularly interesting with the real big points being:
- Scaled Content Abuse: AI-generated low-quality, unoriginal quality produced at scale (300 blog posts in 5 minutes, anyone?)
- Site Reputation Abuse: Quality sites that also have a low-quality element from third-parties – is this guest posting getting a kicking again?
- Expired Domain Abuse: The purchasing of expired domains to create new content and try leverage historic link juice.
I think these are great targets for Google to try and remove the low-quality content that has been infesting the internet.
Conclusion
When I was told that notorious spammers and purveyors of spammy tactics were being subject to manual removals from the Google index, I smiled.
For over 20 long years I have worked hard to create quality content. And for people to just “rock up” and use AI to flood the zone has been a disappointing development.
Google weren’t quick enough to deal with it, in my opinion. Or at least I don’t recall any communications really highlighting the enormity of the problem and the urgency of tackling it. Maybe, with their track record of no longer giving us the scary raw figures (40 billion, six times more, five times more…) Google have had a bit of a problem on their hands?
If, as I suspect, content spam has exploded exponentially since the public release of ChatGPT, then what is the multiplier for 2023 if it was “only” 6 times in 2021, and 5 times in 2022? What will the April 2024 release of the Google Webspam Report 2024 reveal? Ten times the amount of spam content detected daily? One hundred times? One thousand times?
I’m speculating, I really don’t know. But Google know and I’d love them to share that data.
So Google have probably had a bit of a PR nightmare, with having to first deal with the sheer uptick in volume of spam, and then having to explain it to the world.
Also, they didn’t really help did they? At first Google said AI-generated content was a no-no, then they reversed their decision, This U-turn probably encouraged the spammers to continue to act with impunity.
If there were indeed, as we estimate, some 1.2 trillion spam pages detected every day in 2022, was it 10 trillion daily in 2023? Or 100 trillion? How do you realistically deal with that level of crap?
Oh yes, the Google March 2024 update.
If spammers are fearful or unhappy that they’ve lost the benefits of their exploits, then I do not shed a tear. It has been interesting to see their experiments, but simply that – interesting. These tactics have been allowed to go on for too long and spammers have become complacent. As has Google. Or did the Mountain View giant just have too much on its hands?
This chapter has come to an end and a new one opens. I just hope that the lid stays on the box for long enough to allow the professionals, the experts, and the experienced to continue with the good work they’ve been doing on a professional level for the past twenty years or more.
Leave a Reply