(SEO This Week) – Over the past year there have been many reports of Google using blocks of text, sentences, or word strings to filter out pages from their index.
Strings like “Page Not Found”, for example, that are used on typical 404 pages would be found on pages and those pages would be dropped from the index automatically. While not always something that will happen, the instances are there and have been noted on sites like seroundtable.com.
Today, business owners and webmasters have to contend with a new Google error message that is becoming more and more prevalent, the Soft 404.
Google defines Soft 404 errors as “a URL that returns a page telling the user that the page does not exist and also a 200 status code. In some cases, it might be a page with little or no content (for example, a sparsely populated or empty page). “
The error code became more prevalent when a lot of SEO’s were buying old domains and redirecting the entire domain to the home page of the new site in an effort to recapture the backlink power of the old domain for the new.
As this practice became more and more popular, Google responded by having the Soft 404 errors show up more in the Search Console.
Today, it seems that Google is using text strings to try to identify pages that are essentially gone (404) but are being served as legit pages (200).
However, there are reports from users that the string selection, and subsequent deindexing of the pages with those strings is hurting websites across the web.
Google Search Console is reporting a spike in Soft 404 errors on Category pages.
— Nikhil Raj. R (@nikhilrajr) December 23, 2021
Pages that are part of the category tree are incorrectly marked as Soft 404. These pages are getting removed from the index; stopped receiving any traffic.
Anyone else observing the same? /1 pic.twitter.com/wDuHl6qqyT
This got fixed. Please see this thread https://t.co/nUVlSFDLS7
— Nikhil Raj. R (@nikhilrajr) December 31, 2021
We have a breakthrough in understating the problem (at least one issue) check for text similar to 'not found', 'no results' also in different languages. It seems obvious but it may be hidden in places that didn't trigger 'soft 404' in the past and it does now. I hope it helps!
— Shlomo Sasson (@onlinex) April 13, 2021
Google representatives have stated that they have seen reports of this happening, however, because they are supposedly affecting so few sites, they don’t comment on the issue as a matter of general practice. John Mueller also noted that it is something they know is happening.
Yeah, if you make your pages look like they might be error pages, we might treat them as such. It's similar to having a noindex on them, and using JavaScript to selectively remove it. Good job finding this!
— 🐄 John 🐄 (@JohnMu) December 30, 2021
Testing conducted by SEO testing groups has reported instances of this, as well as, the use of text strings to identify the nature of backlinks like guest posts in order to devalue the links.