Google Doesn’t Have A Limit On Long HTML Pages

Date:

Share post:

by Clint Butler, Digitaleer

(SEO This Week) – With all the different SEO audit tools and webmaster tools that exist there is bound to be some measurements that apply to the one that may not apply to others. After all, SEO is so subjective anyway beyond the basic implementations, it’s bound to happen.

A Twitter user found one such difference while using the Bing Webmaster Tools site audit feature and got a message saying that “HTML size too long” and so, logically, he asked Google’s John Mueller if it also applied to Googlebot.

John’s reply was as expected.

Based on the text of John’s answer, he did the same Google search I had to do in order to see if there was actually a documented limit of the page size.

Turns out, the last study (term used loosely) was in 2015 when someone submitted a copy of Pride and Prejudice to Google and the search engine only indexed a part of it. The same user did a search again in 2017 and the status hadn’t changed.

This supports information a deleted Google documentation page that places size limits at 30MB (anything bigger is completely ignored) and an HTML size limit of 2.5MB (which we know they are not following anymore as modern web page development are making bigger sites).

So if there isn’t an official HTML size limit, but we know Google isn’t indexing full documents over a given size, what is that size?

Well, it turns out, it’s around 180,000 words.

Ted Kubaitis from seotoollab.com and Lee Witcher conducted a test by putting 1 million test keywords on a page and Google indexed the top 180K of them and took 48 hours and they indexed in order from the top down. The test keywords ranged from 7 to 16 characters in length with a white space in between.

Kubaitis summarized that “There are likely multiple limits in play. There is a character limit which is how big a page can be because the HTML specification doesn’t set any limits so a web page can be any size and still be valid HTML. From an engineering point of view, you have to set practical limits. John is likely referring to this first engineering limit where the Page can be 10s of MB and still get fully loaded by GoogleBot. That doesn’t mean there aren’t other limits too further into the process. The indexing limit is probably a second limit being used.”

Clint Butler
Clint Butlerhttps://www.seothisweek.com
With more than 15+ years’ of Agency Owner experience working as an advanced SEO, I help companies scale their business with the best content strategies and digital marketing campaigns.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.

spot_img

Related articles

Google Fixes Exploit Allowing Unauthorized Map Pin Moves for Businesses

Google has reportedly addressed a critical vulnerability in its Maps platform that allowed unaffiliated users to move business...

Google Testing New Rich Snippet: ‘Places People Are Talking About’ in Local Search

In its constant evolution to enhance user experience, Google is testing a new feature within its search results...

Google Testing AI-Generated Content in ‘Things to Know’: Transparency Questions Arise

In a quiet but significant development, Google has begun testing AI-generated content in its “Things to Know” section...

Mastering Bing SEO: Unlocking a New Dimension in Search Optimization

Google has been the dominant force in the search engine world for years. Yet, Bing has steadily carved...