Google Doesn’t Have A Limit On Long HTML Pages

Date:

Share post:

by Clint Butler, Digitaleer

(SEO This Week) – With all the different SEO audit tools and webmaster tools that exist there is bound to be some measurements that apply to the one that may not apply to others. After all, SEO is so subjective anyway beyond the basic implementations, it’s bound to happen.

A Twitter user found one such difference while using the Bing Webmaster Tools site audit feature and got a message saying that “HTML size too long” and so, logically, he asked Google’s John Mueller if it also applied to Googlebot.

John’s reply was as expected.

Based on the text of John’s answer, he did the same Google search I had to do in order to see if there was actually a documented limit of the page size.

Turns out, the last study (term used loosely) was in 2015 when someone submitted a copy of Pride and Prejudice to Google and the search engine only indexed a part of it. The same user did a search again in 2017 and the status hadn’t changed.

This supports information a deleted Google documentation page that places size limits at 30MB (anything bigger is completely ignored) and an HTML size limit of 2.5MB (which we know they are not following anymore as modern web page development are making bigger sites).

So if there isn’t an official HTML size limit, but we know Google isn’t indexing full documents over a given size, what is that size?

Well, it turns out, it’s around 180,000 words.

Ted Kubaitis from seotoollab.com and Lee Witcher conducted a test by putting 1 million test keywords on a page and Google indexed the top 180K of them and took 48 hours and they indexed in order from the top down. The test keywords ranged from 7 to 16 characters in length with a white space in between.

Kubaitis summarized that “There are likely multiple limits in play. There is a character limit which is how big a page can be because the HTML specification doesn’t set any limits so a web page can be any size and still be valid HTML. From an engineering point of view, you have to set practical limits. John is likely referring to this first engineering limit where the Page can be 10s of MB and still get fully loaded by GoogleBot. That doesn’t mean there aren’t other limits too further into the process. The indexing limit is probably a second limit being used.”

Clint Butler
Clint Butlerhttps://www.seothisweek.com
With more than 15+ years’ of Agency Owner experience working as an advanced SEO, I help companies scale their business with the best content strategies and digital marketing campaigns.

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.

spot_img

Related articles

Cracking the SEO Code: Surfer Study Reveals How Comprehensive Content Drives Rankings

In a crowded digital ecosystem where millions of new web pages are published daily, standing out in search...

ChatGPT Now Fully Integrated with Apple Devices: A New Era of AI Accessibility

In a groundbreaking development for AI and tech enthusiasts, OpenAI has announced the integration of ChatGPT with Apple’s...

OpenAI Unveils Sora: A Revolutionary AI Video Generation Tool

OpenAI has officially launched Sora, a cutting-edge artificial intelligence tool poised to redefine the way we create and...

Leonardo.ai Unveils ‘Flow State’ for Real-Time Creative Refinement

Leonardo.ai has introduced a groundbreaking new feature, “Flow State,” designed to revolutionize the creative process for artists, designers,...