Google: PDF File Content Is Not Indexed

by Clint ButlerDigitaleer

(SEO This Week) -ย Google's John Mueller dropped a proverbial dime on how the search engine works when dealing with PDF files and the content in them. Actually, he didn't, but some might claim he did so let's get ahead of it.

In the tweet sequence, John was complaining about having to have two different technologies installed in order to fill out forms.

The story here is that John pointed out that the PDFs themselves were in the index, however, he went on to say that the content of the PDFs was not.

This assertion would lead credence to new testing observations that show that a web page can indeed be in Google's index and findable when looking directly for that asset. However, at the time, the web page won't be ranked for any terms on that particular page, or in this case, PDF.

Upon further review of John's claim a look at the search result he shared resulted in a different story.

The PDFs he was referencing were in fact indexed, and so was the content, however, because the PDFs required Adobe Reader to open there is a default message on all the documents. This default message is what was indexed.

A couple of things are going on here, first, the entities that are providing the PDF documents have technology in place to detect both Adobe and Javascript being active on a user's browser. Second, if the browser doesn't report those two pieces of tech, the users are given this PDF with the template messaging on it. Third, their SEOs teams are allowing those blank versions of the PDFs to be indexed.

In the end, this is more of an issue for users who think that they are clicking on a PDF download link in the search results and end up getting a 45 page PDF with a "Please wait..." message on every page.

But on the bright side, this search result alone should lay your mind to rest knowing that Google is still converting PDF files into HTML, reading them, and ranking them based on the content inside of them.

(function() { // DON'T EDIT BELOW THIS LINE var d = document, s = d.createElement('script'); s.src = 'https://seothisweek.disqus.com/embed.js'; s.setAttribute('data-timestamp', +new Date()); (d.head || d.body).appendChild(s); })();
Recommended Tools
Podcast
About Us
Contact Us
2021 - Copyright, All Rights Reserved, web design by Digitaleer with โค๏ธ
crossmenu