Google Now Indexes Scanned Documents
Writing by Nick Stamoulis on Friday, 31 of October , 2008 at 12:19 pm
Google can now indexed scanned documents - .pdf files and other images of print text. Wow!
This is great news because before humans - people like you me - had no problems reading scanned documents online, but the search engines did. Now you can scan your entire library of technical manuals and possibly rank for search terms within them. At least, in theory.
From Google’s official blog:
Consider a circle. Should it be read it as a zero, the letter ‘O’, just a circle, or the ring from my coffee cup? People learn to answer this kind of question very quickly, but for the computer it is a painstaking and error-prone process.
Check it out:
Here’s a link to the SERP.
Now view the .pdf document.
Incredible that the huge title across the top of the page on the .pdf file is the actual title of the document in the SERP, just like on a web page. How you can make that work for your website? Any ideas?
Category: SEO
Comment by pattypat
Made Friday, 31 of October , 2008 at 1:03 pm
Personally, I don’t find this to be an improvement. Some of he documents I have in my site have been indexed by Google but the PDF coming from a bad quality source (microfilms), whatever Google has interpreted is nothing like the real title. On top of that, I have a very well indexed website and I wish my visitors to view it as a whole, not just a PDF randomly picked and shown out of its normal page setup.
I have nothing against the idea, I just want Google to give the opportunity to webmasters to avoid this indexation. I haven’t found anything in FAQ that would tell me how to stop this.
Comment by frde
Made Friday, 31 of October , 2008 at 10:31 pm
Why not just use robots.txt?
Pingback by » Can You Promote Scanned Documents With PPC? Pay Per Click Journal - Pay Per Click Advertising Blog
Made Sunday, 2 of November , 2008 at 7:22 am
[...] couple of days ago there was announcement on Search Engine Optimization Journal that Google can now crawl and index scanned documents. But can you drive traffic to those pages with PPC and expect your quality score to register the [...]
Subscribe to our RSS Feed 




