Google Instant Previews, OCR, Page Segmentation and SEO
Tags: google instant previous
Many SEO professional are aware of the page and block segmentation algorithms used by Google to identify the content part of the a web page. My guess is that OCR is a part of the algo too. Why? Simply relying on the html code structure will not be accurate, since one could use divs to put important text at the beginning of the source while visually the content will be on the top of the page.
Many argued that OCR is not scalable for Google. With the launch of the Google Instant Previews they now know they were wrong. So, Google does keep “screenshots” of the pages presented in search result pages. From screenshots to OCR is only a matter of computing power, which Google has. And here’s a document form the lion’s mouth, showing some research done with OCR done on microfilms (which have much lower quality than webpage screenshots).
Now, we know that Google does index the text content of a page, keeps a screenshot of the page, does page segmentation based on code structure and maybe visually based on OCR. How does this affect internet marketers like us?
For once, Google Instant Previews seems to be ignoring Flash for previews, no matter what Google just released . Then, there is a potential problem with the content inside tabbed navigation. While a page can rank #1 for a term, the instant preview is not able to identify the words on the screenshot:
Google can’t and won’t be able to highlight content inside tabs, maybe the default tab will do. Depending on how Google users will change their click thru behavior, that could influence CTRs even on #1 rankings – some for sure will think that if they can’t see the search query highlighted in the preview the site is not good enough for them and maybe they won’t click?
Time will let us know and I am sure that some guys will come with a study on CTR and instant previews :)
Pitstop Media offers ROI focused SEO services. If you need a SEO company to help you rank #1 please contact us for a free, no obligation quote. We’ve helped companies rank first on Google in short periods of time, for highly competitive terms.





IMG Tag and Image SEO said:
Jun 03, 11 at 11:02 am[...] photo sharing websites like panoramio.com or flickr.com- Possible OCR of the web pages (see article Google Instant Previews, OCR, Page Segmentation and SEO)As you can see there’s plenty of things search engines have to take into consideration when [...]