Google Updates Crawler Documentation To Fix A Typo


Google has fixed a typo in their crawler documentation that inadvertently misidentified one of their crawlers.

In general, this is a minor issue but it’s a major issue for SEOs and publishers who depend on the documentation to set firewall rules.

Failure to notate the correct data could cause a website to inadvertently block a legitimate Google crawler.

Google Inspection Tool

The typo is in the section of the documentation about the Google Inspection Tool.

This is an important crawler that is sent out to a website in response to two prompts.

1. URL inspection functionality in Search Console
When a user wants to check within search console whether a webpage is indexed or to request indexing, Google’s system responds with the Google Inspection Tool crawler.

The URL inspection tool offers the following functionality:

  • See the status of a URL in the Google index
  • Inspect a live URL
  • Request indexing for a URL
  • View a rendered version of the page
  • View loaded resources, JavaScript output, and other information
  • Troubleshoot a missing page
  • Learn your canonical page

2. Rich results test

This is a test for checking the validity of structured data and to see if it qualifies for an enhanced search results, also known as a rich result.

Using this test will trigger a specific crawler to fetch the webpage and analyze the structured data.

Why Crawler User Agent Typo Error is Problematic

This can become a troublesome issue for websites that are behind a paywall but whitelist specific robots, such as the Google-InspectionTool user agent.

Improper user agent identification can also be problematic if the CMS needs to block the crawler with robots.txt or a robots meta directive in order to keep Google from discovering pages it shouldn’t be looking at.

Some forum content management systems remove links to parts of the site like the user registration page, user profiles and the search function to keep bots from indexing those pages.

Hard To Spot User Agent Typo

The issue involved a difficult to catch typo in the user agent description.

See if you can tell the difference?

user agent string comparison 6500b612f3740 sej - Google Updates Crawler Documentation To Fix A Typo

This is the answer:

Original version:

Mozilla/5.0 (compatible; Google-InspectionTool/1.0)

New version:

Mozilla/5.0 (compatible; Google-InspectionTool/1.0;)

Be sure to update relevant robots.txt, meta robots directives or CMS code if you or a client are whitelisting Google’s crawlers or blocking crawlers from certain webpages.

Compare the original version (on Internet Archive Wayback Machine) with the updated version here.

It’s a small little detail but it can make a big difference.

Featured image by Shutterstock/Nicoleta Ionescu



Source link

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

We Know You Better!
Subscribe To Our Newsletter
Be the first to get latest updates and
exclusive content straight to your email inbox.
Yes, I want to receive updates
No Thanks!
close-link

Subscribe to our newsletter

Sign-up to get the latest marketing tips straight to your inbox.
SUBSCRIBE!
Give it a try, you can unsubscribe anytime.