Google cracked own on web scrapers that harvest search results data, triggering global outages at many popular rank tracking tools like SEMRush that depend on providing fresh data from search results pages.
What happens if Google’s SERPs are completely blocked? A certain amount of data provided by tracking services have long been extrapolated by algorithms from a variety of data sources. It’s possible that one way around the current block is to extrapolate the data from other sources.
SERP Scraping Prohibited By Google
Google’s guidelines have long prohibited automated rank checking in the search results but apparently Google has also allowed many companies to scrape their search results and charge for accessing ranking data for the purposes of tracking keywords and rankings.
According to Google’s guidelines:
“Machine-generated traffic (also called automated traffic) refers to the practice of sending automated queries to Google. This includes scraping results for rank-checking purposes or other types of automated access to Google Search conducted without express permission. Machine-generated traffic consumes resources and interferes with our ability to best serve users. Such activities violate our spam policies and the Google Terms of Service.”
Blocking Scrapers Is Complex
It’s highly resource intensive to block scrapers, especially because they can respond to blocks by doing things like changing their IP address and user agent to get by any blocks. Another way to block scrapers is through targeting specific behaviors like how many pages are requested by a user. Excessive amounts of page requests can trigger a block. The problem to that approach is that it can become resource intensive keeping track of all the blocked IP addresses which can quickly number in the millions.
Reports On Social Media
A post in the private SEO Signals Lab Facebook Group announced that Google was striking hard against web scrapers, with one member commenting that the Scrape Owl tool wasn’t working for them while others cited that SEMRush’s data has not updated.
Another post, this time on LinkedIn, noted multiple tools that weren’t refreshing their content but it also noted that the blocking hasn’t affected all data providers, noting that Sistrix and MonitorRank were still working. Someone from a company called HaloScan reported that they made adjustments to resume scraping data from Google and have recovered and someone else reported that another tool called MyRankingMetrics is still reporting data.
So whatever Google is doing it’s not currently affecting all scrapers. It may be that Google is targeting certain scraping behavior, learning from the respones and improving their blocking ability. The coming weeks may reveal that Google is improving its ability to block scrapers or it’s only targeting the biggest ones.
Another post on LinkedIn speculated that blocking may result in higher resources and fees charged to end users of SaaS SEO tools. They posted:
“This move from Google is making data extraction more challenging and costly. As a result, users may face higher subscription fees. “
Ryan Jones tweeted:
“Google seems to have made an update last night that blocks most scrapers and many APIs.
Google, just give us a paid API for search results. we’ll pay you instead.”
No Announcement By Google
So far there has not been any announcement by Google but it may be that the chatter online may force someone at Google to consider making a statement.
Featured Image by Shutterstock/Krakenimages.com