fbpx
Saturday, 08 April 2017 17:05

Does Google Use Chrome to Discover New URLs for Crawling?

By: 

Many people assume that Google leverages all of its vast information resources in every way possible. For example, the great majority of people who think about it probably assume that Google uses their Chrome browser as a data source, and in particular, that they use it to discover new URLs for crawling.

We decided to put that to the test.

Brief Methodology Note

The test methodology was pretty simple. In principle, all we needed to do was setup a couple of pages that Google didn’t know about, have a bunch of people visit those pages from a Chrome browser, and then wait and see if Googlebot came to visit it. In practice, there was a bit more discipline involved in the process, so here is what we did:

 

  1. Created four brand new articles as web pages. We used two of these as test pages, and two of them as control pages.
  2. Uploaded them to a web site by direct FTP to a web server. By that, I mean we didn’t use any Content Management Systems (e.g. WordPress, Drupal, …) to upload them to make sure that something in that process didn’t make Google aware of the pages.
  3. Waited a week to make sure nothing went wrong, causing Googlebot to visit. During that week, we checked the site log files every day to make sure that no Google bot visits occurred.
  4. Enlisted 27 people to follow this process:
    • Open Chrome
    • Go into settings and disable all their extensions
    • Paste the URL of the first test page in their browser and then visit it
    • Paste the URL of the second test page in their browser and then visit it
    • Reenable their extensions after completing the steps above
    • Grab their IP address and send it to me so I could verify who followed all the steps and who didn’t
  5. Checked the log files every single day until a week after the last user completed their steps

Note that the control pages were never visited by humans, and that’s what makes them controls. If something went wrong in the upload process, then they might get visited, but that never happened. No views of either control page ever occurred, either by humans or bots, so this confirmed that our upload process worked.

 

What do people believe about this?

In anticipation of this test, Rand Fishkin of Moz put out a poll on Twitter to see what people believed about whether Google uses Chrome data for this purpose. Here’s his result:

Twitter poll: Does Google use Chrome browser user data?
Click on image to see original tweet

As you can see, a whopping 88% believe Google sniffs new URLs from Chrome data, and the majority of those are sure they definitely do it. Let’s see how that compares with the results of our test.

The Results

The results are pretty simple: Googlebot never came to visit either page in the test.

As it turns out, two people in the test did not actually disable their extensions, and this resulted in visits from Open Site Explorer (someone had the Mozbar installed and enabled) and Yandex (due to a different person, though I’m not clear on what extension they had installed and enabled).

Summary

This is a remarkable result. Google has access to an enormous amount of data from Chrome, and it’s hard to believe that they don’t use it in some fashion.

However, bear in mind that this test was specific to testing if they used Chrome to discover new URLs. In an earlier test, we showed that Google does not use smartphone clipboards to discover new URLs either. My guess is that Google still is of the mindset that if there is no web-based link path to a page, it doesn’t have enough value to rank for anything anyway.

 

This does not mean that Google does not use Chrome data in other ways, such as to collect aggregate user behavior data, or other metrics. Google’s Paul Haahr confirmed that they do use CTR data as a quality control in highly controlled testing to measure search quality.

Note: He did NOT say that it was a live ranking factor, but more a way of validating that other ranking factors were doing their job well. That said, perhaps Chrome is a source of data in such testing. It could easily be made truly anonymous, and perhaps add a lot of insight into user satisfaction with search results.

In any event, this part of the conversation is all speculation, and for now, we’ve shown that Google does not appear to use simple Chrome visits to new web pages as a way to discover URLs for crawling.

Thanks to the IMEC Labs group for their assistance on this test, and to the IMEC board of Rand FishkinMark TraphagenDan Petrovic, and Annie Cushing for their guidance. Please follow all of them and me, Eric Enge, for more info on IMEC tests in the future!

Source : stonetemple.com

airs logo

Association of Internet Research Specialists is the world's leading community for the Internet Research Specialist and provide a Unified Platform that delivers, Education, Training and Certification for Online Research.

Get Exclusive Research Tips in Your Inbox

Receive Great tips via email, enter your email to Subscribe.

Follow Us on Social Media

Finance your Training & Certification with us - Find out how?      Learn more