Synthetic Monitoring

Simulate visitor interaction with your site to monitor the end user experience.

View Product Info

FEATURES

Simulate visitor interaction

Identify bottlenecks and speed up your website.

Learn More

Real User Monitoring

Enhance your site performance with data from actual site visitors

View Product Info

FEATURES

Real user insights in real time

Know how your site or web app is performing with real user insights

Learn More

Infrastructure Monitoring Powered by SolarWinds AppOptics

Instant visibility into servers, virtual hosts, and containerized environments

View Infrastructure Monitoring Info
Comprehensive set of turnkey infrastructure integrations

Including dozens of AWS and Azure services, container orchestrations like Docker and Kubernetes, and more 

Learn More

Application Performance Monitoring Powered by SolarWinds AppOptics

Comprehensive, full-stack visibility, and troubleshooting

View Application Performance Monitoring Info
Complete visibility into application issues

Pinpoint the root cause down to a poor-performing line of code

Learn More

Log Management and Analytics Powered by SolarWinds Loggly

Integrated, cost-effective, hosted, and scalable full-stack, multi-source log management

 View Log Management and Analytics Info
Collect, search, and analyze log data

Quickly jump into the relevant logs to accelerate troubleshooting

Learn More

How Google Collects Data About You and the Internet

Google, perhaps more than any other company has realized that information is power. Information about the Internet, information about innumerable trends, and information about its users—YOU.

So how much does Google know about you and your online habits? It’s only when you sit down and actually start listing all of the various Google services you use on a regular basis that you begin to realize how much information you’re handing over to Google.

Let’s have a look at how Google is gathering information from you, and about you.

Google’s Information-Gathering Channels

The stated mission of Google is “to organize the world’s information and make it universally accessible and useful,” and it’s making good on this promise. However, Google is gathering even more information than most of us realize.

 

    • Searches (web, images, news, blogs, etc.) – Google is, as you all know, the most popular search engine in the world, with a market share of around 92% as of February 2019 according to StatCounter (for example, 88% of searches in the U.S. are made on Google). Google tracks all searches, and now with search becoming more and more personalized, this information is bound to grow increasingly detailed and user-specific.

 

    • Clicks on search results – Not only does Google get information on what we search for, but it also gets to find out which search results we click on. Additionally, it knows the relation between clicks and impressions (click-through-rate, or CTR) for all search results, and may use CTR for search rankings. The result is that often-clicked websites get higher in the search results.

 

    • Web crawling – Googlebot, the Google web crawler, is a busy bee, continuously reading and indexing billions of web pages.

 

    • Website analytics – Google Analytics is by far the most popular website analytics package out there. Since it’s free and still supports a number of advanced features, it’s used by a large percentage of the world’s websites.

 

    • Ad serving – Google Ads (known as Google AdWords until July 2018) and Google AdSense are cornerstones of Google’s financial success, but they also provide Google with much valuable data. Which ads are people clicking on, which keywords are advertisers bidding on, and which ones are worth the most? All of this is useful information.

 

    • Email – Gmail is one of the three largest email services in the world, together with competing options from Microsoft (Outlook) and Yahoo. Interestingly, email content, both sent and received, is parsed and analyzed not just by Google itself, but also by third-party apps.

 

    • G Suite (Docs, Sheets, Slides, Calendar, Drive, etc.) – The Google office suite has many users and is, of course, a valuable data source to Google.

 

    • Google Public DNS – The Google DNS service doesn’t just help people get fast DNS lookups; it helps Google too, because it can get a ton of statistics from this, such as which websites people access.

 

    • Google Chrome – What is your web browsing behavior? What sites do you visit? Google has access to everything related to web browsing through the massive Chrome’s user base. According to Statista, as of August 2018, several versions of the Chrome Browser (including mobile and desktop) had around 55% market share globally.

 

    • Android OS – The world’s most used mobile OS is an immense source of data that Google has access to.

 

    • Google Pixel – Google launched its line of Android-based smartphones back in 2016. The most recent model is the highly-praised Google Pixel 3 released in October 2018.

 

 

    • Chrome OS – While not nearly as successful as the Android OS, Chrome OS-based laptops, aka Chromebooks, are available in many corners of the globe.

 

    • Google Finance – Aside from the finance data itself, what users search for and use on Google Finance is sure to be valuable data to Google.

 

    • YouTube – The world’s largest and most popular video site by far is, as you know, owned by Google. It gives Google a huge amount of information about its users’ viewing habits.

 

    • Google Translate – This helps Google perfect its natural language parsing and translation.

 

    • Google Books – Not huge for now, but it has the potential to help Google figure out what people are reading and want to read.

 

    • Google Flights – Launched in 2011, Google Flights is a search service where users can find and book airline tickets through third-party providers.

 

    • Google Maps and Google Earth – What parts of the world are you interested in?

 

    • Your contact network – Your contacts in Google Meet, Gmail, YouTube, etc., make up an intricate network of users. And if those contacts also use Google, the network can be mapped even further. We don’t know if Google does this, but the data is there for the taking.

 

And the list could go on since there are even more Google products out there, but we think by now you’ve gotten the gist of it.

Much of this data is anonymized, but not always right away. Advertising data in server logs is kept for nine months, and cookies (for services that use them) aren’t anonymized until after 18 months. Even after that, the sheer amount of generic user data Google has on its hands is a huge competitive advantage against most other companies—a veritable gold mine.

The Unstoppable Data Collection Machine

There are many different aspects of the Google data collection. The IP addresses requests are made from are logged, cookies are used for settings and tracking purposes, and if you are logged into your Google account, what you do on Google-owned sites can often be coupled to you personally, not just your computer.

In short, if you use Google services, Google will know what you’re searching for, which websites you visit, what news and blog posts you read, and more. As Google adds more services and its presence gets increasingly widespread, the so-called Googlization (a term coined by John Batelle and Alex Salkever in 2003) of almost everything continues.

The information you give to any single one of the Google services wouldn’t be much to huff about. The really interesting dilemma comes when you use multiple Google services, and these days, who doesn’t?

Try using the internet for a week without touching a single Google services. This means no YouTube, no Gmail, no Google Docs, no Google search, and so on.

Why Does Google Do This?

As we stated in the very first sentence of this article, information is power.

With all this information at its fingertips, Google can group data together in very useful ways, and not just per user or visitor. Google can also examine trends and behaviors for entire cities or countries.

Google can use the information it collects for a wide array of useful things. In all the various fields where Google is active, it can make market decisions, research, refine its products, anything, with the help of this collected data.

For example, if you can discover certain market trends early, you can react effectively to the market. You can discover what people are looking for, what people want, and make decisions based on those discoveries. This is, of course, extremely useful to a large company like Google.

And let’s not forget, Google earns much of its money serving ads. The more Google knows about you, the more effectively it will be able to serve ads to you, which has a direct effect on Google’s bottom line.

Accessing the Google Data Vault

To its credit, Google is making some of its enormous cache of data available to you as well via various services.

 

If Google can make that much data publicly available, just imagine the amount of data and level of detail Google can get access to internally. And ironically, these services give Google even more data, such as which trends we are interested in, what sites we are trying to find information about, and so on.

No Free Lunch

Did you ever wonder why almost all of Google’s services are free of charge? Well, now you know. That old saying, “there ain’t no such thing as a free lunch,” still holds true. You may not be paying Google with dollars (aside from clicking on those Google ads), but you are paying with information. That doesn’t have to be a bad thing, but you should be aware of it.

Note: This article first appeared on this blog back in 2010, and we have touched up the data since.

Introduction to Observability

These days, systems and applications evolve at a rapid pace. This makes analyzi [...]

Webpages Are Getting Larger Every Year, and Here’s Why it Matters

Last updated: February 29, 2024 Average size of a webpage matters because it [...]

A Beginner’s Guide to Using CDNs

Last updated: February 28, 2024 Websites have become larger and more complex [...]

The Five Most Common HTTP Errors According to Google

Last updated: February 28, 2024 Sometimes when you try to visit a web page, [...]

Page Load Time vs. Response Time – What Is the Difference?

Last updated: February 28, 2024 Page load time and response time are key met [...]

Monitor your website’s uptime and performance

With Pingdom's website monitoring you are always the first to know when your site is in trouble, and as a result you are making the Internet faster and more reliable. Nice, huh?

START YOUR FREE 30-DAY TRIAL

MONITOR YOUR WEB APPLICATION PERFORMANCE

Gain availability and performance insights with Pingdom – a comprehensive web application performance and digital experience monitoring tool.

START YOUR FREE 30-DAY TRIAL
Start monitoring for free