Summary: Many people are concerned about website tracking and monitoring. While it's not something most need to be concerned about, it can get quite involved.
Other than using spyware and cookies which can be deleted from our PC (hopefully), how can websites or search engines continuously track and monitor our internet activities from our home PC? I read from one of your earlier articles that most people probably have a "dynamic" IP address. Assuming that is true for me, and my IP address constantly changes, how can an IP address be used to identify me for any significant length of time? (My IP address today could be yours tomorrow.) And even if the website/search engine knew my IP address at a single point in time, how can they connect that IP address to my name (if I don't register it) and physical location? I'm guessing that my ISP can make this connection, but I assume they won't provide that information to just anyone, right?)
•
That question covers a lot of ground, from cookies to IP tracking. It also misses a couple of areas that are worth thinking about as well.
But I do have to point out one important thing for most people: you, as an individual, just aren't that interesting. Sorry to burst your bubble, but it's pretty likely no one really cares where you go or what you do.
Let's see what they might care about, and the ways that they can collect it.
•
Cookies are, by far, the most common way for web sites to preserve information about your usage. While cookies ostensibly only store information on your machine, that information can of course be used to access other information stored elsewhere.
Cookies typically just store small bit of information on your machine, and then each time you go back to a page on the same site as stored the cookie originally, that number is sent along. That bit of information might be a login ID so that as you access on-line email account you don't have to login for each and every page.
That's why completely disabling cookies can be such a pain. Many web sites simply rely on cookies to keep you logged in, and keep the experience of using them somewhat manageable. It would really suck if you actually had to login each time you wanted to see the next email message in your inbox. Cookies solve that.
But yes, cookies are one way - the most obvious way - that sites and particularly advertising services - can collect information about the sites you might visit on the web. They may not know it's you, but your machine that's visited these sites, this many times over this period of time.
•
Logging In is probably the least thought of way that web sites collect information. When you login to a service, by definition you've identified yourself (and your IP address, but more on that below). The service then "knows" who you are - to the extent that you've provided that information - for as long a you're logged in.
The 'catch' is that logging in to one service could identify you with all services from the same provider. And we provide a lot of information to the various services we interact with.
Consider Google. Logging into GMail also identifies you for iGoogle, Google Calendar, Google News, and all Google services, including Google Web History, which keeps a history of all the sites you visit while logged in.
It's not uncommon. Login to Hotmail, and you've actually logged in to all Windows Live Services. Login to Yahoo mail, and all Yahoo services may follow.
•
Flash and Javascript can also be used to collect information about how they're being used. Flash even has its own version of cookies that are not the same as browser cookies, and are not clear by a browser's cookie management functions.
Javascript, when enabled, can also be used to send off some additional information to the sites you're visiting in ways that bypass cookies.
•
IP addresses are what typically get everyone all excited and concerned, and for no real good reason. As I've said over and over and over again here and elsewhere: IP addresses cannot be traced to your physical location without legal intervention.
They can, however, occasionally be used as a tracing mechanism. As you say, IP addresses can change, but unless you're on dial-up they actually don't change that often. While they're set they are a unique identifier - though not of your machine, since you may have any number of machines sharing an IP address behind a router. All the machines behind the router "look like" they all have the same IP address on the internet.
•
Combinations of everything above are where, I believe, the transient nature of all those means of identifying you can often be mitigated.
When you login, the service now knows your IP address.
When a cookie is uploaded as you visit a site, it might now be associated with your login, and/or your IP address.
If your IP address changes, but the same cookie is delivered, the service could know that it's still the same machine.
If your cookies are cleared, your IP address changes and you logout, but a flash cookie or some Javascript happens to be used, the site you're visiting might still be able to determine that it's the same user or machine as before.
I'm not saying that any of this is happening on any particular site or set of sites. But as you can see, if sites are sufficiently motivated and technically astute they can collect a lot of information.
•
DON'T PANIC
I'm always reluctant to write about this kind of topic, about what kinds of things are possible, because it so often simply feeds people's paranoia. Many folks will read the above and get very scared, thinking that their every move is somehow being tracked on line.
Folks, you're just not that interesting.
By far the vast majority of data collection that's happening is in aggregate - meaning that the habits of thousands if not millions of web users are collected en masse with all individual information being lost in the aggregation. Data like "people who visit Ask Leo! are 20% likely to shop at Amazon.com" is the level of information that's being used. Individual activity, like "Leo Notenboom shops at Amazon and Fry's and also visits CNN.com and somerandomservice.com and looked at these web pages and clicked on these links ..." - even if it is being collected - isn't being looked at by anyone. By and large, it can be used in either of two ways:
as input, without the individual identification, for the aggregate "people are likely to" kinds of calculations I mentioned
for you. For example, if you're a customer of Amazon (or any retailer) you can login and see what you purchased. Perhaps you choose to use Google's Web History so you can see what sites you've visited.
The Caveat
There are two scenarios where your paranoia might be somewhat justified.
You actually are a criminal, a suspected criminal, are under surveillance by law enforcement, or live in a country where law enforcement has been compromised. Depending on how big a fish you really are, "they" could be watching you. Most people just aren't that interesting, but I'm sure that there are a few that are.
Your account's been stolen or compromised. In this case, all the information normally available to you would be available to the person with access to your account. This is perhaps the most likely scenario, and the one for which you would want protect against by keeping your account secure.
Related:
What are tracking cookies and should they concern me? Cookies are placed on your machine by websites, but often more websites than you realize. We'll review cookies and how third parties can use them.
What can a website I visit tell about me? Websites can collect a fair amount of information about you. In this first step we look at what every website sees no mater what it does.
What can people tell from my IP address? People can tell very little from your IP address. They cannot, for example, tell who or where you are. How much they can tell varies a great deal.
Article C3749 - June 1, 2009
Leo great job. Love your articles, and read them when I receive them. I'm not a novice so appreciate your candid solutions or suggestions.
Posted by: Roland at June 2, 2009 8:20 AMfuther more I save them so I can go back when needed.
I'm a senior in a senior and dissabled building and contantly doing my voluntering helping others with their computers.
Keep them coming, and great job.
Roland
Leo,
Posted by: Tom at June 2, 2009 8:48 AMGreat article! I own a medium size ecommerce company (www.tylertool.com) and your dead on. At best I can see which people used google, yahoo or MSN versus came directly to my site by typing the url. Without the cookie that tells me it comes from google I would have no idea how much money i should invest in paid advertising. Without this i don't know my return on ad dollars spent.
Amazon and the bigger companies may have elaborate data mining resources but the smaller companies really don't!
I really enjoy your articles,they are very informative. I have often wondered how companies track my information or how I use my online operations.Thanks
Posted by: Al at June 2, 2009 9:03 AMFirefox has an add-on to delete Flash cookies! And for the truly paranoid, don't forget the index.dat files!
Posted by: sirpaul1 at June 2, 2009 9:09 AMWhat about the MAC on the NIC card/device? All MAC addresses are unique in this world. Couldn't that be used to "know you" even if you had cookies turned off completey?
02-Jun-2009
@Suzy: Leo has an article clearing it up: http://ask-leo.com/can_a_mac_address_be_traced.html
Posted by: Mike at June 2, 2009 10:12 AMLeo, sorry to contradict you (in terms), but although everything you said was true, there are scores of studies showing that with modern data mining techniques, one *can* trace individual information using aggregate data. Data mining is a technology that is so advanced now that it escapes comprehension to even most seasoned IT professionals.
Now, it is true that most corporations aren't interested in particular individuals. But a particularly aggressive mass-mailer might, totalitarian governments (or branches thereof) might - remember the Internet is worldwide, it's not only used in the U.S., and even there your government's record of late is not exactly flawless in that respect... And identity theft gangs might be VERY, VERY interested in that. Remember they are huge now, extremely rich and well-organized, and difficult to trace and frame because they are internationally based and spread through many countries and continents. The world has no borders, and if Big Brother isn't (yet) watching you, someone else might be...
Posted by: UrsoBR at June 2, 2009 1:26 PM@Tom: It's actually not a cookie that tells you where people came from when they visit your site; it's a 'referer' record. This record is always present in a request for a web page, unless the user has gone to great lengths to disable it.
So even in a cookie-free world you would still get those stats you need.
Posted by: Ben D. at June 3, 2009 8:44 PMNot being tracked huh? Well how about this? I go to a website and then a pop up window of a sexy girl comes up and says that she is available. She is in the same town as me or there are some that are 25 miles from me. Ok, I clear out my Internet cookies via CCleaner and restart my computer. All my history is cleared as well. I go back to the same website and again sexy girls in my particular area are showing up. My area not anything over 25 or 50 miles but only my area. How can this be?
12-Jun-2009
@Jon B: Geotargeting http://en.wikipedia.org/wiki/Geo_targeting
Posted by: Jim at June 14, 2009 6:47 AM