Summary: Many people are concerned about website tracking and monitoring. While it's not something most need to be concerned about, it can get quite involved.
Other than using spyware and cookies which can be deleted from our PC (hopefully), how can websites or search engines continuously track and monitor our internet activities from our home PC? I read from one of your earlier articles that most people probably have a "dynamic" IP address. Assuming that is true for me, and my IP address constantly changes, how can an IP address be used to identify me for any significant length of time? (My IP address today could be yours tomorrow.) And even if the website/search engine knew my IP address at a single point in time, how can they connect that IP address to my name (if I don't register it) and physical location? I'm guessing that my ISP can make this connection, but I assume they won't provide that information to just anyone, right?)
•
That question covers a lot of ground, from cookies to IP tracking. It also misses a couple of areas that are worth thinking about as well.
But I do have to point out one important thing for most people: you, as an individual, just aren't that interesting. Sorry to burst your bubble, but it's pretty likely no one really cares where you go or what you do.
Let's see what they might care about, and the ways that they can collect it.
•
Cookies are, by far, the most common way for web sites to preserve information about your usage. While cookies ostensibly only store information on your machine, that information can of course be used to access other information stored elsewhere.
Cookies typically just store small bit of information on your machine, and then each time you go back to a page on the same site as stored the cookie originally, that number is sent along. That bit of information might be a login ID so that as you access on-line email account you don't have to login for each and every page.
That's why completely disabling cookies can be such a pain. Many web sites simply rely on cookies to keep you logged in, and keep the experience of using them somewhat manageable. It would really suck if you actually had to login each time you wanted to see the next email message in your inbox. Cookies solve that.
But yes, cookies are one way - the most obvious way - that sites and particularly advertising services - can collect information about the sites you might visit on the web. They may not know it's you, but your machine that's visited these sites, this many times over this period of time.
•
Logging In is probably the least thought of way that web sites collect information. When you login to a service, by definition you've identified yourself (and your IP address, but more on that below). The service then "knows" who you are - to the extent that you've provided that information - for as long a you're logged in.
The 'catch' is that logging in to one service could identify you with all services from the same provider. And we provide a lot of information to the various services we interact with.
Consider Google. Logging into GMail also identifies you for iGoogle, Google Calendar, Google News, and all Google services, including Google Web History, which keeps a history of all the sites you visit while logged in.
It's not uncommon. Login to Hotmail, and you've actually logged in to all Windows Live Services. Login to Yahoo mail, and all Yahoo services may follow.
•
Flash and Javascript can also be used to collect information about how they're being used. Flash even has its own version of cookies that are not the same as browser cookies, and are not clear by a browser's cookie management functions.
Javascript, when enabled, can also be used to send off some additional information to the sites you're visiting in ways that bypass cookies.
•
IP addresses are what typically get everyone all excited and concerned, and for no real good reason. As I've said over and over and over again here and elsewhere: IP addresses cannot be traced to your physical location without legal intervention.
They can, however, occasionally be used as a tracing mechanism. As you say, IP addresses can change, but unless you're on dial-up they actually don't change that often. While they're set they are a unique identifier - though not of your machine, since you may have any number of machines sharing an IP address behind a router. All the machines behind the router "look like" they all have the same IP address on the internet.
•
Combinations of everything above are where, I believe, the transient nature of all those means of identifying you can often be mitigated.
When you login, the service now knows your IP address.
When a cookie is uploaded as you visit a site, it might now be associated with your login, and/or your IP address.
If your IP address changes, but the same cookie is delivered, the service could know that it's still the same machine.
If your cookies are cleared, your IP address changes and you logout, but a flash cookie or some Javascript happens to be used, the site you're visiting might still be able to determine that it's the same user or machine as before.
I'm not saying that any of this is happening on any particular site or set of sites. But as you can see, if sites are sufficiently motivated and technically astute they can collect a lot of information.
•
DON'T PANIC
I'm always reluctant to write about this kind of topic, about what kinds of things are possible, because it so often simply feeds people's paranoia. Many folks will read the above and get very scared, thinking that their every move is somehow being tracked on line.
Folks, you're just not that interesting.
By far the vast majority of data collection that's happening is in aggregate - meaning that the habits of thousands if not millions of web users are collected en masse with all individual information being lost in the aggregation. Data like "people who visit Ask Leo! are 20% likely to shop at Amazon.com" is the level of information that's being used. Individual activity, like "Leo Notenboom shops at Amazon and Fry's and also visits CNN.com and somerandomservice.com and looked at these web pages and clicked on these links ..." - even if it is being collected - isn't being looked at by anyone. By and large, it can be used in either of two ways:
as input, without the individual identification, for the aggregate "people are likely to" kinds of calculations I mentioned
for you. For example, if you're a customer of Amazon (or any retailer) you can login and see what you purchased. Perhaps you choose to use Google's Web History so you can see what sites you've visited.
The Caveat
There are two scenarios where your paranoia might be somewhat justified.
You actually are a criminal, a suspected criminal, are under surveillance by law enforcement, or live in a country where law enforcement has been compromised. Depending on how big a fish you really are, "they" could be watching you. Most people just aren't that interesting, but I'm sure that there are a few that are.
Your account's been stolen or compromised. In this case, all the information normally available to you would be available to the person with access to your account. This is perhaps the most likely scenario, and the one for which you would want protect against by keeping your account secure.
Article C3749 - June 1, 2009
Leo, sorry to contradict you (in terms), but although everything you said was true, there are scores of studies showing that with modern data mining techniques, one *can* trace individual information using aggregate data. Data mining is a technology that is so advanced now that it escapes comprehension to even most seasoned IT professionals.
Now, it is true that most corporations aren't interested in particular individuals. But a particularly aggressive mass-mailer might, totalitarian governments (or branches thereof) might - remember the Internet is worldwide, it's not only used in the U.S., and even there your government's record of late is not exactly flawless in that respect... And identity theft gangs might be VERY, VERY interested in that. Remember they are huge now, extremely rich and well-organized, and difficult to trace and frame because they are internationally based and spread through many countries and continents. The world has no borders, and if Big Brother isn't (yet) watching you, someone else might be...
Posted by: UrsoBR at June 2, 2009 1:26 PM@Tom: It's actually not a cookie that tells you where people came from when they visit your site; it's a 'referer' record. This record is always present in a request for a web page, unless the user has gone to great lengths to disable it.
So even in a cookie-free world you would still get those stats you need.
Posted by: Ben D. at June 3, 2009 8:44 PMNot being tracked huh? Well how about this? I go to a website and then a pop up window of a sexy girl comes up and says that she is available. She is in the same town as me or there are some that are 25 miles from me. Ok, I clear out my Internet cookies via CCleaner and restart my computer. All my history is cleared as well. I go back to the same website and again sexy girls in my particular area are showing up. My area not anything over 25 or 50 miles but only my area. How can this be?
12-Jun-2009
@Jon B: Geotargeting http://en.wikipedia.org/wiki/Geo_targeting
Posted by: Jim at June 14, 2009 6:47 AMHi..one more important thing here is that ..the user is also tracked based on the mac id of your computer.
Posted by: Bk at March 17, 2010 7:44 PM