Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Everyone's Trying to Track What You Do on the Web: Here's How to Stop Them (lifehacker.com)
58 points by ColinWright on Feb 23, 2012 | hide | past | favorite | 50 comments


I work in the ad industry. We take pains to make sure that everything is hashed as early in the process as possible, so we're basically just correlating hashes with each other. There's nothing personally identifiable.

I understand that it sounds creepy, but it's worth noting that advertising is what pays for all the free content you enjoy on the internet.

EDIT: Downvoted for comments from the industry. Thanks! Let me know if there's any questions I can answer for you guys.


The thing I have with collecting data, even if depersonalized, is that it's practically taken without consent. I'm completely fine, for example, with giving detailed, but anonymous usage statistics to Mozilla, because they nicely asked me if they may do that, to improve Firefox. I'm not okay with people just "taking" those statistics from me.

Tracking should be opt-in, not opt-out. I shouldn't have to tell you to stop tracking me, you should need to get my consent to start tracking me.

>it's worth noting that advertising is what pays for all the free content you enjoy on the internet

Which is why I whitelist trustworthy domains I frequent in Adblock Plus. Again, opt-in.

Also, advertisement need not necessarily be targeted. You can run generalized ads just as well. Sure, it might not be as profitable, but if the trade-off for more profit is less privacy, then I'm afraid I'll give priority to the latter.


Targeted advertising makes ad delivery efficient. If you somehow turned off all targeted ads today, the advertisers are getting less bang for their buck. The price they would be willing to pay sites will drop dramatically. How do you compensate for that lost revenue? More ads? "Donate with PayPal" buttons?

Although there are legitimate concerns about everyone tracking everyone, the Internet is less annoying overall as a result of the advertising efficiency of targeted advertising.


I can understand your point, although I have a small beef with the consent point (you're visiting the website on your own consent, after all). More importantly though, you're basically pitting your privacy against someone else's money. Of course you're going to rank your privacy higher.

What if it was your privacy vs less content on the internet because more sites go out of business?


Gosh I don't know. Maybe sleazy sites getting pushed to the margins would have to develop non-sleazy business models or die trying. In either case, it's a win for the Internet.


Sleazy sites like the new york times? Come up with a list of your favorite sites and I bet, aside from Ycombinator, all of them sell ads on the exchanges. And the only reason ycombinator doesn't have ads is because it is an ad.

If you visit a website, I don't see how it's such a violation of your rights for them to take a note of your visit.


You don't need to track people to show them ads and in no case should you track people without their consent.


That's awfully sanctimonious. Do you cut personal checks to websites who's content you enjoy? If not, they need advertising dollars, and advertising can bring in way more money when anonymized tracking is used.


Yes, actually. If a web site has content worth consuming, and gives me the option, then I make sure they get money. I know I'm not alone - so why do so few sites make this option available? Probably because of the pervasive attitudes like yours that ads are the _only_ way to make money on the internet.


Well, paying for content sounds great but most websites that I visit don't even offer that choice.

Sorry, but you're wrong that tracking is required. TV, Radio and Newspaper have been making money hand over fist for decades without tracking.


Newspapers are making money hand over fist? Funny, I thought they were being massacred and increasing CPMs on their remnant inventory is one of the few bright spots in their finances.

If you visit someone's website, they're not violating your rights by noting that you visited. You're free to not visit, in fact. Or to register yourself on a do-not-track registry, enable ad block, and visit without contributing towards their bottom line. Whatever you want.


I said "TV, Radio and Newspaper" and I also said "for decades" not just the last decade.

> ...they're not violating your rights...

I didn't say squat about "rights". I simply disagreed that tracking is "required" for web sites to make more money on advertising because that is utter hogwash. The ability to track people is a relatively new thing.


I don't think hashing by itself means much. Especially when working with tightly-constrained values.

How about k-anonymity?


Can you explain how hashing is insufficient? You're saying someone could use a lookup table of hashcodes for all the URLs on the internet and deanonymize the URLs? The browser's identity is still unrecoverable.

EDIT: Ah, just looked up k anonymity. We don't store any of the information that they're protecting for, like age, sex, any other personal attributes.


I'm sure you are looking at more than just URLs.

Say you have a lead form with two fields, email and zip code. You would store a variety of data points besides those two. Referring URL, IP address, useragent, etc. If you just hash everything, and I gain access to your hashed values, it would be easy to make a lookup table reversing the one-way hash, at least for some of the data points.

I haven't had a chance to read the k-anonymity or related papers, but from what I understand it's not specific to data points like age/sex/etc.


If you obtain my gender, DOB and zip code (which is not hard - Google demonstrated that had all of that data even though I never gave it to them directly), you can uniquely identify me 80% of the time.

That's insufficient, in my book.


I'm a little skeptical of this claim in light of the http://en.wikipedia.org/wiki/Birthday_problem

23 people means a 50% chance that 2 of them share a birthday. Assuming that your gender and zip code put you in a pool of people > 23, I'm not sure how your statistic holds up. Do you have a citation?


I believe the previous commentator was referring to the work conducted by Latanya Sweeney

http://dataprivacylab.org/projects/identifiability/paper1.pd...

"In this document, I report on experiments I conducted using 1990 U.S. Census summary data to determine how many individuals within geographically situated populations had combinations of demographic values that occurred infrequently. It was found that combinations of few characteristics often combine in populations to uniquely or nearly uniquely identify some individuals. Clearly, data released containing such information about these individuals should not be considered anonymous. Yet, health and other person-specific data are publicly available in this form. Here are some surprising results using only three fields of information, even though typical data releases contain many more fields. It was found that 87% (216 million of 248 million) of the population in the United States had reported characteristics that likely made them unique based only on {5-digit ZIP, gender, date of birth}. About half of the U.S. population (132 million of 248 million or 53%) are likely to be uniquely identified by only {place, gender, date of birth}, where place is basically the city, town, or municipality in which the person resides. And even at the county level, {county, gender, date of birth} are likely to uniquely identify 18% of the U.S. population. In general, few characteristics are needed to uniquely identify a person."


> I understand that it sounds creepy, but it's worth noting that advertising is what pays for all the free content you enjoy on the internet.

I found your statement bit cocky, but you may not see it the same way since you work "in the field".

And you are pretty much wrong. Large part of web is free. Wikipedia? Sure they ask for money and every year they come up with just enough, since people appreciate free content and asked are willing to donate. Heck, people even write free software that you can download from websites that are free of ads [Apache is the first thing came to my mind].

I am sure web would look different have Google not based its business model on advertising, etc, but I would never say that the internet fundamentals are based on paid advertise-displaying model. I don't believe Tim thought it will or should as well.


Ok, allow me to amend my sentence:

"It's worth noting that, aside from a few laudable charitable organizations, advertising is what pays for all of the free content you enjoy on the internet".

Wikipedia and Apache are both one of a kind. 95% of the great content out there (and 100% of the shitty content) is supported by ads.


Maybe it's just to reinforce the point, but Ghostery prevents 9 scripts from executing on that page!


Indeed. I liked the bit where my browser did a lookup on "track.lifehacker.com".


As if to demonstrate the point, the page does not load any content unless javascript is turned on [Firefox], even though the server does not require it in order to provide the article text.

One of the questions in my mind is in regards to Chrome, which I have not used for some time. Does it still send keystrokes from the address/search bar to Google servers in order to provide suggestions?


I think Blogger does the same thing; it kills Instapaper functionality. (Net result: I don't read things published on Blogger.)


> We'd say Ghostery, AdBlock Plus, and Priv3 are the essentials here.

With Ghostery and ABP installed, why is Priv3 also required?


Good question. I tried Priv3 today, but it seems to be the subset of Ghostery.

BTW, it is necessary to use ABP, when I have Ghostery installed? (I don't intent to block ads, unless they actively track me, like Adsense, etc.)


>I don't intent to block ads

May I ask why? I can understand whitelisting domains who you regularly visit and use (and thus want to support with ad income) and whom you trust to serve safe and unobtrusive ads , but I would be going insane without ABP on the web. It's excruciating to use a browser without ABP installed. Imagine trying to hold a conversation with someone while at least half a dozen of people are screaming nonsense at you - that's how it feels to browse the web with ads to me.


I could go without Adblock, but I absolutely could not use the web without Flashblock.

If you run advertisements that automatically play sounds, you ought to be shot and dumped in a river.


Maybe it' the consumer in me, but I feel responsible to view the ads on websites I frequent. If I see something I may like to purchase, then I'd rather the middleman be a website I frequent, so I can show support. May be trained behavior, but it is why I dont use ABP.


I hear people say this, but personally I've never used any ad block and I nearly never see ads. Unless something weird happens like that one add that seemed to reach up into another one, I really never notice them.


Yes, it is necessary. Think of Ghostery as filling the role of a black-list and ABP filling the role of a grey-list (and sometimes white-list).


AFAIK, Ghostery doesn't prevent the tracking scripts from running, it just checks for known bugs


You can switch blocking off, and that is, for some reason I don't understand, the default. Ghostery runs a simple wizard the first time you restart the browser after installing it where it asks you if you want to turn on blocking or not.


Are you sure?


FWIW, my Firefox privacy add-on stack is:

* Adblock Plus

* BetterPrivacy

* Ghostery

* NoScript

* PrivacyChoice TrackerBlock

* QuickJava

* User Agent Switcher

Yes, there is some overlap in functionality. And getting the configurations correct can be a minor pain (none of them are ready to go after install).

I have also found the Chromium version of Ghostery cannot be configured to block DoubleClick tracking. :-( So much for using Chrome-based browsers as my main browser.


Another good add-on to consider is RequestPolicy. It controls cross-site requests and thus blocks a lot of stuff before it even loads. It's a nice complement to NoScript.

If you don't like all of the social media buttons (which are also trackers and are, in my opinion, an eyesore), consider subscribing to the Antisocial filter list[2] for ABP. It blocks all of these pesky, useless annoyances and removes them from sight.

Oh, and btw: Chrome is Chromium based, not the other way round. ;)

[1]: https://addons.mozilla.org/en-US/firefox/addon/requestpolicy...

[2]: http://adversity.uk.to/


Is there any add-on similar to TrackMeNot which can regularly send request for random web pages and obfuscate my internet history? TrackMeNot does similar thing for search engines.


I use Ghostery and Https-Everywhere add-ons, plus startpage.com when I want to use Google search engine.


I'm so tired of this.

If you're scared, get a dog.

Or just read the "Why Should You Care" section and really think about it. Is it so bad? Do they have your gramma's name and address and are they sending the Depends cops over to check if she's peed herself? No, they have a bunch of anonymous data that they never attach to you personally.

Everyone's trying to track YOU? NO, that's just some sort of weird narcissist paranoia on your behalf. Nobody is trying to track YOU. Everyone is trying to track EVERYONE. YOUR information is worthless on its own. NO marketer cares about YOU personally and YOUR data is not worth anything negotiable.

If you're scared, get a dog.


What could possibly go wrong with individuals being tracked to this level? Don't be naive.


Explain, and while you do, remember that "slippery slope" is a logical fallacy.


That's not correct. A slippery slope argument, given realistic risk and possibility, can be a valid argument. And that is exactly what all of the privacy concerns are about: risk.

I think we both know that breaches of sensitive and private data are a real danger and aren't some abstract, theoretical possibility that never happens. They happen all the time. Aggregation of personal data is dangerous by virtue of existence, without even touching the moral issue of whether it's ethical to try to spy into people's lives and habits for whatever reason.

Knowledge is power. And knowing almost everything about a person is one of the most dangerous weapons I can think of. Even more dangerous is that most people do not even realize how dangerous that is. They are either unaware of it, or worse, willfully ignorant.


My instinctual response to this debate is, if you having nothing to hide, why do you care. But you're right, there is a very fine line somewhere here in this quest for data collection, and as history shows again and again, the entities collecting it normally get too greedy, or slip, look the other way, etc, and the consumer ends up being taken advantage of, and having to learn once more, these entities can not be trusted.

If there were more of an "opt-in" mindset to doing such collecting, the situation would be a bit more balanced. I agree visiting a site and consuming its content while demanding the site gets nothing out of it is not balanced. But why can't we meet somewhere in the middle, with more transparency on what is being taken, and more permission asking for taking.


> My instinctual response to this debate is, if you having nothing to hide, why do you care.

My instinctual response to that is to ask if I can watch you the next time you go to the bathroom, or have sex.


While I am not nodata, I have a single acronym answer for you:

STASI


That's exactly what I'm talking about; we just magically went from "people are trying to target you with ads that are relevant to you" to "the secret police are going to make you disappear."

And that's the connection made when I specifically request an argument that's not slippery slope.

And I'm downvoted for specifically requesting a valid argument. Thank you, HN, or should I say, Reddit Lite.


> "the secret police are going to make you disappear."

Easily said, Tony, as a white male living in America. Some of us actually lived behind the Iron Curtain, and secret police and antagonistic governments aren't "magical" and hypothetical to us.


I know I'm not meant to feed the trolls, but I do think it's important to imagine what life would have been like in Germany all those years ago if Facebook had existed.


Everyone tracking everyone IS the problem.


> Everyone is trying to track EVERYONE

I used to think this, but the Target article on NY Times changed my mind. I've started locking down all my computers.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: