I work in the ad industry. We take pains to make sure that everything is hashed as early in the process as possible, so we're basically just correlating hashes with each other. There's nothing personally identifiable.
I understand that it sounds creepy, but it's worth noting that advertising is what pays for all the free content you enjoy on the internet.
EDIT: Downvoted for comments from the industry. Thanks! Let me know if there's any questions I can answer for you guys.
The thing I have with collecting data, even if depersonalized, is that it's practically taken without consent. I'm completely fine, for example, with giving detailed, but anonymous usage statistics to Mozilla, because they nicely asked me if they may do that, to improve Firefox. I'm not okay with people just "taking" those statistics from me.
Tracking should be opt-in, not opt-out. I shouldn't have to tell you to stop tracking me, you should need to get my consent to start tracking me.
>it's worth noting that advertising is what pays for all the free content you enjoy on the internet
Which is why I whitelist trustworthy domains I frequent in Adblock Plus. Again, opt-in.
Also, advertisement need not necessarily be targeted. You can run generalized ads just as well. Sure, it might not be as profitable, but if the trade-off for more profit is less privacy, then I'm afraid I'll give priority to the latter.
Targeted advertising makes ad delivery efficient. If you somehow turned off all targeted ads today, the advertisers are getting less bang for their buck. The price they would be willing to pay sites will drop dramatically. How do you compensate for that lost revenue? More ads? "Donate with PayPal" buttons?
Although there are legitimate concerns about everyone tracking everyone, the Internet is less annoying overall as a result of the advertising efficiency of targeted advertising.
I can understand your point, although I have a small beef with the consent point (you're visiting the website on your own consent, after all). More importantly though, you're basically pitting your privacy against someone else's money. Of course you're going to rank your privacy higher.
What if it was your privacy vs less content on the internet because more sites go out of business?
Gosh I don't know. Maybe sleazy sites getting pushed to the margins would have to develop non-sleazy business models or die trying. In either case, it's a win for the Internet.
Sleazy sites like the new york times? Come up with a list of your favorite sites and I bet, aside from Ycombinator, all of them sell ads on the exchanges. And the only reason ycombinator doesn't have ads is because it is an ad.
If you visit a website, I don't see how it's such a violation of your rights for them to take a note of your visit.
That's awfully sanctimonious. Do you cut personal checks to websites who's content you enjoy? If not, they need advertising dollars, and advertising can bring in way more money when anonymized tracking is used.
Yes, actually. If a web site has content worth consuming, and gives me the option, then I make sure they get money. I know I'm not alone - so why do so few sites make this option available? Probably because of the pervasive attitudes like yours that ads are the _only_ way to make money on the internet.
Newspapers are making money hand over fist? Funny, I thought they were being massacred and increasing CPMs on their remnant inventory is one of the few bright spots in their finances.
If you visit someone's website, they're not violating your rights by noting that you visited. You're free to not visit, in fact. Or to register yourself on a do-not-track registry, enable ad block, and visit without contributing towards their bottom line. Whatever you want.
I said "TV, Radio and Newspaper" and I also said "for decades" not just the last decade.
> ...they're not violating your rights...
I didn't say squat about "rights". I simply disagreed that tracking is "required" for web sites to make more money on advertising because that is utter hogwash. The ability to track people is a relatively new thing.
Can you explain how hashing is insufficient? You're saying someone could use a lookup table of hashcodes for all the URLs on the internet and deanonymize the URLs? The browser's identity is still unrecoverable.
EDIT: Ah, just looked up k anonymity. We don't store any of the information that they're protecting for, like age, sex, any other personal attributes.
Say you have a lead form with two fields, email and zip code. You would store a variety of data points besides those two. Referring URL, IP address, useragent, etc. If you just hash everything, and I gain access to your hashed values, it would be easy to make a lookup table reversing the one-way hash, at least for some of the data points.
I haven't had a chance to read the k-anonymity or related papers, but from what I understand it's not specific to data points like age/sex/etc.
If you obtain my gender, DOB and zip code (which is not hard - Google demonstrated that had all of that data even though I never gave it to them directly), you can uniquely identify me 80% of the time.
23 people means a 50% chance that 2 of them share a birthday. Assuming that your gender and zip code put you in a pool of people > 23, I'm not sure how your statistic holds up. Do you have a citation?
"In this document, I report on experiments I conducted using 1990 U.S. Census summary data to determine how many individuals within geographically situated populations had combinations of demographic values that occurred infrequently. It was found that combinations of few characteristics often combine in populations to uniquely or nearly uniquely identify some individuals. Clearly, data released containing such information about these individuals should not be considered anonymous. Yet, health and other person-specific data are publicly available in this form. Here are some surprising results using only three fields of information, even though typical data releases contain many more fields. It was found that 87% (216 million of 248 million) of the population in the United States had reported characteristics that likely made them unique based only on {5-digit ZIP, gender, date of birth}. About half of the U.S. population (132 million of 248 million or 53%) are likely to be uniquely identified by only {place, gender, date of birth}, where place is basically the city, town, or municipality in which the person resides. And even at the county level, {county, gender, date of birth} are likely to uniquely identify 18% of the U.S. population. In general, few characteristics are needed to uniquely identify a person."
> I understand that it sounds creepy, but it's worth noting that advertising is what pays for all the free content you enjoy on the internet.
I found your statement bit cocky, but you may not see it the same way since you work "in the field".
And you are pretty much wrong. Large part of web is free. Wikipedia? Sure they ask for money and every year they come up with just enough, since people appreciate free content and asked are willing to donate. Heck, people even write free software that you can download from websites that are free of ads [Apache is the first thing came to my mind].
I am sure web would look different have Google not based its business model on advertising, etc, but I would never say that the internet fundamentals are based on paid advertise-displaying model. I don't believe Tim thought it will or should as well.
"It's worth noting that, aside from a few laudable charitable organizations, advertising is what pays for all of the free content you enjoy on the internet".
Wikipedia and Apache are both one of a kind. 95% of the great content out there (and 100% of the shitty content) is supported by ads.
As if to demonstrate the point, the page does not load any content unless javascript is turned on [Firefox], even though the server does not require it in order to provide the article text.
One of the questions in my mind is in regards to Chrome, which I have not used for some time. Does it still send keystrokes from the address/search bar to Google servers in order to provide suggestions?
May I ask why? I can understand whitelisting domains who you regularly visit and use (and thus want to support with ad income) and whom you trust to serve safe and unobtrusive ads , but I would be going insane without ABP on the web. It's excruciating to use a browser without ABP installed. Imagine trying to hold a conversation with someone while at least half a dozen of people are screaming nonsense at you - that's how it feels to browse the web with ads to me.
Maybe it' the consumer in me, but I feel responsible to view the ads on websites I frequent. If I see something I may like to purchase, then I'd rather the middleman be a website I frequent, so I can show support. May be trained behavior, but it is why I dont use ABP.
I hear people say this, but personally I've never used any ad block and I nearly never see ads. Unless something weird happens like that one add that seemed to reach up into another one, I really never notice them.
You can switch blocking off, and that is, for some reason I don't understand, the default. Ghostery runs a simple wizard the first time you restart the browser after installing it where it asks you if you want to turn on blocking or not.
Yes, there is some overlap in functionality. And getting the configurations correct can be a minor pain (none of them are ready to go after install).
I have also found the Chromium version of Ghostery cannot be configured to block DoubleClick tracking. :-( So much for using Chrome-based browsers as my main browser.
Another good add-on to consider is RequestPolicy. It controls cross-site requests and thus blocks a lot of stuff before it even loads. It's a nice complement to NoScript.
If you don't like all of the social media buttons (which are also trackers and are, in my opinion, an eyesore), consider subscribing to the Antisocial filter list[2] for ABP. It blocks all of these pesky, useless annoyances and removes them from sight.
Oh, and btw: Chrome is Chromium based, not the other way round. ;)
Is there any add-on similar to TrackMeNot which can regularly send request for random web pages and obfuscate my internet history? TrackMeNot does similar thing for search engines.
Or just read the "Why Should You Care" section and really think about it. Is it so bad? Do they have your gramma's name and address and are they sending the Depends cops over to check if she's peed herself? No, they have a bunch of anonymous data that they never attach to you personally.
Everyone's trying to track YOU? NO, that's just some sort of weird narcissist paranoia on your behalf. Nobody is trying to track YOU. Everyone is trying to track EVERYONE. YOUR information is worthless on its own. NO marketer cares about YOU personally and YOUR data is not worth anything negotiable.
That's not correct. A slippery slope argument, given realistic risk and possibility, can be a valid argument. And that is exactly what all of the privacy concerns are about: risk.
I think we both know that breaches of sensitive and private data are a real danger and aren't some abstract, theoretical possibility that never happens. They happen all the time. Aggregation of personal data is dangerous by virtue of existence, without even touching the moral issue of whether it's ethical to try to spy into people's lives and habits for whatever reason.
Knowledge is power. And knowing almost everything about a person is one of the most dangerous weapons I can think of. Even more dangerous is that most people do not even realize how dangerous that is. They are either unaware of it, or worse, willfully ignorant.
My instinctual response to this debate is, if you having nothing to hide, why do you care. But you're right, there is a very fine line somewhere here in this quest for data collection, and as history shows again and again, the entities collecting it normally get too greedy, or slip, look the other way, etc, and the consumer ends up being taken advantage of, and having to learn once more, these entities can not be trusted.
If there were more of an "opt-in" mindset to doing such collecting, the situation would be a bit more balanced. I agree visiting a site and consuming its content while demanding the site gets nothing out of it is not balanced. But why can't we meet somewhere in the middle, with more transparency on what is being taken, and more permission asking for taking.
That's exactly what I'm talking about; we just magically went from "people are trying to target you with ads that are relevant to you" to "the secret police are going to make you disappear."
And that's the connection made when I specifically request an argument that's not slippery slope.
And I'm downvoted for specifically requesting a valid argument. Thank you, HN, or should I say, Reddit Lite.
> "the secret police are going to make you disappear."
Easily said, Tony, as a white male living in America. Some of us actually lived behind the Iron Curtain, and secret police and antagonistic governments aren't "magical" and hypothetical to us.
I know I'm not meant to feed the trolls, but I do think it's important to imagine what life would have been like in Germany all those years ago if Facebook had existed.
I understand that it sounds creepy, but it's worth noting that advertising is what pays for all the free content you enjoy on the internet.
EDIT: Downvoted for comments from the industry. Thanks! Let me know if there's any questions I can answer for you guys.