Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Facebook self-censorship: What happens to the posts you don’t publish? (slate.com)
99 points by ForHackernews on Dec 13, 2013 | hide | past | favorite | 44 comments


There's no need for speculation here. Open Facebook, right click, select "Inspect Element" then the "Network" tab.

On a self-censored status update, they send a request to /ajax/bz that triggers their censorlogger, but don't pass the content itself.

On messages & chat, they send a request to /ajax/messaging/typ.php to show the other person you're typing, but again, they don't pass your message to their servers.

On search, they send a request to /ajax/typeahead/search/facebar/query/ which passes everything you type to their server to power their auto-complete functionality.


Yes you could do that two seconds worth of testing, in basically every browser these days even. But then you couldn't write a link bait article about the possibility that they are doing it.


This is Slate. I expect nothing better.


"I don't know for certain that this has never happened..."


I have done this before, I've done experiments where I meticulously monitored all HTTP requests sent and I could've sworn I saw questionable activity. Doing it now I get results that seem similar to your description.

I don't know what to say, other than that experience has taught me to always be on the skeptical side of things when we're talking about Facebook.


Well if I was really paranoid I might argue that they could save that information locally then upload it later when Facebook loses the browser's focus (ie. switch tab).


Alright. Go ahead and test that. Have your machine connect through a proxy, leave it open for a day, see what you get


I don't think /ajax/bz has anything to do with censorlogging. I'm seeing it hit just for scrolling down the page.


I think it may be their generic logging system. The JSON passed back to their server contained the phrase trigger:"censorlogging" when I bailed on the status update.


ah ok


Someone should fix the HN headline, currently "Facebook monitors and analyzes posts that users type but don't submit".

In their article, Das and Kramer claim to only send back information to Facebook that indicates /whether/ you self-censored, not /what/ you typed. The Facebook rep I spoke with agreed that the company isn’t collecting the text of self-censored posts.... "we have arrived at a better understanding of how and where self-censorship manifests on social media; next, we will need to better understand what and why." This implies that Facebook wants to know what you are typing in order to understand it.

Facebook is not analyzing posts you don't submit.


Title uses vague words that are not necessarily untrue. Facebook is analyzing self-censorship, just not the contents of self-censored posts. What is maybe interesting to users is that there are people at Facebook who think it isn't right for the user to self-censor, that it apparently robs Facebook and your "friends" of the value in that self-censored post. That seems mildly insane and I'm glad I do not use Facebook.


> What is maybe interesting to users is that there are people at Facebook who think it isn't right for the user to self-censor, that it apparently robs Facebook and your "friends" of the value in that self-censored post.

This is FUD. Facebook has a large data science team that publishes lots of studies on online behavior [1]. These are academic, peer-reviewed research papers. Insinuating that FB as an entity thinks that "it isn't right for the user to self-censor" is unfounded speculation.

[1] https://www.facebook.com/publications


>> there are people at Facebook who think

> FB as an entity thinks

There's your disconnect


> What is maybe interesting to users is that there are people at Facebook who think it isn't right for the user to self-censor, that it apparently robs Facebook and your "friends" of the value in that self-censored post.

This is exactly the mentality that causes me to barely visit my Facebook timeline these days. I use it less and less.


From the paper:

Furthermore, all instrumentation is done client-side. In other words, the content of self-censored posts was not sent back to Facebook's servers

https://autoblog.postblue.info/autoblogs/wwwinternetactunet_...

Of course, that doesn't mean they _couldn't_ do this, just that they say they don't currently.


But there's lots of things people and companies _could_ do, they just don't do it currently.


This should be more or less confirmable by using a traffic inspector, right? (tamper headers, web inspector, etc) If Facebook is only checking that you've entered some text, then submitting that boolean (or text length) as a POST?


I wasn't sure how to write a title that was concise and captured the substance of the article.

It seems like you want to split hairs about whether "analyzing posts" means examining text or metadata. Do you think that "NSA is analyzing phone calls" would be an accurate description of their metadata dragnet?

At any rate, if they aren't looking at the content of self-censored posts yet, it certainly sounds like they're at least interested in doing so. According to the article, "This implies that Facebook wants to know what you are typing in order to understand it."


Normally, the answer to your question is "nothing". In this case -- a bit of short term research -- you should read what the paper actually says:

Content was then marked as “censored” if it was not shared within the subsequent ten minutes; using this threshold allowed us to record only the presence or absence of text entered, not the keystrokes or content.

If content entered were to go unposted for ten minutes and then be posted, we argue that it was indeed censored (albeit temporarily).

These analyses were conducted in an anonymous manner, so researchers were not privy to any specific user’s activity.

Furthermore, all instrumentation was done on the client side. In other words, the content of self-censored posts and comments was not sent back to Facebook’s servers: Only a binary value that content was entered at all.


So you're trusting what a Facebook rep says as truth?


You can see for yourself what it does. Just look at web inspector as you type a post.


Thanks. I saw another post mentioning this too. I don't have time to test it. I'm not really worried about that all as I don't ever type "I'm going to kill you!" in a fit of rage, come to my senses / re-think it, and then delete the words I typed. :P


Over wild, baseless speculation? Yes.


Wouldn't it be better to trust your own investigation - sans taking the accused party's word for it though?


I'm not "trusting" anybody. The headline made up facts that do not appear anywhere in the article.


The highest correlation for censoring are these demographics: Average number of friends of friends 1.32, Group member count 1.29, Gender: Male 1.26, Gender: Male X Percentage male friends 1.11.

The lowest correlation for censoring are these demographics: Age 0.85, Percentage friends conservative 0.77, Percentage friends liberal 0.77. Extrapolating from the results,

Group that self-censors the most seems to be: Young male user with many male friends who are in a tight network.

Group that self-censors the least seems to be: Old users who are politically aligned

Probably expected, but hilarious anyways.


This is why before I post to Facebook I write my post in vi, edit it repeatedly, consider what my life has become, and then :q!.


Why even bother reaching out to a Facebook representative, one can always monitor the HTTP requests that are sent.

To me it seems completely speculative and the image of an incomplete post at the beginning of the article enables people who don't read the article completely to draw uninformed conclusions.


This rather interesting, but harmless paper is getting blown way out of proportion in this article. All they do is see who is returning information, and make hypothesis about the demographics of those people. Their methodology is very clearly explained in the article "using this threshold allowed us to record only the presence or absence of text entered, not the keystrokes or content."

Furthermore, and somewhat amusingly, "a summer intern conducted this work." Thus, it was mostly a summer project, researching demographics about people who started to comment and then stopped.

They put forth 7 major hypothesis:

1) posts will be censored more than comments 2) men will self-censor more than women 3) users with more opposite-sex friends will self-censor more 4) younger users will self-censor less 5) users with older friends will censor more 6) users who more frequently used audience selection tools self-censor less 7) users with more diverse friends will self-censor more

All of these are interesting questions based in a large amount of literature that they could uniquely address and answer with the data that they had. Recording whether or not users enter and then remove data is honestly relatively unimportant. Yes, websites taking user data without asking is an invasion of privacy, but I think this sort of article represents a type of knee jerk alarmism and exaggeration that can be incredibly hurtful to efforts to use data Facebook has available to answer interesting questions about human interaction. It also, furthermore, represents a purposeful misreading of the article, which is journalistically questionable.

If you want to read the article for yourself you can find it at https://www.facebook.com/publications/493601774027388/.


In this age of the "post-page" Internet, it is best to assume that everything you do on a... "page" (cough) may be communicated, unless you take measures yourself to stop this.

This will continue to be a topic of concern and debate for at least some of us: Whose agent is the browser? The client's, or the host's?

Currently, design trends towards more and more "programmatic" content and behavior seems to be favoring the latter.

(I, for one, do not like this change.)


It would be interesting if the internet was born fully formed. If there was not an embedded base of people "thinking it was like it used to be". My guess is it may have never got off the ground. But now that there is sufficient momentum behind a paradigm that makes people predictable (ie, the "old way"), the confidence has been built up to perfectly suit people taking advantage of this very predictability. For the most part, this revolves around "free", "anonymous", "open", etc. The powers that be do not want these things to exist...for the very reason that money is to be made by selling the picks and shovels to those who dream they may one day achieve these things. Heaven forbid you provide this to them, they would then be outside the system of tribute to those who just happen to be in charge.


Disable JavaScript > The blue pill.


Facebook, as usual, simply gives no regard to personal privacy. They will do whatever it takes to exploit your personal info, including your deepest and darkest thoughts, in order to make money. Mark Zuckerberg is a shameless person. Just look at how he copies every single competitor shamelessly. What can we except from a company with a leader like that? Shameless exploitation of course.


They appear to log both when you hover over the input box without clicking, and when you click it and enter text in addition to the aforementioned time delay trigger (that I didn't test).

Here is some of the data they are transmitting:

[{"user": "BLANKED","page_id":"BLANKED","posts":[["censorlogger",{"cl_impid":"BLANKED","clearcounter":0,"instrument":"composer","elementid":"BLANKED","parent_fbid":BLANKED,"version":"x"}, BLANKED,BLANKED]],"trigger":"censorlogger"}]

BLANKED is inserted where some of the data could potentially be personally identifiable information. Of note is the "clearcounter". They apparently evaluate how many times you clear the data. The ajax call is apparently a more general function as they must supply the trigger.


Seems like a scare story conflating metadata (which they collect) with data (which they do not). While certainly the serriptitous and opaque tracking of meta-data is unwanted by many, the idea that they might, maybe, want to track 'actual data' is latent premise. That they don't, is the reality. But there is no neutralizing this fear, here. Because in fact, they can.

So, in eseence the tL;DR = FUD.

The Facebook rep I spoke with agreed that the company isn’t collecting the text of self-censored posts. But it’s certainly technologically possible, and it’s clear that Facebook is interested in the content of your self-censored posts. Das and Kramer’s article closes with the following: "we have arrived at a better understanding of how and where self-censorship manifests on social media; next, we will need to better understand what and why."


I assumed they did this and just don't enter anything private on such websites, even without hitting enter. I never actually checked, but what better way to learn thoughts than by unspoken words?


This is almost as creepy as lie-detection by analysis of subvocalisation. Almost.


I have always asumed they saved the text as a hole. This comes more as a relief than a revelation for me.


That was a pretty long article to give the answer to the question in the title: "nothing".


I'm not sure what is worse - that Facebook allows data collection for these type of studies or that there are engineers and data scientists willing to study it.

Does Facebook wonder why we don't trust them - or even care?


I checked the network and it's not sending any packages to facebook while typing or on release or on delete... :) wtf? just check the info before writing such a big article about a lie :))))))


Why not show us a Wireshark screenshot?


So they're running E2E tests on DOM states?

I see a whole new class of submission:

    {{Software company}} uses robust architecture patterns on Webpages




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: