A few weeks ago there was a similar discussion, and I commented the following:
If you think there is no problem, you are wrong. The blog post does not show all the information leaks that this implies. Example: I can modify the script to monitor all the numbers I've in my phone, so that based on the online/offline status in a few weeks I can be able to guess who is having conversations together, discovering cheatings, work affairs, ...
EDIT: Practical example. After collecting enough data about user X I create a table about the probability of this user being online in a given few-minutes time ranges. Then I check the online frequency of that user compared to the online statuses of another user Y. If the difference compared to the expected probability is significant, than I can suspect the two are chatting.
Another thing I can use is that attivation delay of the online status, since often X sends a message to Y and this results in, a few seconds after, Y to be online, and then the contrary.
[then an HN user said she/he was not sure this was serious because maybe the users casually had similar patterns, so I replied:]
If you check the model I described in my comment, it should filter the "bus problem", since it will detect a chat only if, compared to the standard "bus time" probability of the user A chatting, it is chatting more if in the same range also B is chatting. If you add to this that people on Whatsapp usually do not talk to the exact minutes, it is definitely possible to create a robust system for guessing with good probability of two have often conversations. Also note that the phone numbers in input are not random, are the ones of a connected circle of persons. Add to this the fact that we can split the ranges even, potentially, by few minutes, and you can even detect interesting stuff for people having continuos chats with multiple persons like teenagers. Another thing that is possible probably is also "groups detection", since at new messages a set of users will activate at the same time.
[And the attack can be refined a lot with more powerful mathematical approaches]
The main objective, however, was not to stalk innocent users but to catch an anonymous IRC troll who was using an identless shell server in order to hide their real account name. Every time the troll wrote to IRC, the activity logger program showed typing activity from a certain user. After a few message exchanges during quiet night hours I was able to reliably pinpoint them.
That'll be the boring part of the story. I just /msged her primary nick and asked nicely if it was possible to stop. Apparently the threat of losing anonymity is enough to turn trolls back to normal people.
I'm using an xposed mod to continually be online on WhatsApp, even when my phone's screen if off. Would this thwart such attacks? I mean it would be an anomaly, but other than that I don't know what other information you could get, except the last Online/Offline never changing.
It confuses my friends though, who write me more at night times than usually, luckily though I've a DnD mode that saves me from waking up. (confuses gf too)
This isn’t just necessarily a problem with WhatsApp. The same applies to IRC, if you set away states.
Even if you don’t set away states, one can simply monitor every channel you’re in, every message you send, and then quickly determine what timezone you’re in, when you sleep, when you’re on vacation, etc.
Here’s an example graph of a user, every dot is a message: https://i.imgur.com/DrgVvVw.png and here one from a user with more regular sleep patterns: https://i.imgur.com/a1xdSqR.png (notice the timezone transition when daylight savings time starts? And notice how the user takes about 2 weeks to adjust?)
In chat applications, those features were the first to get disabled. As I recall, one of the MSN Messenger features required you to sign in before you disabled it.
Anyhow, I'd disable showing online status, typing status, or automatically changing status based on activity.
This was a decade and a half ago, probably longer. The principle remains the same. No, no I don't want you to know when I'm in front of my computer, typing, or otherwise. If I want to appear online, I'll manually do so.
This is much more interesting because pretty much everyone only participates in discussions when posts are on the front page - it would be tough to schedule/delay a post and stay relevant. Also, the lock-in after 1 hour (or a reply) preventing deletion is huge.
Some HN participants are now kind of "whales" in the startup community - at the very least, this info could be used to schedule cold-pitch emails! (And this is across the entire archive of past users, not just current users. These habits need not necessarily change much.)
Timestamp metadata is all over the place - GitHub activity graph, blog post comments, etc. -- merging timestamps for the same person across their accounts on all the different services offers amazing insights.
Correct, but this doesn't appear to be group chat. This appears to be individual chat and the timing attacks based on their online presence. Group chat, by default, is going to seriously hinder one's ability to remain private.
Waybackwhen in 1998, when ICQ was a thing, I had an ICQ client on my Amiga that was scriptable. It was fairly trivial to write a quick program to tell it to change status at random times, to confuse people as to my whereabouts.
Actually, most normal working people are similar to the second person.
Only few people have sleep patterns like me (first, erratic graph), and I have them because I spend often my nights working on projects, trying to build new products, and once I've started one, it's hard to stop.
Most Tor busts follow a similar pattern, watching both ends of the connection.
There is a real need for a "tor delay" metadata-disruption-as-a-service, where random strangers invoke one another's web callbacks and report back the result in exchange for Bitcoin (Strangers on a Train -style). Someone put it on the block chain and start an ICO!
same risk as hosting an exit node. It could take a while convincing police that you obfuscated traffic patterns without any involvement in the crimes they're after. I think most people wouldn't want to take that risk.
As I understand random strangers are logged on to tor and invoke each others' callbacks and give back results. Since all of them are anonymized, This is not at all similar to an exit node.
The only purpose of this is to make tor packet traffic patterns hard to follow :)
This could be a service, but not sure if this can be filtered out by the snooper. These will be one off requests from random nodes and will not affect your tor traffic pattern much because I posit the signal to noise ratio of your main activity will be pretty high. Hmm, :thinking: perhaps if we jack up this random traffic, would that hide your main traffic maybe.
Anyone who knows such analyses want to chime in? :)
The thing is, this method works pretty well if people are chatting in real time, if you wait like 10 minutes to answers messages, it is much more difficult to create the links.
Moreover if people are using all the time Whatsapp, it is again much more difficult to do.
But I agree with you, there are many situations where these could work
Unfortunately even under much more noise than the Whatsapp activation patterns, we have seen timing attacks working in incredible reliable ways, with the network in the middle adding random delays, and even when the task at hand was to misure very small differences in time. So I guess that if this attack already seems feasible in certain contexts, it can only get much better using more advanced techniques.
A similar indirect way can be used to extract information out of Google's database. For example, launch an ad-campaign for any product, directed at people who love cats. Now if people click on the ad and buy the product, you know they must love cats.
This might seem like a brilliant idea, but you'll run out of money before you map all people for all things.
If you have so much money (spit balling here), you could buy google itself I think
There are easier ways to get data on people like their social profiles, and other online breadcrumbs like yelp reviews, any digital footprint really.
Another way is to buy databases of people. People have databases of HNIs, etc that you can purchase. This of course doesn't lend itself to much analysis but if the main purpose was to market to them or something like that, then databases work best :)
I tried to do something like that over 2 years ago. I never got to work on the analyze part of the data. I still have a 2GB database of online/offline, status (text status thingy) and profile picture changes. Someday I'll get back to that data and analyze it.
If you trust those """services""" to be secure and trust that they care about your privacy, then you will be betrayed sooner or later, in ways you can't think of -- just like in the article.
Fun fact, years ago I accidentally found out that my girlfriend at the time cheated on me on Snapchat, without me actually exploiting anything. She told me to join it with her, telling me that is going to be fun. Snapchat kept track of useds' activity and gamified it to incentivize you by scoring your activity then. Each person has a public activity score when you tap on their profile. One day, I noticed that her Snapchat had more than twice the score that I had. So I clicked on her profile and there it is some strange dude having a score higher than me, it turned out that was her """"ex"""" (I actually never asked her even for his name before, I found out only after that). I never consciously looked for anything, I trusted her 100%, the score was just there on my screen.
Thanks Snapchat for their stupid gamification efforts, otherwise I would have wasted more time on her. But since that accident, I never trust proprietary shit that has money to make, ads to sell, governments to please, and incentives to grow, even it says its selling point is to protect your privacy, like Snapchat. It's not about the "end to end encryption" or "finer privacy control" or "only allow when app is in foreground" or "restricted sharing" or "MIT open sauce license" or "export your data" or "only listening to hotwords" or "open APIs," it's about the intent. If the intent was to expand and make money, then all those techs won't be the magic pill that suddenly cures the ill intent. Anyway, privacy my ass, man.
Wait, when you view her profile (as a friend), it shows who has the highest 'score' in terms of contact with her? Wow, that IS a lot of data if they break it down by contact pairs.
Only log me, but don't let my friends know: You know your privacy is respected jack shit when the least intrusive setting is letting the service know and log you, but not letting your friends know.
The real question isn't that what it sets by default, the question is why that chat app needs to know and log your location in the first place? Why does it not only get it and send it when you choose to share? What kind of enhancement does it give to your fucking """experience""" when it logs your location like that?
@cassowary, geotagging your photos can be done without logging your location on the server. It can be done locally. Plus, I thought that Snapchat does not keep the pictures you've taken? (I have been out of that since then, so I don't know.)
I'm not by any means a Snapchat power user. But I really like the way my phone tells me where my normal photos are taken — it makes it very easy to find the photos. Also, any chat app would benefit because it's very easy for me to remember "i was talking about foo with Bar when I was over in Baz", but much harder for me to remember when that happened. So to me location tracking would offer many useful features. (They may not be worth it, and even the useful ones might not be available, but those aren't answers to the question you asked.)
In Snapchat your location is used for the geofilters, I guess that's why they have to get your location. I do think they should have an option to turn off geofilters and not use your location at all though.
If you use the map it asks you if you want to turn it on, most people click through it looks like. Now someone write a PoC to keep a log of where everyone goes.
I loved this article. It is beautifully written, given both the hacking curiosity on display as well as the real-world privacy impact it demonstrates. Most of my family use whats-app and would be mortified if they actually understood most of this. Not saying they would stop using it, as the trade-off is a great social app, but it would make them think more broadly about how the world is changing.
It takes a real turn towards developer centered humor with the opening line "With even more time on your hands than ever before, you go just a bit mad and start...". Great Deus ex Machina type segue into all out yummy tech craziness he relishes out.
It sort of reminds me of gonzo-style journalism. I took a look at their other articles, well some of them, and like their style. I'm not sure if it would appeal to a larger audience, but I liked it.
Wrong, if you deactivate the feature 'last seen at' it doesn't change anything because you can still get the same information with the feature 'is online now' and this feature can't be deactivated
What count as "online"? Using the app? Does the web app also track that? I don't think this is disclosed by facebook, it would be nice to experiment to check it.
Nevermind the clever writing but the issue has been known for years—and beautifully exploited with the selfhostable ready-made solution WhatsSpy Public since Feb 2015: https://gitlab.maikel.pro/maikeldus/WhatsSpy-Public/ It's not actively maintained anymore but Maikel deserves some credit for it.
Wanted to post the same. Note that this project used an own client, instead of scraping the webinterface. Which is by far superior, because you don't need an active charged phone and can scale much better. yowsup is still around and working.
Probably not, it used Chat-API [0], but the developer is kind of an asshole. But I admit, people just post stupid issues all the time. However I don't share the developers opinion that this was abused. My friends and I haven't received spam messages on Whatsapp. I admit that may be a small sample size, but still.
I did the same about mid 2015 using yowsup (Python API to Whatsapp). But it's was a private project because of legality concerns of hoarding so much data.
Of course, the elephant in the room is that all this info and much more is with WhatsApp, Facebook, Google and what ever garbage app is installed on your phone. I agree that the article is more about targeted surveillance towards certain users but that is where NSA and secret letters come in :).
Very well written article - and I love your drawings!
I did a similar story a while back on how you can track your friends sleep patterns using Facebook Messenger [1]. I'm sure there are lots of other services that have this problem, and most users are blissfully unaware.
When stuff like this happen I wonder if we can try to trick the system, overloading it with information, faking things. Couldn't we just somehow make sure we are online all the time (some script pinging the app), then the data would become meaningless..
Just to clarify as a non-user: there's an online status, and a 'last seen' data point, and both can be queried by any user for any user given their telephone number, as often as the querying party likes? And the online status is when the app is open on the phone?
AFAIK If you have them in your contacts and they haven't blocked you, you can access both those data points. If they have disabled last seen, you can still get the the 'online' and 'typing' status.
I guess it really depends where. I believe here, where WhatsApp is pretty much the _only_ method of communication, people most definitely check it every few minutes, and especially before they go to bed.
The notification and icon badges help hide when you're sleeping, but they advertise when you're interacting with someone. You're "unseen" for twenty hours. The guy you're cheating with logs in and sends you a message. Five minutes later, you log in and read it. You wait two hours, read it again and send another message. He doesn't check for an hour. You're so longing for his response, so every five minutes you're logging in "how many ticks? what color? has he read it yet??". Once he's out of his meeting, he (finally noticing the notification) logs in and sends a message. Your activity drops off now that you have your reply, but you nevertheless send yours...
I think there's more than 1.3 billion users on WhatsApp - its massive - I am personally checking it constantly (> once an hour)
It's certainly a more popular app outside of the USA. They initially gained traction because they were willing to make apps for things other than iphones and androids - which gave them a huge following in the developing world where people may still use 10+ year old candy bars.
This is so true. Where I live most of people started using Whatsapp on old Nokia phones running SymbianOS. It was one of the few decent apps available.
It's probably a combination of high cost of texts at the time when Whatsapp became popular, no limit (or much larger limit) to the size of texts, a reasonable probability of texts not arriving or arriving late and a "fuck telcos for squeezing millions of euros from their users for no other reason than to turn massive profits from texting" attitude.
Soon was June 2017. But I doubt it has anything to do with roaming. Maybe more people paid per SMS for a longer time than in the US? I know I still do; I could add unlimited messages to my monthly contract for 1 EUR or so, but what's the point.
Interesting. My assumption was that Europe was much more okay with pay-per-use than the US was. It was always strange to someone in the US that a European would pay different amounts for a call depending on what kind of phone you were calling, where in the US both parties simply paid for their airtime if they wanted to use mobile phones.
SMS took off faster in Europe than in the US, but we've had bundled packages for so long that the individual cost per text wasn't such an issue, and now on many plans they're unlimited.
I guess the differing cost structure depending on who you're texting and from where may have spurred the adoption of WhatsApp, whereas in the US, even if you WERE paying per text, it was the same across a territory of many thousands of miles and hundreds of millions of people. And, the same way that many folks in the US do not even have a passport, they tend also not to have a reason to text internationally. The size and homogeneity of the country benefits the adoptions of some technologies, but hinders the adoption of others.
I don't know why this was downvoted, because absolutely this is why I started to use WhatsApp. Though the main problem is very high international SMS/call charges. I was in an international distance relationship a couple of years ago, and doing anything over cellular would have bankrupted me.
It is true that international texting is expensive in Europe, whereas inter-state texting is free in US. But while Americans usually have circles of friends spread over several states, an international circle of friends is less common in Europe.
I think the reason is that a typical cell phone plan in Europe was like 5€ per month, plus 0.07 cents per text (or call minute). Whereas typical American plan was $50 month, but unlimited free text and calls. So people who text lot didn't want to pay even for the tiny amounts for individual text messages, and migrated to using apps.
The all-inclusive fixed price monthly plans are only now getting more popular in Europe.
It's also because it's much easier sending photos and live recordings using whatsapp compared to any other app outhere (FB is too clunky, the rest of the apps don't have a critical mass in most of Europe.
That's even worse, because it makes it easier to correlate when two people are Whatsapping with each other. If they both happen to be online at the same time a lot...
I suspect the opposite - given that whatsapp dominates texting in europe, and twice as many people live in europe as the USA (which is upon what i suspect you base your assumption here)
Oh come on, "whatsapp dominates texting in europe"? You do not know every country ;) Based on my experience it would actually be opposite, but I am not going to extrapolate to whole continent.
Particularly for those living in Europe, or those that have a lot of international friends - all with phone numbers from different countries - it's a godsend. My phone bill would be ridiculous if I were texting my friends in Sweden or Brazil from my Dutch SIM. iMessage for similar reasons.
Also the group messages are great. My housemates and I all talk via a WhatsApp group. It makes it far easier to hold a coherent group conversation when some of us aren't at home. SMS would be a ballache.
Oh and GIFs, voice messages, and videos can be sent in messages. Free calling too. I can call my friends in Australia for nothing, and it's not a bullshit experience like Skype.
Yes, they do. iPhones don't seem to do that, but old GSM phones did it (like my first phone, Ericsson T20, which got released in 2000). Androids have read reports in the default message app, if I'm not mistaken.
No, they don't. The protocol supports delivery report only. And the meaning of that report isn't necessarily what you believe or wish it to be:
...the exact meaning of confirmations varies from reaching the network, to being queued for sending, to being sent, to receiving a confirmation of receipt from the target device...(https://en.wikipedia.org/wiki/SMS)
Nor does it test the recipient understanding of the content. But that's missing the point.
The app lets you know that your message delivered to the recipient and then lets you know that it was opened (not merely viewed as a notification). This is a useful feature that SMS lacks.
The point was that being displayed on my screen doesn't mean I read it. For instance, I have a problem where sometimes I'll log in to my phone and an app will be active. I don't want that app; I need to do a bank transaction. But now someone thinks their message has been read, when so far from reading it I don't even know it exists! It was just displayed on my screen for half a moment when my eyes and my attention was somewhere else. (Another common problem I have is when I send a message, and then they reply so fast that the message arrives at about the same time as I'm trying to return to the homescreen to ensure I get a notification when the message arrives. Well unfortunately the message arrived first, got marked as read and no notification exists. After twenty minutes I realise what happens but maybe I've already offended someone by "reading" their message and ignoring it.)
Delivered vs read is only accurate if you have an eye tracker.
I find services like WeChat or Line to be superior based entirely on the fact that you can have an actual username. I'm still not sure why whatsapp forces you to use and exchange long sets of numbers to get someone's contact.
Obviously WeChat is not secure in any way, though ;)
It's free (unlike SMS or MMS) and back in the day it was the only service that worked reliably on all mobile platforms and didn't use PINs or usernames--just the phone numbers in your contact list so it was plug&play: just install it and you can talk to everybody.
People avoid thinking too much about things that are working as advertised. How many people wonder about how exactly their cars work or the global financial system works yet they are impacted by both of these. They may reserve curiosity for other things depending on their interests.
And here the problem begins, a lot of software engineers seem to conflate this disinterest to stupidity and think this gives them a right to do whatever they want with other people's data.
There is a fundamental lack of understanding and respect of other people rights and privacy and an easy dehumanization that is disconnected from human society and the evolution of fundamental rights like like the right to privacy. Regulation will catch up and eventually address this as more people become aware but is a troubling reflection of a large part of the software ecosystem.
Huh; why on Earth does WhatsApp make the default visibility of your "last seen" to "everyone"?! Also, speaking of 'tracking', I'd love to be able to track the sources of fake news forwards, but I assume such a technique would not work for anything like that.
I think I did almost the same thing three years ago. See: https://www.v2ex.com/t/121272 (in Chinese only, sorry. I should translate it to English when I'm free)
Always wondered what would happen if someone was to happen to have every valid US/CAN number in their contact list (all 3-4 billion), since WhatsApp doesn't validate you actually know the contact just that you have their phone number.
The idea being you incentive WhatsApp users to install your app that then harvests all their contacts and collates the "last seen" info on all of them. If they delete your app, you setup a proxy to imitate their device and continue the monitoring. Have a privacy policy that is super strong but has a couple "loopholes" that one can drive a truck through.
Is that the idea? Seems doable if you're not too risk averse, have no family and live in a country with weak extradition laws. Kidding, there's nothing illegal about any of this stuff or FB, Google and lots of other companies would not be in business.
FB would have a civil claim against you -- they paid several billion dollars for the legal right to all that user data!
You wouldn't need an app or other WhatsApp users beyond your distributed proxy accounts. You'd be running the monitoring through these proxies.
Creating an app with the sole purpose of backdooring WhatsApp on a user's phone seems like it'd open you up to a lot of lawsuits. Ethically its a mite more questionable, but the original article is still unethical in that you're monitoring people without consent.
Like I said above, I'd do this just so that they'd crack down on it. It's still a "means justify the ends" argument, however, so you have to be quite comfortable with moral relativism.
I don't see why people suddenly panic about it.. That's not a new thing. I wrote my own Tracking app over 2 years ago. I still have the code and database laying around.
I was using https://github.com/tgalal/yowsup back then.
Back then you could even see when people requested your online-status. Meaning you could see when they opened your chat. Back then I used that to see if my message have been read because the message-read notification didn't exist back then.
Similar "online status tracking" has been used for Facebook messenger in the past. I know Facebook removed send-location by default, but I'm not sure if the API still allows pulling online status.
The only issue here is that WhatsApp lets you see the status of people who don't have you as a contact. The rest is utterly underwhelming.
One thing I loved about ICQ-esque IM services was that you could clearly see whether a contact was online or not. I still feel weird starting a conversation on WhatsApp because of the lack of clear visual cues of the contact's status.
I might be wrong here, but what if I change my settings to "not show the last seen status"? I guess in that case this doesn't work. Yes, I believe checking "Online" status frequently does give some information about my activity. Correct me if I'm wrong here.
If you think there is no problem, you are wrong. The blog post does not show all the information leaks that this implies. Example: I can modify the script to monitor all the numbers I've in my phone, so that based on the online/offline status in a few weeks I can be able to guess who is having conversations together, discovering cheatings, work affairs, ... EDIT: Practical example. After collecting enough data about user X I create a table about the probability of this user being online in a given few-minutes time ranges. Then I check the online frequency of that user compared to the online statuses of another user Y. If the difference compared to the expected probability is significant, than I can suspect the two are chatting. Another thing I can use is that attivation delay of the online status, since often X sends a message to Y and this results in, a few seconds after, Y to be online, and then the contrary.
[then an HN user said she/he was not sure this was serious because maybe the users casually had similar patterns, so I replied:]
If you check the model I described in my comment, it should filter the "bus problem", since it will detect a chat only if, compared to the standard "bus time" probability of the user A chatting, it is chatting more if in the same range also B is chatting. If you add to this that people on Whatsapp usually do not talk to the exact minutes, it is definitely possible to create a robust system for guessing with good probability of two have often conversations. Also note that the phone numbers in input are not random, are the ones of a connected circle of persons. Add to this the fact that we can split the ranges even, potentially, by few minutes, and you can even detect interesting stuff for people having continuos chats with multiple persons like teenagers. Another thing that is possible probably is also "groups detection", since at new messages a set of users will activate at the same time.
[And the attack can be refined a lot with more powerful mathematical approaches]