Facebook machine learning technology improves; Redditors alaramed.

DanielBMarkham · on Dec 21, 2011

Years ago I worked with a guy who used to sell computer equipment to businesses.

He told me that when he went to different divisions in each company, they all had org/structure charts somewhere around the office. And in every one of them, their section was in the middle and everybody else was serviced by them. No matter what you did, you were the center of the universe to you.

Likewise, I've had experience building and working on many computer systems for businesses. Each of them, if successful, wants to handle not just their thing but everything for their customers. No matter what they did, they wanted to be the center of the universe for their customers.

You can laugh at these examples, and we all know most software companies have much great ambition than they do traction, but Facebook and Google are actually doing this. They are becoming the center of the information universe to their customers.

This is not a good thing.

safetyscissors · on Dec 21, 2011

Its also not a good thing that we don't know who Facebook or Google is selling this or giving this information to. The advertising agencies, meh kinda. The organisations with acronyms, they are more worrisome. I also wonder, to what extent the amount of information they have about the average user.

xavi_ · on Dec 21, 2011

BTW, Google seems to be pretty open with what they know about you. Here are two links where they display an explicit list: https://www.google.com/dashboard/ and http://www.google.com/ads/preferences/

deno · on Dec 21, 2011

This is certainly not everything they know about you. This is part of their initiative to be not-evil, it’s called Data Liberation Front[1]. In their own words:

> Loyalty, not lock-in. We firmly believe you control your data, so we have a team of engineers whose only goal is to help you take your information with you. [2]

It’s absolutely terrific that they have something like this. Many others had to be dragged there kicking and screaming.

However, this is just not relevant to the topic at hand. This is “your data”, the data that you deliberately created, rather than revealed accidentally. The point of the project is to avoid vendor lock-in, rather than to protect your privacy.

If you’re EU resident, you should be able to just write to them and they have to reveal everything they have on you. And that list will be entirely different from this.

[1] http://www.dataliberation.org/

[2] http://googleblog.blogspot.com/2011/06/supporting-choice-ens...

rhizome · on Dec 21, 2011

FYI: "The Carlyle Group" = CIA

jemfinch · on Dec 21, 2011

> They are becoming the center of the information universe to their customers. This is not a good thing.

Why isn't it?

pilgrim689 · on Dec 21, 2011

I believe the argument is that having all your information in one service (Google, Facebook,..) is like having all your eggs in one basket.

Then, if the company decides to sell your information, lose it, etc., it's too easy, and you're left empty-handed.

However, I think that if that 1 service makes it possible for you to backup all your information and allows you to remove all the information from their service if you want to, then it shouldn't be a problem... but I might be underplaying their argument...

jemfinch · on Dec 21, 2011

> However, I think that if that 1 service makes it possible for you to backup all your information and allows you to remove all the information from their service if you want to, then it shouldn't be a problem...

Something like Google's Data Liberation Front? http://www.dataliberation.org/

I put all my eggs in one basket every time I take the family to the grocery store. It's unclear to me why storing all my information with Google is any worse than piling my wife and kids into the Odyssey.

boredguy8 · on Dec 21, 2011

When you drive, your goals and the goals of everyone else on the road are roughly aligned insofar as you all want to travel from one place to another place quickly and safely. There are pretty rough consequences for those whose goals aren't similarly aligned. So yes, you could be killed by someone else driving drunk, but it's relatively unlikely.

Google's goals and mine aren't aligned at all, except instrumentally. So Google cares about my privacy up to and until it costs them less to not care than to care.

I'm fully in the Google camp, but let's be honest about the fact that we've made a deal with some supernatural being. We just hope it turns out not to be the devil.

cafard · on Dec 22, 2011

Or "earns them more not to care".

biturd · on Dec 29, 2011

Imagine if google died. You have your email with them. Now you can't login to any other service that you have forgotten your login and password to, because you don't have access to email.

Say you use Google Apps For Domains. Thinking it is safer because you own your own domain. Arguments that you don't really own that domain aside, if google dies, you won't be able to transfer that domain away from google.

Email is the backbone of an approval system for just about everything.

I think you can pretty safely put all your eggs in one basket, as long as you take one egg out, that being your email access. At the very least, forward all inbound emails to some other service so you have access to those in an emergency.

Ideally, we would all have our own mini email servers that we control and manage. However, managing an email server is perhaps one of the harder chunks of tech to manage. From anti-spam to just configuring all the pieces, it is much more than most are willing to do.

falling · on Dec 21, 2011

Facebook lets you download your data as well: https://www.facebook.com/help/?topic=download

Just wanted to balance the “Facebook evil / Google not evil” diatribe, as when discussing these topics only Google is ever mentioned.

patrickaljord · on Dec 22, 2011

Google allows you to download your data, which includes the emails of your contacts, your icals etc. In Facebook you get your list of contacts consisting in a list of full names in pure text, no emails. The only way to get in contact with them is to connect with them on facebook, making the data completely useless. With Google, you can export everything and use it somewhere else and still be in touch with your contacts, calendars etc _without_ ever using google again. Big difference.

falling · on Dec 22, 2011

Man, you really want Facebook to be evil, do you?

Google gives you the email addresses because you provided them with that data, it's yours to begin with. Facebook will give you the email addresses, but only for the users that allowed it to share that data with you.

Will Google give you the email addresses of your Google+ contacts, if they decided not to share their address with you? I doubt it, and rightly so.

“Data Liberation” means that you own your data, not that using some social network gives you the right to be able to communicate with people outside of that service. If you want to contact them outside of Facebook, ask them their email. It's your problem, not Facebook's.

Your argument is like saying that Google should give me my Gmail contacts' phone numbers, so I can contact them without using email.

jemfinch · on Dec 21, 2011

How long has this existed? I don't recall seeing such an option when I deactivated my account almost a year ago, and it might be worth trying to reactivate my account just to get this data.

falling · on Dec 22, 2011

Most of the news articles talking about this are from early October 2010, about the same time the Google Data Liberation Front was created.

zombielifestyle · on Dec 21, 2011

i can't manage fb to share location information from friend accounts. if 2 accounts upload an (similar) image including one active check-in fb only suggests locations based on the check-ins of that active account. this is even true for public images.

until someone provides a POC that states the opposite it's just YOUR data that YOU freely give to fb. no magic behind, just FUD.

responded to top post for karmawhoring. http://news.ycombinator.com/item?id=3377267

rhizome · on Dec 21, 2011

I've been working for some time (solo) on a concept that could manage to alleviate some of your concerns. There are ways around them today (API), but who knows if they'll be foreclosed upon in the future.

Like many old farts, I imagine, I see your post in terms of the thick-client/thin-client battles of years past. I don't think FB will ever see it in their interests to completely close their garden, which would really be remaking the mistakes of the past, but stranger things have happened.

JonnieCache · on Dec 21, 2011

The important thing to take away from this:

If they can make guesses this accurate about your photos, imagine they guesses they're making behind the scenes about your life, your personality and your innermost thoughts.

If there was a page on fb where it showed all the inferences they had made about you, sexuality, income, religion, philosophical viewpoints, mental health, etc. then people would run screaming. Of course, a lot of people have already told fb that info voluntarily, and that's why its possible to guess it for everyone else.

Also potentially terrifying: a facebook fortune telling engine. I bet they can predict your future with frightening accuracy, or they will be able to after another decade or so of data anyway.

drats · on Dec 21, 2011

Indeed, with all the Facebook sharing buttons around the web they have more than your bookmarks too, they have the equivalent of uploading your browser history each day to them. With current machine learning technology, let alone improvements over the coming years, they'd know more or less what someone would know if they stood behind every time you were browsing for the last few years.

Even if you were to cancel your account you'd turn up in photos of parties on other people's accounts and they'd follow pretty well you after you'd left too. With their dodgy history, Zuckerberg's contempt for his own users as evidenced by his quotes, the idea that this won't be abused is laughable. Given national security letters (NSL) it's almost certain the intelligence community has access. Hell you could hit a mid-level admin at one of their data-centers with a NSL and the top level management and legal wouldn't even be aware that there are outside entities with a connection to their database. That's to say nothing of the extra-legal capacities of these people.

While people like to think that it's more or less anonymized clusters and demographics that are being passed to advertisers to target ads better - not a terribly scary thought, just more relevant ads - I think we need to be far more cautious.

nitrogen · on Dec 21, 2011

While people like to think that it's more or less anonymized clusters and demographics that are being passed to advertisers to target ads better - not a terribly scary thought, just more relevant ads - I think we need to be far more cautious.

How do you go about convincing the "I have nothing to hide" majority of the population that giving away such vast quantities of information is not in their best interests? I'm thinking of friends, family members, etc. who honestly don't care whether the TSA sees them naked at the airport, or how much the other agencies know about them. Where does such extreme deference to authority come from in the first place?

guygurari · on Dec 21, 2011

I think this article, "Why Privacy Matters Even if You Have 'Nothing to Hide'" [1], makes a compelling case. There are essentially two reasons why privacy is important. The first is that privacy can be used to 'hide bad things'. This is what the 'nothing to hide' argument addresses: if you don't do bad things, you should not worry about privacy. The problem is that not everyone agrees on which things are bad. For example, do you agree that the OWS protesters terrorists?

But privacy is not just about hiding bad things. A multitude of innocuous facts about many people, collected in a database, are themselves a powerful tool. The data can be mined by governments to look for 'suspicious' behavior, by insurance companies to deny insurance, by criminals to commit fraud, etc.

I think the basic problem is that violations of privacy give power in the hands governments and corporations, and these entities tend to abuse whatever power you give them.

[1] http://chronicle.com/article/Why-Privacy-Matters-Even-if/127...

rhizome · on Dec 21, 2011

Where does such extreme deference to authority come from in the first place?

Gilles Deleuze said it comes from a desire to be led.

r00fus · on Dec 21, 2011

> How do you go about convincing the "I have nothing to hide" majority of the population that giving away such vast quantities of information is not in their best interests?

Continuing and possibly gruesome examples where such data mining ruins lives.

I spoke with a recruiter who prided himself on their HR department's willingness to deny entry based on FB profile (unsure if was public or obtained through some "secret friends" policy, or through a backdoor).

Furthermore, it's becoming almost a certainty that some foolish judge will set precedent somewhere that essentially makes FB account data public (or forces publicity of formerly private data).

mkjones · on Dec 21, 2011

You can ask to watch them having sex. They have nothing to hide, right?

MrRage · on Dec 21, 2011

I'm not sure how compelling my argument is... but if I have nothing to hide then why does anybody need to look? For example does the TSA seeing everybody naked actually improve security?

rhizome · on Dec 22, 2011

Everybody has something to hide. If you don't understand what I mean by this, post your credit card number in a reply.

johnyzee · on Dec 21, 2011

That is true. Mere statistical analysis on the word count on articles you prefer to read will give a very accurate insight into your interests and mindset. Facebook has access to this data through the "Like" button trojans that exist on many websites. If anything, they haven't been very impressive in putting this data to work yet (or I would be seeing a hell of a lot better targeted ads than currently).

gresrun · on Dec 21, 2011

Statistical analysis can give you a lot more than just what I like to read. The power of statistics never ceases to amaze: http://en.wikipedia.org/wiki/German_tank_problem

johnyzee · on Dec 21, 2011

Funny you should mention it. I am in the middle of Cryptonomicon - a thoroughly enjoyable read with a heavy focus on this area of study.

jordan0day · on Dec 21, 2011

I read Cryptonomicon a month or two ago and really enjoyed it. Seemingly serendipitously, a post about encrypting email turned up on Slashdot the other day:(http://yro.slashdot.org/story/11/12/20/0158227/do-slashdotte...) and I realized that with everyone basically using webmail services now, the chance for something like Ordo to take hold at-large is even less (given that fewer and fewer people actually use email clients anymore).

I'd like to fantasize that there could be a way to use a social network like Facebook with a tool like Ordo -- that is, most the data (photos, in this case) you upload to FB is encrypted, and only people who you've approved can decrypt it. Using steganography this is probably already possible with FB, probably not practical, though.

reinhardt · on Dec 21, 2011

As someone with just a fake-info FB account that interacts with it maybe once or twice a week, I am more concerned by the fact that not having an (active|real) account is also a signal that says (or may say at some point in the near future) a lot about me.

colkassad · on Dec 21, 2011

From the link - "Did you login with the facebook app on a smartphone while you were at the hospital? If so, they would have your GPS location at a certain time, when you later upload the photos from your camera, it would have a timestamp, and they could just look check where you had logged in from at that time."

If you are a home owner and this is true, you could possibly be identified -- say for instance posting from home 75% of the time and cross-referencing with certain demographic statistics collected from your browsing history and property records. Would that be legal?

iradik · on Dec 21, 2011

Why does FB need to make guesses about this stuff? People are supplying all this information in form fields in triplicate.

Also they are a fortune telling machine. That's a good pun actually. They are hoping to make a fortune by telling you what ads you'll click.

tomjen3 · on Dec 21, 2011

I found googles page where they tell you which catagories of things they believe about you (basically age and interests).

Some of them where right, but more than half of them was wrong (I am not, for the record, interested in Paleontology).

pnathan · on Dec 21, 2011

Can you provide a link? I'm curious now!

davux · on Dec 21, 2011

http://www.google.com/ads/preferences

See "Your categories"

Edit - Sorry, I copied the link from the Google results page, it acutally shows up as adspreferences in the ad, but ads/preferences below. Kind of strange.

kang · on Dec 21, 2011

gives me a 404

waffle_ss · on Dec 21, 2011

I think he meant http://www.google.com/ads/preferences

omegaworks · on Dec 21, 2011

google.com/ads/preferences

sk5t · on Dec 21, 2011

Here you go: https://www.google.com/settings/ads/onweb/

tomjen3 · on Dec 21, 2011

Can't remember.

_csoz · on Dec 21, 2011

I thought google has more interesting data about all those.

joelthelion · on Dec 21, 2011

"It seems you've been watching a lot of porn lately. Do you want to post an update about it?"

fl3tch · on Dec 21, 2011

That's funny, but I've been told some porn sites have social widgets. I don't know who would want to share that, but there they are.

freshhawk · on Dec 22, 2011

They already sell that data to credit agencies who use it to produce more accurate credit scores (and get the answers to those questions they are not legally allowed to ask).

kokey · on Dec 21, 2011

Yeah it's scary that a site, where you and your friends share the details of your lives online, uses those details to... oh wait that's the point.

berntb · on Dec 21, 2011

I hope they don't lose this info. What a boon for researchers in some distant future!

(The problem, of course, is that a "distant future" today is 10-20 years... Probably even in my lifetime.)

sandGorgon · on Dec 21, 2011

or build a self-aware version/cylon of you.

jhferris3 · on Dec 21, 2011

(Disclosure: FB employee, but nowhere near the photos/locations teams, just personal experience)

In all likelihood, there's no magic. Its just comparing 'dumb' manual album labels with places pages and trying to match them up.

I've had similar experiences with the location-suggestion feature where I was totally bewildered by how it was getting the data to recognize the locations. As it turns out, all of the albums they've done this for were pre-location tagging and so I'd manually put in a location (like 'Bowery Ballroom'). So there wasn't any particular magic in how they seem to be doing it. This would also explain one of the other comments on here about the location suggestions being England, Arkansas (if she just labeled the album England and the location suggester goofed). I also had it goof when I had an album labeled "Rhode Island and Massachusetts" and it tried to suggest a real estate agency with that in its name.

brlewis · on Dec 21, 2011

This was just answered on Quora: http://www.quora.com/How-does-Facebook-predict-my-photo-albu...

Henry F. Bridge, Product Manager, Facebook

Photo albums on Facebook have long had a text input field for location, but until recently, there was no way to put in structured data in this field (like a Facebook page). We added that ability earlier this year, and the "add a location" feature just performs a search on whatever text the album owner put in originally (ranked by place popularity etc) and suggests the first result as the location of the album.

iamandrus · on Dec 21, 2011

I uploaded pictures from a class trip on Facebook back in 2009 and had no captions or names that suggested that the pictures were taken in California. My camera at the time was not capable of GPS. Facebook still managed to get the location right.

Scary stuff.

henrybridge · on Dec 22, 2011

Like I said on Quora, we use the location field that you put into the album when you uploaded it. The album location isn't displayed in the caption for individual photos and isn't displayed that prominently on the album page, so you're probably just not seeing that you set it when it was uploaded in 2009.

iamandrus · on Dec 22, 2011

I never entered anything into the location field (I would have remembered doing it), yet it got the exact city and state where I took the photos.

modeless · on Dec 22, 2011

You specifically remember leaving a form field blank on a web page three years ago?

iamandrus · on Dec 22, 2011

I would have remembered when the message to tag the location of my photos came up.

seaucre · on Dec 22, 2011

It didn't work like that. It was a plain blank text field among a couple other blank text fields. No search was brought up upon data entry. Very innocuous; Perhaps that why you don't remember it.

iamandrus · on Dec 23, 2011

Believe me, I would have remembered doing it. I wouldn't have cared about a location tagging message if I had entered the location manually. Not to mention I couldn't even remember the names of some of the cities where the photos were taken until Facebook suggested them to me.

devonrt · on Dec 21, 2011

Very sophisticated. Almost like magic :)

drats · on Dec 21, 2011

http://en.wikipedia.org/wiki/Scale-invariant_feature_transfo...

Facebook almost certainly has more photo information than TinEye or Flickr, and of indoor environments probably more than Google (which has reverse image search too). Across any given bar or hospital Facebook would have maybe 5-10 other people with albums tagged with the name/gps/check-in. They'd only need one other album though.

SIFT more or less turns every image into a bag-of-words. Your single photo, even at different angles, is going to have a heavy match with photos they have. If you upload a whole album they are going to have tons of matches and they can be more or less certain of the location. To say nothing of adding even the most basic geoip-to-city lookups that would narrow you down to at least five cities that you and your social network inhabit. But the extra information they have is besides the point, SIFT is enough; hospital rooms look alike to us, to SIFT they don't.

iradik · on Dec 21, 2011

It's not that sophisticated. It's a picture of a baby being taken at a hospital.

There's probably hundreds of photos of babies being taken with GPS from that hosptial.

FB image recognition probably thinks all these babies are the same "person".

Then a separate system comes along and takes photos that do not have location info and tries to match them up with ones that do. It finds a match with the "hospital baby" and then asks for verification. Person says YES and it adds that baby to the "hospital baby" pool as well.

tibbon · on Dec 21, 2011

One other even simpler possibility, Facebook is looking at where you checked in or made status updates from and comparing the timestamps?

"I just had a baby" - updated from hospital (Upload photo with timestamp near that status time) Match the two

jsaxton86 · on Dec 21, 2011

If I were to take a picture of a baby at a hospital, wouldn't the majority of the features be of the baby, not the hospital? I suppose if there's at least one picture in the album that's of the actual hospital, that's probably all you would need to infer the rest of the set was taken at the same hospital.

T-hawk · on Dec 21, 2011

There's plenty of ways to infer location information from a single picture focusing on a baby. The hospital could be identified from something as small as one piece of paper in the background with letterhead or another identifier. Or the face of a nurse in the background that was previously known to be at this hospital. Quite possibly the room layout or particular pieces of equipment in certain arrangements. Landmarks outside a window.

Pictures leak a ton of side channel information outside of their subject matter.

apu · on Dec 22, 2011

SIFT is not enough for this task. There are many computer vision researchers working on this application (automatically inferring location of an image), and accuracies are pretty low right now, except for popular landmarks.

The problem is that single SIFT features are not very distinctive, and so you need many of them in common between two images to get reliable matches. So you essentially need images taken from very similar locations. You can ameliorate this requirement slightly by using fancy tricks in how you aggregate SIFT features together, but the fundamental constraint remains.

This precondition is satisfied at popular landmarks, since there is a good chance that someone's taken a photo from the same location you're standing at, but in general, this condition is not that easy to meet, and hence the poor accuracy of current approaches.

Finally, SIFT is great only for roughly-2d, distinctively textured areas. However, a lot (maybe most?) of the world does NOT fall into this category -- many things are either not textured, not distinctive, or are 3d. Something other than local descriptors (of which SIFT is the best known example) will be needed to understand these kind of scenes.

If you're curious about this line of research, a good place to start is the IM2GPS paper, which was the first major work (that I'm aware of) to look at this problem: http://graphics.cs.cmu.edu/projects/im2gps/

yogrish · on Dec 21, 2011

But why that whole technology for just making a Suggestion? There is something else they are working on...something really BIG and a complete game changer I guess.

drats · on Dec 21, 2011

Well if the accuracy is 95% why not turn all Facebook users into one huge mechanical turk batch process to get to 99.9%? And it's just convenient for people not to have to tag.

Then you have the data sitting around for mapping the insides of all buildings like Google has it for street view. Sure your Asimo-style robot butler/"something really BIG" will be that more efficient with internal mappings of most public spaces when you release it in 2030, but the convenience is sufficient. Remember that Facebook beat Myspace more or less on interface (combined with a few other factors like exclusivity), they aren't going to let Google or some other competitor get ahead of them by having an interface convenience edge of 5-10% which might cause a Myspace-to-Facebook style exodus. Kings who have committed regicide on their predecessor are all too aware of how they got into power.

drats · on Dec 21, 2011

Also consider this, use SIFT on video stills and collect all the handheld video of a particular concert. Use machine learning to combine all the audio tracks and clean them up into high quality audio. Stitch the video together, using textures from the higher resolution stills people take, to allow people to relive the concert with a massive panoramic video (or 3D) with high quality sound. Use it to launch a music competitor to Google or destroy Ticketmaster (well overdue..) as bands and venues won't have to hire video production companies to record concerts if they sell tickets through FB. All that technology is in current research papers and prototypes at the moment, it would probably only take two years to put it together at worst if they aren't working on it already.

dbarlett · on Dec 21, 2011

Previously: http://news.ycombinator.com/item?id=3293324

rmc · on Dec 21, 2011

Or it could just a be a way to get users to add more meta data to their data, which allows them to increase engagement

ArbitraryLimits · on Dec 21, 2011

> hospital rooms look alike to us, to SIFT they don't

Well, to SIFT they'll all look different. Except for the hospital room in Portland, OR and the hospital room in Portland, ME that each happen to have a sign with "Portland Hospital" visibile in the picture. While I'm alarmed about the potential of computer vision to compromise my privacy, I have yet to see anything in actual use even be competent, let alone alarming.

obtu · on Dec 21, 2011

They might be combining some low-accuracy image similarity to match on images from his friends list, and infer the location from those other images.

pirate_is_back · on Dec 21, 2011

Just checked this.

I took some pics on a regular digital camera (no GPS) in Indore. I uploaded them from New Delhi a few weeks later. And now FB is asking me "Were these pics taken in Indore?". Crazy shit.

Update - I dug through my FB updates. Just before leaving for airport, I updated my status to "Off to Indore" and after coming back to New Delhi, I had some status updates about my office and a local park. Facebook is probably using the the timestamps from image and relating it to locations using some heuristics like status updates, IP addresses, image recognition etc.

kokey · on Dec 21, 2011

I think you are actually getting to the bottom of it. A combination of timestamps, and status updates, and probably that of friends tagged in the same album. Could be quite effective, as we can see.

forensic · on Dec 21, 2011

yeah this is by far most likely. The guy's wife or whatever had her phone turned on while in labour, facebook knows her GPS coords, facebook knows she's his wife, and so on

doesnt take much to throw out a guess based on a single GPS location + timestamp. Even if the guess is only right 5% of the time it is still a profitable guess to make.

freedompeace · on Dec 21, 2011

How could Facebook do this (technologically)?

According to a redditor,

"

- It's not IP address. Facebook successfully identified a number of specific locations (bars, theaters, etc) even though I had uploaded the photos from my home

- It's not geo-tagging. All of my photos were taken with a camera that does not geo-tag (Nikon d700).

- It's not contextual tagging. There were no people tagged in the photos, no comments in a lot of them, no words or phrases or names in the captions that could have given clues

- It's not image recognition. One set of photos was taken at Cafe du Nord in SF, CA and every single shot was of the performer onstage, with no identifying characteristics or clues to be had.

"

I would really like to know as this is very interesting and none of the reddit comments (as of now, 12 hours after submission) really answer this question. What technology or methods are they using to suggest (accurate?) locations where pictures have been taken from?

Even more strangely, I have never used my mobile phone with Facebook, but when I uploaded a photo just now of a place from my childhood to which I haven't ever been since using Facebook, Facebook correctly suggested the location.

What the heck?!

loopdoend · on Dec 21, 2011

Purely speculation but perhaps he is carrying his phone with him and it is correlating his location with the timestamps on the images, or perhaps he marked himself as attending an event which is at the venue.

From an image recognition standpoint, if anyone else was at the event and took similar sets of photos which they then tagged, that could also be used.

iradik · on Dec 21, 2011

This is how it can be done without any EXIF/GPS data with a fairly dumb algorithm.

It's a picture of a baby at a hospital.

There are probably hundreds of baby photos being taken per day at that hospital being posted to FB. Some have GPS and some don't.

The images that do have GPS, get recognized and lumped together as the same "entity" since they all look the same.

FB then looks up where she lives, then looks for any photos that don't have a location (like her baby photo), then tries to match it with any photos within a 50 mi proximity to her hometown. Bingo finds 90% match on the baby pic in the hospital, and asks user for verification. She says YES, then this pic gets lumped together with all the other entities. FB also asks more often since it's "right".

hyperbovine · on Dec 21, 2011

A slightly more sophisticated approach would be to use the all /other/ data they have on her to predict what hospital she is likely to choose. Why stop with just her hometown?

iradik · on Dec 21, 2011

Sure let's speculate they have a function that returns a guessed gps coordinate at time t. Again there are dumb implementations for this. Can take a time decaying and time-of-day/day-of-week weighted mode of your recognized locations. I imagine it'd be fairly accurate.

missing_cipher · on Dec 21, 2011

Could be as simple as figuring out that the picture is a baby, get the user's location and then find near-by hospitals. Then ask if the guess is right.

ddw · on Dec 21, 2011

But you can take a baby anywhere. Unless FB deduced "this baby was born today" and figured it must be at a hospital.

If so, we're in trouble...

phpnode · on Dec 21, 2011

Here's my guess at how it's done:

Redditor has facebook app installed on their smartphone (or just uses the website), sets status to "OMG wife is going into labour, at the hospital now". Facebook now knows roughly where redditor was at the specified time based on the ip, they can narrow this down further by looking at keywords in the status message and check it against a list of addresses in the local area and select the best match.

When the redditor comes to upload their photos days or weeks later, facebook just checks the photo timestamp against the user's location +/- X hours and makes a guess at where the photos were taken.

swalsh · on Dec 21, 2011

It's probably a combination of techniques to be honest. They do have some unprecedented access to contextually heavy data, as well as an unusually large base of free labor to supervise the learning...

My creepy moment came when It correctly pointed out the exact location of this photo:

http://www.flickr.com/photos/40127665@N03/4788700749/in/set-...

I was in the middle of Costa Rica. For fear of roaming charges my phone was not on, and I never made a status update. The photo was taken on a D40 at the time, so no location exif data, and frankly the the picture is kind of generic.

My best guess is it used the other photos in the album to gain contextual information. For example this photo was in the same album:

http://www.flickr.com/photos/40127665@N03/4789329722/in/set-...

this, to me, would be extremely easy to recognize.

DJN · on Dec 21, 2011

Both photos are in a public folder on Flickr called "Costa Rica".

swalsh · on Dec 21, 2011

the facebook album was named "D40 Pics", but it would be curious if it is cross referencing photos with flikr.

__alexs · on Dec 21, 2011

Simply not making a status update isn't enough. You'd have to never visit a page with a Facebook like button from a computer which you'd used ever to log in to Facebook with.

I'm yet to be convinced that this feature requires actually analysing image content at all. GeoIP and time stamps alone can provide a huge amount of context before you even start on things like mobile clients providing location data.

waterlesscloud · on Dec 21, 2011

Or that species of insect only lives in Costa Rica.

Kidding! If it was that good, I really would be shocked.

Peroni · on Dec 21, 2011

The logic doesn't hold up. I uploaded a bunch of photos taken in Christchurch, New Zealand. The pics were over two years old and I now live in the UK and uploaded them from here. The photos were also taken on a cheap, point & shoot camera, not a smartphone.

terhechte · on Dec 21, 2011

It doesn't need to be taken on a smartphone. It's sufficient to open facebook on a smartphone only once (app or web) in roughly the same timeframe that the picture was taken. It can then look where one has been in the timeframe when the picture was taken by comparing the picture timestamp with the list of checkin ip addresses, and do a geolookup of the ip address against a location database.

Peroni · on Dec 21, 2011

Again, that doesn't align with my experience. The first smartphone I owned was when I arrived in the UK. In NZ, I was restricted to an old Nokia work phone.

terhechte · on Dec 21, 2011

Maybe you accessed facebook in a browser while in NZ? On a regular computer?

JonnieCache · on Dec 21, 2011

Guess: your friends also took pictures of the same events from similar angles and uploaded them, fb worked out their location from one of the other suggested methods, and then matched it to your photos through simple computer vision algorithms.

Or, you used facebook at some point while in that city, facebook recognised that your photo was of a baby just born, by comparing it with the millions of other photos tagged with "OMG MY BABY IS BEING BORN!," it knows that babies are born in hospitals, and it knows that there is only one hospital in that city, so it guesses that the photo was taken in that hospital.

phpnode · on Dec 21, 2011

Also, does the location info purely say "Christchurch, New Zealand" or is it more granular? I'm guessing that when facebook only has sparse information about your location, as in your case, it "zooms out" to the city, state, country etc, but the more information it can collate, the more accurate the suggestion it gives is.

phpnode · on Dec 21, 2011

but did you use facebook at all when you were in new zealand?

Peroni · on Dec 21, 2011

Indeed but when I posted the pics it was 2 years later from a totally different continent and no reference to the location on the photos.

phpnode · on Dec 21, 2011

right, but assuming the timestamp is on the photos, facebook still knows where you were at the time they were taken. The time you upload them is immaterial

pud · on Dec 21, 2011

In your opinion, how would Facebook know your location if you never logged in from a GPS-enabled smart phone? (and never approved an HTML5 geo permission request on your laptop).

That's the interesting question here.

msbarnett · on Dec 21, 2011

> In your opinion, how would Facebook know your location if you never logged in from a GPS-enabled smart phone? (and never approved an HTML5 geo permission request on your laptop).

No real trick to that; if the user signs in from an IP leased to a Christchurch, NZ ISP on March 5th, 2007, then it's a safe bet that any photos taken that day were taken near Christchurch (and you get date taken from the EXIF data of virtually every camera on the planet).

phpnode · on Dec 21, 2011

well they can do a best guess based on the location of your IP, even if you only sign into facebook once while you're travelling. The really interesting question is: can they do this even if you have not signed into facebook at all on your travels, or for totally new accounts with no location history at all.

If they can, then they must be doing image recognition in some way. Perhaps someone who has more time / energy than me could try this by creating a new account and uploading some photos of recognisable objects (but without exif geo data).

pbhjpbhj · on Dec 21, 2011

Facebook knows more about me than I do. I'd be interested in their history file of my location.

jfoster · on Dec 21, 2011

When you were in NZ: [hh:mm dd/mm/yyyy, nz.ip.address] => [hh:mm dd/mm/yyyy, location]

When you uploaded the photos: [hh:mm dd/mm/yyyy, photo] + above data => [photo, location]

ErrantX · on Dec 21, 2011

I reckon they use image recognition too. I have a gash digital camera that gets a lot of use; no exif data beyond camera make/model.

Anyway; in 2005 I went on a skiing trip to Chamonix. Pics were uploaded about a month later, no Facebook activity at the time (or while I was there). The other day FB asked me if they had been taken in Chamonix. So... something else used there at the very least.

nixy · on Dec 21, 2011

How does Facebook use image recognition to recognize that an image of a baby on a blanket with no surroundings visible was taken inside a certain hospital? I doubt it.

Regarding your Chamonix images, perhaps your friends uploaded photos taken at roughly the same time, with geo-tags? Perhaps you are tagged in some of them? That would make for pretty simple logic:

- Facebook knows that you were at a certain place a certain time through the geo-tagging of those photos, as your user is tagged in one of them

- You upload photos taken at the same time as the photos you were tagged in

Conclusion: You must have taken those photos where the photos you were tagged in were taken.

ErrantX · on Dec 21, 2011

Regarding your Chamonix images, perhaps your friends uploaded photos taken at roughly the same time, with geo-tags?

Nope. Family holiday. Of those there I am the only one with photos on Facebook - and in fact at the time was the only family member with a Facebook account :)

I am intrigued because if it is image recognition there isn't much for them to have gone on. But I am stumped for what else they could have used.

nixy · on Dec 21, 2011

Wow. Maybe they have a bunch of people looking at pictures, manually making suggestions? Google "similar images" search-like tool? It does sound scary if they really had nothing to go on other than the image data.

benmmurphy · on Dec 21, 2011

maybe Facebook just randomly guesses locations. those who get matches write comments on reddit wondering how the hell it works and those that get weird suggestions ignore them :)

amouat · on Dec 21, 2011

My photo locations seemed to be mainly based on the album title - I was impressed that it got both locations from "St Catharines and Niagara" correct.

I haven't checked to see if the individual photos are correct. However, at the time I didn't have a smart phone and I'm pretty sure my camera wouldn't provide much useful info.

UPDATE: I take it back: fb thinks I was at a football camp called St Catharines and Niagara! http://www.facebook.com/pages/St-Catharines-and-Niagara/1821...

I think that shows that the title is the most important indicator to fb.

darklajid · on Dec 21, 2011

Immature, but I really loved this comment/idea and wonder about the feasibility of such attacks:

"You know what this means:

Time to rewrite your EXIF info and location bomb the hell out of popular attractions. Eiffel Tower in Paris? Nope, it's in Iowa ..." [1]

Possible? Google bomb with a twist?

1: http://www.reddit.com/r/WTF/comments/nkktm/facebook_is_reall...

yaix · on Dec 21, 2011

Yep, it would be good to show that this kind of data is not reliable and easily manipulated. But probably it will not happen.

rufibarbatus · on Dec 21, 2011

There's anecdotal evidence, a little bit further down, of this actually happening: someone mentions Facebook thinking that their pictures of a trip to London were taken in England, Arkansas!

http://www.reddit.com/r/WTF/comments/nkktm/facebook_is_reall...

ChaitanyaSai · on Dec 21, 2011

Highly unlikely that there is anything sophisticated here. It would be too many computing cycles thrown at something with very small RoI. Most likely some straightforward text, date, and location matching. Many of the previous commenters are assuming you need sophisticated pattern sifting to get any good insights. Not true. Large numbers of facebook users are including this data voluntarily or with their unexpurgated photos. Going after the sliver who don't is just not a worthy investment -- yet.

nbm · on Dec 21, 2011

Karan from the locations team at Facebook posted a response on Reddit. He said:

""" I am an engineer at Facebook working on the Locations team. We use the text entered in the location field of the album by the album owner and match it to the best guess for an existing Facebook Page using text matching. The popularity of the place is also used to rank the suggested places. """

http://www.reddit.com/r/WTF/comments/nkktm/facebook_is_reall...

(I work at Facebook, not on locations/photos.)

zombielifestyle · on Dec 21, 2011

I've tried it. i don't use fb for pics. i don't use wifi on my smartphone and i had the fb app only briefly installed. i also weren't on any trips outside my city.

I've uploaded 4 screenshot-ed landscape (including one building) pics from flicker with absolutely no metadata. at least 2 of them should be very recognizable. the only thing that fb suggested was the single check-in that i had, minimum of 500km off.

i'm pretty sure that there is only minimal (if any) intelligence on image recognition and that they guess locations by information that "leak" from you and your hardware.

zombielifestyle · on Dec 21, 2011

fb don't even matches the check-in of my fake wife at hollywood walk of fame to an pic of it uploaded by me.

uptown · on Dec 21, 2011

Since Facebook knows your social connections, isn't it also possible they check the location of the people to whom you're connected, and use their location as a guide as to where your photos may have been taken? If a cousin checked Facebook on a phone near a hospital while you posted a hospital photo, they know there's a chance the cousin was visiting you and that's where your photo was taken. They use your social grid for everything ... why wouldn't they use it for this too?

myared · on Dec 21, 2011

I think this is a highly likely scenario. The reddit user isn't taking into account that much of what Facebook knows about us comes from our connections.

grifaton · on Dec 21, 2011

It's not perfect yet -- Facebook knows that I went to university in Cambridge (UK) but keeps asking me if photos from my undergraduate years were taken in Cambridge (Massachusetts).

sondh · on Dec 21, 2011

I don't think this is machine learning.

I have Facebook suggested a few albums today (2 hours ago). Some of them are from 2+ years ago, others are recent ones. The suggestion is quite accurate (89%, 8 out of 9 albums). With the incorrect album, Facebook suggested another city with the same name but from another country!

So Facebook probably just combine as much data as possible and when the matching rate is larger than some threshold, it will temporary tag the album and confirm with users. Data may include:

1. Album info (yeah, most of my album includes the place in it's name or description)

2. Comments group. 3 of my albums have a large number of comments, they are all from my highschool friends (I grouped them all in the same group) so it's quite accurate I guess. Also, if Facebook use the information in their smart lists, that makes sense too.

3. EXIF data. Some people says Facebook can read the data and cross-check the date with its database of check-in. This doesn't happen to me but I think it's possible. And of course, if the EXIF has geo-tagged, that info can be picked up.

Thought?

emw · on Dec 21, 2011

I second the idea that this doesn't sound like machine learning. I wouldn't be surprised if there were no machine learning or data mining algorithms involved in Facebook mapping EXIF-embedded geographic coordinates in digital photographs and suggesting locations for an album based on points-of-interest near those coordinates.

Maybe some neat work with GIS enabled this, but I don't see anything that strongly indicates machine learning.

nandemo · on Dec 22, 2011

The reddit poster claimed they didn't have GPS on their camera. What happened is problaby (3) above:

http://www.reddit.com/r/WTF/comments/nkktm/facebook_is_reall...

dhx · on Dec 21, 2011

LinkedIn appears to build "social graphs" based on profile views. If person A visits the profile of person B and C it is quite likely that some form of relationship exists (perhaps an inverse relationship too). It doesn't just have to be profile views -- LinkedIn most likely keeps track of search terms too. If a random unknown visitor comes along and searches for both "Person A" and "Person C" then it is likely that a connection exists between these two persons.

It would be easy to train this system by displaying a "guess" (a friendship recommendation for person A) to persons B and C. If person B or C show interest in the recommendation then perhaps a relationship exists. Guesses could also be formed by comparing keywords/metadata found on profiles in two different social circles.

The reason I mentioned LinkedIn is that I think they do a better job of recommending/guessing who your acquaintances are than Facebook.

reinhardt · on Dec 21, 2011

I was about to post about LinkedIn too. I am convinced it uses, among others, reverse view lookups. I have had suggestions for people I know whose LinkedIn profile I had never viewed or searched for myself; the only explanation is they had looked for me. I don't know if it's because I interact with it more frequently and using real personal info but its "people you may know" predictions are eerily more accurate than FB, or anything else for that matter.

finnw · on Dec 21, 2011

I've found the opposite. LinkedIn has been right only once for me. The other 23 people it suggested I might know were people I have never heard of.

Facebook usually suggests people I have met but do not want to be "friends" with.

archangel_one · on Dec 21, 2011

I've seen both extremes on my photos. One set it suggested the nearby town, which was creepily accurate; another it asked "were these taken in England, Arkansas" which was a hilariously poor guess. Weirdly, they managed the difficult part (figuring out the photos were taken in "England" without any obvious clues) but then failed to sensibly geolocate it.

simondlr · on Dec 21, 2011

Facebook has suggested twice accurately places I've visited, but I pretty much specified it in the album (Singapre and Australia).

There is an album now with random photos, with the location "Plekke" (which is afrikaans for "places"). They are suggesting "Plekke, San Juan, Puerto Rico". So, currently I wouldn't suggest machine learning, although it is not entirely dismissible. They have the largest photo db in the world, they could eventually learn and then seed it with stuff like checkins, status updates, etc.

EDIT: I have another album called "Kuala Lumpur and Thailand", which have pretty iconic pictures in it (Wat Pho, KL Towers, Batu Caves, etc) and they haven't asked me about that album at all.

nodata · on Dec 21, 2011

I vote cell tower info.

I have a Nokia phone with no GPS. Many programs uses the cell tower info added by the phone to determine where the photo was taken.

jfoster · on Dec 21, 2011

A test that might fool this. You will need: 1. A facebook status update/checkin/access from a particular location. 2. A photo taken without EXIF data at around the same time as the update/checkin/access above, but from a completely different location.

Upload the photo using your facebook account. Check whether they get the location right. If yes, the mystery continues. If no, but the location is somewhere other than where you updated your status/checked in from, the mystery deepens. Else, they're pairing the times together.

finnw · on Dec 21, 2011

I think I'll set the date on my camera to 1911. A human seeing the timestamp on the picture will just assume its a Y2K bug and will guess the correct data. Since that's extremely unlikely IRL, the Facebook algorithm probably won't know to make this correction.

barrkel · on Dec 21, 2011

Maybe it infers from the locations of your friends. It looks like more than one technique is used, though; probability-weighted heuristics.

ErrantX · on Dec 21, 2011

I first got these notices about... oh, best part of 2 months ago. I assumed it was everywhere but perhaps not!

chinmoy · on Dec 21, 2011

I googled Zooey Deschanel several hours ago. I was logged into facebook at that time. I log into facebook again and Zooey Deschanel in now on my 'People to Subscribe To' list. I came to hacker news to post an ask hn thread about if fb tracks my browsing history..and bummer I find this thread.

sliverstorm · on Dec 21, 2011

I dunno about everybody else, but I'm pretty impressed. It guessed some obvious ones- but I also had an album of photos of a car at a dealership. All you could see was the car and the service bay, but it knew what dealership it was.

Turing_Machine · on Dec 21, 2011

It keeps asking me if some screenshots from a virtual world that I'd location-tagged with "Cyberspace" were taken in some net cafe with that name in British Columbia. I'm really tempted to answer yes.

ananthrk · on Dec 21, 2011

Yup. I had a similar experience couple of days back when FB listed three of my albums and correctly predicted the place names for confirmation. I was shocked when it _correctly_ predicted the location of one of my private albums which is an obscure town in southern India.

Given that most of my pictures were taken with cameras with no geo identification mechanism, I can only conclude that they identify the location based on any comments and similar photos available in _my_ network.

yread · on Dec 21, 2011

I just tried to upload some photos that facebook hasn't seen yet with or without EXIF info (obviously no or invalid geotagging) and tried geo-tagging some photos I already have uploaded and Facebook hasn't made any suggestion whatsoever. Perhaps I found a way to switch it off but I've just checked the settings and none of the options seems to be concerned with suggestions for geotagging

tlrobinson · on Dec 21, 2011

The outrage expressed by some of the commenters on Reddit and here is curious. Clearly Facebook and other companies have the technology to make these connections. Would you rather they keep it hidden to sell to advertisers and other potentially unscrupulous buyers, or expose it to you so that you can either make use of the features, or decide it's not worth playing their game?

orionsbelt · on Dec 21, 2011

This had been freaking me out for months!

But I think I finally figured it out in my case. I had used Picasa to upload the pics years ago. In Picasa, I had entered a caption with enough detail for Facebook to guess the location. The caption had never been uploaded (i.e. set as the photo's caption on Facebook), but I'm guessing they did somehow capture that info on upload.

brown9-2 · on Dec 21, 2011

Ignoring the how of this new feature, what is the upside to me as a user of confirming to Facebook the hospital in which my child was born?

Why do they think people would generally be interested in clicking "Yes" on these suggestions? That they are helpfully filling in the blanks on photos I "forgot" to location-tag?

devonrt · on Dec 21, 2011

The upside to you is that your photographs are tagged and categorized in more and more useful ways. Without this suggestion what is the likelihood that you would go back to old albums and tag them in such a way? I know for me, it's almost zero. When it's presented in such a frictionless manner (and is eerily accurate) it becomes a hell of a lot more likely. I think that the typical HN user is either too cynical or too tech-savvy to really find this useful. Most users have huge collections of photos that aren't necessarily categorized in useful ways.

Facebook wants its users to catalog their lives. For them, this data is marketing gold. For end users, it means their content (photos, status updates, life events, etc) are categorized more thoroughly and in a more meaningful fashion (geographically, chronologically, etc).

henrybridge · on Dec 21, 2011

Disclosure: I work on location products at Facebook.

I think that real world places are an important part of who you are. Where you're from, where you've lived, and where you've travelled to, for instance, all help tell a story about you as a person.

The way that's represented on Facebook today is the Map on your Timeline. These suggestions help you put more photos on your map so you can share experiences with friends. That trip you took to Disney World 5 years ago may be far down in timeline, but it's going to be easy for friends to find it on your Map.

Hope that helps explain the "why" a bit.

brown9-2 · on Dec 21, 2011

Thanks for the answer.

ddw · on Dec 21, 2011

FB might say that it enhances the timeline, which is essentially their way of organizing your life for you.

shalmanese · on Dec 21, 2011

I asked this question on Quora a few days ago: http://www.quora.com/Facebook-1/Why-is-the-new-Facebook-Add-... but did not get any responses.

steve8918 · on Dec 21, 2011

I noticed this yesterday as well. I have photos uploaded from a Canon camera from 2007. There is simply not enough information to determine the location, but they correction identified the bar that the photos were taken at. I have no idea how they did it, but it really does bother me.

sajidnizami · on Dec 22, 2011

There was an old vision that machines would do the stuff for you somewhere back in the day.

Sad that we've got to a point where we can make this possible but won't because we can't trust pie in the sky.

Torn between the two options and can't decide!

silentscope · on Dec 21, 2011

I actually think fb needs to own up to how they do it. If it's geolocation with a correlated timestamp I'd like to know.

If it's just album name suggestions, I still don't like it, but I suppose I'll live with it.

This is creepy.

idunno246 · on Dec 21, 2011

Anyone else find it amusing he is complaining about Facebook knowing the location of a photo by posting it to a more public place and confirming the location?

randysavage25 · on Dec 21, 2011

Encouragement for the creation of a facebook replacement

yogrish · on Dec 21, 2011

Wow! A serious competitor to SIRI in place. My guess is they are using combination of techniques - Semantics of your Status as phpnode mentioned with example(heard they have plans to get into semantic search to beat google), EXIF info, IP address and also your friends replies - when is the due? which hospital or Gynic? If hospital name is not mentioned then Gynics details and her hospital location. Do you think all this is used just to make a SUGGESTION?? Its a Billion$?

pbhjpbhj · on Dec 21, 2011

If image info that users wish to enter can be guessed well this reduces friction substantially. Having this info makes FB a more valuable marketing tool. Why not make such suggestions.

paraschopra · on Dec 21, 2011

I was freaked out too, but I had relevant names for those albums like Turkey, Shillong, etc.

richardburton · on Dec 21, 2011

Eventually someone evil will do something truly monstrous on Facebook and it will make this WTF look like a drop in the ocean. I fear for my friends who are still Facehooked.

giulivo · on Dec 21, 2011

I think it's just GPS positioning by the smart phone. I wouldn't be fooled by a liar.

nmunson · on Dec 21, 2011

Quite a few people mention the pictures were taken with a camera and uploaded at a later date, so GPS wouldn't have been a factor. With the amount of data Facebook has I wouldn't be surprised that they could do more than just use GPS information to place locations.

giulivo · on Dec 21, 2011

I'm sure there are other technologies which can be used for geolocalization and can achieve similar "performances" (I'm thinking about WIFI networks or IP, in some cases).

Still, I wouldn't expect a digital camera to put those informations in the EXIF (assuming it does have a WIFI and/or an IP address)

Also, to me Facebook doesn't seem to be using any magical machine learning algorithm capable to recognize the location of a PIC from a face, it never did on my PICs, not even when there were landscapes in it.

The guy could instead have uploaded the photo using a smartphone which added the GPS coordinates.

But what do YOU think instead? How does that work? Is it working also for your uploads?

mhartl · on Dec 22, 2011

s/alaramed/alarmed/ in the article title