Hacker News new | past | comments | ask | show | jobs | submit login
Meta outage (metastatus.com)
698 points by geocrasher 11 months ago | hide | past | favorite | 845 comments



Seeing such a large web property go down like this is fascinating.

It's like a power plant grid failure, except for attention instead of energy.

When meta is down, a hoard of internet users desperately seek somewhere else to place their attention... But the system is designed with the expectation that meta will take all that traffic... And boom! everything starts falling over. Wild.


Classic failure casdcade.


And rather than an indication that it is not working, I was just told to login again; since the attempts were unsuccessful, I was led to reset my password. Now I have no clue if the password has been reset properly or an old one is used, or login with google is used, or if I will continue to be logged out after they fix stuff.

If your service is down, please say "my service is down".


One weekend I was on-call, we had a system we ran for the government, but authentication was handled by another government owned service, hosted and managed by another company.

A user called in early Saturday evening, saying that the system was down. After a bit of debugging, wondering where our alerting had failed, I concluded that the system was perfectly fine, but the authentication service had been returning 500 errors for around an hour. When I called the user back he made a comment that rather changed how I think about monitoring and systems. He says: Well, from my point of view it really doesn't matter which part isn't working, it's just down.


I've learned the same lesson, and why things like ping checks aren't enough. Even just a WGET of the front page isn't enough. You need the monitor to login and replicate at least part of the user experience if you have a complex setup. It's very embarrassing when someone tells a user "shows it's up from my side." and they're relying on a naïve monitor.


I lament the fact error messages are rarely, if ever, displayed anymore. Just a generic message in its place, usually not even indicating whether it's a 'you' problem or a 'them' problem :(


It amazes me how most smartphone apps- which operate in a wireless environment of likely flaky connections, don't seem to be tested or designed at all to handle loss of connections. They tend to just become non-responsive or do erroneous things like play the first second of an audio file and then auto-pause, without reporting any errors.


Sooo like the Spotify app for the past 10 years.


Spotify is in offline mode...

This bug has been there since the first time I installed Spotify on a smartphone and it's still there a almost 15 years later.


These 1 percent of edge cases take so much effort to test and handle though.


A smartphone having a flakey internet connection isn't an edge case.


Exactly, it's a daily thing for users- it's only an edge case for the developers, because they are likely developing and testing with hard wired high speed connections.


Good chance most of the phones are virtualized as well to test across OS builds and that the actual deploy to real phones aren't tested nearly as well.


Personally, I despise cutesy error messages that try to make the program sound more human. "Oops, we're sorry this happened! Would you like a nice cup of hot cocoa while we try and figure out what happened? Or should I read you Goodnight Moon" again?"


1 percent is extremely frequent though. Like literally thousands of times per second on prod if it's a b2c application


Yeah, but at least it's a bit understandable that its changed in that direction, and usually you can get a developer-friendly error message if you look at the HTTP responses in the devtools.

I remember one time a company I worked at received a support message where the title was (paraphrased) "DONT HURT MY CHILDREN" from some person who saw an error message saying something about "couldn't dispose of child" or something similar, when the frontend broke. "Child/children" being kind of common in programming, I'm sure others faced similar scenarios.

I'm not sure if they were genuinely scared or just decided to have some fun with us while reporting the issue, but after that we made the error messages even more generic, so nothing could be misunderstood.

I guess it is this quest to not alienate the lowest common denominator that is the reason behind the stupidification of error messages. On one hand, people won't get scared, on the other hand, people get less context about why the error happened in the first place.

Wonder how many of the average users actually care?


The original Mac would pop up a dialog with a threatening icon of a bomb with a lit fuse, whenever it crashed! https://en.wikipedia.org/wiki/Bomb_(icon)

>The Bomb icon is a symbol designed by Susan Kare that was displayed inside the System Error alert box when the "classic" Macintosh operating system (pre-Mac OS X) had a crash which the system decided was unrecoverable. It was similar to a dialog box in Windows 9x that said "This program has performed an illegal operation and will be shut down." Since the classic Mac OS offered little memory protection, an application crash would often take down the entire system.

Unfortunately, the Mac's bomb dialog could cause naive users to jump up out of their seat and run away from the computer in terror, because they though it was going to explode!

https://www.youtube.com/watch?v=zQGX3J6DAGw

And Window's "This program has performed an illegal operation and will be shut down" error message was just as bad: it could cause naive users to fear they might get arrested for accidentally doing something illegal!


Generally error messages get watered down to nothingness because it gives news reporters less meat to chew on. Not knowing if its a "me or you" problem means that small problems might actually get missed and not need any PR response. Details won't leak and produce all kinds of speculation over the causes, etc.


You have to be careful with error messages. It's possible to give too much away and enable security vulnerabilities.


I'd assume the security vulnerability is there no matter what the error message says, but I guess very explicit and verbose error messages might expose details to make it easier to find said vulnerability.


There are so many microservices everywhere with different teams responsible, that if just one team doesn't forward errors properly it is messed up.


Integration tests are a thing.


You mean e2e from mobile app? Because they would have to be able to trigger those failure situations from all layers of microservices for which they have no control over.


I thought I got hacked and tried to retrieve my password.

Got a weird email with the same code every time from facebookmail.com


My first thought was that I'd been hacked as well...

I have gotten 4-6 "here's your facebook reset" type emails over the last month. All unrequested. I've always assumed they're casual attacks (mostly interested in seeing if my password can be stuffed into another, more valuable account)


I really hate that companies do this. PayPal does it sometimes, too.


It sounds like you are struggling with withdrawal symptoms. I recommend a detox and get your life into order.


It has nothing to do with Facebook; it's the trend of sending email from alternative domains like facebookmail.com that train users to trust unknown domain names.

It's particularly egregious with critical stuff like banking.


Password works on THREE other machines, fails completely on ONE.

I was logged in when the outage happened on machine A and phone B - immediately tried to reset password, which took me into the hellish abyss many of us are experiencing now...

This evening - about 18hrs about the event, I fire up machine C - it logs STRAIGHT INTO Messenger. OK... This is something, too scared to open a browser in case that triggers the session disconnect...

So I fire up laptop D - STRAIGHT INTO MESSENGER... OK, open browser, STRAIGHT INTO FB. Log into Google password manager, VISUALLY CHECK password, and it is my last known good password (in use at time of outage).

Fire up iPad E - Straight into Messenger!!! All these machines are on the same network!

Back to Machine A - clear cookies and try to log in, no joy; different browser and try to log in, no joy; try to reset password on this different browser (I might add, I did get a new Change password Token number off this attempt), but no joy; clear cookies and restart, then attempt to log in, no joy!

WHAT THE F?!?!?!?

I initially thought it may have been a 24hr block due to password change attempts? But now not so sure... I've also tried logging in on via Machine A in a VM from a different O/S to see if it may have something to do with it - but again no joy - this environment had NOT been logged in to FB before...

Thoughts????


MAC Address being flagged?

But on the flip side, I was able to create a back up profile in a different VM on this machine.....


I'm just waiting for all my relatives to breathlessly call me saying "I THINK MY FACEBOOK WAS HACKED".


Already got a call from someone who regularly gets caught in unprompted xfinity password reset loops saying “it’s happening again.”

Inexcusable to catch people in a reset loop during a login outage, and not confirm which password is the current one.


Same... This was literally the first time I wanted to use my Facebook account in years, so I changed my password twice before I realized that there must be an issue on their end.

I just wanted to use their oauth login to buy a jonsbo n3 case from AliExpress. Sadge


Logging everyone out across all platforms and telling them to just try logging back in sounds like an excellent way to DDOS yourself.

This should be a fun postmortem.


Yeah, and they've dug themselves a hole with having near permanent login persistence. I can't remember the last time on a computer I use frequently that I've had to log in.


Yeah, same thing happened to me so will likely need to reset my password again once Facebook is up again.


That was my cue to check hacker news


Yup - same here...

I fear my initial attempts to 'reset my password' and getting that same security(at)facebookmail.com email with the SAME reset code, have not helped me - if only I'd been asleep, I probably could have slept through it.

At least Insty came back up - though to be honest I cant remember that password either and now terrified to try and recover or change that now too!


Me too, after the first couple of "is it down" websites I tried failed to load.

HN never fails us when a big website is down.

I was also almost ready to reset my password.


Fwiw I tried to reset my pw and it didn't work either. Because FB web app said I input wrong pw.


I have exactly this issue. Been frantically jumping around this morning dealing with perceived security issues and have no idea where my many FB resets led to.


MY SERVICE IS DOWN.... I did exactly as you.. I have no clue now


It logged me out and told me that my credentials were incorrect; I thought my credentials had been stolen, so I'm kinda personally glad that it seems to be happening to a lot of other people too. I know that's a bit selfish, but :shrug:


A much better UX would be clear error messaging informing users that the service is down and there is no problem with their individual account.

This would prevent people from panicking they've been hacked and/or unnecessarily resetting their password.


There are quite some harsh comments here below. You can't plan for every possible failure point, who knows what part of a system/infra out of everything that they have went down and triggered this behaviour. Some things you just can't catch/predict. Especially in huge systems like theirs. I would expect people here to understand things like these and not just call people names for something like this, we all know things seem simple/clear from the outside, but the job of debugging and fixing something like this take quite some effort.


This is a company with one of the largest digital infrastructures in the world. An outage is understandable, inability to tell they're having an outage and inform users appropriately is not. Stop making excuses for people who are literally awash in resources.


> Stop making excuses for people who are literally awash in resources.

This is a pretty weird outlook to have - looking at any group awash with resources, whether it be governments or other companies, and you can clearly see that even with those resources, failures still happen.

You can jump up and down and pretend that this is solvable, or you can look at reality, look at all the evidence of this happening over and over to almost everyone, and conclude with some humility that these things just happen to everyone.

(Looking this reality in the face is one of the things motivating my beliefs around e.g. AI safety, climate change, etc.)


> An outage is understandable


It is always better for the company's rep for the issue to have been on your end. Admitting fault comes with a potential liability. It's gaslighting written as an SLA


You can't plan for every contigency, but you can reserve potentially scary message for situations where you know they are correct. An unpected error state should NOT result in a "invalid credentialiald error".


This is the nature of credentials errors. The more information you give, the more you're telling an untrusted and therefore assumed-hostile agent.

I hate it because it's bad UX, but that's the thinking behind it.


Pushing people to unnecessarily reset credentials increases risk. Not only does it increase acute risk, but it also decreases the value of the signal by crying wolf.

The argument here is the kind of nonsense cargo cult security that pervades the industry.


I think this argument falls flat on two axes:

- in general, if the system is broken enough to be giving false-negatives on valid credentials, it's broken enough that there isn't much planning to be done here because the system's not supposed to break. So if they give me "Sorry, backend offline" instead of "invalid credential," they've now turned their system into an oracle for scanning it for queries-of-death. That's useful for an attacker.

- in the specifics of this situation, (a) credential reset was offline too so nobody could immediately rotate them anyway and (b) as a cohort, Facebook users could stand to rotate their credentials more often than the "never" that they tend to rotate them, so if this outage shook their faith enough that they changed their passwords after system health was restored... Good? I think "accidentally making everyone wonder if their Facebook password is secure enough" was a net-positive side-effect of this outage.


So your approach to security is to never admit that an application had an error to a user, but to instead gaslight that user with incorrect error messages that blame them?

This is security by obscurity of the worst kind, the kind that actively harms users and makes software worse.


No. My approach to security is to never admit that an application had an error to an unauthenticated user.

That information is accessible to two cohorts:

- authenticated users (sometimes; not even authenticated users get access to errors as low-level as "The app's BigTable quota was exceeded because the developers fucked up" if it's closed source cloud software)

- admins, who have an audit log somewhere of actual system errors, monitoring on system health, etc.

Unfortunately, I can't tell if the third cohort (unauthenticated users) is my customers or actively-hostile parties trying to make the operation of my system worse for my customers, so my best course of action is to refrain from providing them information they can use to hurt my customers. That means, among other things, I 403 their requests to missing resources instead of 404ing them, I intentionally obfuscate the amount of time it takes to process their credentials so they can't use timing attacks to guess whether they're on the right track, I never tell them if I couldn't auth them because I don't recognize their email address (because now I've given them an oracle to find the email addresses of customers), and if my auth engine flounders I give them the same answer as if their credentials were bad (and I fix it fast, because that's impacting my real users too).

To be clear: I say all this as a UX guy who hates all this. UX on auth systems is the worst and a constant foil to system usability. But I understand why.


You are absolutely correct. That would be a much better experience.

That said, getting there strikes me as pretty challenging. Automatically detecting a down state is difficult and any detection is inevitably both error-prone and only works for things people have thought of to check for. The more complex the systems in question, the greater the odds of things going haywire. At Meta's scale, that is likely to be nearly a daily event.

The obvious way to avoid those issues is a manual process. Problem there tends to be that the same service disruptions also tend to disrupt manual processes.

So you're right, but also I strongly suspect it's a much more difficult problem than it sounds like on the surface.


> That said, getting there strikes me as pretty challenging. Automatically detecting a down state is difficult and any detection is inevitably both error-prone and only works for things people have thought of to check for. The more complex the systems in question, the greater the odds of things going haywire. At Meta's scale, that is likely to be nearly a daily event.

Well, in principle, the frontend just has to distinguish between HTTP status 500 (something broken in the backend, not the fault of the user) and some HTTP status code 4xx (the user did something wrong).


Yes, assuming the responses are usefully different, accurate, and you get responses in a timely manner.


The "your username/password is wrong" message came in a timely manner. So someone transformed "some unforeseen error" into a clear but wrong error message.

And this caused a lot of extra trouble on top of the incident.


But there's something off here. I wouldn't expecting to be shown as logged out when the services are down. I'd expect calls to fail with something aka 500 and an error showing "something happen edited on our side". Not all the apps going haywire.


At the scale of Meta, "down" is a nuanced concept. You are very unlikely to get every piece of functionality seizing up at once. What you are likely to get is some services ceasing to function and other services doing error-handling.

For example, if the service that authenticates a user stops working but the service that shows the login form works, then you get a complex interaction. The resulting messaging - and thus user experience - depend entirely on how the login page service was coded to handle whatever failure the authentication service offered up. If that happens to be indistinguishable from a failure to authenticate due to incorrect credentials from the perspective of the login form service, well, here we are.

At Meta's scale, there's likely quite a few underlying services. Which means we could be getting something a dozen or more complex interactions away from wherever the failures are happening.


Isn't this just the standard problem of reporting useful error messages? Like, yes, there are academic situations where you can't distinguish between two possible error sources, but the vast majority of insufficiently informative error messages in the real world arise because low effort was applied to doing so.


Yes and no.

Yes, with the additions of sheer scale, a vast number of services, multiple layers, and the difficulty of defining "down" added in. I think the difficulty of reporting useful error messages is proportional to the number of places an error can reasonably happen and the number of connections it can happen over, and by any metric Meta's got a lot of those.

No, in that detecting when you should be reporting a useful error message is itself a complex problem. If a service you call gives you a nonsense response, what do you surface to the user? If a service times out, what do you report? How do you do all this without confusing, intimidating, and terrifying users to whom the phrase "service timeout" is technobabble?


> If a service you call gives you a nonsense response, what do you surface to the user?

If this occurred during the authentication process, I think I would tell the user "Sorry, the authentication process isn't working. Try again later." rather than "Invalid credentials". And you could include a "[technical details]" button that the user could click if they were curious or were in the process of troubleshooting.


Slightly unrelated question, but just how "Big" is Meta? I know it's vast, but as an outsider I have trouble grokking the scale of it.


When most people talk about serving thousands and maybe millions of requests per second, Meta talks about billions of requests per second.

https://read.engineerscodex.com/p/how-facebook-scaled-memcac...


> If that happens to be indistinguishable from a failure to authenticate due to incorrect credentials from the perspective of the login form service, well, here we are.

If you can't distinguish those, then that is bad software design.


Come on use a little imagination. DNS lookup for the db holding the shard with the user credentials disappears. Code isn’t expecting this, throws a generic 4xx because security instead of a generic 5xx (plenty of people writing auth code will take the stance all failures are presented the same as a bad password or non-existing username); caller interprets this a login failure.

Same auth system system used to validate logins to the bastions that have access to DNS. Voilá.


> plenty of people writing auth code will take the stance all failures are presented the same as a bad password or non-existing username

Those people would be wrong. You can take all unexpected errors and stick them behind a generic error message like "something went wrong" but you should not lie to your users with your error message.


It's about not leaking sensitive information.

If you have different messages for invalid username vs invalid password, you can exploit that to determine if a user has an account at a particular service.

"Invalid credentials" for either case solves this problem.

But sure, let's report infra failures different as "unexpected error"

Now, what happens if the unexpected error is only when checking passwords, but not usernames?

Do you report "invalid credentials" when given an invalid username, but "unexpected error" when given a valid name but invalid password?

If so, you're leaking information again and I can determine valid usernames.

So, safe approach is to report "invalid credentials" for either invalid data or partial unexpected errors.

Only time you could safely report "unexpected error" is if both username check and password check are failing, which is so rare that it's almost not worth handling. Esp. at the risk of doing wrong and leaking info again.


If you really want to hide whether a username is in use, then you also have to obscure the actual duration of the authentication process among other things. The amount of hoops you need to jump through to properly hide username usage are sufficient that you need to actually consider if this is a requirement or not. Otherwise, it is just a cargo cult security practice like password character requirements or mandated password reset periods.

In this case, Facebook does not treat hiding username usage as a requirement. Their password reset mechanism not only exposes username / phonenumber usage, but ties it to a name and picture. So yes, Facebook returning an error that says credentials are incorrect when it has infrastructure problems is absolutely a defect.


what if, if one service doesnt respond at all or responds with something that doesnt fit an expected format that it would if working correctly, the whole thing just says "sorry, we had an error, try again later"? if it has to check both at the same time, and cant check them independently, wouldn't that solve the vulnerability? or am i missing something? totally understandable if i am, i just want to learn /gen


Well you can't expect to hire engineers with half a brain for the pitiful compensation Meta offers, can you?


There was for a brief moment. I got that once


It would be a better UX, but, depending on the outage, that might be a really hard behavior to guarantee.


Not the worst thing that a bunch of Facebook users are resetting their passwords.


That's the sound of millions of "password" becoming "passwordnew"


Yea, the wife came to me in a bit of a panic that her Facebook account got hacked. I tried logging in to FB to check if I had been unfriended, and I also got errors indicating my password was incorrect. My FB password is 96 bits from /dev/urandom in a GPG-based password manager I wrote for myself a couple decades ago. So, no my password wasn't wrong, and I'm not a big enough target for someone to put enough effort into figuring out how to snarf up my password data and crack my GPG passphrase.

Anyway, when FB thought my password was wrong I calmed way down. I thought maybe FB corrupted their password DB or something, so I just tried to reset my password, got into an odd workflow loop, and then quacked "downdetector facebook".


> My FB password is 96 bits from /dev/urandom in a GPG-based password manager I wrote for myself a couple decades ago.

We have the same approach to password management!


that's actually really cool, i hadnt considered writing my own password manager but i feel like it'd be a fun and fairly useful project, did it take you particularly long to do? i'm interested in giving it a go :D


The heavy lifting is done by GPG in a subprocess, taking information on stdin or outputting the decrypted data on stdout. The rest is just generating the passwords, organizing the encrypted files, and perhaps interacting with the clipboard.

Have a look at https://www.passwordstore.org/ and also https://github.com/kmag/store_password_gpg


Yes. My spidey sense went off and I told my work I'll be off for an hour while I redo all my passwords... might still do that but glad to know it's not necessarily me getting hacked.


don't bother. fb's forgot password flow is broken too.


As was the account hijack process. It just loops.


I called out some comment for being racist a little earlier (yeah I know, just report and move on...) and figured they'd managed to pwn my account somehow. Good to know it's not just me.


Strictly speaking, just because there's an outage does not mean you're not pwn'ed.


Maybe the outage happened because they used a 0-day to pwn smcl


Crazier things have happened.


In an "anything's possible" sense then yeah. But the fact that FB was not letting me login with the credentials I knew to be correct was directly attributed to a global outage, rather than a me-specific issue. Which I can now verify by checking the devices that are authorised to my account.


[flagged]


Much better defense if someone accuses you of racism - you still have not shown that I am wrong.


So you're saying you do own your racism. Well good for you, one of the brave racists -- now we know what kind of a person you really are. But it doesn't mean you're right, it just means your opinion is worthless and you're not worth debating because you're an intellectually dishonest bigot, even worse for believing in scientific racism.

Edit:

Your beloved scientific racism is not reality, it's a pseudoscience, as foolish and wrong as Astrology and Phrenology and Homeopathic Medicine. You're still a intellectually dishonest bigot.

If you're so intellectually honest and sure of yourself, then why don't you state right now unequivocally for the record that you're an unrepentant racist bigot?

https://en.wikipedia.org/wiki/Scientific_racism

Scientific racism, sometimes termed biological racism, is the pseudoscientific belief that the human species can be subdivided into biologically distinct taxa called "races", and that empirical evidence exists to support or justify racism (racial discrimination), racial inferiority, or racial superiority. Before the mid-20th century, scientific racism was accepted throughout the scientific community, but it is no longer considered scientific. The division of humankind into biologically separate groups, along with the assignment of particular physical and mental characteristics to these groups through constructing and applying corresponding explanatory models, is referred to as racialism, race realism, or race science by those who support these ideas. Modern scientific consensus rejects this view as being irreconcilable with modern genetic research.

Scientific racism misapplies, misconstrues, or distorts anthropology (notably physical anthropology), craniometry, evolutionary biology, and other disciplines or pseudo-disciplines through proposing anthropological typologies to classify human populations into physically discrete human races, some of which might be asserted to be superior or inferior to others. Scientific racism was common during the period from the 1600s to the end of World War II, and was particularly prominent in European and American academic writings from the mid-19th century through the early-20th century. Since the second half of the 20th century, scientific racism has been discredited and criticized as obsolete, yet has persistently been used to support or validate racist world-views based upon belief in the existence and significance of racial categories and a hierarchy of superior and inferior races.


If the opinion is based on facts that are true but inconvenient - the intellectual dishonesty is dismissing them.

>reality must take precedence over public relations, for nature cannot be fooled


I agree!

And if your grandmother had wheels, then she'd be a bicycle.

Since you just don't get it, and you're such an intellectually dishonest unrepentant racist whose opinions are so worthless they should be dismissed, I will explain it for you:

A statement in this form is always true:

"If <something that is false>, then <anything in the world you want to make up, true or false, no matter how stupid of implausible>."

Because <something that is false> like "If the opinion is based on facts that are true but inconvenient" means that you can say anything you like after that, such as "the intellectual dishonesty is dismissing them", and the entire statement is true, because the condition is false.

I know that's going to whoosh right over your head, but in other words, it's false that your opinion is based on facts that are true but inconvenient. Your opinion is based on lies and pseudoscience, and it is false, which is inconvenient for you.

Gino D'Acampo "If my Grandmother had wheels she would have been a bike" -18th May 2010:

https://www.youtube.com/watch?v=A-RfHC91Ewc


Same same. I went through the password reset flow (I was overdue anyways), it never sent anything to my SMS, so I did it again with email, reset the password and went to log in with the new password, "Incorrect password" error. Old password, also incorrect.

Didn't help that I had just posted a lukewarm spicy take on how linguistic prescriptivism is BS.

All the while the website felt like it was unstable, hard to describe, but it felt like it was bouncing around between URLs too much and reloading a lot.

Definitely feels like a botched update on their end.

E: Instagram is misbehaving as well, banner loads but big "Something is wrong" error on the feed.

E: now youtube has "Something went wrong" - WTF. I can't believe I'm saying this, but thank goodness for reddit and X[itter]???

E: interesting, seeing a big spike across multiple platforms on downdetector, including AWS: https://downdetector.com/status/aws-amazon-web-services/ I'm not able to log in right now, but that could be PEBCAK, I have too many saved IDs and I don't want to fail2ban myself


The password reset flow was broken too. And I got logged out on every device. My friend who works there said they can't login either.


downdetector reports has gone down but to me is still bugged out, been catching a livestream on youtube all along though, meta stocks are back up from the dip so I take it some regions are restored to normality


Discord is also having issues.


I panicked the same. Even when I tried to recover my password, it said my email address wasn't associated to any account. I thought I lost it forever


That's the natural endgame of the "user-facing services must not stop, if something they depend upon stops, they must only degrade" philosophy.


I heard for a while Netflix would fail open if auth was unavailable. Like it’s just movies just let em see it.

Facebook data is more sensitive. Not so much the data people go there to see, cool memes that their friends liked, but the list of friends and interests.

Other places I worked had the ability for Ops to push out a change saying the site was down for maintenance. After a while we stopped using it and just took the hit of a bunch of 5xx errors. Basically when the planned down times became shorter than the time to propagate the down setting.


Failing open is maybe ok. Telling everybody on the world their account doesn't exist anymore isn't.


Likewise, started password reset process that won't complete, asked my wife to double check my account wasn't compromised and posting cryptocurency crap or somesuch.


Yeah your product should NEVER confuse an auth service being down with a failed auth. This is really terrible by FB.


me too, also Instagram, could be that Facebook got hacked ?


Pro tip: in chase meta actually got hacked, it would be good idea to not use that password on any other websites, change them immediately.


Same, thought all of my friends were getting spammed and i'll look like a "boomer" who got phished.

Then i remembered i have a very long and secure password, then immediately panicked about someone having access to my Gmail.

The sense of security is more brittle than i thought.


On a psychological note, I think the threat detection part of our brain doesn't always notify our conscious thought that it's actively monitoring for threats. I've often noticed that when I'm carefully handling a hot frying pan then my ringing phone is more likely to startle me than usual.


When you've already got one threat on your hands you're less prepared for anything else.


That makes sense. I've noticed too that my brain seems to have a threat pre-emption module as well as a threat reaction module. For example, I'll sometimes be walking and texting at the same time, only to stop in my tracks and suddenly realize that there's a hidden stair in front of me.


I have an active Instagram account. Today, coincidentally was the first time I thought of promoting my best few posts as an ad to improve reach to fellow photographers. Bad luck! The app now seems to be back online but my ads which were supposed to run for 48-72 hours, show <Disabled>. And there is a new "Pay Now" link, even though it was paid for and seems ad-spend is already showing it used some of that payment.

As an individual, this is pretty confusing. I don't have much to lose. I am glad I spent 10-15 bucks on favorite 2-3 posts only. I can imagine many others to be more affected. What is normally to be expected for SME users? Does Meta resume the ads automatically? Do they make good for the lost time since the clock seems to be ticking -- although no one saw any impression.

Edit: I submitted a ticket to Instagram Help and they responded by asking for a screencast video. The first time I sent, the video bounced. I have re-sent this by trimming the video.

Out of curiosity I want to know firsthand how Meta handles the small customers.


Special bonus: today is Super Tuesday, which is when a large number of US states have their primary elections.


Neither party's primary is in the slightest way contested, so what do you think the "bonus" is here?


If we get huge surge of "Uncommitted" votes to Biden as we did in Michigan, you can argue this can cause significant pressure on him and might even lead to his ousting...


I've seen this posted everywhere like it's some grand conspiracy.

What is the impact of it being Super Tuesday? Are people worried they can't vote without social media?


Social Media is very successful in Get Out the Vote campaigns for younger people - who typically vote for one party disproportionately.


I thought Facebook was being used by old people nowadays


This would be a point for sure, except this is a primary and the only people who can vote in it are registered party members. Young people actually registered already care enough to know it's the primary today.

This one is already decided though. Biden on the Democrat side, Trump on the Republican. If anything it might hurt the Trump side a bit, as people may not realize that the Supreme Court only yesterday ruled states can not block him.


There’s all the down-ballot elections, and states like Texas have completely open primaries.

And states like Texas often have nearly one-party rule, so the primary pretty much is the election.

Because of the party that currently owns the non-urban parts of Texas, I usually vote in that primary, despite not voting for its candidates in the general elections.


> This would be a point for sure, except this is a primary and the only people who can vote in it are registered party members

Can't you register the same day as the voting happens? Seems utterly stupid that you have to register to vote to begin with, but if it's a requirement, you should at least be able to register the day of the voting.


I used to switch my party back and forth to vote for who I thought mattered most when I cared about voting.


Facebook and other social media platforms have played somewhat of a role in the 2016 election, for example. Meta has been under a lot of pressure for that, compare e.g. this recent Instagram change: https://about.instagram.com/blog/announcements/continuing-ou...


I can't get into the US-local conspiracy-mongering, but I'm not sure how you've missed social media becoming a hot-spot for election-day information/misinformation?

Shutting down social media has gone to the top of the list for regimes either "attempting to fabricate positive election results" or "attempting to combat the spread of misinformation about elections".

More sympathetically, for better or for worse (definitely the latter) there will be people trying to look up election information ("what are my local polling hours", "who is on my ballot") on social media websites, who will now not be able to be guided to the correct information.


Messenger is often used to coordinate logistics among friends. I would not be surprised if a Meta disruption lead to at least momentary confusion if someone was planning on carpooling. Not to mention it messes with "get out the vote" posting.

I'm not very conspiracy-minded but this does smell a little weird.

At the very least, "there's a big event in the country of one of our biggest userbases, maybe hold off on risky deploys until tomorrow"


The primaries are pretty well decided, so I suppose the conspiracy would be "trial run".


Actually I can see this morphing into a Russia interference conspiracy.

If the main social media used by liberals goes down, while the one used by Trump (forgot what it's called) stays up, surely that is an advantage for him?


In an election between the two parties I think that would be a very likely narrative but in this case it's the primaries. And (as others have mentioned) neither primary has really had anything resembling a competition at this point so it's more or less irrelevant to the result (most likely).


conspiracy theory get more imaginative.

For example, you can then pose the conspiracy that the fact it was Russian interference conspiracy was a conspiracy to justify more social media policing during the real election.


Perhaps but why would downtime increase policing? Something like a large scale fake news campaign (i.e. basic deep fakes or something) is easier (generally speaking) to pull off and would be more likely to cause increased policing that a short outage.


There are probably more MAGA's on Facebook platforms than TFG's Truth Social. Either way it's going to start a few conspiracy theories.


I'm sure people will be able to vote even if their vacation pictures are unavailable


Not relevant to the primary election in any way, but this has been more than annoying to me already.

We have a small livestock operation, and won an online auction late last night for a pig about four hours away. Facebook was the only listed means of contacting the person, and we were planning on driving to pick it up this morning.

Now I get to re-arrange my day today to deal with that, and will probably have to take a PTO day from work to drive there later in the week.

Real businesses are in fact impacted by Facebook being down - including those not based around Facebook and that you might never expect.


Hi, wanted to try on an old thread first, to avoid disrupting a current one.

I really enjoyed reading a few of your comments, and would like to get in touch.

If you're open, drop a mode of contact in your bio/here?


nominallyanonymous@protonmail.com


Did they imply anything as ridiculous as needing facebook to vote? The implication is merely that everyone would otherwise have been talking about voting and the results, ie, simply higher than normal traffic. (I'm not sure it would be that much but that was the reasonable interpretation of the comment.)


I agree with the first part but the second is taking it too far. Plenty of people use Facebook messenger for communication about semi-important things. Not seeing how that would affect the primaries but it is not jsut for vacation pictures.


Which one this time - BGP or DNS?

Either way, global GDP should be up :)


My sense is that it's BGP.


Given the range. Doesn't look like DNS


So... No way it could be DNS?


Narrator: it was DNS


I'm betting memory.


Sweden, logged out from all devices, Google authentication to reset password results in error. Authorization codes sent via SMS to reset passwords seems non-responsive. Authorization codes to reset password sent by e-mail works, but setting passwords results in set password page two times in a row and after second try "An unexpected error occurred. Please try logging in again."


Facebook is throwing up the sign in page, as if your sign in has failed or someone's logged you out. Instagram and Messenger also seem to be affected.


Yes, just checked Instagram and it says “Couldn’t refresh feed” and is generally not loading. Messenger has also logged me out.


For some reason I couldn't browse the home page but I could still access my messages (website).


And Threads.


And the Quest headsets.


I can confirm that IRC still works! Pfewwwww..... /me takes his coat.


buserror has quit (*.net *.split)


/me put on my wizard robe


unless your bouncer is offline!


It seems to not just be Meta - sites like Downdetector[0] are showing a spike in reported issues from AWS, Google services, and X/Twitter as well. I noticed issues with Google myself.

[0] https://downdetector.com/


The sparklines on Downdetector's homepage can't be compared to each other. Spikes that look similar can actually have a difference of several orders of magnitude. Only meta's services have truly large spikes.


That's true and an excellent point. I commented about the reported issues elsewhere mainly because I experienced them myself (google.com and drive.google.com not loading or being extremely slow to load content). That could be entirely sympathetic though - people having issues with Meta flooding other services and briefly overwhelming them.


Cloudflare reporting that they were implementing a fix for something SSO-related at 16:02.

https://www.cloudflarestatus.com/


every one panicked that they got hacked. i got a slim hope that i also got hacked and that i will not bother to recover my account and just roll without FB :) im too week to quit my self.


Maybe post your password to Twitter or something. Let it escape into the wild, where it can be observed in its natural state.


I know your intention is to help, but please don't share your FB password (if that wasn't obvious already lol). Letting randoms log into your FB account will just have massive consequences with your friends and family thinking it's you talking to them etc.


tZ7TuQ)cznNhnimgJVwUuys(uy


Nice pw bro, mind if I use it? Looks pretty secure


Send me your password and I'll log in and change it to nonsense. Get out of that heck hole.


I tried logging in, I couldn't and was like "nice, I guess I want to have to look at garbage today".

I really would like a social network just for friends without all the garbage bloat.


That's what Facebook used to be. I think they really lost their ways when they went from "useful tool" to "let's try to get users to spend as much time as possible".

I dream of a "facebook-like" app where you can only add someone as friend via a bluetooth protocol, forcing you to only add people that you've met in real life. Then text only, or with very limited image options.


What would take Google and Facebook auth down at the same time? Coordinated 0day patch?

Or is the report of Google auth being down actually tied only to meta logins?


I was wondering if Google was just melting down under a tidal wave of password reset emails from FB/IG..

My gmail seems to be working fine, both personal and work.


That's my best guess. And for them to log out every user in the world makes me incredibly curious about what would have happened if they'd chosen a different course.


I think I’ve experienced 2 or 3 global session resets on fb in my life. Usually followed by some kind of reason they had to protect everyone. This probably isn’t great, hopefully a precaution not an active exploit.


Note that this web page only covers the status of Meta's offerings for business users. This doesn't track the downtime of Facebook, Instagram, WhatsApp, Messenger, etc. as normal users experience them.


Curiously, WhatsApp did not suffer an outage in Brazil.


Discord is having issues too! https://discordstatus.com/


I think the downtime associated with other services could just be people choosing alternative sites for their social media time.

Facebook + Insta makes up a huge share of the social media market, and when they go offline, it'd be natural for their competitors to receive large sudden upticks in trafic they're not immediately prepared for on a Tuesday morning.


Scary idea but could indeed be. Who let the zombies out!


Could this be related to Google's log in page change? Seemed cosmetic only, but funny timing that Google's page update happens the same time all these log in issues pop up.


They've been previewing that change for weeks, and I wondered: why do they need to change it at all? Is it some product manager justifying his need to exist?


Based on their hype banners, I was ready for a major overhaul, too. Or at least something obviously different about the login flow. It sure looks like somebody just clicked the left-align button and spit-shined the typeface a tiny bit.

Which, I guess, is the best possible redesign: one that freshens up without rocking the boat.


> Could this be related to Google's log in page change? Seemed cosmetic only, but funny timing that Google's page update happens the same time all these log in issues pop up.

That rollout is staggered over time, so not all users receive it at the same time. It's unlikely to be related.


+


Anyone have a good guesstimate of how much money is "lost" when Youtube/Facebook/Instagram/Whatsapp (more?) are down each minute?


I can’t answer your question, but when I was at Google I made a mistake that caused ads serving on Google results to become unclickable. For the postmortem they had me calculate (I don’t think a dollar amount but) the number of ad clicks that would have happened during the time it was down. Of course I looked up average cost per click rates. Not sure if I could share even if I remembered, but it really put things in perspective.

Overall it was a good learning experience. I didn’t get reprimanded; several months later I got a promotion.


I can't believe you typed all that and didn't include a number, what a tease


Something like $X million an hour.


This is rough napkin math, no need to downvote if anyone knows the real number and this is way off :)

Meta 2023 ad revenue was $131 billion. To make it easy, let's assume an even spread for # of users and ad revenue generation per hour/minute of the day and day of the year (which I'm sure is not the case).

This would be:

$358 million per day

$15 million per hour

$249k per minute

This also assume a minute down won't be somewhat or totally offset by a spike in users when it comes back online.


I didn't check your math, but your last number is probably per minute, not per hour.


Thanks, you're right . Edited it


I assume they will run all those campaign views after yhe outage, so very little will be actually lost


A hell of a lot less than when they are up.


Naively, divide ad revenue by time to get a dollars-per-time.

But thats naive because ad serving isn’t totally sold out so they can make up for it by increasing the density of ads in the next time window. If the outage is short, then the impact is small.

But some markets are totally sold out and there’s no making up for lost impressions.


This is something I've thought about a while back. Like Facebook probably has a "maximum number of ads shown to users per post" value. So theoretically, they have a ceiling for how many ads can be bought in a specific time frame before having to increase the ceiling/find new users.

Do they share data about this?


> before having to increase the ceiling/find new users.

Or raise the prices of ads.


> they can make up for it by increasing the density of ads in the next time window

Not only that but the bigger spenders will have more budget so the bidding after a large outage should return higher bids on average leading to increased profit per ad slot.


Possibly, but increasing ad density is usually negative for performance. So it’ll probably be end of bad for Meta as a whole, as people spend more but don’t get more value out.


> they can make up for it by increasing the density of ads in the next time window

no need to do that, for google search, people will come back later to make the search


Yes, but, they’ll need to serve X ads in a smaller amount of time


They also do not refund spend during this time typically so wouldn't be a 1:1


They don't? I've worked in ad tech at some smaller places and we absolutely refunded spend during outages.


From my experience you'll receive a partial refund - and in some instances like inexplicable overspending, etc. - you won't receive anything. This may be an exception given a full sitewide outage, though


How is money even being spent if users aren't viewing the ads?


Not all ad spend is impression based, there are things like takeovers. And, sometimes impression counts arent guaranteed, theyre estimated


That's because they were some smaller places.

When you've more-or-less monopolized a lot of the web's content sharing you get to tell your clients to pound sand. Where else they gonna go? Twitter? The incel white supremacist dollar is not what advertisers call "the good dollar".


Thank you for posting, I'm glad you can generally find service status better here than status pages that never show downtime.


For me it happened when my long-dormant messenger lighted up (for a perfectly valid reason, no surprise there) and made me go through those unbundling options presumably mandated by EU. Certainly a coincidence, but it does irrationally strengthen that satisfying feeling I get whenever I see the ad giant stumble.


Session timed out. I'm in Macedonia. In the quick login menu there was another person with name and profile photo beside my name and profile photo. It seems the girl was from my country, don't know her. I thought I was hacked. This is messed up.

IG is not working as well. Feeds, profiles, messages are all blank.


This is so funny because the status page itself actually shows 500 Internal Server Error to me in the API calls. So the status page clearly isn't isolated from the FB network itself. I highly suspect it is either a BGP convergence issue, or their OIDC service hits the dirt.


I had immediately started the account hijack process which, funnily enough, is also down. I was kicked off Messenger and Facebook and then my password was rejected by the login page. This whole flow made it seem like an account takeover and not an authentication outage.

Bad UX, Meta. Bad!


What's the account hijack process btw?


Ironically, Meta employees are also affected by this


Just like last time they had an outage.


Gonna have my Angle Grinder controller ready for the new Sysadmin Simulator release


My Meta Ray-Ban Stories glasses are working just fine FWIW


https://metastatus.com/ seems to be suffering a hug of death now...


It really should be a static page.


Built on React, though.


React can be used fine for static pages. That's orthogonal.


React does have server components which are completely rendered server side, ala PHP or similar. Even SSG/pre-rendering support.


I see Corporate Memphis has even infected status pages now.


Tangent to this but I saw a link to metastatus last week and thought it would be a status page for other services' status pages. This makes it sound like a useful thing now, too bad the name is taken.


Could call it metastasis, apropos when one outage cascades into others.


Such irony...


It could save others from clicking on downed sites, but not itself.


During the outage it was cheerily reporting no problems


On Super Tuesday in the US today, too. Very interesting timing for how much political nonsense is on social media.


I wonder if this is related to 3 Red Sea data cables cut as Houthis launch more attacks in the vital waterway https://www.washingtonpost.com/business/2024/03/04/red-sea-u...


The cable cut only gonna after the arab world, or maybe india, but eropean to india communication don't depend on them that much so it's unlikelly, so probably not


This will be great for productivity!


I thought the average facebook user was of retirement age.


In a lot of developing countries, Facebook is often the only app that a significant part of the population uses. And that includes young adults as well.


Not sure about globally, but middle aged, 45 - 65, slightly more women.

I would bet those who are not working in full time jobs are a lot more active though.


Instagram is also affected.


My password didn't work, and I was like, "oh man, it's finally happened - I've clicked on the wrong link somewhere and my account has been owned".

Whew.


Anyone having luck resetting passwords yet?

My Messenger and FB on my phone magically logged back in a day or so ago, but on my PC, no luck - and that PC doesn't seem to recognise the password I thought it was before the outage.


Has a meta developer used Chat GPT again?


Mastodon is up!


Just checked again to be sure, even lemmy.world is up.


Instagram has stopped working for me and I've also noticed other sites having issues like YouTube?


Yes both Youtube and Instagram seems to be down too for me.


Instagram is down but Youtube seems to be fine for me


Youtube seems to be about 1/3 i get "Something went wrong" error on the frontpage, so only sporadic.


Maybe because of the sudden increase of traffic?


YouTube seems fine


Interesting how such large scale outages can ONLY happen because of human errors, in this case by a poorly thought out heuristics. Like, a literal explosion could hardly achieve this level of disruption when there’s physical failover protections everywhere.


What about cyber attack? It's super Tuesday...


Neither is the only Facebook status page I found, interestingly. https://metastatus.com/ At least, it shows everything green (except for WhatsApp Business API).


Being logged out of every app is so insane. No automatic recovery from that. Big fuckup.


I guess there are alot of unsuspecting victims being logged in as dorment users and tracked.

A lot of people would probably also don't bother log back in, if there will be a password prompt?


It looks like Meta acknowledged a problem on their status page [0]

[0] https://metastatus.com/whatsapp-business-api at "5:32 PM GMT+2"


Facebook login is also not working but status page is not updated.

https://metastatus.com/facebook-login

Platform Status : No known issues Mar 5 2024 4:38 PM GMT+1

The service is up and running with no known issues.


They can't log in to update the status page about facebook login because they use facebook login for auth on the status page!

Joking aside, I wouldn't be surprised if this was the case, because during their last major outage (something related to DNS iirc) they were having issues pushing fixes because they couldn't login because their DNS was down.

EDIT: seems like the status page was recently updated.


iirc they had to have a guy with a hatchet open the server cabinet because their login cards were tied to the failed infrastructure.


I assumed it was a "me" issue. They really ought to put something on the login page. Then I found out my wife had been logged out too.

I started to go through the password reset process and that failed as well. Then I got here.


Looking at the Downdetector home page [1], it looks like many more services are having outages, not just the ones owned by Meta, including:

- Google

- YouTube

- Google Play

- T-Mobile

- X (Twitter)

- Discord

- TikTok

- Pokemon Go

- Snapchat

It looks like they all have the same failure point.

[1] https://downdetector.com


Or people are using Facebook Auth for them. I don't really trust Down Detector, which despite the claims is really People Winging on Twitter Detector.


> I don't really trust Down Detector, which despite the claims is really People Winging on Twitter Detector.

I'm confused. Isn't listening for spikes in complaints about outages a great way to detect them? I know for a fact some service companies monitor social media channels for this purpose (among others). I'd be surprised if that wasn't more or less standard practice.

I've checked Down Detector for ISP outages in my area many times now. It's always confirmed them before my ISP did.


> Isn't listening for spikes in complaints about outages a great way to detect them?

When there's a major ISP outage, people report problems with all the major sites. When Facebook's down, people report problems with any site that has "Login with Facebook" as an option.

It's almost never actually an outage impacting all of FAANG at once.


> It's almost never actually an outage impacting all of FAANG at once.

Exactly. If you click through down detector when things are _up_ you'll see people still complaining that $site is down. Could be a local power outage or even a flaky connection in their own home.

Down Detector is one of many signal sources and should have a "credibly" score associated with it that's proportional to the number of people complaining that something's down.


[flagged]


I can guarantee you with 100% confidence from experience that the call centers for AT&T, T-Mobile, Comcast, etc. are all blowing up right now because of users who assume that if the Instagram app isn’t loading it means the “wifi” is broken. Also keep in mind “wifi” doesn’t mean 802.11, it means “anything related to the internet” up to and including 4g/5g and Ethernet.


Heh, as soon as I saw Instagram failing to load, I immediately assumed it was Roger’s fault. They just suck when it comes to reliability and Instagram has a much better track record.


Opened devconsole, saw it was a server error then went to HN for confirmation.


Ok great. How does that equate to having down detector filter reports?


The important step is to filter downdetector from your consciousness. It only exists as rage/cable news bait and nothing more. It is not a useful tool, it’s just a clever way to serve AdWords iFrames.



We get calls at work when people can't reach our services while their power is out, so yes, I do.


> do you really think there are masses of people who can’t tell the difference between a single sign on service being down and individual sites being down and reporting it to downdetector?

Absolutely without a doubt.

99.9% of people don’t know what single sign on means or how it works.

Sometimes I wonder what world HN lives in.


A world where people on HN are statistically extremely likely to be at least in the top 5% of the population in terms of skill at using a computer.

See also https://xkcd.com/2501/


> do you really think there are masses of people who can’t tell the difference between a single sign on service being down and individual sites being down and reporting it to downdetector?

Have you never seen The Website Is Down? https://www.youtube.com/watch?v=uRGljemfwUE

The answer is: way more people than a software developer might think. Ask anyone in IT, or go to anywhere bugs are reported and read a handful.


yeah that’s my point. No one who arranges icons by penis is taking time to go file a report on down detector.


Ahh, I see. In that case, most of DownDetectors data are from Twitter and other sources, not first party reporting, although even in the case of first party data, it is also sourced via "visits to DownDetector" which can be from a simple Google search for "is Instagram down?"

If DownDetector relied primarily on direct reporting, they'd be the last to know.


> do you really think there are masses of people who can’t tell the difference between a single sign on service being down and individual sites being down and reporting it to downdetector?

Yes, absolutely. 100%.

> Even if there were doesn’t the outage graph give you exactly the information your asking be curated?

https://downdetector.com/status/aws-amazon-web-services/

Was there an AWS outage this morning? The graph sure looks like it, but there wasn't.


> When Facebook's down, people report problems with any site that has "Login with Facebook" as an option.

If users log into your site with Facebook, then the login functionality of your site effectively is down when "Login with Facebook" is down.

From the user's perspective, your subcontractors, including authentication subcontractors, are a problem for you to deal with and never show them. From your perspective, you could have architected your site in a way that logging in doesn't "go down" when Facebook login is down.

If the user chooses "Login with Facebook" over other authentication options available, and they don't want to use other options, educating them with a good error message might help. Or you could remove the Facebook login option, if you (totally reasonably) don't want Facebook's failures to reflect poorly on you.


> If users log into your site with Facebook, then the login functionality of your site effectively is down when "Login with Facebook" is down.

There are plenty of sites where "Login with Facebook" is a convenience but hardly the only way to log in. Reddit, for example, has "Login with Google" and "Login with Apple"; it would be highly misleading to claim "Reddit is down" if Google's OAuth flow was having an outage.

> educating them with a good error message might help

Nothing in the API or OAuth flow would make that doable in an automatic fashion with this outage. It'd have to be something you put up manually as a banner after hearing of the outage.

> Or you could remove the Facebook login option, if you (totally reasonably) don't want Facebook's failures to reflect poorly on you.

I don't particualrly care; we're talking about why DownDetector isn't necessarily ideal for assessing. It can be a useful signal, in some scenarios, but I've seen plenty of spurious signals come from it.


> Nothing in the API or OAuth flow would make that doable in an automatic fashion with this outage. It'd have to be something you put up manually as a banner after hearing of the outage.

That is fair: if I choose to architect my site such that a user-critical feature goes down when a 3rd party service goes down, it behooves me to monitor the 3rd party service and do whatever necessary to properly inform users what's going on.

I edited my post unfortunately after you replied, but another option is removing the parts of your site that rely on 3rd parties, if you don't want the failures of those 3rd parties to reflect poorly on you (which they reasonably would).

>we're talking about why DownDetector isn't necessarily ideal for assessing. It can be a useful signal, in some scenarios, but I've seen plenty of spurious signals come from it.

Indeed, and if a bunch of users say that a feature of your site is down, even if it's a result of a 3rd party failure: chances are, that part of your site is down, and it's partially your fault for relying on a 3rd party for that feature. The users correctly don't care what the root cause is, they expect you to either mitigate it or don't have a feature they rely upon be unreliable.


All that's fine, but totally misses the point.

Take a look at https://downdetector.com/status/aws-amazon-web-services/ ; scroll down to the comments.

"SSH and Dbconnect stopped on all of my EC2 instances. Anyone else?"

"I can't add a payment method"

The chart shows a big spike this morning, but there was no AWS outage, nor does Amazon use Facebook login.

Again, DownDetector can be a useful "is something unusual happening right now" signal, but it'd be a mistake to take its attribution at face value.


Ignore the comments on DownDetector for a moment and check out that huge spike in reports recently. Clearly something wrong happened with AWS's user experience. That's something AWS needs to resolve, in the eyes of their users.

>The chart shows a big spike this morning, but there was no AWS outage

Are you sure? If hundreds of users simultaneously reported there was some sort of outage, particularly a huge spike like we saw, chances are there was an outage.

>Again, DownDetector can be a useful "is something unusual happening right now" signal

Exactly! Specifically, "is something unusual happening right now with my site, in the eyes of my users?" Every site owner should know when that condition is true. What you think about your site "up-ness" isn't as important as what your users think about your site "up-ness". What you attribute your downtime to, isn't as important as what your users attribute your downtime to (you.)


> Clearly something is going on with AWS's user experience.

But that's not the case. It's a false positive.

Pick a DownDetector service and open the page every day for a few days. You'll see it most of the time just reflects people waking up in the US timezones.


Is it a false positive, though? The data shows there was an outage. We would need more evidence to conclude hundreds of users, at that 1 spike, weren't actually having issues.

In other words, we have hundreds of people saying there was an outage, and 1 person saying there wasn't.

That's a problem AWS needs to resolve, regardless of what they think might be the root cause. If the users weren't experiencing any issues with AWS, I doubt they'd be reporting it.

Your comment about timing is a good point: if people are working with AWS early in the day, and AWS is giving them problems, then they will probably report problems with AWS early in the day. I wouldn't expect them to report problems while they're sleeping.


> Is it a false positive, though?

Yes. AWS was not down this morning.

> In other words, we have hundreds of people saying there was an outage, and 1 person saying there wasn't.

We have hundreds of millions using AWS and AWS-backed services successfully this morning.

I'm out.


Hundreds of users, representing more users who didn't bother reporting, say they experienced issues when interacting with AWS this morning, so we'll need better evidence to the contrary to conclude otherwise.

The fact that some people accessed AWS without reporting issues does not mean that all people did. For those who had issues, AWS is responsible for dealing with those perceptions.

Indeed, it could have been a fault that affected a subset of users, for example 1 service in 1 availability zone. That's still an outage in the eyes of users, which AWS is responsible for managing. It could have been an issue with a route from 1 ISP. That's still an outage in the eyes of users, which AWS is responsible for managing.

An even better example is the DownDetector page for Facebook, with hundreds of thousands of reports. Do we really think there's no correlation between what DownDetector reports and what users experience?

tl;dr: what users think about your site is more important than both what you think about your site and the reality of your site, and you should be tracking it.


> When there's a major ISP outage, people report problems with all the major sites. When Facebook's down, people report problems with any site that has "Login with Facebook" as an option.

Yes? That's how all top-level reporting is going to work. It's not going to tell you which part of your service is inaccessible. It's just telling you that people can't access it. You obviously have to do additional investigation to figure out why people are having trouble.


> It's not going to tell you which part of your service is inaccessible.

Scroll up the thread a bit; https://news.ycombinator.com/item?id=39605354

Even here on HN, where people should know better, people take its incorrect attribution as useful info. TikTok isn't down. X isn't down. Google isn't down.


I would completely agree that people are bad at interpreting Down Detector-type results, but that doesn't mean it isn't providing a very useful signal.


Indeed I haven't noticed any blip in functionality, but then again I don't ever do FB (or other external service) login. Absolutely no reason to do so, long term drawbacks are too serious to be lazy about this.


You can't login to snapchat, tiktok, youtube with your facebook


Yes, those are all just Down Detector’s bog standard false positives.

Pull the page up tomorrow and you’ll see the same morning spike there as people wake up.


That's kinda the point though isn't it? DownDetector is showing an early indication of a major outage in both of your examples. The issue may not be caused by the indicated service, but it's still a useful information source especially when we can correlate reports on there with what we are seeing in our internal monitoring.


A big spike on DownDetector is an indication of something going on.

Its attribution of what/who is often incorrect. You'll see "maybe it's more than Big Site X!" comments come up on every HN thread like this citing DownDetector; it's almost never the case, and folks on HN should know better.


The problem is the source of the reports and display of the reporting.

I'd trust Down Detector a lot more if it was filled with Hacker News community -- people who are able to understand that there's "DNS" and "Routing".. and that your phone can have internet access at home while your home PC does not.

I personally hate Down Detector's graphing because it can make it 'look' like there's an issue when there isn't really... Facebook with 500,000 reports looked as down as Google with 1,000 reports... For equally sized / used entities, I would not trust that "Google" is down with 1,000 reports. I had a coworker ask me what was going on with the internet because "everything is down.. Facebook, google, gmail, microsoft!" (when seeing the Down Detector home page)

DD should normalize the graphs against the service history in some way. A service shouldn't spike because it had 30 reports / hour for a day, then suddenly has 100... when it has a history of being out with 100,000+ reports. The 100 reports are probably mis-reporting, but you can't tell until you dig into each service, one by one, with separate page loads.


The OP means there is a lot of collateral noise from people who are just tech savvy. Eg. “oh no, I can access Facebook, my internet must be down. Let me login in Down Detector to file a complaint against my ISP”


In the Twitter operations area there was a big TV that streamed searches for #failwhale etc. It was actually very useful to detect problems with Twitter by looking for people complaining about Twitter on Twitter.


The problem is that people use FB login for other sites, and if FB login is down, many users report a problem with that other site, not with FB.


The point is that the “other site” _is_ down for those who use Facebook to login to it.

Maybe they should have a backup password (if they site allows it, f-ing Spotify doesn’t), but it’s still effectively down for them!


I think it's a good data point, but it's not a "tell all" indicator. So I agree with you.


Well, Twitter was down for me at least. Google, on the other hand, was not unlike what DownDetector claimed.


YouTube was definitely doing something weird that doesn't seem likely connected to Facebook.

A couple hours ago after watching a video I went to my home page, which usually shows recommendations based on what I've recently watched plus a few videos labeled as sponsored that have nothing to do with any of my interests.

Instead everything on the home page was either a sponsored video, or a movie that was free to view with ads, or something from one of their music products.

I tried from an incognito window to see if it had something to do with being logged in. Normally going incognito loses the history-based recommendations but at least recommends user uploaded content. But now it has just like my logged in home page. No user content. Just ads and videos from Google's movie and music services.

Refreshing gave an error that said something went wrong. I then logged in on that page and again got something went wrong. Another refresh got a page with some user content. Another refresh was the ads and Google stuff page.

A little later it seemed to clear up and now my home page is back to normal.


Yeah, it's wild that it's now treated as an authoritative source, especially by some news organizations.

It's as good as asking a neighbor what happened with a loud noise down the street. Sometimes you'll get something good, sometimes it'll be completely wrong.


Downdetector is nice because it answers my question of "is anyone else having issues with this?" When it takes AWS an hour to even acknowledge "increased error rates", and tells me that everything is a-ok in the meantime, I want another perspective.


Twitter's search used to be my go-to for this - a search for "AWS down" would typically be very illuminating - but it's tough to get it to genuinely spit out the most recent tweets with a keyword these days.


> Yeah, it's wild that it's now treated as an authoritative source, especially by some news organizations.

> It's as good as asking a neighbor what happened with a loud noise down the street. Sometimes you'll get something good, sometimes it'll be completely wrong.

Asking my neighbors if they know what some loud noise was or about some local disturbance has been extremely reliable in my experience. The one time someone gave me an explanation about something which wasn't mostly right they qualified it with something like "So-and-so said it might be such-and-such but I don't know if it's true".


You must have an exceptional neighborhood. Everywhere I lived, here's a handy map of "actual cause" :: "what the neighbors said it was"

Car exhaust :: gunshot Appliance delivery truck liftgate :: gunshot Transformer explosion :: gunshot Garbage truck :: gunshot 787 at 25000ft :: complete ruining of peace and quiet Any police activity :: probably someone robbed a bank

For the record, my city has (statistically indistinguishable from 0) homicides and bank robberies and, by American standards (I know, I know) no particular issues with gun crime.


I can imagine it being different in a city. I'm in a fairly quiet suburban area.

One time I heard a loud boom. A few hours later I saw a neighbor outside and asked if he'd heard it and if he knew what it was. He told me a house a few neighborhoods over had exploded. I was a bit skeptical of it but he turned out to be right.


exploded? the house... exploded? like, a gas leak or something?


Drug lab is more probable than a gas leak.

Sources:

https://www.statista.com/statistics/942043/laboratory-incide... - meth lab incidents are down to about 900/yr and have been far higher in the past (presumably because the labs have moved to things besides meth)

https://rpgaspiping.com/blog/critical-safety-tips/gas-safety... (286 natural gas incidents per year) - I've tried to find a more credible source for this number but keep seeing it cited in various places and have no better source, higher or lower.


> Drug lab is more probable than a gas leak.

I can’t tell if you’re trying to demonstrate the problem you have with your neighbors ;)

It was a gas leak, not a drug lab. The utility failed to fix things in a neighborhood where there’d been reports of leaks for years: https://www.wbur.org/news/2023/08/17/eversource-fine-gas-exp...


Yep. The utility had gotten reports of gas leaks on that street previously and didn’t fix it.


It's as good as asking a thousand neighbors if they heard anything down the street.


The New York Times just posted a news flash ... citing Down Detector :-P


i trust Down Detector more than the (majority of) companies who are silent during outages

hell, i'm surprised Down Detector hasnt been outright sued due to the graphs being an actual honest representation of availability that shitty companies cannot hide


> Or people are using Facebook Auth for them.

Gmail is also on the list. You can't use FB auth to login to Gmail, can you?



People are using Facebook Auth for YouTube?


Seems likely. TikTok and YouTube are currently working for me, while Meta platforms aren't.


It's not that, getting some issues using X


Their outage heatmap is also basically a population density map too. https://xkcd.com/1138/


Man the postmortem on this is gonna be fun.

“Yeah so it turns out when Facebook and Instagram goes down so does Google”

I do not envy the SREs at either company. I'm pretty sure all those other ones use Facebook or Google as their OAuth provider which is why they are all being reported as down.



Everything is going down, except Steam. You know where to find me.


Some piece of core infrastructure went down because everyone got spike at the same time. Surprisingly DoorDash and Steam was up


Pokémon GO player here, the app was absolutely fine unless you tried to use Facebook as your login provider.


Scrolling down their list, why is DoorDash the only one that didn’t have a spike this morning?


HN seems to be struggling too, but that could just be everyone here to talk about the outages.


That's the standard HN experience, this site runs on a single core I believe.


You don't run a massively profitable VC company by just throwing money away at a second core.


Does it really or is this a joke?

Edit: I found the following, I wonder if it's still the case.

https://news.ycombinator.com/item?id=16076041


It's real. Single core performance improves all the time. People overestimate how much power it takes to handle lots of queries per second on a well-tuned system and well-written software in 2024.

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...


I see the "sorry, we are receiving too many requests, try again in a few minutes" error several times a day on here. I don't think that HN is reliably able to handle the amount of users it currently has.


I believe that's by design if you send an action request very quickly after a previous one. It's very easy to replicate. Open a post. Then click the upvote button and very quickly click the favorite button too. That will trigger it. I think it's used to rate limit.


>"Sorry, we're not able to serve your requests this quickly" is our little server process saying "help, I only have a single core and I'm out of breath here". If your account were rate limited it would say something like "You're posting too fast, please slow down."

dang, linked in one of the ancestor comments. But I still suspect you are correct.


I just tested it by quickly upvoting your comment and then favoriting it and the error was:

> Sorry, we're not able to serve your requests this quickly. reload

Note that this only seems to happen for actions. Doesn't seem to be the case if I am just loading a page quickly.


I have been using HN daily since I was a teenager. I've seen that message maybe 10 times outside of serious issues in last 15 years. It's strange to me that it happens so frequently for you.


That's difficult to believe to be honest. I get it several times a week.


I get something like that when I try to comment and then upvote too quickly.


There's a manual rate limit that can be assigned to your account if you post frequently on controversial topics. Afaik once it's there it stays until removed by a moderator.


I think that's a little different from what I described, but may be what the grandparent comment described.


I can confirm -- I've only seen this as a form of throttling (i.e. preventing users from sending too frequent HTTP requests).


I've been using hacker news for years and I think I've seen it <10 times too. Maybe our usage patterns differ and I only use HN off peak.


I've been seeing this multiple times a week for the past couple of years. It's gotten worse since 2020. I think that they are preparing to upgrade it, or did upgrade it?


Hmm. I get a different message. Something like "We are having trouble handling your request. Sorry!"

I saw it just a few minutes ago, but I don't remember the exact wording...


Yeah that's the one I meant, I don't remember the exact wording either.


That is a generic rate limiter that is independent of system load. As far as I can tell if you make more than one request per every 5 seconds, you will always be served the rate limit page.


The message for this is something like "we can't serve your requests that fast". The GP is quoting the site overloaded message.


It's a feature


Note that's the application server process being single threaded, but the server machine is 4 core, so nginx cache etc use other cores


A worthwhile distinction!

I looked up the CPU mentioned in the link from your other comment. It looks like HN handles enormous traffic on about 2x the power of the last Celeron chip ever made.

https://www.cpubenchmark.net/compare/2383vs5793/Intel-Xeon-E...


> People overestimate how much power it takes to handle lots of queries per second on a well-tuned system and well-written software in 2024.

I wasn't overestimating anything, but with how easy it is to write concurrently software today, why limit your site to a single core.


Maybe it's a lisp thing. Who knows what mysteries lurk here


It's not a lisp thing. Many lisps are capable of multithreading, including implementations of Common Lisp that have had it far longer than HN has been around, and Clojure, which is extremely good at it.

It even looks like Arc, the lisp HN is written in has threads now, but Arc is built on top of Racket and uses Racket's green threads, so it only takes advantage of one CPU core. Racket does have OS threads, but Arc does not use them.


> well-tuned ... well-written

These are good for actual business needs, but bad for resume-driven development.


Doesn't he mean single socket by single core?



HN is always struggling.

Couldn't use it two nights ago, IDK why.


Happens whenever Sama sneezes too


Down Detector is so unreliable. People that can't call an AT&T phone via Verizon will think (and report) that Verizon is down, when it's really AT&T. People can try logging in using Facebook's on click login and not be able to get in, so they think Tiktok is down. It's not all that useful. I hate when journalists cite it.


It has false positives and noise for sure, but it's also very sensitive and shows issues very quickly.

I wouldn't trust it as a single source, but in a case like this where our internal monitoring shows a spike of issues with the Google APIs and we can see a huge spike in reported issues for Google on Downdetector starting at the same time, it's useful to confirm that the issues have an external source.


It's only slightly better than "my mom claims". My mom would ask if I had the internet at my house. Yup. all of it. in a rack in my bedroom closet. She'd also report the "internet is down" when a single website was having issues. To me, that is down detector carrying on the legacy of moms everywhere.


A single report on there is useless. A sudden flood of reports is a good sign that something interesting is happening.


and something is usually happening. the issue is that a lot of end users (so the people that down detector is pulling from) don't understand the systems well enough to point to where that something is and will often misattribute it, which is what both parent and grandparent is claiming.


Eh, mostly it's people misunderstanding what it represents.

If I can't login to tiktok because FB is down, then tiktok is effectively down for me. When it comes to technology most people don't care about the trip, they care about the destination.

So yea, tiktok isn't "down" but for a lot of people it might as well be, hence coupling your infrastructure/auth on other providers has side effects like this you must take into account.


Downdetector has successfully detected 150 of the last 20 outages.

Its mention should honestly be banned from this site.


There is no consistent scale on that graph, so any local maxima of reports received would look similar to any other.


Exactly this. FB topped out around 520,000 reports. Google topped out around 1,400. That's a massive difference in scale.

Both are above their baselines, but I bet some is just mis-reports, or increases in awareness due to more people checking in.

Meta seems to be the only one really affected from what I can tell.


We saw a big spike in latency and failures on the Google OAuth apis starting at the same time (15:21 UTC)


I made that same mistake after seeing someone post an unlabeled set of graphs to a Slack. The Google peak reported outages is about 0.25% of the Facebook peak. It seems reasonable some people just made a mistake.


X (Twitter) wasn't affected:

https://twitter.com/elonmusk/status/1765048551023734801

Downdetector is user reports, not automated monitoring. It's... semi-trustworthy.


In general, linking to an Elon Musk tweet is hardly proof that a Twitter service degradation didn't happen.

Even more so when the tweet in question isn't even a direct claim about Twitter, but just a meme making fun of a competitor.


The fact that twitter was usuable the whole time does.


> The fact that twitter was usuable the whole time does.

That's an assertion, not a substantiation. A single tweet does not corroborate that, even if you ignore the fact that most outages of large global services (including some of the outages of these Facebook properties mentioned above) are actually partial degradations.


lol


Most of those seem OK for me now, and DD agrees. This seems to have been a temporary blip for all of them, possibly some kind of service switchover/fallback "not entirely unrelated" to the Meta outage?

Edit: actually a more attractive theory, given the very short timelines and near simultaneity of all those failures, is that downdetector itself had a failure, possibly a Meta-dependence, that they noticed and corrected quickly.


GCP seems fine, and no issues logging into Google Cloud Console.


Services are coming back to live.

Interesting to see that all static content was still working during the outage (at least for Instagram). It was still possible to swipe through all reels (I assume the list was cached).


I think the content was probably cached on your device.


Fb is down but YT is still up for me.


All the electronic candy/soda vending machines in our office are also not working. Shudder to think of the chain of dependencies inside these machines.


A lot of sites have features that say “log in with your X or Y account.” They connect to each other somehow. I never studied that protocol. I wonder if authentication failures across services could be tied to it.

For process of elimination, do all of these services do multi-platform logins? Or do some not connect to anyone else?


that actually infuriates me more than cookie banners. the one from Googs is the worst offender.


Facebook audience is in the billions, so you will see 100k false positives when a big site like that goes down.


This is just a knock-on affect from 1B+ users moving their timespent elsewhere during the outage


Yeah, I had issues accessing GCP's documentation site for AlloyDB around 8:00-9:00ish Eastern this morning. The page just said 'Service Unavailable'. Had to use Google Cache.


Additional fun factor: today is Super Tuesday - primary elections in a lot of US states.

This outage will result in absolutely no ridiculous conspiracy theories.


If your election integrity relies on Facebook, YouTube, or even DNS to be up... there are bigger issues.

Actually, if all of them including Xitter went down, maybe things would get better? All the sunlight photons might get sucked in by too many eyeballs though, and there could be grass trampling.


> If your election integrity relies on Facebook, YouTube, or even DNS to be up... there are bigger issues.

I agree. While I don't think it likely that Facebook or YouTube would enter into it, I'd pretty much bet that DNS being down would cause problems.

And yes, there are bigger issues with that. Much.


Is downdetector known for accuracy in these types of situations? Seems like a large amount of services out.


Thanks for sharing this, from what I have read it looks like an issue beyond just Facebook.


iirc these all use GCP which would make sense for them all to be disrupted at the same time. I wouldn't have thought Meta was GCP reliant though?


There’s no way fb uses gcp


all of them are using oauth, likely auth provider issue?


https://www.cloudflarestatus.com is reporting an issue with SSO login. So seems like you might be onto something..


Definitely not us.


And I assumed it would be DNS again. Would referral traffic cause issues like this?


Google was acting up for me as well, so that could be


Cloudflare or AWS ?


What's your fave conspiracy theory? Massive cyberattack for Super Tuesday? Powers-that-be mandated takedown? Mossad sleeper agents activated? Covid-brain struck that one engineer attending to that one wire that kept everything going?


Houthis destroying underwater cables in the Red Sea.


i knew my packets took a wrong turn at Albuquerque


I guess the swamp got drained, so there's no more flow through the tubes.


+1 i was freaking out i got hacked cause I tend to overreact like that xD


This is also happening to me. When I try to reset my password using their password recovery I get "An unexpected error occurred. Please try logging in again." For some reason my phone number is not working as a way to get a recovery code, same error. I can't get the recovery code sent from the app or from my browser on my laptop, but when I use my chrome browser on my pixel it sends me a recovery code, which results in "An unexpected error occurred. Please try logging in again."


Seems like it. Pretty messy too because it sometimes pushes you to reset your password which then doesn't work so there are going to be a LOT of reset email codes floating out there.


Agreed. Probably could be a much better UX for handling a mass outage like this. Graceful, clear error messaging that FB login is down would be better than the current UI.

Triggering millions of people to unnecessarily reset their password yet still be unable to login is not a great UX. This seems like one of those cases that's high impact when it does happen, never likely to occur on any given day, but likely to happen at some point; probably just wasn't much focus put on handling a case like this.


From a process/QA perspective I doubt this can ever be properly tested.

Sure you can set up a UX to show that the auth server is somehow down and discourage users from trying to login/reset passwords, but when shit hits the fan, you actually never know the precise error that gets thrown to the client because it could be any layer between the backend and the client that failed...


The domain reads a lot like "metastasis" which fits quite well with the social media landscape and oodles of terrible "suggested" content on Facebook in general



Down here in South Africa too.. it's weird that it's not affecting everyone, yet it's worldwide.. It's almost as if it's taking down a percentage or section of addresses/uid's on their database... I wouldn't be surprised to hear that everyone affected, joined the app at a similar date, or the sequential uid's. To me this sounds like more than a bug.. I wouldn't be surprised if it's a hack.


I agree with this 100%! I do not believe this is just a little bug. This is more. There’s been a lot of things happening that people are trying to ignore. But this just adds to it..


I find it remarkable that Typepad, my blog host, not only isn't down but also is MUCH FASTER than its usual slothlike, approximately 5 seconds response time.


This might be the end of Facebook if it lasts long enough.

Invalidating people’s sessions on apps? That’s a HUGE cost. There’s a huge % of users that won’t be able to get back in.


I've known several people who, when they got signed out, couldn't remember their password and access their email, and so made a new account. I'm sure Facebook will see a spike of this.


yeah i'm hoping it was just an auth server outage and browser sessions werent invalidated because this basically described my situation, but i wont have access to the only device i own that still (hopefully) has a an active login session until the weekend to find out


It might not have actually invalidated the session


It does on the app. Logs you out.


I hope so.


Google login also seems to be having issues, multiple people reported to me that the login isn’t working and they’ve been logged out of their Google accounts.


Yes, I tried logging in today in two distinct Google accounts on separate Chrome profiles and it would sign me out in about ~ 5 seconds after logging in. And the login process was very sluggish.


Looks like their entire auth service is completely borked.

[EDIT] : and if it's a hack, by now half of the world has typed in their password to re-identify ... scary.


FB logs you out and claims "Wrong password" if you try to login. IG just doesn't load new posts or comments, but doesn't log me out.


  > Facebook is not working
That’s very meta.


It looks like AWS is down if we can trust downdetector which is reason why nothing works. Almost fifty percent of these services runs on this cloud.


We can’t trust downdetector. Its just user reports. And users dont know anythig


I am surprised by how many people still use Facebook around here, as my social circle deleted or stopped using Facebook a long time ago.


Well, I've been spitballing 1-3 years for effects of layoffs to affect company performance, so this timeframe is about right


Yup. That it’s related to the elections is also predictable, due to stress.

Made worse in big corp due to affirmative action + lack of enough qualified candidates meeting diversity criteria.

Which is inevitable when you have coarse criteria applied to such a large industry this way so quickly, as it takes decades for anyone to be qualified for the senior roles, and many years for junior/mid level, even if there were no pipeline issues, which there are.

And unqualified folks in leadership, and mid level == stupid mistakes.

And, with the DOL rules, the company can’t even pay people differently, so no bueno even giving the high performers keeping things afloat better bonuses - unless they happen to meet the diversity criteria and it makes the stats look good.

Which it’s already hard enough to do properly when there is only one dimension, and impossible when there are 2-3.

so the bigger the company, the faster it has to cut its own throat.


Blaming an outage on DEI... Man that's a new one for me.

Could you let us know where you work? I want to make sure I never apply there.


Bwahaha, just wait until you see the shrapnel flying over the next year.

You don’t think the steady erosion in system reliability and ever increasing outages is unrelated to these pressures do you?

I’ve seen the sausage being made at the middle manager level in big corp for a long time. It’s never any one person/hiring decision, but the pattern and it’s impact has been obvious (and getting unavoidable) for a long time.

That no one seems to want to talk about the actual issues, but doing character assassination and black listing (like this comment) is part and parcel of the problem.


> You don’t think the steady erosion in system reliability and ever increasing outages is unrelated to these pressures do you?

Outages have steadily decreased at major companies. I don't know what you're looking at.

Remember AWS taking out a good chunk of the internet many times a year because their east coast data center kept going down? Remember the fail whale meme-ing because Twitter was so unstable?

Industry site reliability has only gotten better over the years.


Bwaha, so now everything is actually getting better and more reliable in big corp land!

I’m sure AT&T, Google, Facebook/Whatsapp/Meta, BofA, Apple, MS, and many others who have had prominent massive outages and embarrassing product launch failures this year will be happy to hear this.

Notably, Amazon is one of the few companies that has managed to avoid a lot of the DEI noise somehow. Perhaps due to their reputation for having such a brutal work culture already?

I can’t wait to hear what you’re going to say next.

Big Corp Software quality improving AND running faster on existing hardware?


Your spitball is flying in a completely different universe than my spitball. Imo anytime you have layoffs that cut so deep, you cut through informal capability, knowledge and relationships that take a long time to form. If anything DEI helps create internal resilience because the personal networks end up a little different, giving you wider and more angles of coverage.

HN talks about people in open source holding up major functionality with little to no recognition. That happens within corporations too. Indiscriminate layoffs may directly fire those people, or signal to them that it's better to move elsewhere leaving gaps that only get discovered over time.


Of course, and not breaking anything with layoffs is already hard when your sole criteria as a manager is ‘are they effective’. Which it’s never been that simple, especially in big corp, but it’s waaay more difficult now.

And so what happens when you’re required by the gov’t and leadership to also comply with coarse grained population statistics AND you can’t find qualified people that meet those statistics enough? On top of having to make layoffs?

My ex was a reasonably qualified software engineer, and even 4 years ago was getting no-interview offers because she was a woman - as explicitly stated by the recruiters.

It’s only gotten worse since then for hiring managers. She was offended because they literally didn’t seem to care if she was qualified or not.

I can provide links to signed and in force legal agreements between the DOL and Google for instance which formalize the need for this, and can point towards public records of evidence submitted to court of emails (internal) between recruiters which state the same too, btw.

Actual job qualifications (as in skills) did not enter the conversation at all. Just course grained DEI attributes.

So then they end up disproportionately cutting from the non-protected tranches (the groups that DO have to be qualified to stay) first because your stats still have to look good. I’m not saying DEI folks overall have no one qualified or hard working in them - rather, that there are little to no structural incentives for them to be. In many cases, they’re also unfireable/unlayoffable.

And eventually, the non-protected folks leave, burn out, or give up because f-this. Why do so much extra work when you literally can’t even get paid more for it, or be recognized because it will piss everyone else off?

And even if you’re superhuman on that front - everyone burns out eventually. Which is also why you tend to see what you see in Open Source.

It’s never any one decision, but stochastic movement in this direction has been relentless and inevitable.


So I guess this is what that national security warning and purchase of Palo Alto Networks stocks was all about in mid February.


Can you elaborate further?


I had just added Google Authenticator as a two factor logon for both Instagram and FB not ten min before it happened so I thought it was something got messed up when setting that up. I got a msg on FB saying session expired and tried to log back on with Unexpected error. Then thought I was hacked until my husband confirmed same on his account.


In my country (Georgia), I received dozens of reports from people saying that after the outage, they have been offered to sign in to other users' Facebook accounts and were successful in doing that. I can't confirm it, but it appears that this was happening to accounts that were co-admins of Facebook pages.


I can access what I call my family FB account, which I run on Firefox.

When I try to access my general purpose account I'm forced to log in again. And when I try to change my password I get the "An unexpected error occurred. Please try logging in again." message.

I suspect that one of the password servers has been compromised.


I tried to do a SMS verify when I couldn't login, and the text never came. That service must be overloaded. That's when I realized it was an outage. Hundreds of thousands (millions?) must be doing that.

Also, going to /r/facebook doesn't load, heh, there must be per-subreddit load issues?


Probably DNS ...it's always DNS....


BGP is my guess


Shortly before I received password recovery codes to my email, from all facebook accounts I have.

My email requires 2FA, as does my facebook account, but when I went there and clicked "forgot password" there was an unknown email address added to my account. That shouldn't be possible.


Your mouse click broke it.


Threads is down too. Given its integration with Instagram that's not exactly surprising though.


Glad, I found this. I literally just told me boss I spilled my coffee on my desk so I could have a break and figure this things out. F'ing FB... Thank you to whoever started this post. I tried resetting my password many times and also thought I was HaX0red.


Discord also seems to be down for me


Discord loaded from scratch for me, but it took a much longer time than I usually remember for the feed to come up.


I've given it a lot of time, no luck. It displays:

ISSUE STARTING SESSIONS

We are investigating the issue


I'm from Sri Lanka. It's not working here; I was suddenly logged out from all devices.


In USA, same story here. Logged out and cannot log in.


So lovely that the status page indicates that everything is fine except the WhatsApp Business API!


Sri Lankans cannot live without FB and its the afternoon here.


I'm in France and same for me and my friends


I reaceived a highlight from a friend on fb tagging me in a kfc promotion, i clicked the add to the promotion and liked , followed and text @highlight, after that i was completely logged out from fb and instagram. With passoword being rejected etc.


I was logged in on mac, and the page refreshed and logged me out without any interaction from me, while I was reading a post. Then I went to check on my phone and I got the "We had a problem with the page you tried to reach." dialog box.


What could affect so many different services? Some DNS server? Clould provider like AWS?


Underwater cables seemingly got attacked; the Red Sea situation is likely related: https://apnews.com/article/red-sea-undersea-cables-yemen-hou...


Timeline doesn't seem to line up


Probably the authentication serer being down.


this is the message I'm getting on X now:

https://www.threads.net/@cryptodavidw/post/C4JRaX9vZYE


Cloudflare also having issues as of 1 hour ago:

https://www.cloudflarestatus.com

I wonder if it's connected to this. Seems like most large players are having issues now.


See https://news.ycombinator.com/item?id=39604590 (5 minutes earlier)

Facebook, Instagram, WhatsApp outage (downdetector.com)


Oh good, I thought iOS Firefox was forgetting to load its cookie jar again


I was logged out of all devices and told my password was incorrect. It looks identical to a hijacked account attack and was quite scary until I started seeing similar stories from others.


not only Facebook, Google loggeed me out as well, and had trouble logging back in. downdetector shows nicely that almost everything had a hickup. what are the chances of this?


many independent major services are having significant spikes https://downdetector.com/


Not just Facebook, all of Meta (Instagram, Threads, WhatsApp) is taking a breaks.

The Metaverse is temporarily closed for now.

Lets see where the majority of the discussion migrates to. (Twitter / X)


real discussion happens in private spaces. everything on Twitter/Facebook/etc is performative. including my comment.


WhatsApp seems to be working here in India.


True it's also working for me


and in the UK


Every outage there is a discussion about how these status pages are failing to adequately notify and describe the problem. Is there anyone out there doing it right?


Llama3 become sentient


If there was an AI armageddon, this would be how it starts


They recently started demanding that I log in on a phone app, to make some choice to use messenger on ipad. That is pretty much the end of messenger for me.


Does FB still have that corporate chat/slack competitor offering? Personally I’m glad for the FB “forced” break, but may not be great for those users..


https://metastatus.com/

They are not down. (They just don't work for lots of people!)


> They are not down. (They just don't work for lots of people!)

Seems about par for the course for big tech these days. There's currently an issue affecting the Google Ads API causing timeouts when sending data to it, but the Google Ads Status Summary page shows nothing [0]. However, there's an incident detail page showing some vague hand-wavy information about incidents [1], which appears to be unreachable from anywhere on the Summary page. Gah.

[0] https://ads.google.com/status/ [1] https://ads.google.com/status/publisher/incidents/gNG1ppoY3y...

p.s.: The actual incident details URLs are available in the "RSS feed" link very transiently and tend to disappear -- the feed, which incidentally for fun reasons, is actually an Atom feed.


There are lies, damned lies and... status pages.

According to them Whatsapp Business on premises solutions has issues since end of February.. But also looks like that WAB is the only product with API issues according to the status page.


That hasn't even been updated since Mar 1... Edit, oh, now it has today's date, but still doesn't reflect reality.


For me, the home page and Messenger Platform have date/time stamps like

> Updated Mar 5 2024 10:33 AM EST

and

> Updated Mar 5 2024 10:37 AM EST


25% down, but 100% Down for me is the same as 100% down.


I'd say "down for lots of people" is the same as "having widespread issues." If it was just a single person, then it would be annoying but I wouldn't assume it's a system problem.

My point was just that, of course, the status page is not reflecting reality.


Also got logged out an all devices, and can't get logged back in, or successfully complete a password reset. In the US. Glad it's not just me...


HN also seems unusually slow. Wonder what's going on


HN is always slow when everything else goes down, tech people swarm here to check for information, and it overloads the site.


Happens every time a major outage happens, everyone pours onto HN to discuss it which gives the one-core server HN runs on a really hard time.


At least it's not a mission-critical website . . .


I'm having one heck of a time getting into Discord or getting YouTube to show videos. Anyone else experiencing issues outside of the Meta realm?


Status page show some services are recovering. I wonder why ad services is the first service to be mentioned as recovering instead of login.


No reason to allow people on the platform if you can't monetise them, especially after monetisation has been paused for a few hours


I don't use Meta's services but I have a number of co-workers that are all convinced that their Meta accounts have been hacked.


San Francisco, Instagram, Facebook both not working


I don't think SF not working is related to this outage, though.


Why do you say that? It could be exactly why Facebook is not working.


Yup, got logged out of everything about 25min ago.


Yup. Google OAuth and Facebook, Instagram all are down.

I think its a case of bad deployment by Google.

They recently introduced new OAuth flow.

Might be related.

No idea how long it will take to fix it.


Official page to check status: https://metastatus.com


It would have been smarter if they hosted this on a separate reliable server altogether.


It's on AWS, but likely just has issues with provisioned capacity now that it's actually being hammered.

Meta has a small collection of tools on AWS to deal with large SEV0 events like these. Another one of them is a basic communication tool that does not use Meta's own servers for anything (including auth), a super basic version of the internal SEV tool.


X-Amz-Apigw-Id:

X-Amzn-Errortype: TooManyRequestsException

X-Amzn-Requestid:

Looks like the endpoints are on aws lambda and its getting rate limited.


Might as well rename it from Meta Status to Metastasis.


Thats down too. lmao


Official page to check status: https://metastatus.com


Status page is down for me now too


I guess we all better do some work today then.


Don't panic, you didn't get hacked


There isn't a single part of my life that is affected by this. Good riddance; Facebook is a psychological cancer.


Eesh I thought it was a hacking situation.


Seoul Korea and can confirm. “Logged out.”


Seems like even the status page doesn't work ... i'd say typical for this type of situations :D :D :D


Google’s oauth logins are timing out too.


I thought I got hacked and tried to retrieve my password.

Got a weird email with the same code every time from facebook ail.com


what are stats for Facebook uptime? i can't remember last time large parts of it went down like this.


Probably not very good. They had a major outage in 2021 too. And from people I know who work at Meta I am not surprised at all.

https://www.theverge.com/2021/10/4/22709575/facebook-outage-...


Facebook is not working, but you wouldn't know it from this page, which says everything is working.


IF you try to recover your password you will receive the exact auth code each time that you request it.


Thankfully there are no known issues.


The first thing I did was to reset me password(s), and wonder if my devices had been compromised.


It appears to be more than Meta, lots of things are showing a spike in reports, including AWS.


That might be the FB traffic instead going to the content producers? I mean half the internet users need to go somewhere else if they want procrastination.


I had just added Google Authenticator as a two factor ligon for both Instagram and FB not ten min before it happened so I thought it was something got messed up when setting that up. I got a msg on FB saying session expired and tried to log back on with Unexpected error. Then thought I was hacked until my husband confirmed same on his account.


my heart rate was pounding when I couldn't login and couldn't reset my password just now. Seeing this HN post made me come back down to normal. I'm now going to move away from FB, Google, and all other major tech companies I rely on.


greeting from indonesia,Here too, the same error, lots of people asking what is wrong with FB IG error, lots of speculation saying update, but the repair process is taking too long without any confirmation from meta, something strange is happening


Frankly, I don't buy this explanation - technically or logically both from a devops,and systems architecture level - There is no way in hell a company like Meta is pushing database design changes this significant to production? We all know how many times these database architecture changes get run in staging,then even production subsets before rolling out to production at large.

Should we assume the teams working to ensure Business Continuity & Applications Resiliency redundancies feel asleep at the wheel?

Also, to assume that no down or outage messaging go out during a fairly routine maintenance based outage?

I call BS

Lol I can't find scenario where this happens, at this scale at a company of Meta's scale


I don’t buy it either! Something’s fishy…


or phishy or both for sure


It's the same for me in India


Same situation here in Japan -- all devices logged out of Facebook and cannot log back in.


Doesn't this kind of thing happen in every film regarding an AI becoming sentient ??


For now, some countries preparing a larger war seems more likely.


Both facebook and WhatsApp is not working globally, at first i thought someone hacked me


Well if everything down maybe isn't problem at services itself but at the DNS side?


I've never seen an outage this bad and I've been on Facebook since the oughts.


The 2021 outage was worse, it completely killed the network access to FB data centers for hours, and even led to issues with FB employees accessing offices (since the badge readers were also offline). It was so bad it got it's own Wikipedia page: https://en.wikipedia.org/wiki/2021_Facebook_outage


I don't remember that one - must have been on a hiking trip. This one might turn out to be worse yet! Hopefully not.


What would take Google and Facebook auth down at the same time? Coordinated 0day patch?


[Edit: Misread the comment, mea culpa]


For both companies? Or is the report of Google auth being down only tied to meta logins, and a misdiagnosis?


Google logged me out of an account and I had trouble logging back in. downdetector also reports Google having issues and friends also had trouble with both.


FWIW for me Google seems relatively stable, but I tried logging into a lesser used gmail account on a lesser used browser, successfully logged in for like 10 seconds, then got logged out.

It's very spooky when supposedly the two companies who probably aren't sharing infrastructure seem to go down at the same time...


Agreed! It is odd…


The status page seems down for me as well. I just get a never-ending loading screen.


If you have an adblocker, try disabling that. The page fails to load with ublock Origin running.


Note that this status page is... less than helpful in the face of an obvious outage.


Lovely bit of cloaked dramatic irony in "Leave The World Behind" t=1:11:20 [0]

Ruth (Myha'la): This is for the better.

G.H (Mahershala Ali): For who?

[0] https://en.wikipedia.org/wiki/Leave_the_World_Behind_(film)


And nothing of value was lost


Status has just updated (as of 10:49 AM ET) to show that there are outages.


Same issue on the East Coast in the US. All devices were logged out afaict.


Yes, I noticed it too[0]. This is going to send a truckload of traffic onto Twitter/X and Telegram.

--

[0]: https://twitter.com/IvanMontillaM/status/1765036872290681089


Can't see the post, and can't login on twitter either, so probably :D


Yep, I’m logged out and can neither log in nor reset. Mountain west, USA.


Also can't logged into my openAI account.

Not saying is related, but timing is weird.


if you are logging in with facebook account that might be the reason


Indeed if it was the case. But i always create an account with my email for this kind of situation.


Since we're in full-on speculation mode ... Russian cyber-attack ?


Everyone beware, The “Black Hat Hackers” that have accomplished this massive Global hack represent a Clear and Present Danger, immediate and significant threat/danger to National Security, as well as all People’s personal and financial Security


Everyone Beware, The “Black Hat Hackers” that have accomplished this massive Global hack represent a Clear and Present Danger, immediate and significant threat/danger to National Security, as well as all People’s personal and financial Security


They layoff too many ppl.


Google login seems to be slow just now for 2 of our websites


Instagram is also down


Yes. The app does start (unlike Facebook) but then it's "Couldn't refresh feed" and nothing visible on the explore page.


Sounds like they messed up rotating their encryption key.


Instagram wasn't working for me in New England, USA.


We just started having issues with google oauth too....


I have a very tangential question to this situation prompted by some conspiracy theories linking the outage to today being Super Tuesday election day in the US.

Why does Facebook need sharing news media or political ad targeting on its platform from a business point of view? Am I being naive or is it really such an important revenue driver to the business or the core experience of the app?

I think it is the source of enormous reputational damage and risk, that if I was running the company I would even happily trade 10% if not more of the market cap to removing any feature that enables news sharing on the platform.

Actually if you remove these two things (political ad targeting and commenting on news media) I struggle to find any other issue that would make facebook a "political" target, they can literally shutdown the fake news division that employs 10,000 people...


My guess would be that they would lose a significant amount of their user base and engagement without politics and news media.


The ability to affect the outcome of elections is very much worth the political flak for them


YouTube is down too. Only thing working is Twitter.


YouTube works fine for me in Charlottesville, Virginia, U.S.A.


It’s working again. It’s been experiencing outages since 6am I guess.


YouTube seems okay here. East coast US.


Who thinks Meta pushed to prod without testing?


Yeah so much for graceful feature degradation.


Just last week, the Meta paper about their Defcon system landed on HN front page.

https://www.micahlerner.com/2023/07/23/defcon-preventing-ove...


SSO in Cloudflare also seems to have problems.


OH MY GOD WHAT IS THE WORLD GONNA DO!!!!!?????


Touch grass. xD


yeah, randomly logged me out and now my pw doesn't work. The reset password options don't seem to be connecting.


USA, Instagram app broke about 5 minutes ago.


Yup. Kicked me out and won’t let me back in.


Same here in Vietnam, nothin working at all


Oh, that's why I was logged out then.


Obligatory DNS joke :-)

It's not the DNS, It can't be the DNS, It might be the DNS, It's the DNS


Fixed. I can login to facebook.com now.


This should be considered an act of war.


What do you mean by this? I find all of this very odd. Everything that’s been happening.


You're witnessing the greatest attack on Internet infrastructure in history.

Amphibious assault on Taiwan imminent.


Why do you think they do this to our social medias and our phones? To have control? To track? To shut us down so we have nothing? Like what is your thoughts on what they are trying to do? As in there plans or what are they trying to accomplish


Confuse your enemy. Disrupting civil communication before large scale maneuvers is part of the modern doctrine.


Funny. This page won't work too and is clearly down. So there is possiblity of pretty big amateurism from facebook or failure at dns servers.


I thought someone hacked my account ...


So not really an outage ? Just meta


It's back up here in India now.


if you try to reset your password you will receive each time the same CODE number for auth....


Seems like somehow YouTube is also affected? I'm starting to think that this is an attack of sorts and not Meta specific


Given Facebook is like 10% of total Internet bandwidth, I'm curious what happens when all that traffic suddenly gets "redirected" elsewhere by people going to other sites and platforms. Could YouTube cope with a sudden 10% uptick in streams for example?

Edit: YouTube seems now to be defaulting to a lower bitrate, leading me to guess it's a demand issue.


That seems to be the actual case. Most of the "usual web" is struggling. Even HN is struggling for me.


Google Play Store also affected


The lines on DownDetector are fun.


UK: Instagram iOS app not working.


Glad I'm not the only one!


SRE now have real work to do.


Threads isn’t working either.


Poland and Ukraine: all down.



Since my dedicated post about this got no response, I want to use this opportunity to ask:

WHERE the HELL does Facebook store tracking data on my iPhone?

It shows my previous account even after I delete the app, clear the cache and KeyChain, disable iCloud Drive, AND sign out of iCloud??

Why can't I see where this data is stored? Same for TikTok.

WHY does Apple, parading around as a pompous paragon of privacy, allow this bullshit?


Anyone else find this shit weird? First Nation wide outages with all phone carriers, mainly AT&T and Verizon and T-Mobile… massive outage… and then now Facebook and instagram. Then banks were having outages and pharmacies.. and when the phone outages accrued mine was working fine and my fiancés was not. I just find all this weird… feel like there’s more to it!


wild speculation: they were rolling out enforcement for DMA.


same here in Vietnam, I can't contact my colleagues now


Way to go meta team...


My Facebook is down


My service is down


Estonia: all down


i can't log in my account nor create one


Why the outage?


logged out of fb, insta, and msgr in NY


logged out of fb, i sta, and msgr in NY


Meta stock in freefall. Going short at 20x leverage.


Risky move. The slight drop (and subsequent recovery) today is unlikely related to this outage in any way.


Can't tell if serious, but I've seen worse trades. Wouldn't call this a freefall yet: https://imgur.com/a/0UN3HZa


600th comment


and I thought I was having a bad day


It's DNS


Discord too ?


It really is.


I’m seeing a lot of non-FB services down too. Mostly AWS-based, but not all. My original conspiracy thoughts (this being Super Tuesday in the US) are giving away to thinking it’s some low level routing issue.


It's definitely related to Super Tuesday. Why kill 2 birds with one stone when you can kill 10?


probably some sort of BGP flood.


hackernews is also v slow for me


[deleted]


Meta doesn't do that. For one thing, "Patch Tuesday" is a Windows thing, and 0% of production traffic is served from Windows there. For another, they are constantly and always redeploying.

More likely that someone bungled a deploy of user auth. No doubt they are rolling back as we speak.


More likely that someone bungled a deploy of user auth

How would that affect YouTube?


Surplus load from Meta users flocking to other platforms


good luck to those on call


threads.net seem down too.


Nothing of value was lost.


Entire businesses and livelihoods would crumble without Meta’s platforms.


If your business only exists at the sole discretion of Meta then it's not a viable business, at least not in the longterm.


Technically true, just as if you replace Meta with Amazon, Google Cloud, Etsy, eBay, the local city council, etc. If you don't have a diversified, multi-market, preferably international presence, you're not viable in the long term.

Of course, in the really long term, we're all dead. Meanwhile, the local mom & pop candy shops keep advertising and selling online and making ends meet.


They can migrate to other platforms.


What platforms have anywhere near the reach and services that Meta does for a marketplace/advertising?

I'm not personally a fan of most social media, but saying "they could go somewhere else" is a pretty naieve/ignorant response.


yes it's all broken now, so I can take a short break here :)


Google and Meta auth, both down at the same time? Oooh. Conspiracy theories (or not) incoming...


google oauth probably be getting slammed by people trying to use for facebook login


you think Google can't scale it? I doubt.


You know shit's bad when even their status page is down.


Greetings from Ukraine. And I confirm. First I sought, it is because some war issue in Ukraine (we are at war if somebody don't know), or somebody hacked me, but decided to check HN and seen this thread.

God bless HN and America!


Prayers for Ukraine!


This means IG is also down. Which means probably a network tier screw up yet again. Idk why they design a monolith network that basically can put out all their infrastructure.


Does not look like a network issue, my guess is that they fucked up something major with their auth server when deploying a change related to the EU Digital Markets Act.


Theres other apps that went down including Discord how is that not a network issue?


> Ads Manager - Recovering from disruptions

LOLLLL .. oh dear


me: wonder if there's a problem? i'll find a status site

me 2 seconds later: nah, i'll check HN

me 5 seconds later: ahh, there it is


Its work all the time


Eh, leave them down.


I was considering buying the Quest 3 this morning, this outage is timely. The fact that I can't use my perfectly working headset to play an offline game because facebook is down, makes me wonder if I should go for a different provider. Any recommendations? Excluding apple vision pro since it is too expensive.


As a principle I think it is a good idea to avoid any technically unneccessary coupling of hardware or software to something else that is not strictly user-servicing, else the priorities are inverted in favor of the vendor's priorities and not your own.


I have been in the industry for the last 25 years and have done more “technically unnecessary” things than there are grains of sand on Earth :)


why are you smiling about that?


Probably sarcasm because it's likely out of their control due to product & executive demands.


Why are you serious about that?


People casually doing things that feel slightly evil is... disconcerting.

Not going to judge anyone, but also nothing to be happy about.


There is a point where you learn that everything is some sort of moral compromise. There is always going to be someone ahead of you and someone behind, it means someone is always going to get left out.

No system changes that, or makes it better, just different.


No system indeed, only your own choices.


I'm pretty sure the priorities are almost always inverted these days, unnecessary coupling or not.


The Quest 3 is kind of the obvious mainstream consumer choice for a VR headset.

Standalone wireless headset, reasonably powerful chipset, can optionally stream from a PC either wired or wireless, good optics/resolution, decent controllers/tracking, large game library, large suite of features (including hand tracking and color passthrough), all for a reasonable price. Not sure any other headset really competes on all those things at once.


can we really call it standalone if you need it to phone home to Meta? and doesn't it require an account one of their services too?

they did a fantastic job with hardware. I just wish they didn't couple the software so tightly.


> can we really call it standalone if you need it to phone home to Meta?

Yes.


This doesn't work anywhere without internet service. Standalone it is not.


Not true. You can turn off the wifi and the headset works fine.

The current problems sound like a server-side bug while it phones home. But usually it can work fine without internet.

Standalone just means the VR compute is happening on the headset itself, not on a console or gaming PC the headset is tethered to. Of course, most of the people disputing "standalone" already know that, they're just playing definitional games.


For what it's worth it does require a meta account, but not a Facebook one. I refused to buy one while it required a Facebook account since I deleted a couple of years ago. Once they made the change I figured that was an OK compromise. I just found out today during the outage that my headset won't work if I get signed out of my Meta account. That was an unpleasant realization, although I suppose it's partially my fault for trusting Meta not to hamstring the hardware their selling.

It's the equivalent of finding out that if Microsoft's auth servers go down no one with a Windows PC can use it since they can't authenticate. I'm fairly displeased.


I know they raised the price recently but it seemed pretty obvious to me they were selling these at a loss to try and get people locked in by the software.


> can we really call it standalone if you need it to phone home to Meta?

Yes, because etymologically it's "standalone" vs. "wired"; this is akin to how phones are "mobile" vs. (when I was a kid) "landline".


Nitpick: strictly speaking it's standalone vs tethered (which is usually wired, but could be wireless).


I wonder how many people are secretly boycotting the product because of the prison Reputation of the maker


Oh there are definitely people who avoid it because it's made by Meta. Maybe a bunch. On the other hand, it seems to be the most popular VR headset line by a wide margin.

It would be cool if Valve came out with a standalone headset, they're one of the few companies I can see that would be in a good position to do that: they already have a good amount of VR experience with one high-end headset + SteamVR APIs + a couple VR games, they have their own highly popular store/platform, they generally have a positive reputation with gamers, and they have a decent amount of hardware experience in general including the recent Steam Deck for mobile gaming hardware specifically.

And of course, a Valve headset would probably be significantly more open than the Quest. The Steam Deck has gotten some good reputation among more FOSS/hacker-oriented people for being fairly open: you can use it in a regular Linux desktop mode, you can install Windows (or presumably other OSes) on it, it's fairly repairable, etc. The default behavior is very console-like, but it's not very locked down if you don't want it to be. Best of both worlds, really.


It makes me wonder how the world would be if monitor manufacturers did the same thing, it would be unacceptable.

I view VR headsets and their peripherals as no different than a mouse, keyboard, and display.

Companies requiring all this nonsense to use your device, put in that light, is ludacris.


A silly comparison. A standalone VR headset is more comparable to a smartphone or game console than a monitor or keyboard. The latter have little to no compute.


So compute requires vendor lock in? That seems silly to me.

Edit: Can we just acknowledge that a lot of the bells and whistles are for the companies benefit at the expense of the user? Thats their right, but it's also our right to want something better.


> So compute requires vendor lock in? That seems silly to me.

Correct, it's very similar to game consoles, though it is somewhat more open than those (sideloading is possible, including standard Android apps IIRC, and you can run PC VR games from other stores while tethered).

> Can we just acknowledge that a lot of the bells and whistles are for the companies benefit at the expense of the user?

It's the same model as XBox or Playstation, seems like. They sell the hardware at cost or at a loss, and make it up via software.

A fully open headset with comparable specs would probably cost much more for the hardware. From a business standpoint that would be very stupid for a company like Meta, but this is hacker news, and many commenters here see nothing wrong or silly about asking businesses to commit suicide.


> Correct, it's very similar to game consoles

This doesn't explain why its _required_. It just means there is precedent.

Your other point is better, although I think you mean it would cost the consumer more for the hardware, right? The hardware would cost the same to produce, it's just that the company would miss out on surveillance based revenue.

It's a reasonable point, fb would make less money if they made an open headset, possibly to the point that they wouldn't make it all.

But the world where fb doesn't make any headset, and the world where they make an unacceptable headset are basically equivalent to me - the former might even have an edge in that shitty relationships with corporations aren't being encouraged (like they are throughout everything tech related currently). Granted, them blazing the trail has a tiny chance of enabling a reasonable alternative to come along in the future.

But I am a bit of a Luddite, and I know that people want their toys, and they want them now.


> the company would miss out on surveillance based revenue.

More than likely most of Meta's revenue from the Quest series other than hardware is based off of, y'know, selling games. I doubt tracking what games you play to target ads in the OS is more valuable than the money they make when people actually buy games.

In Facebook or Instagram, you're looking at a space that they can shoot lots of ads into, and it's otherwise very hard to monetize. But a gaming-focused VR headset is a different story. Most of the time you're not looking at anything that can have ads in it, but you can actually sell stuff very easily.

Maybe this'll change someday if they actually get social media shit in there that's popular, I'm sure Meta would love that, but so far that hasn't happened.

> But the world where fb doesn't make any headset, and the world where they make an unacceptable headset are basically equivalent to me

Popularizing the format is useful for pushing the tech forward. A big player pushing lots of devices means that the supply chains feeding the manufacture of those devices bulk up too, not to mention other knock-on effects like greater consumer awareness, and "free research" for whoever copies what the market leader does (at least for things that aren't IP-protected).

> But I am a bit of a Luddite, and I know that people want their toys, and they want them now.

I can hear the sneer from over here, yes.


> More than likely most of Meta's revenue from the Quest series other than hardware is based off of, y'know, selling games. I doubt tracking what games you play to target ads in the OS is more valuable than the money they make when people actually buy games.

Isn't that a great argument for why they don't need to have such a hard requirement for a logged in session? Consoles didn't have an internet connection for the longest time, though only because it wasnt feasible yet. They moved a lot of games.

> I can hear the sneer from over here, yes.

I don't mean it as judgment, I know I'm the weirdo here. Sorry if that came off rude.


> Consoles didn't have an internet connection for the longest time, though only because it wasnt feasible yet. They moved a lot of games.

Consoles had physical games. VR headsets don't. Consoles treat digital games the same way Meta is doing them here, I think; if you get logged out, no more games.

The problem here isn't that Meta servers are merely down -- losing connection usually doesn't mean losing access to your library of games on consoles, or Steam. The problem appears to be that authentication is failing such that you're actually being essentially logged out, which would definitely lose you access to digital games on every console as well as Steam.

Which, I mean yeah, that's a big fuck-up on Meta's part.


> Consoles had physical games. VR headsets don't. Consoles treat digital games the same way Meta is doing them here, I think; if you get logged out, no more games.

Again, consoles and steam do this because they want to, because it benefits them, and consumers don't put any meaningful pressure on them for doing so. It's not some kind of fundamental requirement. It's helpful for e.g. anti piracy stuff, but not necessary. It is 100% feasible to sell me a digital copy of a game and then not hang around on my system and watch me play it.

People let triple A PC games basically put rootkits on their systems. It's not like the games wouldn't work just fine (or better even!) without them. It's just that approximately nobody cares, and the companies will do whatatever you let them do.


> I view VR headsets and their peripherals as no different than a mouse, keyboard, and display

That could be valid when VR headsets were tethered to a PC via a DisplayPort or HDMI connection and essentially mirrored the display.

The Quest is closer to an iPhone or Android phone or an all-digital handheld gaming device. With integrated compute, display, battery, text input, pointing devices, mic, and speakers, it bears little functional resemblance to peripherals like a mouse, keyboard, or display with no utility unless slaved to another device.

Considering I can use my Quest with no wifi or other network to log in (once initial set up is complete), it seems that the Meta back-end APIs must have broke in some way that confused the headsets into thinking they were available when they weren't.


It sounds like a server-side bug that forced a log out somehow. Which does really suck, Meta deserves criticism for that, but acting like this means the headset "isn't standalone" is silly, since that's not what "standalone" means in the context of VR headsets.


Agree, many posts I read seem like classic "I don't like Meta/VR/big companies/social media so let me use this specific incident to confirm my biases."

As you say, there's valid criticism to be made but it's hard to find the signal through the noise.


I think the desire for "standalone" VR headsets to mean offline-capable is totally reasonable. It has its own storage, apps and games get installed on it directly, and none of its core features need to rely on an online connection.


Given that it uses its own OS, essentially, is a fair point. I guess what I meant around my monitor analogy earlier is that it has the capability to serve that purpose, possibly without the sophisticated OS that wraps the store experience, the apps/games, and other features -- specifically with being able to use it on SteamVR or your PC in general.

This makes it a device that's generally capable of using any supported source for its screens, and can pass its peripheral input to other devices, like a PC, not unlike a mouse and keyboard.

VR headsets could treat their "OS" as a minimal experience akin to an OSD on a monitor that lets you switch sources and use the peripherals more generally like a mouse/keyboard with the right drivers on the target machine.

I'm more interested in calling out that Meta missed an opportunity here, and that it's confusing that they offer some semblance of these features (wireless linking for SteamVR...) while coupling that so closely to their OS and online-only experience.


I don't know if you'll ever see this, but thought I'd reply.

First, the original Rift headsets were as you describe: lightweight, passing through the PC VR image. However, Meta did not miss out on an opportunity. In what was perhaps the most effective A/B test they could run, they released the Rift S (tethered PCVR) and Quest 1 at effectively the same time. The market feedback was resounding: I believe it was a 10-to-1 preference for a standalone experience vs. tied to a PC. Since they doubled down on standalone (or all-in-one if you prefer), well over 20 million headsets have been sold. In fact, they're so popular that even the fraction that connects to Steam is basically tied for market share with the most popular PC VR headset ever, the Index.

Second, even as a PC VR HMD it was a real stretch to call it a monitor equivalent. It's wildly complicated to create compelling VR images. You need two screens at nearly 2Kx2K resolution each, running at 90 frames/second, sustained. Dip below that and you can induce nausea. Not every PC can do that, so you need careful engineering between the client and HMD, with tricks like time warp, space warp, interleaving, compression, prediction, pose estimation, etc. to take up the slack. Creating sub-millimeter precision of location with six degrees of freedom either requires external base stations (cost, complexity) or inside-out tracking with headset-mounted cameras and a processor running realtime simultaneous location and mapping and image recognition code, which implies a CPU and tech stack to support it. Nowadays people also expect passthrough (with real-time depth correction), hand tracking (AI routines for hand posing), and more. All this is to say that significant code must run on the HMD for a modern gaming headset (Meta's target market), as well as on the PC. And if you're investing that much in a custom software stack, you can't make it up on hardware margin - the cost to build an HMD is just too high. So you have to have an app store tie-in, because Valve sure isn't going to share its Steam profits with you.

Now, certainly there have been (and are) HMDs that tried this approach. HP (G2) and HTC (Vive series) both put out quality products leveraging the Steam ecosystem. Neither are sold in volume today, because the economics of selling a headset just aren't good enough.

Immersed and Big Screen are releasing very lightweight fixed-function HMDs for either work or movie watching that do operate the way you describe. Neither are expected to be high volume devices, and both are more expensive than Quest 3.

In short: VR is much, much harder than you may realize. Meta didn't miss an opportunity, the explicitly chose the market-tested, most popular solution that also has an economic model with some potential future payoff. If you want a "minimal experience akin to an OSD" then look at the Big Screen Beyond ($999, https://www.bigscreenvr.com/) or the Immersed Visor ($1,049, https://www.visor.com/). (Note: compare the price of these hardware-model pass-through devices to the Quest 3 ($499) which also includes a CPU, battery, storage, audio, more RAM).

It's also worth noting that Quest 3 is not online-only. It works fine offline once you've logged in once (people use it on planes, in parks, in the car, etc.). But this particular issue at Meta forcibly logged out users, then the API appeared online while failing all future login attempts. Ironically, users that work offline never noticed the outage because the bug couldn't log them out.


comparing it with a monitor is rather unfair, you have to bundle the computing along with it, not to mention the applications to make it an actually fair comparison. At that point, is it still ludacris?


To clarify my thoughts on this, I responded here to a similar reply:

https://news.ycombinator.com/item?id=39611122


The parent's comment is about how it quite literally is not a standalone headset, as is currently being demonstrated.


It is a standalone headset. You don't need a tethered gaming PC or console for the Quest series of headsets.


I thought that too, but it seems like you need a tethered cloud PC somewhere at meta to open apps after all.


Only because there's a bug, seems like. Normally you can turn off your wifi entirely and the headset continues to work fine. Tried disconnecting entirely in a non-standalone headset and see how that works out for you.

Standalone just means you don't need to tether to a PC or console.


Standalone, but not autonomous.


Steam Index is great if you don't care about cords / needing a PC. Easy to do VRchat, IRacing, or even blade & sorcery game sessions for 3-4+ hours without any eye strain, headaches, motion sickness, or discomfort from the headset.

It also fits over / around glasses

Biggest reason not to IMO is of the rumors around an "index2".


Seconded, I think. I've been very happy with my Valve Index, but I don't know how it compares to newer headsets on the market.

More broadly, I find that Valve represents my interests far more than Facebook/Meta does. So I'd much rather send my money to Valve.


I'll third this - Valve is really unobtrusive about the steam related features of the index. It does have some requirements with steam for setup but if you want to run a local binary and mess around with dev tools it's extremely easy to do. It's also extremely well sealed and designed - I tend to sweat a lot and a few times I've been beat sabering for quite a while without any long term damage to the headset.


> Biggest reason not to IMO is of the rumors around an "index2".

Another big reason not to buy a Steam Index is "not available in your country". The only VR headsets I've seen in stores here are the Quest 2/3 and the PSVR2; and the Steam store page for the Valve Index (and the Steam Deck) says "not available in your country".


I'm typing this from Immersed in the Quest 3, no issues.


There may be a bug or change since I left, but I built the app library and authorization logic, and it was explicitly designed to work offline. Of course, using it day-to-day and initial setup are different and I'd imagine if Apple is down it's hard to setup an AVP as well.


Unless it's a recent change, it works perfectly fine offline (wifi turned off).

As for alternatives, there is Pico, but Quest 3 may be superior in games selection. Or go wired which is of course less portable


This is different from it being offline, it's like the device is kicked off the associated account. A "something went wrong" window pops up with a "generate device code" button, with instructions on how to remove and re-pair the headset.

There is no way to even access wifi settings or anything else to disconnect the device from the internet. If it's still a problem much later in the day, I'll try turning off my router to see what that does.


Interesting. I had a problem a few months ago with DNS not resolving Meta servers on my Starlink internet connection, but I was able to use the UI and the apps nonetheless, just couldn't open the store or update firmware.

Seems like they really did change something in the latest firmwares.


In my case I was doing iRacing with a cable, it requires the Oculus app to run for the link to work. Which in turn needs the login to be active.


I think you need to be online for the first setup of the headset, otherwise it won’t work at all.


Yes. You login first, and then it completes device setup. At least, that's how it works for the Rift.


imho this is a stance you certainly have every right to take, but good luck. If you want to be part of the world of things like VR, smartphones, etc. then refusing (on principle) to participate in things like "accounts" and "cloud" is going to cost you far more time than the number of hours a massive company like Meta may have downtime. Likewise, yes, at some undetermined future date a lot of this hardware will become a complete doorstop due to their supporting servers being taken offline, however again, if you are doing this advanced gadget thing, it'll be long after you would have decided to upgrade to new hardware anyway because it can't do any of the latest stuff.

(and yes, there are ways if you're devoted enough, to roll your own everything and run Linux on a Framework laptop, and use some kind of custom ROM on your phone without Google anything, 3d print yourself a VR headset, etc. But all of this will cost you several orders of magnitude more time than Meta outages ever would.


I think the current buzz in the VR space is the "Bigscreen Beyond" which eschews all of the nice-to-haves in order to make the headset as light as possible, and the result is surprisingly compelling.


It looks compelling for high-end PC gaming VR enthusiasts, but if GP is more of a mainstream consumer it probably won't make sense for them.

At least from what I've read, there's a bunch of downsides for regular consumers: very expensive ($1000) -- SUPER expensive if you bundle in controllers and tracking points (~$1600), needs external tracking, wired instead of wireless, no built-in audio, can effectively only be used by one person (because each one is built custom to your face), and of course it's not a standalone headset, it has to be hooked up to a gaming PC.


Great news: Thanks to Apple, $1600 is now a budget super-economy VR headset!

Courage.


I could not play a steam game on a PC in offline mode when my internet was down. This issue is not exclusive to fb.


That's not a Steam thing though, but rather the specific software. Steam explicitly has an exit path for the user if Valve disappeared overnight that allows their downloaded games to continue working offline.

The difference being that we are discussing platforms, not the things that run on those platforms.


Steam Offline wants you to go online and perform a bunch of steps, including launching games and then enabling offline mode. Every game launch is is probably a download of at least few hundred megabytes of data. And then every game requires its own networks where your account linked to steam acc, etc. and Rockstar games iirc it you must be online when you launch the game. So the fact that Steam client has offline mode is irrelevant and misleading.


That has not been my experience. Any online game is going to be its own thing, dependent on the choices of the game company. Inherently, Steam does not require all the steps you're describing.

Sidenote, but my experience with the Rockstar launcher has been absolutely atrocious, to the point that I just avoid rockstar at this point even though I'd otherwise be interested because I've been burned so many times. That's a Rockstar issue, not a Steam issue.


Doesn't Steam still offer an offline mode that works with most games?


In my experience, the Steam offline mode only works if your computer is actually offline (without any network connection); if you're connected to the local network but the Internet connection from your router is down, it still tries to connect to the authentication servers while starting up.


Rockstar games require you to have an account at the launch iirc. And this us kind misleading, because while technically Steam has offline mode, but not necessarily the games you purchased on Steam. But having a unified UX is why I want to use the platform like steam to begin with.


It does offer an offline mode. It does NOT work with most games, because the publishers literally can't help themselves but add more layers of DRM on top of steam DRM and most of those these days require always online connections.


Were you still connected to your local network? Next time it happens, try completely disconnecting from your local network before starting Steam, it seems to use the presence of a local network connection to decide whether to enable offline mode or not.


Have never had this problem. There must have been something specific to your scenario, like the game wasn't downloaded, or it's a specific title?


I used my Quest 3 for a couple of gaming breaks during the outage with no issues.


I am wondering why I had no issues, since Meta themselves are saying yep, Quests booted people out. I was definitely online (albeit not playing multiplayer). Odd. https://www.uploadvr.com/meta-explains-why-quest-headsets-st...


I can confirm that I was unable to use my Quest 3 this morning. I left it connected to the internet, it tried to phone home I guess, and then locked itself into a "please connect this headset to your Meta account" state.

I am so sick of companies "selling" computers that they continue to control. In what universe does Meta have the right to remotely lock my headset and prevent me from using it to run the software I installed on it? If I were to sell my current desktop computer, or phone, or whatever, on any marketplace, and leave a remote login account on it that I then used to continue to operate the computer as though it were mine remotely, installing software, playing games, and occasionally peeking at what the current owner was doing, that would be obviously criminal. How is this any different? Because I signed away my rights when I "agreed" to their Terms and Conditions box (which I was compelled to do to use the hardware I purchased)?

Something is so fundamentally broken in the current ownership/property landscape. We somehow ended up in a world where people don't own the most critical tools in their lives, companies have managed to recreate feudal fiefdoms within the bounds of the market.


Ubisoft Director of Subscriptions really opened the floodgates of bad behavior when they came out saying "Gamers need to get comfortable with not owning their games".

I think these companies need to be reminded they do not own our PCs either.

I'm really starting to like that mantra of "If buying isn't owning then piracy isn't stealing".


The full context of that quote, which everyone conveniently omits, is ”for subscriptions to take off”.

As in, gaming subscription services like Xbox’s GamePass won’t succeed if gamers prefer to buy games over paying a monthly sub.


I can understand your frustration but were you not aware of the software lock in when you bought it? I'm not defending the ownership erosion, but I avoided these things specifically because of who was selling it and how it was locked to them.


I was aware. They are the only game in town when it comes to standalone VR. I want to play BeatSaber, a game I purchased when I used an Oculus Quest, and the only way to do that now is by subjecting myself to Meta's whims. I compromise on my ideals to have nice things, but will continue to complain when I feel that I or others have been wronged.


Not OP but I bought a Rift when it was still just Oculus.

...then Facebook bought Oculus...

...and then required you to have a Meta account to continue using the Oculus drivers.

It's a real "boil the frog" strategy and this is still early days for VR in terms of realized market value. The time to push back on this bullshit is yesterday. As we can all see, nobody can compete with Meta on price with the Quest 3, but the cost to purchase is heavily subsidized by the expected futures.


If you bought a Rift before facebook purchased them I wouldn't call it boiling the frog, more like being stabbed in the back. Not much to do there but sell your device but I guess most people probably hoped things would turn out differently than they did. This is one of the most infuriating parts of America now, if you hate a company and never want to interact with them some merger comes along and throws you into being their customer again against your will.

Of course, OP owns a Quest 3 so its more cut and dry there.


You can’t have everything. If you want a VR headset you described, there may not be any good ecosystem of software yet and you’d have to wait. This means you are gonna use your headset even less, think about how often fb goes down, it goes down for 2-3 hours once a year and it also has to coincide with the time you are using it? It doesn’t make sense to be this risk adverse


Quest is the best option right now


Is there any reason to get the quest pro over the quest 3? Or should I wait for a quest pro refresh?


Lucky you didn't already have a Quest 3, and you were in CyberSpace when Facebook crashed, because then you would be trapped in CyberSpace and die. That's how it works, you know.


The Sony PSVR2 is getting PC support by end of year apparently.


This is pretty cool. I might get one if I’m convinced it’s better than my quest3s display link.


Whatever you do, get one of the all-in-ones that doesn’t require setting up tracking beacons. They are universally a pain and will cause you to never bother playing.


Offline gaming should never have been destroyed.

Wonder what data they collect from gamers.


I was able to use my Quest 3 throughout the outage for offline games.


Edit: I'm leaving the comment as penance but as https://news.ycombinator.com/item?id=39610130 points out, this is an old message about an old update, not the current one.

A short update from meta with some initial high level technical details:

https://m.facebook.com/nt/screen/?params=%7B%22note_id%22%3A...



You're absolutely right, thanks, my mistake trusting random URLs on HN/the internet!


So it's a bug of the "automated remediation that makes things worse" family.

This system that checks for invalid values in the cache looks like a very bad idea in the first place as in my understanding it checks things beyond "is the cache up to date?".


Yet another status board that's a lie... great.


Might be related to this event?

> The cut lines include Asia-Africa-Europe 1, the Europe India Gateway, Seacom and TGN-Gulf, Hong Kong-based HGC Global Communications said. It described the cuts as affecting 25% of the traffic flowing through the Red Sea.

https://apnews.com/article/red-sea-undersea-cables-yemen-hou...

https://www.bbc.com/news/world-middle-east-68478828.amp


I'm pretty sure that report is just mainstream media reporting this week old cut https://www.datacenterdynamics.com/en/news/at-least-one-subs...

Pretty neat if a week after a cable is cut, FB falls over.

Especially when most of the source of truth databases are in the US and Europe, and that sort of data flow doesn't cross the Red Sea. FB has datacenters and points of presence all over, but outside the US/EU it's almost all caching.


> Pretty neat if a week after a cable is cut, FB falls over.

that'd be one helluva cache!!


Thanks for the reference. Yeah, may also be entirely unrelated.


> BY JON GAMBRELL Updated 8:25 PM PST, March 4, 2024

Timelines don't match nicely


I'm 95% sure it'll turn out to be a mundane config error somewhere


It will be DNS related.


Handy troubleshooting tool for outages: https://isitdns.com/


reminds me of http://iscaliforniaonfire.com/ even if not directly related to topic at hand


The sysadmin's haiku:

It's not DNS

There's no way it's DNS

It was DNS


It's not unreachable. I can easily see the FB page on my browser. It's just that even after resetting my password it doesn't accept it. Probably something's fucked up in the credentials database.


Those lines were cut yesterday, so it seems like a poor candidate for explaining the current outages. Likewise the geography doesn't match up with the outages.


This shouldn't affect europe. It just stopped working


Somebody might have fat-fingered a BGP configuration while trying to improve traffic routing that was impacted by the cut cables.


Yea I thought too that core of this is not at the services itself but at thr network somewhere.


maybe a long tail consequence of further shifting traffic?


I was thinking is deploy of DMA "compliant" unbundling the day before it takes force.

Could be both.


time to move away from undersea cables to satellites.


We have satellites. We use cables b/c they lack the speed and bandwidth necessary to support the total requirements of the modern internet. Satellite-only is only feasible if you're fine with going back to waiting minutes for your saucy jpegs to load (elder millennials, you know what I'm talkin' about).


ever heard of Musk's Starlink? From thier website "Starlink users typically experience download speeds between 25 and 220 Mbps, with a majority of users experiencing speeds over 100 Mbps" - https://www.starlink.com/legal/documents/DOC-1400-28829-70


LEO satellites would be too inconsistent, and further orbits have way too much latency.


Tip for Meta engineers: when your service is failing, don't just log people out and prevent logins. Display a cute image that shows that the service is drastically failing (like a whale or something), and then people will know to stop trying to repeatedly log in. The public might even come up with a catchy name for the whale.


Beyond unbelievable that going on an hour later, they're still showing "incorrect password" errors. How many hundreds of millions of people have wasted time frantically trying (in vain) to reset their passwords and pointlessly freaking out that their account might be compromised? What a bunch of careless, incompetent excuses for engineers.


Blame the managers and product owners that don't think this is an issue, not the developers that likely raised it a million times already.


Imagine how many hundreds of millions of users waste their time using instagram and facebook on a daily basis. Safe to say they don't mind wasting their customers time


This is your regular reminder that the users are not their customers. Advertisers and media outlets are the paying customers.


What a poor bunch of overworked human beings, with almost no control over the product they work on. Frantically following the whims of managers, reduced to labour units in this late stage capitalist hellscape.


But well paid at least. Working at Meta seems pretty shitty except for the pay from the stories I have heard.


It probably depends what team you're on, but I would not describe it as "pretty shitty." Being oncall for a 24/7 service sucks, yeah, but for my team it is one week a quarter and I haven't had any outside-of-biz-hour alarms the last few shifts. Other than that -- my work is challenging and interesting, my colleagues are friendly and smart, and my manager is decent. Not a lot to complain about.


That's such a beautiful comment I would almost consider printing it and putting it on my wall


Major outages are periods of intense stress and extremely difficult to operate in. The folks troubleshooting may be many things, but careless and incompetent are unlikely to be among them.


I can almost guarantee you're getting mercilessly downvoted because half of the people here are sympathetic Meta worshippers who desperately (1) wish they worked there and (2) know they'd probably contribute similarly to this same horribly engineered system.


This is a great idea! I know that I was flailing about trying to figure out why I couldn't login, so I'd suggest calling it a "Flail Whale".


How many other people here just assumed they had been banned for some arbitrary, uncontestable reason? I just use instagram to post hiking photos...


Having had a look at desperate Twitter posts during a major outage of a big German email provider with similar failure mode (login failed silently), it seemed like many people assumed that their email account was hacked. Close enough.


Right? I know I was like "oh, I haven't typed in my FB password in eons... maybe I changed it at some point and forgot? But if I change it what happens to all related services, is it going to log out my kids' Messenger Kids devices? Those are such a pain to log in. Should I change my password or not? What do I do?"

Then I saw the news that it was an outage.


Surely the repeated logins can't be helping the situation. I suppose it is entirely auth related across all Meta products. The repeated strain could pose a cold start problem for example.


> Display a cute image that shows that the service is drastically failing (like a whale or something), and then people will know to stop trying to repeatedly log in.

Probably not so easy to implement in behemoth apps, consisting of 20'000 source files...


For anything outside like-ing and post-ing, facebooks UI/UX is horrendous. Even Internet search does not help to find out how something trivial is done... The only way is to watch youtube videos.


> For anything outside like-ing and post-ing, facebooks UI/UX is horrendous.

It isn't perfect for even that IMO.

> Even Internet search does not help to find out how something trivial is done... The only way is to watch youtube videos.

That says more about where the web is heading than about facebook. Video is easier to monetise ATM⁰, and these days people don't put helpful stuff out there just to be helpful as much as they once did¹, so content creators are making them instead of simple web pages.

--

[0] Everyone wants to be the next big influenza who doesn't need a day job to get by.

[1] That sort of people are still out there, though they are somewhat drowned out as the signal-to-noise ratio heads inexorably towards “WHAT? WHAT?! I can't hear a single thing above the manscaping adverts!”.


What do you mean my login flow sucks? Its time complexity is really good!


Could be an outage in the auth service itself


Unless it's an outage in their ability to log into their own servers, they should be able to swap out the login page with a static HTML page explaining the outage. Maybe a 503 status code.


Yes of course it was. The point is, an hour later, they could have hit a circuit-breaker to get people to stop trying and going crazy over an error that is completely inaccurate.


> catchy name

Petunia?


Laid off too many people? Critical services no longer have owners?


Hope it stays broken. Imagine? People would blink, confused, and slowly start to wake from a decade-long stupor.


Man I would lose contact with a lot of friends if that were to happen


Are they really your friends then?


Sure. Maybe they moved away. Phone numbers and addresses do change. Maybe they are not close friends, but friends nevertheless.


yes why? Are all your friends in the same place in the world with the same phone number? That would seem odd


More or less yes. I certainly don't need facebook to keep in contact with them.


You must come from a small town then if that is the case...


...and then install TikTok.


how many people, realistically, use it daily? I think not a lot


>Facebook: global daily active users 2023.

>Statista https://www.statista.com › statistics › facebook-global-dau Feb 9, 2024 — During the fourth quarter of 2023, the number of daily active users on Facebook reached 2.1 billion, a minor increase on the previous quarter.

So roughly 28% of the planet.


Facebook accounts != individual humans.


Even if half that, still crazy, no?


And that’s just Facebook. Pretty much everyone outside of the us depends on WhatsApp to communicate. From regular people to businesses.


why do people say "outside of the US". Everybody I know in the US uses whatsapp


I don't know a single person.


He's quoting the daily active users, not the number of accounts.


Enough for Meta to be a $1.2 Trillion company. What a silly comment.


Like, everyone I know is on TikTok all the time and Reels is right up there


Lots and lots. Look at how many people are in this thread.


couple billions.


> I think not a lot

You have to be kidding me right?


Boomers probably do.


Really?


So why is this site metastatus.com and not status.meta.com as mentioned in an earlier article here the east amount of domain names these big corps have is not helping with making sure this is not a scam. So why metastatus.com and not metastatuscheck.com? Sure that the buffs at Mets could come up with an independent machine that can show the status even if everything else is down.


[flagged]


Dealing in absolutes is a result of misaligned expectations.


"The Force SHOULD have saved my mother's life!" - Anakin Skywalker


"blah blah blah I don't like the company so all their engineers must be really stupid..."


"The metaverse is the future of this company! We've hired the very best brains in the world to make these 1998 era PS1 polygon avatars!"

whole company goes offline


The Metaverse engineers and AI researchers don't actually run the infrastructure, but by all means, keep shitposting. It's so clever.


Yes, but there's been a trend recently to cut costs on all engineering except AI. Might be related


You keep defending developers while I'm pointing the finger at corporate. But good victim complex.


Yep. I've also noted that the people making such claims never seem to cite their own work as an example of how to implement something at Facebook or YouTube scale that is less "brittle".

Armchair quarterbacking isn't just a U.S. football phenomenon.


Frankly, I don't buy this explanation - technically or logically both from a devops,and systems architecture level - There is no way in hell a company like Meta is pushing database design changes this significant to production? We all know how many times these database architecture changes get run in staging,then even production subsets before rolling out to production at large.

Should we assume the teams working to ensure Business Continuity & Applications Resiliency redundancies fell asleep at the wheel?

Also, to assume that no down or outage messaging go out during a fairly routine maintenance based outage?

I call BS

Lol I can't find scenario where this happens, at this scale at a company of Meta's scale...


Oh dear.

Facebook, WhatsApp, Instagram, Threads are all down. and its time to contact the CEO of Meta to bring them all back up.

At least Twitter / X is still up, so time to complain about it there.


I'm on it! I WhatsApped Zuckerberg directly. He's aware of the situation.


WhatsApp works for me


Is it coincidence that Meta is trying to get portions of NSOs source code via the court system. Also today is apparently a big voting day. The techno-apocalypse is near.


What are the chances this is GPT-5 that has escaped and gone rogue?

Jest aside, i wonder when/if stuff like that will actually happen.


My shadow is extremely good at mimicking my every move, but I don’t live in fear that it will make the jump to the third dimension, kill me, and assume my identity. Should I?!


Well, maybe - if the source of the shadow keeps getting more and more powerful, it will eventually turn you into a permanent shadow.


Negligible. It would have gone after Twitter first.


Comically twitter seems to be working fine and is where I was able to find confirmation that it wasn’t just me.


Yeah, exactly, so it must still be a good old human mistake, not Skynet.


You mean X right?


I don't think that anyone but Elon ever means X.


forgot the /s


Clearly the work of Llama 3


About 95%.


I love how no one here is talking about the "coincidental" timing that this is happening on Super Tuesday. Carry on, nothing to see here.


Same reason we're also not talking about the ten million non-coincidences that didn't seem to happen today. It's a kind of survivorship bias that coincidences seem like they mean something.

Or you could imagine it as this: your animal brain has evolved to notice patterns. Seeing coincidences like this is akin to seeing faces in the stars. The challenge of evolved being is to override those impulses when they're not logically sound.


Sure. But if millions of evolved brains are "seeing the same pattern", they will behave as such, and so the meaning of that otherwise random noise in fact becomes something material.


And that’s why conspiracy nuts exist.


It's also, you know, a foundation of behavioral economics. But either one, whatever.


We've seen continuous layoffs for the past year or so, and the stability of everything is wonky now. Yesterday I had trouble with LinkedIn, last week it was Coinbase, a month or so ago Gmail was hanging. I don't think it's because of Super Tuesday.


If this was 2016, I'd think there could be a possibility. Given that the candidates for the election are almost 100% a foregone conclusion at this point, what purpose would it serve?


Fueling clickbait conspiracy theories is one probable outcome that comes to mind. For example, even a winning candidate may claim that a "woke" employee sabotaged the site in an effort to subvert the will of the people to have free and open discourse, they won despite the media deck stacked against them, etc.


I have yet to see one plausible reason for why this would affect primary voting in any way. Voting is done at the local level on local hardware; we don't vote on Facebook.


Messenger is also down. For some people this is an important means of communication (many young people I know in the US). People may either be in a panic or otherwise cannot coordinate sufficiently the plans they may have had to get to the polls.

That being said this year's primaries have got to be historically uncontested that it could not matter less.


It’s not direct disruption.

Grassroots organizers use FB and messenger and Insta for “get out the vote” communications. People use these services to goad their friends to go and vote. People use FB as a search engine to identify their polling place.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: