Facebook, Instagram, and WhatsApp outages

cronix · on July 3, 2019

One interesting thing I noticed is the ALT text that is displayed in place of the missing images. It shows how their image recognition is classifying everything. I guess a lot of my friends drink as a lot of the alt text was like "Image may contain: drink"

oceliker · on July 3, 2019

That's an accessibility feature and has been around for at least a couple of years, I believe.

chirau · on July 3, 2019

I think OP is referring to the fact that the alt-text is now all ML generated. But yes, you are right, it's been around for a while

amputect · on July 3, 2019

I noticed that too (mtg meme group just showed up as a wall of "image may contain:text" placeholders), but I didn't realize it was an outage and I thought that I'd accidentally turned on some kind of really annoying safe-mode.

the-dude · on July 3, 2019

I noticed those too. Except it contained something else.

cyanbane · on July 3, 2019

Browsing it right now is a good exercise for devs who need to make sure they think about low vision/blind users who browse via audio, etc.

ie "Image may contain: a dog, sunglasses"

rzzzt · on July 3, 2019

Automatic alt text is a nifty feature: https://www.facebook.com/help/216219865403298?helpref=faq_co...

nilayj · on July 3, 2019

Looks like Facebook image store is having issues. Facebook, Instagram and WhatsApp use the same image store, so all the services are affected.

ceocoder · on July 3, 2019

However image metadata service seems to be up, this is what I see for most images on FB now.

https://imgur.com/a/uC2L354

thrusong · on July 3, 2019

I'm pretty sure this data is kept in MySQL and memcached.

giancarlostoro · on July 3, 2019

My concern there is my understanding has always been that WhatsApp implements end to end encryption through the Signal protocol, how can images be stored and be considered end-to-end encrypted? So are only text messages end-to-end? Ignoring logs being uploaded to the cloud.

I wouldn't send any sensitive pictures over WhatsApp then.

octorian · on July 4, 2019

Because the actual data for those images, that is stored on Facebook's media servers, is just a blob of ciphertext. Only the sending and receiving clients actually have the keys, not the servers.

justapassenger · on July 4, 2019

E2E encryption doesn't mean P2P networking. Data still flows through servers.

kabwj · on July 3, 2019

Probably blob store, not just image store.

londons_explore · on July 3, 2019

Could also be a networking issue.

If network capacity is severely reduced, the bulk static content (images/media) is what you'd switch off first.

The fact it's global makes that unlikely tho.

edwintorok · on July 3, 2019

https://developers.facebook.com/status/issues/34337305659485...

jug · on July 3, 2019

That’s also what it looks like to me. The services themselves work and are responsive here. But oh boy, images have been down for the longest time now for being top websites.

perrwa · on July 3, 2019

On Instagram, ad posts still fully display. Would that be a different image store or delivery?

mtgx · on July 3, 2019

Probably a good opportunity for more people to realize that all three services belong to the same company.

hbosch · on July 3, 2019

It’s common knowledge.

3stripe · on July 3, 2019

For HN readers sure

the-dude · on July 3, 2019

I find the outages of the last few days a curious coincidence : Slack, Google, Cloudflare, Azure, now FB.

edit : added Slack after @kache_ mentioned it, added Azure after stevehubertron mentioned it.

JonathonW · on July 3, 2019

Cloudflare's already stated that their outage yesterday was their own mistake: https://blog.cloudflare.com/cloudflare-outage/

eof · on July 4, 2019

Sure but if there were some other weird conspiracy, they would still likely say something like this.

jbverschoor · on July 4, 2019

Ofcourse. And banks are never hacked or stolen from

stevendgarcia · on July 3, 2019

All of my VPS servers have been slammed in the last 5 days with bruteforce SSH attacks. I've never seen anything this aggressive. One of my servers got PWNed, which has never happened to me before. There's definitely some wild, shady shit going on.

pfundstein · on July 3, 2019

I highly recommend installing fail2ban to automatically firewall IPs with consecutive failed attempts, or if possible, disable password authentication altogether and use key auth.

stevendgarcia · on July 3, 2019

Solid advice. I had fail2ban installed and enabled. SSHD Root/password login turned off and only ssh key had access. My firewall was also airtight, or so I thought. Clearly I mucked up a config or setting somewhere because the odds of someone getting past all that are extremely low. One thing I had not prepared for was IP spoofing which I learned can be prevented with a few net.ipv4.conf tweaks. I also just purchased a static IP from my provider so I can lock down ssh access even further. Here's hoping I never have to deal with this headache again! fingers crossed

xorcist · on July 4, 2019

You don't generally "get past" your firewall rules and into your box unless you have accounts that are not password protected.

If you really had password logins turned off, you need to identify and isolate how they gained access before you put that box online again. Never "hope" or "cross fingers" that it doesn't happen again. Unless you are an interesting target for some reason, chances are that these attacks are automated and you are running some insecure software somewhere.

Start by taking a snapshot of the machine before you do anything else. Go through the logs. Are there any unwanted processes? How were they started? Are there any unwanted binaries in the filesystem? How were they uploaded? Try to find IP addresses that that be tied to any unwanted login, and see search your logs for any previous occurrences.

Pay special attention to any web-reachable software you have installed.

acegopher · on July 4, 2019

What was the vector by which they gained access? I have fail2ban, password login turned off, key access, airtight firewall, etc. and now am worried.

steve19 · on July 4, 2019

If they exploited a service or Web app they might have gotten shell access. The chance of gaining access through ssh with fail2ban is extremely unlikely, unless fail2ban was badly configured.

cbluth · on July 4, 2019

Me too

pfundstein · on July 4, 2019

How did you discover the breach, and did you determine the vector? My guess is that it was a pivoted breach from another system on the LAN such as your PC.

stevendgarcia · on July 4, 2019

I'm still picking up the pieces but from my logs I can see that hundreds of successive login attempts were made from different IPs, effectively circumventing fail2ban with what I can only assume is some form of automated IP spoofing. I'm hoping that strict ipv4 settings and ssh ip range restrictions will mitigate this in the future. I also used this python script to harden my SSH security with better algorithms. https://github.com/arthepsy/ssh-audit

nickphx · on July 4, 2019

No, you were not seeing spoofed traffic. There are that many compromised machines actively scanning.

stevendgarcia · on July 4, 2019

It's scary to admit this but you are probably right. The first thing these bots do is use server resources to scan ports and brute force their way into other machines. I don't want to think about how many machines are pwned like this. Very sobering!

snazz · on July 4, 2019

This is also perfectly normal for the Internet, yes? If you have a server with an IPv4 address, expect many attempts per day.

OJFord · on July 4, 2019

> One thing I had not prepared for was IP spoofing which I learned can be prevented with a few net.ipv4.conf tweaks.

Do you have a handy link for more info about that?

`rp_filter`? https://www.slashroot.in/linux-kernel-rpfilter-settings-reve...

ghostpepper · on July 7, 2019

Did you have password login disabled for all accounts or just the root account?

avian · on July 4, 2019

+1 for disabling password auth. On the other hand I find fail2ban pretty useless these days. Most attacks I see seem to come from botnets where you only get a few requests from each IP.

lilyball · on July 4, 2019

I don't trust key-only auth; what if I need to access my machine from a new computer I haven't done this with before?

Is there any way to configure SSH to use a custom high-entry password that's different from the user's local password? My local password is something reasonable for me to type regularly (e.g. for sudo prompts), but I'd love to have a super long password just for SSH that I have to copy from my password manager each time.

0lpbm · on July 4, 2019

If this is a scenario that happens often, you probably should invest in a portable method to hold your ssh keys, such as a smart card. I have only used YubiKey for this purpose, but I'm sure others, like the Nitrokey, work similarly.

dahfizz · on July 4, 2019

This could be done simply by creating a new user with your super long password. Lock down ssh so you can only log in as this user, and let them `sudo su` into your normal working user.

lilyball · on July 4, 2019

I mean, yeah, but I don't want to do that. Partially because I don't want the friction, and partially because that won't work with any other tools that tunnel over ssh (e.g. sftp).

eitland · on July 4, 2019

There will be friction, but sftp and port tunneling - which are my most used fwatures besides plain ssh - should be possible?

I mean either you sftp to a shared folder that can be accessed from your regular user as well or you use a staging area with a cron job (or you load/unload the staging area manually.)

lilyball · on July 4, 2019

If I'm sftp'ing to my server it's because I want to access my files. Not a special shared folder that I then have to separately ssh in, su to the real user, and move into place.

I'm not deploying a website so a staging area isn't applicable. This is just a VPS that I use for various purposes.

dahfizz · on July 4, 2019

If your ssh user has the right priveleges, you can read and write your real users files as the ssh user just fine. I do get that this isn't ideal though.

ahje · on July 4, 2019

There are definitely cases where you might need to expose a an SSH setup where passwords are allowed to the World, but there is usually little reason to allow anyone to log in directly as root.

Set a long password for the user you log in as (correct horse battery staple-style passwords are perfect for such things), and make sure to put SSH on an alternate port to keep the more basic bots away and thereby reduce the noise.

Having to type a long password for sudo promts is a bit of a pain, but that trade-off is worth it from a security perspective.

jmalicki · on July 4, 2019

"a custom high-entry password that's different from the user's local password?"

You mean an RSA private key? (no really, that's exactly what it is... you can put your private key in your password manager and copy it to your computer)

lilyball · on July 5, 2019

I can manually type in a 30-letter password that I can see from my password manager on my phone. I'm pretty sure I'm not going to be manually typing in an RSA private key stored on my phone.

rorykoehler · on July 4, 2019

You can also setup a VPN and limit all port 22 activity to your VPNs IP.

pbhjpbhj · on July 4, 2019

I know it's just an obscurity measure but step one is surely don't expose SSH on 22? Dropped my (home) attacks/scans to zero from hundreds per day.

rorykoehler · on July 4, 2019

You could do that but if you become a target due to unexpected viral growth then it's useless.

pbhjpbhj · on July 4, 2019

It's more "stops you appearing on Shodan" level. But surely if you can reduce the traffic then you are more likely to notice proper attacks.

rorykoehler · on July 4, 2019

Shodan is amazing. Creepy too. I used to spend hours trawling video feeds on there. Fascinating.

stevendgarcia · on July 4, 2019

Yep. I just paid my ISP for a static IP address to accomplish this exact thing.

dharmab · on July 4, 2019

You should disable password auth entirely and whitelist the IPs you connect from in your firewall. If you need to conenct from a large range of possible IPs, use a bastion host with 2FA and restricted to IP blocks from countries you actually connect from.

qwertox · on July 3, 2019

Several european banks are also having issues with their login sites and online services. Backends are working though. I know, it's far fetched to throw them in the same bucket, and I won't, but it feels like there is happening some kind of reorganization on the net, which in the worst case would be some kind of global state-sponsored MITM-layer getting set up, maybe to prepare for new techs like 5G and TLS1.3.

tenpies · on July 3, 2019

My metaphorical tinfoil hat tells me that all the sudden security meetings yesterday (Putin, European Security Council, US VP Pence) were about this.

wybiral · on July 3, 2019

If I were going to buy into any theory other than coincidence my bet would be on an emergency patch situation where we're seeing the rushed deployment.

But it's probably just a coincidence.

api · on July 4, 2019

That is not terribly far fetched. Cisco powers a lot of the core and has had nasty zero days before.

gevz · on July 4, 2019

And on top of that highly classified Russian submarine allegedly used to tap on undersea cables has sunk on Monday.

caiobegotti · on July 3, 2019

One of the biggest banks in Brazil (state owned) is currently having the same problem, bb.com.br

trevor-e · on July 3, 2019

I agree that the probability of these occurring so close together is very unlikely.

Separately, this is a huge blow for Instagram/Facebook since this is one of the busiest social media sharing weeks in the US. I bet a lot of their engineers are answering pagers on vacation.

dmitrygr · on July 3, 2019

If you have your work with you, it isn't a vacation.

wongarsu · on July 3, 2019

If it's emergencies only and you get compensated handsomely many people will feel it's worth it. Of course reality often looks different.

avh02 · on July 3, 2019

depends on how you value your vacation time with family/friends/to yourself.

pfundstein · on July 3, 2019

In my experience, people jostle to take part in a rotating on-call roster where they are sufficiently compensated, in my case $1700/week, to occasionally answer one or two calls.

dheera · on July 3, 2019

A lot of my vacation destinations don't even have cellular signal. Taking a laptop is often not an option on some trips either.

stevenhubertron · on July 3, 2019

Add Azure to the list. Its routing was down for about 2 hours yesterday.

the-dude · on July 3, 2019

Actually, I started wondering the last few minutes : what if the next worldwar would start on the internet? Or is this a common assumption already?

I had not connected Huawei to my conspiracy theory yet, but just did.

dreamcompiler · on July 3, 2019

People who make defense policy have been assuming the answer to that question was "yes" for about 20 years now. It's sometimes described as "cyber precedes kinetic."

https://www.rand.org/pubs/monographs/MG1215.html

TuringNYC · on July 3, 2019

Ok, so unfriendly sovereign nations are trying to “hurt” us by making social media inaccessible? I’m half joking here, but perhaps we should all be out, about, and enjoying Independence Day in person rather than through our phone screens.

(Yes I realize WhatsApp is a different segment, an important way people communicate.)

altec3 · on July 3, 2019

All three apps are huge for communication. It doesn't seem far fetched that a nation would test its capabilities taking down different forms of communication.

the-dude · on July 3, 2019

[flagged]

mandelbrotwurst · on July 3, 2019

Parent is referring to the holiday tomorrow in the United States.

wongarsu · on July 3, 2019

I think GP understood quite well and wanted to express their dissatisfaction with the assumption that everyone is American.

dvtrn · on July 3, 2019

Is there an official HN stance on this? There have been times I comment, with frustration, that a statement/positions can be hard to grok for those who don't have an automatically-American context in mind (you know, because they might not live in America) the downvotes come swiftly and snarky comments about "this is an American messageboard" follow.

So I'm actually genuinely curious what YC/HN's official position on this is. Asking broadly.

TuringNYC · on July 4, 2019

There was already a US-centric context via the parent comment:

It was a link to a RAND report warning the US Military of cyber-attacks in the US, the report starting off as: "The chances are growing that the United States will find itself in a crisis in cyberspace, with the escalation of tensions associated with a major cyberattack, suspicions that one has taken place, or fears that it might do so soon."

TuringNYC · on July 4, 2019

The comment I was replying to was literally a link to a RAND report warning of such things in the US, the reporting starting off as:

"The chances are growing that the United States will find itself in a crisis in cyberspace, with the escalation of tensions associated with a major cyberattack, suspicions that one has taken place, or fears that it might do so soon."

mandelbrotwurst · on July 3, 2019

If that's the case, then they should have make that critique directly rather than beat around it with questions that they didn't actually need answered.

the-dude · on July 3, 2019

Thank you. So we are basically waiting for a Twitter outage.

wallflower · on July 3, 2019

If Dropbox goes down, then something serious may be going on.

cnorthwood · on July 3, 2019

I think Twitter DMs are having issues too...

RandallBrown · on July 3, 2019

You might love Neal Stephenson's latest book "Fall, or Dodge In Hell"

It's not quite about a world war on the Internet, but it is about what could happen if the Internet suddenly stops working somewhere.

kfrzcode · on July 4, 2019

Nitpick, but it's:

Fall; or, Dodge in Hell

Bayart · on July 4, 2019

People would start by attacking telecoms much like they used to start with rail and harbours.

partiallypro · on July 3, 2019

If I'm not mistaken virtually all of the ones in the list above were caused by networking problems. Given how much Chinese parts are in those interfaces, it does make you wonder.

It could also be just a really strange coincidence.

ernsheong · on July 4, 2019

So, people like to slam on GCP but not Azure. Didn’t make front page of HN but GCP wasn’t even an outage just increased latencies due to rerouting and still make front HN?

samstave · on July 3, 2019

So the options are:

1) a secret cyber-war is going on...

2) Some entity is installing new spying slurps

3) An (critical) IX that we werent aware of is having major issues... and we dont know who to blame on that one...

dogecoinbase · on July 3, 2019

Honestly, I think it's much simpler than that -- the last decade has been a general trend of building more centralized, less resilient systems. The global internet is in an inherently unstable state because of this (i.e. when problems occur, the internet is no longer able to route around them because the problems are internal to large organizations which both control huge portions of the internet and which aren't internally incentivized to separate components -- see e.g. the AWS status images living on S3). These outages are simply more and more frequent as the system's instability increases. And they will continue to become more frequent as time goes on, until organizations realize that they can't control their own uptime unless they run their own infrastructure, at which point the pendulum will start swinging back the other way. Rinse, repeat.

lovecg · on July 3, 2019

4) People love to find patterns where none exist

anon_z88 · on July 3, 2019

This is a fallacy all humans experience, however, what is the probability specifically that all these problems are unrelated? At what probability is it mathematically impossible that these problems are unrelated?

AFAF

NateEag · on July 3, 2019

The answer to your last question is 0.

"Impossible" means "impossible", not "unbelievably improbable".

That's even more true when you ask about "mathematically impossible", as it reinforces the idea of formal logic being the relevant domain, where precise meaning of words is fundamental.

If you adjust your question to be "at what probability is it unreasonable to claim these problems are unrelated?", then the answer is subjective - different people have different standards for reasonability.

I think we'd need mountains more data than we have about the incidents to compute a meaningful probability, anyway.

lucasverra · on July 3, 2019

is that a patern ?

samstave · on July 4, 2019

5) patterns exist and people who have been online for decades can see them.

(Your HN account is 4 months old. Mine is ten years old)

basch · on July 3, 2019

0day patch gone wrong? We've seen before that these companies work together to patch issues behind the scenes before announcing them.

dharmab · on July 4, 2019

4) it's summertime in the predominantly online part of the world and general human activity is at a peak

samstave · on July 9, 2019

Hmm...

I like this theory...

Its innocent but id like to see data behind it.

Karawebnetwork · on July 4, 2019

1 & 2: Could it be? https://futurism.com/russian-sub-fire-internet-cables

samstave · on July 9, 2019

Uss jimmy carter

leothekim · on July 3, 2019

Empire BlueCross/BlueShield's site is down. Not quite in the same category but sucks if I want to file claims.

https://www.empireblue.com/

All4All · on July 3, 2019

+ A large unnamed government contractor that has also been having some rare network outages throughout the week.

edwintorok · on July 3, 2019

HN still seems to be working, perhaps they should consider hosting their own cloud?

oblio · on July 3, 2019

HN is probably running on a LispMachine next to pg's desk. I wanted to say "under his desk", but you know, LispMachines...

tim333 · on July 3, 2019

Seems its still a single server. I think PG is off on sabbatical still though. https://news.ycombinator.com/item?id=18184749

notimetorelax · on July 4, 2019

Might be a desire to release new versions ahead of July 4th holiday.

badrabbit · on July 4, 2019

"When it rains,it pours"

In my experience this is so true! The opposite applies as well,for some reason everything behaves quietly all at once and I turn up every leav and look under every carpet for a "problem" with the monitoring or something worse.

Probability is funny, for example I don't think anyone has explained exactly why numbers that start with '1' are so common. I think it's just geometric symmetry if complex event chains.

senderista · on July 4, 2019

https://en.wikipedia.org/wiki/Benford's_law

badrabbit · on July 5, 2019

Thank you,i think this has to do with higher order numbers' symmetry with each other

TheAceOfHearts · on July 4, 2019

There was a huge Comcast outage in the bay area last week as well, although most people probably didn't notice since it happened at night.

vitorgrs · on July 3, 2019

Twitter notifications and DMs were also down today.

XCSme · on July 3, 2019

Heard some rumors that China might have something to do to this, related to the Hong Kong protests, but they're just rumors.

cryoshon · on July 3, 2019

where are you seeing the rumors, if i may ask? i'm not interested in propagating mistruths, but i'm interested in digging deeper.

salemh · on July 3, 2019

Not the poster you replied to, but searching OSINT seems to cover this. Open source intel - though a lot of that is actually on Twitter.

woliveirajr · on July 3, 2019

Whatsapp is having problem delivering downloads now (at least where I live)

nonbirithm · on July 4, 2019

Could it be a physical infrastructure problem? There are occasional issues with fiber cuts that cause outages. I'd imagine many companies rely on such a critical piece of infrastructure.

hobbescotch · on July 3, 2019

Apple Music seems to be having issues too

Bombthecat · on July 4, 2019

If I would take a wild guess : it's the heat waves. They drain people's concentration...

noncoml · on July 3, 2019

Why curious? Is there anything to make you believe there is foul play?

Karawebnetwork · on July 3, 2019

Curious because those are big services people expect to see up 99% of the time. To have them all fail in the same 48h is a curious coincidence.

prab97 · on July 3, 2019

99% is actually a lot below SLA for these services. It means the services are down for 3 days 15 hours in an year.

reaperducer · on July 3, 2019

Why curious?

People naturally find coincidences curious. It's how we find causality.

SmellyGeekBoy · on July 3, 2019

Often even when there is none to be found.

simlevesque · on July 3, 2019

Jeff Bezos casted a spell to crush his opponents \s

moret1979 · on July 11, 2019

And today, eight days later, Twitter.

ratsimihah · on July 3, 2019

Imagine if Google is next

maxdo · on July 3, 2019

Google had network issues this morning in us region

kerng · on July 3, 2019

They were down yesterday

TechRemarker · on July 3, 2019

I first noticed Instagram then realized it was Facebook and their other properties such as What's App as well. Is very odd all of these major outages this week. By Cloudflare at least said it was an employee error, so assuming it's not subtle attacks by other countries unless companies like Cloudflare are required to legally provide an alternate story for national security.

ksec · on July 3, 2019

According to Digital Attack [1] it seems the world is constantly on DDoS.

[1] https://www.digitalattackmap.com

Radle · on July 3, 2019

"Notable Recent Attacks" from 2016, the site is dead and shows fancy graphic.

dboreham · on July 3, 2019

People went on vacation and left interns running the show?

supergauntlet · on July 3, 2019

Feels like every day there's another one of these outages. Shame that everything is so centralized that there's a single point of failure that affects 3 of the biggest communication platforms around.

y04nn · on July 3, 2019

All those outages are strange, I wasn't able to access my personal server for few hours today, I was not even capable to ping ip). When I was able to reconnect, everything seem to be ok. Maybe there is some routing/BGP issues/attack (again). Sadly, I didn't try to traceroute the serveur IP.

cfitz · on July 3, 2019

Second this, with my DigitalOcean VPS's. DigitalOcean posted an incident report yesterday. They say "global networking issues" were caused by a "major provider", but unfortunately there isn't much more detail than that [1].

[1] https://status.digitalocean.com/incidents/qvdcj7yx4030

devin · on July 3, 2019

That was due to google cloud networking issues.

jammygit · on July 3, 2019

GCP provides infrastructure for digital ocean?

captn3m0 · on July 3, 2019

No, but their Data Centres may be sharing upstream fibre provider?

Google mentioned a Fiber cut upstream.

_nickwhite · on July 3, 2019

Not only Facebook, but Instagram (which shares common infrastructure) has also been mostly down (read-only mode it seems). The most popular Twitter hashtag right now is #instagramdown

parthdesai · on July 3, 2019

WhatsApp as well.

agoodthrowaway · on July 4, 2019

This outage coincides with FBs PSC (performance summary cycle) time. I wonder if this is folks trying to push features so they get “impact” for PSC.

lopespm · on July 5, 2019

Very good point. I wonder if the recent outages on other well known services could be heavily influenced by a similar phenomenon. If this holds water, it would be interesting to have an article or study around this issue. I certainly would be interested in reading it.

edwintorok · on July 3, 2019

Do we need a 'Show Outage:' tag? HN seems to become the defacto outage reporting place...

eneveu · on July 4, 2019

Because it's interesting to discuss the impact of those outages and the reasons for them, to learn from those experiences.

smaili · on July 3, 2019

One interesting thing I've noticed from this (not sure if this is just my experience or if others noticed as well) but none of the ads seem to be impacted by this. Certainly makes me wonder if those are on a more prioritized and entirely separate infra and SLA that is designed to be more resilient and highly available.

jedberg · on July 3, 2019

It's a lot easier to serve ads than your custom photos. The ads are more generic, easier to cache globally, and show to a lot of people. They can also fall back to super generic ads if the database is unavailable to customize them.

But your photos don't have a fallback.

londons_explore · on July 3, 2019

Fallback is the main reason. Some webservices even fallback to non-moneymaking ads for charities in the case of a technical fault, because shareholders react badly to an ads outage, but they don't notice an hour or two of charity ads...

Mountain_Skies · on July 3, 2019

Our local transit system installed video boards showing arrival times for the next trains. It was paid for and operated by an advertising company who in return were able to run ads on their video boards. The functionality for arrival times rarely worked but the displaying of ads never failed. It was obvious where the company's priorities were at.

adjkant · on July 3, 2019

Intersection right?

klodolph · on July 3, 2019

> Certainly makes me wonder if those are on a more prioritized and entirely separate infra and SLA that is designed to be more resilient and highly available.

There's absolutely no doubt that the ads are served differently. Think about it this way:

- Ads: Small number, each served to many people, funded with real cash, but you can't get the cash if you don't serve the ads.

- Photos: Large number, each served to few people, funded by money left over from ads, and if you don't serve them your (non-paying) customers get frustrated.

My guess is that ads is funded well enough that they can run at lower resource utilization, and much more effort is made to run photos at high resource utilization. That's how I'd run it.

londons_explore · on July 3, 2019

Images:. Serve requested image.

Ads:. Conduct an auction in milliseconds between hundreds of parties between hundreds of millions of ad creatives with lots of very large in-memory machine learning systems, all to decide which ad to serve to maximize revenue.

Different problem entirely.

disgruntledphd2 · on July 4, 2019

In systems like this, you tend to want to maximise some other outcome than revenue (like clicks or conversions). In general, both GOOG and FB make money out of DR advertisers who care about this, so you tend to make more money if you optimise for the thing they care about (revenue goes up as a second-order effect).

kevlawrence · on July 3, 2019

This gives me a lot of confidence in their crypto currency.

smt88 · on July 3, 2019

1) Cryptocurrency ops are so vastly different from running a social website that I can't even think of any overlap.

2) I hate Facebook as a company, but as a builder/scaler of web apps for many years, I'm continually blown away by the speed and reliability of their website. Their operations are mind-blowing.

The only comparable apps (in terms of scale) are Gmail and YouTube, and Gmail is simpler in certain key areas (e.g. mail delivery isn't millisecond-sensitive for a user).

reaperducer · on July 3, 2019

Cryptocurrency ops are so vastly different from running a social website that I can't even think of any overlap

I know nothing about how cryptocurrency works, but wouldn't social media outage sources like multiple server failures, hurricane, tornado, sliced fiber line, etc... affect the kind of cryptocurrency that Facebook is embarking on?

Or is there something in the "distributed" nature of cryptocurrency that makes it more resilient? Is Facebook using that model, too?

dboreham · on July 3, 2019

No. Yes.

dullgiulio · on July 3, 2019

And Google Search. That's actually treated even more specially (compared to GMail and YT) inside Google and it has amazing reliability.

It would be interesting to know more details. I bet results are sometimes incomplete, but somehow Google manages to keep the system correct enough nobody notices a sudden quality drop.

I also understood Search has some special QoS at all levels: from the network to the scheduling of jobs...

kevlawrence · on July 5, 2019

I don’t have extensive knowledge of how crypto works, but for 1), most, if not all large-scale applications like FB operate with distributed systems somewhere in the chain (ex. image and video processing), which is the core of how ledgers work, and how mining works.

Their website tech is impressive, I’ll give them that. I still wouldn’t trust them with my money.

basch · on July 4, 2019

How would bgp routing failures and cut fiber NOT also impact digital currency networks?

smt88 · on July 4, 2019

As long as the ledger is replicated on machines that are outside the affected areas, the network would be fine.

kofisarfo · on July 3, 2019

Imagine if this meant not being able to send or receive currency.

nemothekid · on July 3, 2019

You don't have to imagine - Wells Fargo was down for several hours earlier this year and no one could send or receive any money.

People couldn't pay bills, transfer money and some couldn't even buy gas.

londons_explore · on July 3, 2019

> some couldn't even buy gas.

How culture changes perspectives! Over here in Europe I was thinking "doesn't sound so bad". Then I realised over in the US, gas means petrol, and without petrol America doesn't work.

I'll reply to any replies after I've taken the electric train home.

adventured · on July 3, 2019

You're not being fair to Europe's very impressive contributions. After all, European companies lead the way in the production of rolling pollution machines, aka ICE vehicles.

Besides, the US has cleaner air than most of Europe, thanks to the intense diesel particulate pollution across European cities. Who can forget the infamous Paris smog photos from a few years ago (which is still an ongoing problem)?

Don't worry though, the US invented the modern electric car, twice, in the form of GM's EV1 and Tesla. We'll lead the European automakers out of the dark ages - underway right now as they all chase Tesla - and take care of the problem of the ICE automobile that Europe was heavily responsible for. In the process cities like Paris will be smog free again, finally.

tomglynch · on July 3, 2019

I don't think he meant to offend you. Moving around in many if not most European cities without a car is a lot easier than to do so in the US. The US also has higher cars per capita: https://en.m.wikipedia.org/wiki/List_of_countries_by_vehicle...

emptyfile · on July 4, 2019

Don't twist an arm patting yourself on the back like that...

thrusong · on July 3, 2019

I would say cash is used much less often in Canada than it used to be and most people just tap or swipe their Interac debit cards. Outages happen from time to time, usually during the mad Christmas shopping season, but life goes on.

baby · on July 3, 2019

Yeah. It happened a few times with Monzo in London. Nobody could pay for a day or something. Life went on.

thrusong · on July 3, 2019

It's really only gone down for short periods of time here- maybe a couple of hours at most, but it's usually just limited to one of the major banks or retailers (like all the Walmarts in town).

jeromegv · on July 3, 2019

Anecdata is always funny because I hardly know anyone that uses debit. Credit all the way.

Fogest · on July 3, 2019

In Canada though? Because in Canada it is definitely rare to see cash now except in boomers. Even a lot of places don't accept cash like gyms for even purchases like drinks or towels.

karlding · on July 3, 2019

If you frequent small (generally Asian) restaurants, it's quite common to see signs saying they only accept cash in Vancouver. Otherwise, if they take credit, they may attempt to incentivize paying with cash via discounts.

perardi · on July 3, 2019

Same, here in Toronto. My favourite Asian fruit market only takes cash, and a lot of the corner stores have a minimum purchase for cards.

But any business with any sort of margin (as in, I’m not buying grapes for 99¢/pound) takes contactless payments. My gym would look at me like I’m crazy if I tried to pay with cash, and Billy Bishop airport has been cash-free for years.

quibono · on July 4, 2019

Why is that? What's some advantages of having a credit card versus a debit one?

dharmab · on July 4, 2019

Customer protection- credit cards allow you to reverse charges on defective or fraudulent purchases if you are unable to find recourse with the vendor. They usually side with the cardholder as long as a basic paper trail of a good faith attempt to resolve the issue with the vendor is included.

Also, credit cards often include insurance on purchases. For example, one of my cards will reimburse my hotel/car rental/plane tickets in certain cases of trip interruption if I purchase those items on that card.

thrusong · on July 11, 2019

There are similar protections on debit cards/chequing accounts here in Canada thanks to our strictly regulated banking system.

fhsm · on July 4, 2019

Failure condition - your conceptual exposure and, at least in the US, legal exposure is more favorable in the face of fraud.

Would you rather have someone thieve a dollar from your wallet or fraudulently assign you a debt?

randlet · on July 4, 2019

Ignoring the obvious one that you can pay for things you don't currently have enough money for (bad idea in general), I pay for almost everything on a visa card for travel points.

ceejayoz · on July 3, 2019

Not hard. My bank's had the occasional (twice in four years, IIRC) debit/credit swipe outage of a couple of hours. It happens.

anbop · on July 3, 2019

And? That’s why cash exists.

the-dude · on July 3, 2019

In NL cash transactions of 10k+ already needed to be reported to the authorities.

But a new law will forbid any cash transactions higher than 3k.

urtrs · on July 3, 2019

In Greece it is forbidden by law any cash transaction higher than 500 euros.

dlphn___xyz · on July 3, 2019

whats the reasoning behind the new law?

i_cant_speel · on July 3, 2019

It looks like an attempt to prevent money laundering

https://nltimes.nl/2019/07/01/dutch-govt-ban-cash-payments-e...

FourthProtocol · on July 3, 2019

The revenue service is unaware of cash payments/purchases made by private parties and so cannot tax such transactions. Unless those private parties volunteer the information.

buboard · on July 3, 2019

Has there ever been a major bitcoin or ethereum, or stellar outage?

BearsAreCool · on July 3, 2019

This is basically the point of decentralized systems. Foor bitcoin to work it relies by protocol on whoever is online, so effectively everyone using it would have to go offline for it to fail. Additionally, bitcoin really doesn't get updates that would dramatically change how things function enough for their to be an error introduced that would take things down. In comparison, FB and friends rely on a handful of servers that typically can be taken out by one poorly planned software upgrade.

This is a gross simplification, but the basic idea is that more servers and diversity of the servers (which you get from decentralised protocols) should lead to crashes not taking out the whole network.

ceejayoz · on July 3, 2019

Facebook's more akin to an exchange for Libra, isn't it? There's 28 companies partnering up for the system.

buboard · on July 3, 2019

They have one wallet. But there is no reason for a wallet to depend on facebook being online, and presumably there will be 100 points of failure for the ledger on launch

pigeons · on July 4, 2019

https://medium.com/stellar-developers-blog/may-15th-network-...

sekai · on July 4, 2019

Nope, that's the beauty of decentralized network, it never goes offline while there at least some clients connected. You could disrupt the network with other ways though.

narnianal · on July 3, 2019

I can answer that. For money! https://www.youtube.com/watch?v=MiJPsOI61qE

(quote from a tv show where the anti-hero destroys an intergalactic empire by setting the value of all money from 1 to 0; in the scene is the alien "white house" having a discussion what to do without money)

kache_ · on July 3, 2019

I wonder what kind of economic damage all these recent outages add up to. How much harm are they causing, in terms of productivity and value. Facebook's social networks I'd imagine are less impactful than the slack outages, which could completely cripple a company if they were primarily remote.

jressey · on July 3, 2019

I think we should think about it more as the economic gain being experienced when people don't have these things to distract them.

aphextim · on July 3, 2019

There was an old joke from I believe Jerry Seinfeld which came out around the time the Y2K scare was in full force.

I am probably going to butcher it as I am no comedian but it went something like:

>All these people are freaking out about this whole Y2K thing. What are we going to do when the internet and computers all stop working?! I always respond with, "I know it will be horrible, it will be like living in the 80s again"

Now clearly I realize if all computers went down there would be real world consequences and issues, however I still like the joke.

navaati · on July 3, 2019

Damn, of course ! It's a Stranger Things marketing ploy !

misterprime · on July 3, 2019

In The Morning, Dude Named Ben.

the-dude · on July 3, 2019

You ain't seen nothing yet : https://en.wikipedia.org/wiki/SQL_Slammer

JMTQp8lwXL · on July 3, 2019

Ad content (images and video) seems to be loading fine on the iOS App. So the damage is likely minimal.