This happening now! Some suspect a failure in a CV training pipeline. Others sug...

Clewza313 · on Feb 9, 2021

It can't be a training pipeline, because the IPs are all around India.

Sample code from Stack Overflow being used by some major app is the most likely candidate. It's also possible that the image fetch call is a vestigial appendix that doesn't even display the image, which will make tracking this down extra challenging.

eli · on Feb 9, 2021

Perhaps a very inefficient “check if we have working internet access” routine.

rootw0rm · on Feb 9, 2021

i hate it when vestigial appendices go awry. pretty sure the evidence is mounting, however

magicalhippo · on Feb 9, 2021

If it was my site I'd replace the image with goatse and see who complains, but I guess that's a bit drastic for Wikimedia.

hinkley · on Feb 9, 2021

Tarpit the image and it will take care of itself.

Same advice I gave a w3c.org admin who was lamenting how much traffic people generate by not caching xml schemas. Yes, you have to serve the requests. But you don't have to try to serve them in 100 ms. If a human is on the other end, 1-2 seconds is just fine. If a human is not, then the human will surely notice when their batch process goes from 3 minutes to 10 minutes because it fetches the same schema 200 times.

rictic · on Feb 9, 2021

Could you end up executing a slow loris style attack on yourself by doing this?

I guess a couple seconds won't matter unless the server is already redlining it and the tarpitted traffic is a small proportion.

hinkley · on Feb 9, 2021

Well any time you start yanking levers and spinning dials you'd better know where the breaking points in your system are.

If you care about the traffic because you're already having trouble with that many simultaneous requests, then you are definitely not going to solve that problem by increasing the response time by a factor of 10.

But an important property of reverse proxies is that once the proxy sees the last byte of the response, the originating server is no longer involved in the transaction. The proxy server is stuck ferrying bits over a slow connection, and hopefully is designed for that sort of work load. If the payload is a static file, as it is in both of these cases, then it should be cheap for the server to retrieve them.

toast0 · on Feb 9, 2021

Yes, but slowloris isn't really a big deal if you've got a modern http(s) server with async i/o. It costs nearly nothing to have a idle connection while waiting 3 seconds before sendfiling the schema xml.

mekkkkkk · on Feb 9, 2021

Can you not run out of sockets though? I know it used to be a thing anyway. Maybe it's handled somehow nowadays.

toast0 · on Feb 9, 2021

You can run out of sockets, but that's easy to tune. I don't know the limits on other systems, but FeeeBSD lets you set the maximum up to Physical Pages / 4 with just boot time setings. So about 1 million sockets per 16 GB of ram.

Worst case, if you start running out of sockets because you're sleeping, sample the socket count once a second and adjust sleep time to avoid hitting the cap. Also, you could use that sampling to drive decisions about keeping http sockets open or closed.

I should add, select on millions of sockets is going to suck; so you'll need kqueue/epoll/whatever your kernel select but better interface is.

ISL · on Feb 9, 2021

Even just serving a giant blinking red X gif to 10^-6 of the requests might be sufficient.

Only 10^-6 of the "legitimate" requests would be affected, but a whole lot of the "undesireable" requests would see it...

viraptor · on Feb 9, 2021

Serving a giant blinking red gif to unsuspecting internet users is a bad idea. https://en.wikipedia.org/wiki/Photosensitive_epilepsy

There was a better idea posted in comments - serve a picture with a very short explanation and an email to contact.

EE84M3i · on Feb 9, 2021

There is no reason it needs to blink quickly.

viraptor · on Feb 10, 2021

It doesn't have to be quick for bad effects. See https://discussions.apple.com/thread/7908738 for example. I can't find the link now, but there was also an article about one of the older designers (IBM?) making sure the terminal cursor blinks in 1:2 ratio to reduce the problem.

Just don't give random unsuspecting people blinking images as a rule.

EE84M3i · on Feb 10, 2021

The first link says "several flashes per second". I was thinking more like a blink every few seconds, which seems to be annoying but not dangerous.

Taniwha · on Feb 9, 2021

But he's suggesting that they only be served to 10^-6 people - that's one 1,000,000th of a person - I suspect it will have little effect

waheoo · on Feb 9, 2021

I'd give them a pass.