Hacker Newsnew | past | comments | ask | show | jobs | submit | martinald's commentslogin

Just to be aware when you say "Even with all 6 environments and other projects running, the server's resource usage remained low. The average CPU load stayed under 10%, and memory usage sat at just ~14 GB of the available 32 GB."

The load average in htop is actually per CPU core. So if you have 8 CPU cores like in your screenshot, a load average of 0.1 is actually 1.25% (10% / 8) of total CPU capacity - even better :).

Cool blog! I've been having so much success with this type of pattern!


Sharp eye! Thanks. Fixed

Because that's the main constraint for building them - how much power can you get to the site, and the cooling involved.

Also the workloads completely change over time as racks get retired and replaced, so it doesn't mean much.

But you can basically assume with GB200s right now 1GW is ~5exaflops of compute depending on precision type and my maths being correct!


Yes! The varying precisions and maths feels like just the start!

Look at next gen Rubin with it's CPX co-processor chip to see things getting much weirder & more specialized. There for prefilling long contexts, which is compute intensive:

> Something has to give, and that something in the Nvidia product line is now called the "Rubin" CPX GPU accelerator, which is aimed specifically at parts of the inference workload that do not require high bandwidth memory but do need lots of compute and, increasingly, the ability to process video formats for both input and output as part of the AI workflow.

https://www.nextplatform.com/2025/09/11/nvidia-disaggregates...

To confirm what you are saying, there is no coherent unifying way to measure what's getting built other than by power consumption. Some of that budget will go to memory, some to compute (some to interconnect, some to storage), and it's too early to say what ratio each may have, to even know what ratios of compute:memory we're heading towards (and one size won't fit all problems).

Perhaps we end up abandoning HBM & dram! Maybe the future belongs to high bandwidth flash! Maybe with it's own Computational Storage! Trying to use figures like flops or bandwidth is applying today's answers to a future that might get weirder on us. https://www.tomshardware.com/tech-industry/sandisk-and-sk-hy...


As a reference for anyone interested - the cost is estimated to be $10 billion for EACH 500MW data center - this includes the cost of the chips and the data center infra.

With such price tag the power plant should be included.

This really is a function of two things:

1) (Mainly) the huge increase in upstream capacity of residential broadband connections with FTTH. It's not uncommon for homes to have 2gbit/sec up now and certainly 1gbit/sec is fairly commonplace, which is an enormous amount of bandwidth compared to many interconnects. 10, 40 and 100gbit/sec are the most common and a handful of users can totally saturate these.

2) Many more powerful IoT devices that can handle this level of attack outbound. A $1 SoC can easily handle this these days.

3) Less importantly, CGNAT is a growing problem. If you have 10k (say) users on CGNAT that are compromised, it's likely that there's at least 1 on each CGNAT IP. This means you can't just null route compromised IPs as you are effectively null routing the entire ISP.

I think we probably need more government regulation of these IoT devices. For example, having a "hardware" limit of (say) 10mbit/sec or less for all networking unless otherwise required. 99% all of them don't need more than this.


> If you have 10k (say) users on CGNAT that are compromised, it's likely that there's at least 1 on each CGNAT IP. This means you can't just null route compromised IPs as you are effectively null routing the entire ISP.

How about we actually finally roll out IPv6 and bury CGNAT in the graveyard where it belongs?

Suddenly, everybody (ISPs, carriers, end users) can blackhole a compromised IP and/or IP range without affecting non-compromised endpoints.

And DDoS goes poof. And, as a bonus, we get the end to end nature of the internet back again.


From having worked on DDoS mitigation, there's pretty much no difference between CGNAT and IPv6. Block or rate limit an IPv4 address and you might block some legitimate traffic if it's a NAT address. Block a single IPv6 address... And you might discover that the user controls an entire /64 or whatever prefix. So if you're in a situation where you can't filter out attack trafic by stateless signature (which is pretty bad already), you'll probably err on the side of blocking larger prefixes anyway, which potentially affect other users, the same as with CGNAT.

Insofar as it makes a difference for DDoS mitigation, the scarcity of IPv4 is more of a feature than a bug.


(Having also worked on DDoS mitigation services) That "entire /64" is already hell of a lot more granular than a single CG-NAT range serving everyone on an ISP though. Most often in these types of attacks it's a single subnet of a single home connection. You'll need to block more total prefixes, sure, but only because you actually know you're only blocking actively attacking source subnets, not entire ISPs. You'll probably still want something signature based for the detection of what to blackhole though, but it does scale farther in a combo on the same amount of DDoS mitigation hardware.

you can heuristically block ipv6 prefixes on a big enough attack by blocking a prefix once a probabilistic % of nodes under it are themselves blocked, I think it should work fairly well, as long as attacking traffic has a signature.

consider simple counters "ips with non-malicious traffic" and "ips with malicious traffic" to probabilistically identify the cost/benefit of blocking a prefix.

you do need to be able to support huge block lists, but there isn't the same issue as cgnat where many non-malicious users are definitely getting blocked.


You should block the whole /64, at least. It's often a single host. It's often but not always a single host, that's standardized.

Usually a /64 is a "local network", so in the case of consumer ISPs that's all the devices belonging to a given client, not a single device.

Some ISPs provide multiple /64s, but in the default configuration the router only announces the first /64 to the local network.


Presumably a compromised device can request arbitrarily new ipv6 from the dhcp so the entire block would be compromised. It would be interesting to see if standard dhcp could limit auto leasing to guard reputation of the network

Generally, IPv6 does autoconfiguration (never seen a home router with DHCPv6), so no need to ask for anything. Even for ipv4, I've never seen a home router enforce DHCP (even though it would force the public ip).

But the point stands, you can't selectively punish a single device, you have to cut off the whole block, which may include well-behaved devices.


In mobile networks it's usually a single device.

This DDoS is claimed to be the result of <300,000 compromised routers.

That would be really easy to block if we were on IPv6. And it would be pretty easy to propagate upstream. And you could probabilistically unblock in an automated way and see if a node was still compromised. etc.


> That would be really easy to block -- if we were on IPv6.

Make that: If the service being attacked was on IPv6-only, and the attacker had no way to fall back to IPv4.

As long as we are dual-stack and IPv6 is optional, no attacker is going to be stupid enough to select the stack which has the highest probability of being defeated. Don't be naive.


It'd be far more acceptable to block the CG-NAT IPv4 addresses if you knew that the other non-compromised hosts could utilize their own IPv6 addresses to connect to your service.

Better to rely on ip blocks than on NAT to bundle blocks.

I am a bit split this topic. There is some privacy concerns with using ipv6. https://www.rfc-editor.org/rfc/rfc7721.html#page-6

Some time ago I decided for our site to not roll out ipv6 due to these concerns. (a couple of million visitors per month) We have meta ads reps constantly encourage us to enable it which also do not sit right with me.

Although I belive fingerprinting is sofisticated enough to work without using ip's so the impact of using ipv6 might not be a meaningful difference.


its hilarious that you have privacy concerns while at the same time using meta ads.

I am guessing they're trying to limit the privacy harm to normal channels that the slightly savvy can understand rather than completely eliminate it.

Reportedly this is often incorrectly implemented, where /64 prefix is still a stable static address.

Is there any money an ISP would make, or save, by sinking money and effort on switching to IPv6? If there's none, why would they act? If there is some, where?

For instance, mobile phone operators, which had to turn ISPs a decade or two ago, had a natural incentive to switch to IPv6, especially as they grew. Would old ISPs make enough from selling some of their IPv4 pools?


Presumably they'd lose money when a DDoS originating from their network causes all their ips to get blocked.

less expensive IP space, more efficient hardware, and lower complexity if you can eliminate NAT.

They already lease them out. TELUS in Canada traditional old ISP rents large portion of their space to a mostly used for Chinese GFW VPN server provider in LA „Psychz“

The ISPs have to submit plans on how to use their IPs for the public,especially for IPv4, Arnic shouldn't approve this kind of stuff. Unless they lied in their ip block application, in which case they should be revoked their block.

I filled out one of these for Cogent to get a /24. I was being honest but all I had to put was services that requires their own IP. I even listed a few but no where near the 253.

They also never responded back and were like "what about NAT" or "what about host based routing".


Not sure what you filled out, but blocks are handed usually not to end users, but to providers that will sublease the ips to their client. So if you are asking for a block for a couple of your HTTP servers, that's a no. If you rent HTTP servers to, say, local small businesses, then that's a yes.

> How about we actually finally roll out IPv6 and bury CGNAT in the graveyard where it belongs?

That depends on the service you are DDosing actually having an IPv6 presence. And lots of sites really don't.

It doesn't help if you have IPv6 if you need to fallback to IPv4 anyway. And if bot-net authors knows they can hide behind CGNAT, why would they IPv6 enable their bot-load when all sites and services are guaranteed to be reachable bia IPv4 for the next 3 decades?

(Disclaimer: This comment posted on IPv6)


Is it advantageous to be someone who supports IPv6 on a day like today?

Isn't it enough that the target of the DDOS only accepts ipv6?

> 3) Less importantly, CGNAT is a growing problem. If you have 10k (say) users on CGNAT that are compromised, it's likely that there's at least 1 on each CGNAT IP. This means you can't just null route compromised IPs as you are effectively null routing the entire ISP.

Null routing is usually applied to the targets of the attack, not the sources. If one of your IPs is getting attacked, you null route it, so upstream routers drop traffic instead of sending it to you.


Sorry, late here. You are right. I mean filter the IP in question.

Haha that last part is pretty wild. rather than worrying about systemic problems in the entire internet let's just make mandates crippling devices that China, where all these devices are made, will defffinitely 100% listen to. Sure, seems reasonable. Systems that rely on the goodwill of the entire world to function are generally pretty robust, after all.

If they don’t then the devices are not sold in the United States. It’s quite simple.

Great to know that smuggling hardware into the US has been completely stopped.

If the analysis above is accurate, a few smuggled devices would not be an issue, as long as the zillions of devices sold at Walmart are compliant.

Congratulations on the creation of a thriving new black market in which the main beneficiary is organized crime! What could go wrong?

Do you take issue with the concept of laws or are you just being annoying?

I'm sorry that you find thinking about second order dynamics annoying, but that's what you have to do if you actually want effective laws. Just making laws doesn't magically fix problems. In many cases it just makes much more exciting problems.

I'm annoyed because you didn't actually come up with an interesting response. Yes, when you make laws people can break them. But you need to explain why there is an incentive to break them, and whether it will happen to the extent that it will actually be a problem to enforce. Personally, I don't see people scrambling to get DDoS attack vectors in their house by any means necessary.

> I think we probably need more government regulation of these IoT devices. For example, having a "hardware" limit of (say) 10mbit/sec or less for all networking unless otherwise required. 99% all of them don't need more than this.

What about DDoSs that come from sideloaded, unofficial, buggy, or poorly written apps? That's what IoT manufacturers will point to, and where most attacks historically come from. They'll point to whether your Mac really needs more than 100mbps.

The government is far more likely to figure it out along EU lines: Signed firmware, occasional reboots, no default passwords, mandatory security updates for a long-term period, all other applicable "common sense" security measures. Signed firmware and the sideloading ID requirements on Android also helps to prevent stalkerware, which is a growing threat far scarier than some occasional sideloaded virus or DDoS attack. Never assume sideloading is consensual.


>What about DDoSs that come from sideloaded, unofficial, buggy, or poorly written apps? That's what IoT manufacturers will point to, and where most attacks historically come from.

any source for this claim? Outside of very specific scenarios which differ significantly for the current botnet market (like manjaro sending too many requests to the aur or an android application embedding an url to a wikipedia image) I cannot remember one occourence of such a bug being versatile enough to create a new whole cybercrime market segment.

>They'll point to whether your Mac really needs more than 100mbps.

it does, because sometimes my computer bursts up to 1gbps for a sustained amount of time, unlike the average iot device that has a predictable communication pattern.

>Signed firmware and the sideloading ID requirements on Android also helps to prevent stalkerware, which is a growing threat far scarier than some occasional sideloaded virus or DDoS attack. Never assume sideloading is consensual.

if someone can unlock your phone, go into the settings, enable installation of apps for an application (ex. a browser), download an apk and install it then they can do quite literally anything, from enabling adb to exfiltrating all your files.


Historically, it was called Windows XP and Vista about 15 years ago (Blaster, Sasser, MyDoom, Stuxnet, Conficker?). Microsoft clamped down, hard, across the board, but everyone outside of Big Tech is still catching up.

Despite Microsoft's efforts, 911 S5 was roughly 19 million Windows PCs in 2024, in news that went mostly under the radar. It spread almost entirely through dangerous "free VPN" apps that people installed all over the place. (Why is sideloading under attack so much lately? 19 million people thought it would make them more secure, and instead it turned their home internet into criminal gateways with police visits. I strongly suspect this incident, and how it spread among well-meaning security-minded people, was the invisible turning point in Big Tech against software freedom lately.)

https://www.fbi.gov/investigate/cyber/how-to-identify-and-re...

> if someone can unlock your phone, go into the settings, enable installation of apps for an application (ex. a browser), download an apk and install it then they can do quite literally anything, from enabling adb to exfiltrating all your files.

Which is more important, and a growing threat? Dump all her photos once; or install a disguised app that pretends to be a boring stock app nobody uses, that provides ongoing access for years, with everything in real-time up to the minute? Increasingly it's the latter. She'll never suspect the "Samsung Battery Optimizer" or even realize it came from an APK. No amount of sandboxing and permissions can detect an app with a deliberately false identity.


> Signed firmware and the sideloading ID requirements

Ending the last corner of actually free market in software is quite a cost for something that wouldn't prevent DDoS.

> sideloaded, unofficial, buggy, or poorly written apps? That's what IoT manufacturers will point to, and where most attacks historically come from

Is that actually true? What evidence do we have, vs. vulnerabilities in the OEM software (the more common case)?


> A $1 SoC can easily handle this these days.

Could you elaborate?


I think there's some exaggeration as few $1 SoC parts come with 10G Ethernet, and >1G to the home is not common, but pretty much any home router can saturate its own uplink - it would be useless if it couldn't!

Not always the case. Generating traffic can be more computationally intense than routing the traffic. I've done speed tests on a few routers local to it and the results have been less than stellar compared to getting expected results with it just routing traffic (consumer routers). Granted these tests were a few years ago and things have progressed, but how often are people upgrading their routers?

Correct.

Also, most 1Gbit/s and faster routers have hardware-accelerated packet forwarding, aka "flow offloading", aka "hardware NAT", where forwarded packets mostly don't touch software at all.

Some routers even have internal "CPU" port of packet core with significantly slower line rate than that of external ports'. So traffic that terminates/originates at the router is necessarily quite a bit slower, regardless of possibly extra-beefy processor, and efficient software. Not really a problem since that traffic would normally be limited to UI, software updates, ARP/NDP/DHCP, and occasional first packet of a forwarded network connection.


A Allwinner H616 is Quad-Core ARM and can definitely saturate gigabit ethernet with packet generation.

1gb upload is extraordinarily rare.

It’s not; most places that give you gigabit fiber will give you a symmetric connection.

Define most places? I know i dont get one (uk) and neither does my german friend or texan friend.

I've only ever seen one despite having used 4 different ISPs for gigabit, and that one was special. It was in an apartment i rented in a converted office tower, line was done via a b2b provider then included in the rent.


I'm in the US and most fiber providers (I checked a handful: AT&T, Sonic, Google Fiber, Frontier) all provide symmetric connections.

Aren't most residential fiber deployments PONs which generally do not offer symmetric bandwidth? E.g. 10G-PON has 10G down / 2.5G up.

Depends on country, its not common here.

Yup. Spectrum is Michigan will give you up to 2gbps down but not anything more than 200mbps up

Is Spectrum fiber or DOCSIS? I didn't realize anyone was pushing these kinds of numbers for fiber. What's the point other than screwing the users?

Penny pinching. Afaik asymmetric PON is the cheapest possible network tech at scale.

Nope. less than a percent of a percent. symmetric plans are extra cost and offered primarily to business.

almost all homes have no ability to exceed gigabit. infact almost all new homes dont even have data wiring. people just want their netflix to work on wifi.


I didn't pay anything extra for symmetric.

Most places do not have fiber.

We know. The problem is that the above comment said "extraordinarily rare" which is a very different and incorrect threshold.

But for those that do...symmetric is the norm. The number of fiber connections is only going up.

symmetric is not the norm. the infra costs are not worth it. symmetric is primarily a business offering.

This is probably technically true but very misleading. Fiber penetration in the US has been consistently rising for over a decade now and it is not at all uncommon to have either Google Fiber, Fios, or a local fiber provider available to you in a big city. I bet within the next decade most places will have gigabit fiber available.

There are probably more English speakers using the Internet in India than there are in the USA...Let alone the hundreds of millions elsewhere.

You cant just assume everyone is talking about your country online.


Does it really matter? The grandparent comment states the bandwidth is becoming even more readily available in the US, while the article itself says the bots were largely hosted by US ISPs, and that's obviously enough bandwidth to already cause global disruptions. But that's just the source of the attack, and who is on the receiving end is another.

I get being too US-centric, but I think it's interesting if the US has the right combination of hosting tons of infected devices and having the bandwidth to use them on a much larger scale compared to other countries and possible implications.


You can assume the county when it is in the title.

>DDoS Botnet Aisuru Blankets US ISPs in Record DDoS


The US is a big place. But the world is bigger. The internet works across the whole world.

There's a long way to go before fibre is commonplace across the world.


Seems more likely that residential modems will be required to use ISP-provided equipment that has government mandated chips, firmware, etc to filter outbound traffic for DDoS prevention.

Why should they be required to have hardware in their own network to filter that out when the ISP is obviously receiving all of their traffic anyway?

Sometimes the attack, or amplification, comes from the ISP-provided router and its bargain basement firmware.

Of course there is. If you've got all your internet egress tied up with DDoS attacks from your network it is a big problem.

Most eyeball networks have a lot of inbound traffic and not very much outbound, but interconnections with other networks are almost always symmetric, so there's a lot of room for excess egress before it causes pain to the ISP.

When I ran a large web site that attracted lots of DDoS, it didn't really seem worthwhile to track down the source and try to contact ISPs. I had done a lot of trying to track and stop people sending phishing mail under our name, and it's simply too much work to write a reasonable abuse report that is unlikely to be followed up on. With email, mostly people seem to accept the Received headers are probably true; with DDoS, you'd be sending them pcaps, and they'd be telling you it's probably spoofed, and unless I've got lots of peering, I'm not going to be able to get captures that are convincing... so just do my best to manage the inbound and call it a day.


I think we’re just starting to see attacks that big - which might start some practical mitigations (or they’ll just upgrade transit).

No it shouldn't do. "All" you're doing is having a small model run the prompt and then have the large model "verify" it. When the large model diverges from the small one, you restart the process again.

Why is having so many bands a bad thing? Demand for data is so much higher now you need (ideally) hundreds of MHz of spectrum in dense areas. You need some way to partition that up as you can't just have one huge static block of spectrum per auction.

The issue with LTE isn't bands, it's the crappy way they have done VoLTE and also seemingly learnt nothing for VoNR.

They should have done something like GET volte.reserved/.well-known/volte-config (each carrier sets up their DNS to resolve volte.reserved to their ims server which provides config data to the phone). It would have given pretty much plug and play compatibility for all devices.

Instead the way it works is every phone has a (usually) hopelessly outdated lookup table of carriers and config files. Sort of works for Apple because they can push updates from one central place, but for Android it's a total mess.


> Why is having so many bands a bad thing? Demand for data is so much higher now you need (ideally) hundreds of MHz of spectrum in dense areas. You need some way to partition that up as you can't just have one huge static block of spectrum per auction.

Because different countries use different sets of bands. That was true for GSM too, but quad band phones were reasonably available. Many phones were at least tri band, so you would at least have half the bands if you imported a 'wrong region' tri-band.

But now, you'll have a real tough time with coverage in the US if you import a EU or JP phone.


With a "quad band" LTE phone of bands 2, 7, 20 and say 12 you would get pretty much worldwide coverage. It'd just be slower because you can't access other ones. Not sure what the issue is?

The issue is the import phones I want to buy don't suppprt those bands. An example phone I might want (Xperia 10 IV) supports 12 bands for LTE, my carrier (US T-Mobile) supports 6, but the intersection is only 2 bands (the old GSM bands) and I know my carrier doesn't always have coverage on those bands. I've got enough dead zones without throwing out 4 bands.

Plenty of phones support all reasonable bands. The intentional 4G brokenness is much worse.

I think the fibre optic analogy is a bad one. The key reason supply massively outstripped demand was that optical equipment massively improved in efficiency.

We are not seeing that (currently) with GPUs. Perf/watt has basically completely stalled out recently while tokens per user has easily increased in many use cases has went up 100x+ (take Claude code usage vs normal chat usage). It's very very unlikely we will get breakthroughs in compute efficiency in the same way we did in the late 90s/2000s for fiber optic capacity.

Secondly, I'm not convinced the capex has increased that much. From some brief research the major tech firms (hyperscalers + meta) were spending something like $10-15bn a month in capex in 2019. Now if we assume that spend has all been rebadged AI, and adjust for inflation it's a big ramp but not quite as big as it seems, especially when you consider construction inflation has been horrendous virtually everywhere post covid.

What I really think is going on is some sort of prisoners dilemma with capex. If you don't build then you are at serious risk of shortages assuming demand does continue in even the short and medium term. This then potentially means you start churning major non AI workloads along with the AI work from eg AWS. So everyone is booking up all the capacity they can get, and let's keep in mind a small fraction of these giant trillion dollar numbers being thrown around from especially OpenAI are actually hard commitments.

To be honest if it wasn't for Claude code I would be extremely skeptical of the demand story but given I now get through millions of tokens a day, if even a small percentage of knowledge workers globally adopt similar tooling it's sort of a given we are in for a very large shortage of compute. I'm sure there will be various market corrections along the way, but I do think we are going to require a shedload more data centres.


> We are not seeing that (currently) with GPUs. Perf/watt has basically completely stalled out recently while tokens per user has easily increased in many use cases has went up 100x+ (take Claude code usage vs normal chat usage). It's very very unlikely we will get breakthroughs in compute efficiency in the same way we did in the late 90s/2000s for fiber optic capacity.

At least for gaming, GPU performance per dollar has gotten a lot better in the last decade. It hasn't gotten much better in the past couple of years specifically, but I assume a lot of that is due to the increased demand for AI use driving up the price for consumers.

Why wouldn't Moore's Law continue?


Difference is that with fiber you can put more data on same piece of glass or plastic or whatever just by swapping the parts at the end. And those are relatively small part of the cost. Most which is just getting the thing in place.

With GPUs and CPUs. You need to replace entire thing. And now they are the most expensive part of the system.

Other option is doing more with same computing power, but we have utterly failed with that in general...


It's been worse than that. Datacentres are needing basically completely rebuilt for especially Blackwell chips as they mostly require liquid cooling, not air cooling as before. So you don't need to just replace the hardware, you need to replace all the power AND provide liquid cooling, which means completely redesigning the entire datacentres.

Yeah, but the question is whether your demand for Claude Code would be as high as it is, if Anthropic were charging enough to cover their costs. Not this fake "the model is profitable if you ignore training the next model" stuff but enough for them to actually be profitable today.

This is a crucial question that often gets overlooked in the AI hype cycle. The article makes a great point about the disconnect between infrastructure investment and actual revenue generation.

A few thoughts:

1. The comparison to previous tech bubbles is apt - we're seeing massive capex without clear paths to profitability for many use cases.

2. The "build it and they will come" mentality might work for foundational models, but the application layer needs more concrete business cases.

3. Enterprise adoption is happening, but at a much slower pace than the investment would suggest. Most companies are still in pilot phases.

4. The real value might come from productivity gains rather than direct revenue - harder to measure but potentially more impactful long-term.

What's your take on which AI applications will actually generate enough value to justify the current spending levels?


This reads like a chatgpt response.

Because the weather is very changeable. You may get a lull in the wind for a couple of mins, enough to land.

I've been on a couple of flights like that. Once where we did two attempts and landed on the 2nd, the other where we did 3 but the had to divert. Other planes were just managing to land in the winds before and after our attempts.

The other problem is (as I found out on that flight) that mass diversions are not good. The airport I diverted to in the UK had dozens of unexpected arrivals, late at night. There wasn't the ground staff to manage this so it took forever to get people off. It then was too full to accept any more landings, so further flights had to get diverted further and further away.

So, if you did a blanket must divert you'd end up with all the diversion airports full (even to flights that could have landed at their original airport) and a much more dangerous situation as your diversions are now in different countries.


Every time I read one of these articles the main issue I have is that it doesn't take into account the huge shortages of compute that are going on all the time. Anthropic and Google especially have been incredibly unreliable, struggling to keep up with demand.

Each of the main providers could easily use 10x the compute tomorrow (albeit arguably inefficiently) by using more thinking for certain tasks.

Now - does that scale to the 10s of GWs of deals OpenAI is doing? Probably not right now, but the bigger issue as the article does point out in fairness is the huge backlog of power availability worldwide.

Finally, AI adoption outside of software engineering is incredibly limited at work. This is going to rapidly change. Even the Excel agent Microsoft has recently launched has the potential to result in hundred fold increases in token consumption per user. I'm also suspect of the AI sell through rate being an indicator that it's not popular for Microsoft. The later versions of M365 copilot (or whatever it is called today) are wildly better than the original ones.

It all sort of reminds me of Apple's goal of getting 1% in cell phone market share, which seemed laughably ambitious at one point - a total stretch goal. Now they are up to 20% and smartphone penetration as a whole is probably close to 90% globally of those that have a phone.

One potential wild card though for the whole market is someone figuring out a very efficient ASIC for inference (maybe with 1.58bit). GPUs are mostly overkill for inference and I would not be surprised if 10-100x efficiency gains could be had on very specialised chips.


the huge demand exists right now because the cost of a token is near zero. and companies have figured out one weird hack to gaining value in the stock market, which is to brag about how many tokens are being crammed into all manner of places that they may or may not belong.

customer value must eventually flow out of those datacenters in the opposite direction to the the energy and capex that are flowing in

do people actually want all this AI? I see studio ghibli portraits, huge amounts of internet spam, workslop... where is the value?


> Each of the main providers could easily use 10x the compute tomorrow (albeit arguably inefficiently) by using more thinking for certain tasks.

That's true for everyone with regard to any resource.

The question is whether the 10x increase in resources results in 10x or more increase in profit.

If it doesn't then it doesn't make sense to pay for the extra resources. For AI right now, the constraint is profit per resource unit, not number or resource units.


Why have spot H100 prices been going down then? It was roughly $3/hr a year ago and now it is closer to $2.2/hr.


Reminds me of the fiber boom


"The later versions of M365 copilot (or whatever it is called today) are wildly better than the original ones."

I find AI agents work very poorly within the Microsoft ecosystem. They can generate great HTML documents (because it's an open source format maybe?) but for word documents, the formatting is so poor I'd had to turn it off and just do things manually.


Opposing anecdote: I got consistent performance out of Grok and Qwen (17 providers on Openrouter) throughout the day but Gemini gets slow and dumb at times.


BlackRock just bought a data center for 40bn last week. BlackRock. The animals that bought all the housing once. They must be stupid I guess.


BlackRock did not buy “a” data center, it bought a data center company with 78 data centers. I have no comment on whether or not it was a good deal, but your framing is silly.


Interesting, seems to use 'pure' vision and x/y coords for clicking stuff. Most other browser automation with LLMs I've seen uses the dom/accessibility tree which absolutely churns through context, but is much more 'accurate' at clicking stuff because it can use the exact text/elements in a selector.

Unfortunately it really struggled in the demos for me. It took nearly 18 attempts to click the comment link on the HN demo, each a few pixels off.


18 attempts - emulating the human HN experience when using mobile. Well, assuming it hit other links it didn't intend to anyway. /jk


Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: