Cheaper is an understatement... it's less than 1/10 for input and nearly 1/8 for output. Part of me wonders if they're using their massive new investment to sell API below-cost and drive out the competitor. If they're really getting Opus 4.1 performance for half of Sonnet compute cost, they've done really well.
With the unlimited demand I can't see that strategy working. It is not like taxis where you may do a trip or two a day but if it cheap enough you'd do 100 a day. But with AI you would totally 100x.
I'm not sure I'd be surprised, I've been playing around with GPT-OSS last few days, and the architecture seems really fast for the accuracy/quality of responses, way better than most local weights I've tried for the last two years or so. And since they released that architecture publicly, I'd imagine they're sitting on something even better privately.