I am Don Quixote, building a app that abstracts over models (i.e. allows user choice), while providing them a user-controlled set of tools, and allowing users to write their own "scripts", i.e. precanned dialogue / response steps to permit ex. building of search.
Which is probably what makes me so cranky here. It's very hard keeping track of all of it and doing my best to lever up the models that are behind Claude's agentic capabilities, and all the Newspeak of Google PR makes it consume almost as much energy as the rest of the providers combined. (I'm v frustrated that I didn't realize till yesterday that 2.0 Flash had quietly gone from 10 RPM to 'you can actually use it')
I'm a Xoogler and I get why this happens ("preview" is a magic wand that means "you don't have to get everyone in bureaucracy across DeepMind/Cloud/? to agree to get this done and fill out their damn launchcal"), but, man.
Apple is generally on the side of customers, but this is a clear example of how anti-customer-friendly their policy was. As a customer, I had to jump through hoops to buy a book on their premium platform.
Apple maps, "Intelligence", siri, etc all run on device because Apple is in the market of selling devices. As many as they can. Whereas google is in the market of selling you to advertisers.
It's literally a major difference in their fundamental business models.
Apple does both. Under Tim Cook, services have become nearly as profitable as hardware, and the price of making services compared to hardware is comically low. That's why we have AppleTV and Apple News and Apple Arcade, for all they're worth as a motley crew of subscriptions.
I've never had to buy a service plan for any of my PC laptops.
OTOH, every Apple owner I know has an AppleCare plan. And has had to use it. Multiple times. For nearly every device they own. And this is a sample population of over a thousand over more than a decade.
Yes, Apple service is great. I couldn't tell you what Dell Service is like, or Lenovo, or HP, because I've never had to use them. PC laptops and Android phones...just work.
And given that Androids are more popular in the parts of the world where reliability is essential, it's pretty clear that most of the world agrees that Apples are the inferior device.
Good luck trying to paint the picture that AppleCare is a waste of money. I've owned hundreds of Apple products since AppleCare became a thing. I've "had to use it" once, and that one use replaced a $4500 product for $0. The ROI on AppleCare always works out.
If you've had AppleCare since it became a thing, you've basically paid the same price for AppleCare over that time as it would have cost to replace the $4500 product on its own.
Android devices and PC laptops don't need service plans. But as you've demonstrated, Apple products do. Even the $4500 ones.
I guess this is a side-effect where, as things stood, developers were incentivized to just forgo IAP and force users to jump through hoops to find how to give them money; and that in turn wasn’t customer friendly. But in general I much prefer IAP to whatever payment system the developer uses. It makes it so easy to do things like change or cancel any payments I have.
In general I think centralized stores are customer friendly but anti developer. As a less controversial example, see how many gamers will wait months or years for a game to leave the Epic game store and go on Steam.
Centralized stores are only superficially consumer friendly. The store owner is too well positioned to rent seek, and they will inevitably do so -- as Apple in fact is.
With React Native yes but not yet with Swift. There's been quite a few requests to my surprise through--I figured CloudKit, etc would be sufficient on iOS but I don't have experience there.
I don't understand the knee-jerk skepticism. This is something they are doing to gain trust and encourage users to use AI on WhatsApp.
WhatsApp did not used to be end-to-end encypted, then in 2021 it was - a step in the right direction. Similary, AI interaction in WhatsApp today is not private, which is something they are trying to improve with this effort - another step in the right direction.
What's the motive "to gain trust and encourage users to use AI on WhatsApp"? Meta aren't a charity. You have to question their motives because their motive is to extract value out of their users who don't pay for a service, and I would say that whatsapp has proven to be a harder place to extract that value than their other ventures.
btw whatsapp implemented the signal protocol around 2016.
"motive is to extract value out of their users who don't pay for a service"
that is called a business.
if you find something deceitful in the business practice, that should certainly be called out and even prosecuted. I don't see why an effort to improve privacy has to get a skeptical treatment, because big business bad bla bla
Supabase's decision to take a dependency on Deno, IMO caused indirect pain to lot of devs. I have wasted quite a bit of time trying to find or load a package that I needed. And now, with Deno 2.0 apparently everything is node compatible... I don't know what was the whole point.
We don’t have automated evals, latency, or cost comparisons yet. But, Promptrepo does offer versioning and lets you deploy the same model across providers for comparison. Automating these comparisons is definitely on our roadmap.
Is Best-of-N Sampling standard practice these days in Inference? Sounds expensive on the face of it. I am surprised because I thought the trend was towards cheaper inference.
For reasoning models, this would actually improve exploration efficiency and hence possibly allow higher performance for the same compute budget. As in, if you want to sample from multiple rollouts for the same prompt, it's more efficient if the model is able to produce diverse thought directions and consider them to find the best response as opposed to going down similar trajectories and waste compute.
Almost all of the efficiency gains have come from shedding bit precision, but the problem is that AI labs are now running out of bits to shed. The move to reduced precision inference has been masking the insane unsustainability of compute scaling as a model improvement paradigm.
Is there really a limit on bits to shed? I suspect not.
Take N gates, normalize them, represent them as points on the surface of a hypersphere. Quantize the hypersphere as coarsely as you need to get the precision you want. Want less precision but your quantization is getting too coarse? Increase N.
Fast algebraic codes exist to convert positions on a hyperspheric-ish surfaces to indexes and vice versa.
Perhaps spherical VQ isn't ideal-- though I suspect it is, since groups of weights often act as rotations naturally-- but some other geometry should be good if not.