Hacker Newsnew | past | comments | ask | show | jobs | submit | more justanotheratom's commentslogin

Yay! do you use your Gemini in Gemini App or AI Studio or Vertex AI?


I am Don Quixote, building a app that abstracts over models (i.e. allows user choice), while providing them a user-controlled set of tools, and allowing users to write their own "scripts", i.e. precanned dialogue / response steps to permit ex. building of search.

Which is probably what makes me so cranky here. It's very hard keeping track of all of it and doing my best to lever up the models that are behind Claude's agentic capabilities, and all the Newspeak of Google PR makes it consume almost as much energy as the rest of the providers combined. (I'm v frustrated that I didn't realize till yesterday that 2.0 Flash had quietly gone from 10 RPM to 'you can actually use it')

I'm a Xoogler and I get why this happens ("preview" is a magic wand that means "you don't have to get everyone in bureaucracy across DeepMind/Cloud/? to agree to get this done and fill out their damn launchcal"), but, man.


Apple is generally on the side of customers, but this is a clear example of how anti-customer-friendly their policy was. As a customer, I had to jump through hoops to buy a book on their premium platform.


> Apple is generally on the side of customers,

I'm not convinced;

Planned obsolescence, repair issues, phone-home privacy issues, vendor lock-in, etc

Apple is certainly innovative which helps consumers, but that's about it. The rest is bare minimum for the price point.


Apple maps, "Intelligence", siri, etc all run on device because Apple is in the market of selling devices. As many as they can. Whereas google is in the market of selling you to advertisers.

It's literally a major difference in their fundamental business models.


Apple does both. Under Tim Cook, services have become nearly as profitable as hardware, and the price of making services compared to hardware is comically low. That's why we have AppleTV and Apple News and Apple Arcade, for all they're worth as a motley crew of subscriptions.


I've never had to buy a service plan for any of my PC laptops.

OTOH, every Apple owner I know has an AppleCare plan. And has had to use it. Multiple times. For nearly every device they own. And this is a sample population of over a thousand over more than a decade.

Yes, Apple service is great. I couldn't tell you what Dell Service is like, or Lenovo, or HP, because I've never had to use them. PC laptops and Android phones...just work.

And given that Androids are more popular in the parts of the world where reliability is essential, it's pretty clear that most of the world agrees that Apples are the inferior device.


Good luck trying to paint the picture that AppleCare is a waste of money. I've owned hundreds of Apple products since AppleCare became a thing. I've "had to use it" once, and that one use replaced a $4500 product for $0. The ROI on AppleCare always works out.


If you've had AppleCare since it became a thing, you've basically paid the same price for AppleCare over that time as it would have cost to replace the $4500 product on its own.

Android devices and PC laptops don't need service plans. But as you've demonstrated, Apple products do. Even the $4500 ones.


There was also a conflict of interest, considering Apple has their own bookstore.


I guess this is a side-effect where, as things stood, developers were incentivized to just forgo IAP and force users to jump through hoops to find how to give them money; and that in turn wasn’t customer friendly. But in general I much prefer IAP to whatever payment system the developer uses. It makes it so easy to do things like change or cancel any payments I have.

In general I think centralized stores are customer friendly but anti developer. As a less controversial example, see how many gamers will wait months or years for a game to leave the Epic game store and go on Steam.


Centralized stores are only superficially consumer friendly. The store owner is too well positioned to rent seek, and they will inevitably do so -- as Apple in fact is.


Steam doesn't impose anti-competitive measures on games, though.


maybe run it through few other LLMs depending on how much confidence you need - o3 pro, gemini 2.5 pro, claude 3.7, grok 3, etc..


Then you need to be able to formally prove the equivalence of various TLA+ programs (maybe that's a solved problem?)


No idea about SOTA but naively it doesn't seem like a very difficult problem:

- Ensure all TLA+ specs produced have the same inputs/outputs (domains, mostly a prompting problem and can solved with retries)

- That all TLA+ produce the same outputs for the same inputs (making them functionally equivalent in practice, might be computationally intensive)

Of course that assumes your input domains are countable but it's probably okay to sample from large ranges for a certain "level" of equivalence.

EDIT: Not sure how that will work with non-determinism though.


I didn't mean generate separate TLA programs. Rather, other LLMs review and comment on whether this TLA program satisfies the user's specification.


can websets enrich a column with images?


There aren't currently any vision LLMs involved. But if you asked for image links, it'd probably find you something!


triplit sounds awesome. Any precedence of someone using it from a native iOS app?


With React Native yes but not yet with Swift. There's been quite a few requests to my surprise through--I figured CloudKit, etc would be sufficient on iOS but I don't have experience there.


CloudKit not cross platform


I don't understand the knee-jerk skepticism. This is something they are doing to gain trust and encourage users to use AI on WhatsApp.

WhatsApp did not used to be end-to-end encypted, then in 2021 it was - a step in the right direction. Similary, AI interaction in WhatsApp today is not private, which is something they are trying to improve with this effort - another step in the right direction.


What's the motive "to gain trust and encourage users to use AI on WhatsApp"? Meta aren't a charity. You have to question their motives because their motive is to extract value out of their users who don't pay for a service, and I would say that whatsapp has proven to be a harder place to extract that value than their other ventures.

btw whatsapp implemented the signal protocol around 2016.


"motive is to extract value out of their users who don't pay for a service" that is called a business.

if you find something deceitful in the business practice, that should certainly be called out and even prosecuted. I don't see why an effort to improve privacy has to get a skeptical treatment, because big business bad bla bla


Privacy was reduced from where it already stood by the introduction of an AI assistant to an E2E messaging app.

Had they not included it in the first place they would then not have to 'improve privacy' by reworking the AI.

I agree with OP and am highly sceptical of Meta's motives.


You would be correct to be skeptical when they introduced AI into conversations, which btw is opt-in.


Supabase's decision to take a dependency on Deno, IMO caused indirect pain to lot of devs. I have wasted quite a bit of time trying to find or load a package that I needed. And now, with Deno 2.0 apparently everything is node compatible... I don't know what was the whole point.


is there a well-established tool-chain for finetuning these models?


Unsloth. Check their colab notebooks


FineTuning Ops platform requirement - compare evals, latency, cost across models.


We don’t have automated evals, latency, or cost comparisons yet. But, Promptrepo does offer versioning and lets you deploy the same model across providers for comparison. Automating these comparisons is definitely on our roadmap.


Is Best-of-N Sampling standard practice these days in Inference? Sounds expensive on the face of it. I am surprised because I thought the trend was towards cheaper inference.


For reasoning models, this would actually improve exploration efficiency and hence possibly allow higher performance for the same compute budget. As in, if you want to sample from multiple rollouts for the same prompt, it's more efficient if the model is able to produce diverse thought directions and consider them to find the best response as opposed to going down similar trajectories and waste compute.


Not standard but one of several techniques, you can see them in our open source inference proxy - https://github.com/codelion/optillm

Cerebras has used optillm for optimising inference with techniques like CePO and LongCePO.


Almost all of the efficiency gains have come from shedding bit precision, but the problem is that AI labs are now running out of bits to shed. The move to reduced precision inference has been masking the insane unsustainability of compute scaling as a model improvement paradigm.


Is there really a limit on bits to shed? I suspect not.

Take N gates, normalize them, represent them as points on the surface of a hypersphere. Quantize the hypersphere as coarsely as you need to get the precision you want. Want less precision but your quantization is getting too coarse? Increase N.

Fast algebraic codes exist to convert positions on a hyperspheric-ish surfaces to indexes and vice versa.

Perhaps spherical VQ isn't ideal-- though I suspect it is, since groups of weights often act as rotations naturally-- but some other geometry should be good if not.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: