You can’t strip arbitrary words from the input because you can’t assume their co...

dankwizard · 2025-05-26T05:10:15 1748236215

Each call goes through an LLM-lite categorizer (NNUE mixed with Deeplearning) and the resulting body has something along the lines of a "politenessNeededForSense: boolean". If it is false, you can trust we remove all politeness before engaging with Claude 4. Saved roughly $13,000,000 this FY

jjallen · 2025-05-25T20:17:42 1748204262

Seems like you could detect if this was important or not. If it is the first or last word it is as if the user is talking to you and you can strip it; if not it's not.

prng2021 · 2025-05-25T20:44:40 1748205880

That’s such a naive implementation. “Translate this to French: Yes, please”

jjallen · 2025-05-25T20:47:10 1748206030

It's very naive but worth looking into. Could always test this if it is really costing so much money for one word. Or build another smaller model that detects if it is part of the important content or not.

chatmasta · 2025-05-25T21:08:46 1748207326

There are hundreds of other opportunities for cost savings and efficiency gains that don’t have a visible UX impact. The trade-off just isn’t worth it outside of some very specialized scenarios where the user is sophisticated enough to deliberately omit the word anyway.

jjallen · 2025-05-26T02:57:42 1748228262

They would write “How do you say ‘yes please’ in French”. Or “translate yes please in French”.

To think that a model wouldn’t be capable of knowing this instance of please is important but can code for us is crazy.

saagarjha · 2025-05-26T09:48:07 1748252887

Or you could just not bother with dealing with this special case that isn't actually that expensive.