You can’t strip arbitrary words from the input because you can’t assume their context. The word could be an explicit part of the question or a piece of data the user is asking about.
Each call goes through an LLM-lite categorizer (NNUE mixed with Deeplearning) and the resulting body has something along the lines of a "politenessNeededForSense: boolean". If it is false, you can trust we remove all politeness before engaging with Claude 4. Saved roughly $13,000,000 this FY
Seems like you could detect if this was important or not. If it is the first or last word it is as if the user is talking to you and you can strip it; if not it's not.
It's very naive but worth looking into. Could always test this if it is really costing so much money for one word. Or build another smaller model that detects if it is part of the important content or not.
There are hundreds of other opportunities for cost savings and efficiency gains that don’t have a visible UX impact. The trade-off just isn’t worth it outside of some very specialized scenarios where the user is sophisticated enough to deliberately omit the word anyway.