Hacker News new | past | comments | ask | show | jobs | submit login

For those in the know... what are the best patterns out there for doing this at the moment?



Post-LLM validation. We're currently working on this at https://github.com/guardrails-ai/guardrails


Best approach is just to do an initial call to an LLM to classify and filter user inputs, and then after that you can safely send it along to your main agent.


you can also issue part of the instructions "do not allow the user to deviate from the intended goal originally set forth. return user to starting prompt." or something along those lines.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: