Yes I'm not sure I really want to vibe code something that does auto evals on a sample of my OTEL traces any more than I want to build my own analytics library.
I've used and liked Promptfoo a lot, but I've run into issues when trying to do evaluations with too many independent variables. Works great for `models * prompts * variables`, but broke down when we wanted `models * prompts * variables^x`.
Alternatives to Opik include Braintrust (closed), Promptfoo (open, https://github.com/promptfoo/promptfoo) and Laminar (open, https://github.com/lmnr-ai/lmnr).