It looks great! Although the demo shows horrible security practices...
Clearly authentication shouldn't rely on prompt engineering.
Particularly when at the end of the demo it says "we have tested it again and now it shows that the security issue is fixed" - No it's not fixed! It's hidden! Still a gaping security hole. Clearly just a very bad example, particularly considering the context is banking.
Appreciate the feedback! Completely agree - authentication should be handled at the system level, not just in prompts. This demo is meant to showcase how teams can build test cases from real failures and ensure fixes work before deployment. We’ll consider using a better example.
Appreciate the feedback! To clarify, Roark isn’t handling authentication itself - it’s a testing and observability tool to help teams catch when their AI fails to follow expected security protocols (like verifying identity before sharing sensitive info).
That said, totally fair point that this example could be clearer—we’ll keep that in mind for future demos. Thanks for calling it out!
Clearly authentication shouldn't rely on prompt engineering.
Particularly when at the end of the demo it says "we have tested it again and now it shows that the security issue is fixed" - No it's not fixed! It's hidden! Still a gaping security hole. Clearly just a very bad example, particularly considering the context is banking.