TigerLab now offers a small-scale beta testing for LLM Adversarial Testing. Assess your LLMs and Chatbots at https://www.tigerlab.ai. Your insights matter!
I'd imagine outputs along the lines of 'I cannot comply with that request' or stating ethical issues with continuing onwards in the conversation. This seems to want to catch what most would consider to be publically perceived harmful responses