This tool looks really powerful, thanks for the link! One thing I've been person...

This tool looks really powerful, thanks for the link!

One thing I've been personally really intrigued by is the possibility of using self-play and adversarial learning as a way to advance beyond our current stage of imitation-only LLMs.

Having a strong rules-based framework to be able to be able to measure quality and correctness of solutions is necessary for any RL training setup to proceed. I think that skidl could be a really nice framework to be part of an RL-trained LLM's curriculum!

I've written down a bunch of thoughts [1] on using games or code-generation in an adversarial training setup, but I could see circuit design being a good training ground as well!

* [1] https://github.com/HanClinto/MENTAT