What I wonder about is whether these models have some secret triggers for particular malicious behaviors, or if that's possible. Like if you provide a code base that had some hints that the code involves military or government networks, whether the model would try to sneak in malicious but obsfucated code with it's output