Note that OP's traffic light problem would suffer the same problem. Ie. a smart ...

Note that OP's traffic light problem would suffer the same problem.

Ie. a smart model, knowing it cannot say a word, will give the next best solution - for example maybe saying "A P P L E" or maybe "I'm afraid I'm not able to do that".

However, a constrained model does not know or understand its own constraints, so keeps trying to do things which aren't allowed - and even goes back and tries to redo these things which aren't allowed, because to the model it is a mistake which needs correcting.