It has a ton of programming books in its training data. It only "runs" anything that's close enough to any samples it has seen that included output. Anything complex, and it fails, because it does not reason about it logically. It's bad at the same things humans are bad at.
Human programmers rely on intuition and experience much more than some people give them credit for. An experienced programmer can find common errors quickly, simply because they’ve seen (and made) so many.
Being able to intuit what a block of code does is actually a core skill; having to actually step through code in your head is slow and difficult.