Actually, there are loops: jumps are performed modulo 2^16, so jumping past the end of the program actually wraps back to the beginning. By dividing the program into blocks, and jump as many times as required, you can perform loops. Just one problem: it takes a looong time.
The brown paper and explanations reminded me of a Numberphile video (another good channel to check out on YouTube). I'm very impressed out how he was able to break down what he did and explain it simply.
Likewise; that impressed me just as much as the details themselves did. An explanation that clear is a rare thing.
I actually appreciated the degree to which he glossed over certain things. For instance, he described a register as a temporary thing inside the processor, and then described the stack the same way. For the purposes of the explanation he gave, it doesn't matter that the stack actually lives in memory. And anyone who already knows about the stack will already know that.
He acknowledged this in the comments and said that he didn't notice! But also that using add demonstrated the point he was making much better: decompose an inaccessible desired instruction into a longer series of accessible instructions. I agree.