It does, but only if you state the theorem in a particular way. Let's say that y...

It does, but only if you state the theorem in a particular way. Let's say that you have a verifier A, that checks theorems in Peano Arithmetic (PA). Inside that logic, you can define what a computer is, how it executes, what PA is, what a verifier is, and what it means for a program to verify theorems of PA. Then you can have a sequence of bytes (verifier B), and prove that they represent a program that verifies theorems of PA.

What have you proven? Well, verifier A (using PA) proves that "verifier B proves theorems of PA". That means that "if B says phi is true, PA |- phi". We would have a hard time actually running verifier B inside the logic of verifier A (that entails running all the steps of the computation B does as theorem applications inside verifier A), but even if we did, we would obtain a proof of "PA |- phi". If we assume A correctly implements PA, then that means PA |- "PA |- phi". In general, this is the best we can do. If we assume further that PA is a sound axiom system, i.e. when it proves facts about numbers then we won't find concrete counterexamples, then this strengthens to PA |- phi and then to phi itself, so we've learned that phi is true.

The plan is to prove inside verifier A (that is, MM0) the statement "Here is a sequence of bytes. It is an executable that when executed has the property that if it succeeds on input "T |- phi", then T |- phi." The bootstrapping part is that the sequence of bytes in question is in fact the code of verifier A. In order to support statements of the form "here is a sequence of bytes", the verifier also has the ability to output bytes as part of its operation (it's not just a yes/no oracle), and assert facts about the encoding of those bytes in the logic.