MIRI's concerns are vastly overrated IMHO. Any AGI that's intelligent enough to misinterpret its goals to mean "destroy humanity" is also intelligent enough to wirehead itself. Since wireheading is easier than destroying humanity, it's unlikely that AGI will destroy humanity.
Trying to make the AGI's sensors wirehead-proof is the exact same problem as trying to make the AGI's objective function align properly with human desires. In both cases, it's a matter of either limiting or outsmarting an intelligence that's (presumably) going to become much more intelligent than humans.
Hutter wrote some papers on avoiding the wireheading problem, and other people have written papers on making the AGI learn values itself so that it won't be tempted to wirehead. I wouldn't be surprised if both also mitigate the alignment problem, due to the equivalence between the two.
Trying to make the AGI's sensors wirehead-proof is the exact same problem as trying to make the AGI's objective function align properly with human desires. In both cases, it's a matter of either limiting or outsmarting an intelligence that's (presumably) going to become much more intelligent than humans.
Hutter wrote some papers on avoiding the wireheading problem, and other people have written papers on making the AGI learn values itself so that it won't be tempted to wirehead. I wouldn't be surprised if both also mitigate the alignment problem, due to the equivalence between the two.