Thanks! Alex had been studying getting better compression by repacking the tiles based on a scheme I still haven't quite figured out, but we were getting CPU --> PPU bound in what we could do during vblank. The MMC3 solution let us basically dma our way around the problems and push a much larger screen update far more often. The original version ripped through playback so fast that we had to slow it down to get a decent number of frames into the available 32 CHR ROM banks (it's a mapper 4 NES rom).
The homebrew community was a fantastic resource! Shoutouts to nesdev, Shiru, and the countless articles we googled!