Neat but static ads are slowly becoming a thing of the past... Companies like Acast and Midroll are building dynamic ad solutions that inject dynamic ads into podcasts at the time of stream/download
Sounds like that's potentially solvable by breaking the podcast down into chunks by speaking voice, then flagging any sections of ~30s with a different speaking voice from the rest. Detecting guest speakers shouldn't get caught by this as there'd be more conversation rather than a mostly unbroken 30s chunk.