Why couldn't you switch between input and output fast enough that you can't hear the difference, prioritizing output, and get low fidelity but viable input?
Producing output requires vibrating the headphone elements. That vibration will completely swamp any vibration induced by the sound in the room. Cutting the output long enough to dampen those vibrations will certainly be noticeable.