Also, I think a big advantage for the camera-based solutions is that they're tot...

Zigurd · on Jan 22, 2018

To make a camera-based solution practical, it seems like one would neeed a specialized camera, for modest resolution, but very high frame rate. Without enough frame-rate, you can't capture high frequencies. Even if speech could be discerned from a spectrum that rolls off at 300hz, well under the peak energy of a typical human's speech, you would still need many hundreds of frames per second.

fisherjeff · on Jan 22, 2018

If you watch the video at the end, you'll see that they actually exploit the fact that CMOS pixels are read out sequentially and use the data from each row to extract surprisingly decent high frequency data from even 60 Hz video!

mikeash · on Jan 22, 2018

They article says they were able to use the rolling shutter on standard cameras to extract data at a much higher frequency than the nominal framerate would allow.

regularfry · on Jan 22, 2018

The article talks about this. The rolling-shutter effect means that you can get information out at higher than frame rate.