Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Minimodem – general-purpose software audio FSK modem (whence.com)
152 points by marcodiego on April 2, 2022 | hide | past | favorite | 37 comments


If you are interested in transmitting data over sound between airgapped devices, make sure to also checkout ggwave [0]. I've been working on this library on and off during the past year in my free time. I focused on making a FSK protocol that is robust in noisy environments at the cost of low bandwidth. Found some fun applications in various hobby projects.

[0] https://github.com/ggerganov/ggwave


Could this be used to embed e.g. the sports "game clock" clock time(s) in broadcast TV/audio/video streams; in order to synchronize an air-gapped device next to the media signal reproduction unit?

For example at a grille during the game.

FWIU, e.g. Chromecast have ultrasonic pairing.


Digital broadcasting adds significant latency so it would be in sync with the TV but still out of sync with the actual game (and other broadcast receivers).

The same issue affects radio time signal (e.g. https://en.wikipedia.org/wiki/Greenwich_Time_Signal, last beep is the exact top of the hour), so some broadcasters will no longer play them on internet streams/digital radio, rather than be inaccurate.


So live TV could broadcast [ultrasonic] timecodes during ad breaks such that the gameclock would be synchronized even when the viewer initially tunes in during ad breaks, but "Video On Demand" would be fine because it's not the current time timecode, it would be a game time timecode?

Selecting between multiple audible [sports bar TVs, radios] might imply need for a sonewhat-unique signal code, or only over e.g. USB would work with more than one game on; because there's naturally noise in that airgap channel.


Yes absolutely! (Edit: or maybe not :-) see sibling comment for more info)

Actually, the "Waver" youtube video linked at the top of the README has an embedded ultrasound transmission at around 0:36. I can for example decode the message on my iPhone by simply running the Waver app and playing the video on my PC.

Some people have already done some work for AV sync in the issues of the project [0].

[0] https://github.com/ggerganov/ggwave/issues/46


https://www.adafruit.com/product/3421 https://learn.adafruit.com/adafruit-i2s-mems-microphone-brea... :

> The I2S is a small, low-cost MEMS mic with a range of about 50Hz - 15KHz, good for just about all general audio recording/detection.

From https://en.wikipedia.org/wiki/Ultrasound :

> Ultrasound devices operate with frequencies from 20 kHz up to several gigahertz.

Raspberry Pi Pico (RP2040), Pi Zero [2 W]: $5+

Power supply, case, wall-mount:

Mic: ~$6

Huge [red] 7-segment display with I2C and voltage regulator components taped to the wall next to the sports bar TVs:


Please, submit it to f-droid!


Is it possible to use ML to have greater bandwidth at lower SNR?


Probably, but I suppose you would need lots of training data in various surrounding conditions and hardware (mic, speakers).

I wanted to experiment with using a mic-array in order to improve the SNR. Got a 4 Mic ReSpeaker [0] some time ago, but haven't played with it yet. My expectation is that capturing the audio simultaneously with multiple mics should reduce the noise and thus improve the robustness of the transmission allowing to increase the bandwidth.

[0] https://respeaker.io/4_mic_array/


I'd imagine training data would be the limiting factor.


Funny that this is top of HN right. Spent a ton of time in this repo lately trying to understand how modems work. Fascinating how we can use waves to send data (audio or rf). So much of the world relies on these modulation/demodulation approaches.

Also, it was only a few days ago I finally learned that modem is short for MOdulator DEModulator. My mind was blown.


You might be interested to know about the Morse keying requirement for Ham radio licenses before the early 90s.

I think ham radio licenses lost popularity between then and approx 2018. Demonstrating fluency with Morse was necessary for a license long ago. Nowadays, there’s a renewal of interest but many new Ham operators can write a Morse codec faster than they can learn fluency by hand. I think the requirement for demonstrating Morse by hand in order to obtain a HF license will be dropped if it isn’t already.


The Morse requirement has been dropped for all classes of license (in the USA): http://arrl.org/learning-morse-code


Thanks, I guess I’m living in the past


Hah, I had to learn the most basic understanding of Morse for my private pilot certificate. When I finally got to the checkride and had to fly off a VOR radial I asked the examiner if he wanted me to identify the station (by the 3 letter Morse code broadcast) and he said nah.


> I think the requirement for demonstrating Morse by hand in order to obtain a HF license will be dropped if it isn’t already

The Morse requirement was dropped by the FCC for all license classes back in 2006.

But even with SDR making radio experimentation easier and more rewarding than ever ham radio seems to be continuing its slow decline and the average age of operations continues to rise... and we continue to lose spectrum (e.g. we just lost the 9cm band).


Not required in UK afaik.


The blessing of a recent birth that you stand on the shoulders of giants :)


If you're looking for a library to do FSK (or a variety of other telephony-related protocols), SpanDSP is worth looking into as well. I've used it to decode some radio-based telemetry that uses Bell 202 for data transmission.

https://github.com/freeswitch/spandsp


I've had the "pleasure" of trying to transmit data / timecodes through active speakers with a lot of signal processing.

I don't imagine any QAM-like stuff working. FSK does, but the transitions get a little wonky, and the different frequencies will often have different phase shifts. ASK modulation should work, but haven't gone there yet.

But its interesting. I know a thing or two (literary) about encoding in the radio spectrum, but none of the encodings I've found deals with systems that are so un-lineary.


In my experience, the non-linearity is not the issue. It's the multipath that gets you with these things. Your signal echoes off of everything. I was able to get QAM working with Quiet[1] just fine but only at a few centimeters. FSK is much more reasonable. If you're having issues with intersymbol interference it's probably still multipath getting you. I'd say slow the symbol rate down to avoid it. I've had very good success with FSK with Quiet.

1: https://quiet.github.io/quiet-js


I am proud I contributed to this repo with a PR that substitutes the FFT with a goertzel filter, it simplifies things a lot.


If you don't mind a question (and there is not much detail on the home page for OP), does this imply frame-based or sliding-window-based processing? If so is that as short as a byte or two and does not introduce much delay (i.e. only a character or two)?

Asking because I want to send single characters (or two) from a controller.


I don't know how the implementation of the Goertzel algorithm is used in Minimodem, but the FFT requires you to process a whole window at a time (which can slide) while the Goertzel algorithm works sample by sample. I'm pretty ignorant about signal processing but I did this in 02019; my notes are at https://dercuano.github.io/notes/goertzel-minsky-pll-prefix-....


Now someone just needs to do an open source acoustic coupler and we can stage 80s phreaker re-enactments like European medieval societies.


Minimodem was originally created to be able to get debugging information out of a computer with no network connectivity - if audio is working, one can transmit kernel and network driver debug dumps and decode them on another computer.


I think I remember an old patch that would replay the lines of a kernel panic using morse code on the keyboard lights.


The ingenuity of computer hackers is literally on another plane of existence some times. It's these kind of ideas which make me proud to do computer science


> Minimodem can be used to transfer data between nearby computers using an audio cable (or just via sound waves), or between remote computers using radio, telephone, or another audio communications medium.

It should be noted that any signal that can be modulated and demodulated can be used to exchange information, so either visible or IR light, ultra/infra sound etc. It's just a matter of building the right hardware interface.


Do the audio cables need to have crossed wiring? Microphone wire of one connecting to the speaker wire of another?


Yes, but beware that directly connecting a speaker output to a microphone input can be dangerous for the input port. Computer headphones or small speakers outputs are generally safe because of their low power, but anything that can directly drive more powerful speakers can damage a microphone or line input if directly connected without any attenuation network in between.


Yes, as a wild guess, putting 100ohm or 1000ohm resistors in series with each lead might make it safe for inputs and outputs, whether wired correctly or incorrectly. There is no need to keep low impedance to drive a speaker or headphone.


If we’re talking analog, those cables usually have capacitors to limit DC and a stereo-mixing resistor because of the headphone impedance (KOhms) and microphone impedance (20-200 Ohms).


If we're posting audio modems, here's another one I've contributed to, Quiet Modem. Haven't contributed to it much lately but happy to answer questions about it.

https://github.com/quiet


I'd like to see this project but trying to not sound 'nasty'.

Ie. I want to fit a few hundred bps of data inaudibly into some nice classical music.

It could be hidden up at 18kHz, but to be honest that still sounds nasty for those of us who have ears that still work that high.

Better would be to hide the data in harmonics and neighbouring frequencies of the music. Human ears, if they are hearing a sine wave at frequency X have reduced sensitivity for audio at frequencies close to X and close to multiples of X. I think you could use that effect to send data much more pleasantly.


What the world needs now is this running in webassembly to relay messages through a websocket or something.


I would so love to have a terminal+modem for my mobile phone.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: