After about a month of trying to debug a real-time process control program written in assembler by my boss, I had to resort to desperate measures. The program was about 150 pages long and written in unstructured assembly language. It ran on a minicomputer with the company's own operating system. The program controlled a machine full of hydrofluoric acid that etched semiconductor chips on big slices of silicon. It was a real-time program with a few dozen asynchronous parts and couldn't be slowed down because of the process machinery that it ran.
If it wasn't for the other junior programmer assigned to the project, I might have just given up in despair, but the two of us would come in on weekends when the clean room was down and crank up the program and look for problems in its execution.
We finally debugged the problems by taking a much larger computer with a fast line printer and used it to do direct memory access to the minicomputer, hex dumping the section of memory containing the bad program's data structures as fast as the line printer could print the pages. By calling out to me what was going on with the silicon etch machine I could annotate our core dumps as they printed and we could go back to our desks with the 9 inch thick listing and figure out what was going wrong.
A core dump is when your transistor-based computer has a fault and has to dump the magnetic cores to tape. You parse it by running a magnetometer along the tape to read the values. Real programmers just harness gamma rays directly.
Sun's "desktop productivity" tools for OpenWindows supported core dumps really well. The file manager skeuomorphically showed a core dump as a red bomb. You could effortlessly double click on the bomb, and it would predictably attempt to load it into the XView text editor, which would then intuitively explode, replacing the core dump with an even bigger, more sophisticated core dump.
It was the dreaded "thick printed output" in ancient days of IBM mainframes. It meant your program crashed overnight. In the old days you batch compiled and batch ran your program from punch card job deck. If you were a poor student you probably ran your jobs at night because your dollars went three times farther then. Researchers on fat government grants and school administration could afford day runs.
For the most part these dumps werent too useful beyond the contents of machine registers. You could probably figure out where you died in the code. If you were computing a scientific array of numbers in memory you could probably see how the computation got and whether it was generating garbage. Also in the old days you bought like 16 kilobytes or 64 kilobytes of core, so a full dump printout was super large.
I think you could tailor the format of the dump with the Job Control Language. But I dont remember.
It feels like a lot of people have an allergic reaction to core files. (Maybe because they remind you that your program just crashed? Or because they seem to take superhuman understanding to work with?) What's great about them is that they decouple the debugging process from restoring service. In production, when something's broken and you have a hunch will be fine if you restart it, you can save a core file, restart it, restore service, then debug it. In dev and test, when another dev or tester runs into a bug in your area, they can just send you a core file instead of waiting for you to drop everything and debug it right away.
Glad you liked it. Unfortunately, core files can be massive and require you to have a lot of the assets from the environment the fault occurred in to debug. Last but not least, it would be good if they gave us more of a head-start on root cause. These are all things we're directly tackling at Backtrace.
That sounds interesting. I heard about you all through Abel at Surge last month and I'm looking forward to checking that out.
I work primarily on illumos, where we do dump program text, and we're working on getting CTF everywhere. It's almost always possible to debug without the original binaries and usually on different systems as well.
And for your next trick, undump: convert the core file into an executable program which recreates the process in the same state it was originally dumped (nearly).
This was (is?) a step in compiling Emacs. Compile the C code into an uninitialized binary of the Lisp interpreter, start the interpreter, load in a bunch of Lisp code, dump core, then undump the core into /usr/bin/emacs.
If it wasn't for the other junior programmer assigned to the project, I might have just given up in despair, but the two of us would come in on weekends when the clean room was down and crank up the program and look for problems in its execution.
We finally debugged the problems by taking a much larger computer with a fast line printer and used it to do direct memory access to the minicomputer, hex dumping the section of memory containing the bad program's data structures as fast as the line printer could print the pages. By calling out to me what was going on with the silicon etch machine I could annotate our core dumps as they printed and we could go back to our desks with the 9 inch thick listing and figure out what was going wrong.
Those were the days (1976).