What is a core dump and how do you parse one?

todd8 · on Oct 17, 2015

After about a month of trying to debug a real-time process control program written in assembler by my boss, I had to resort to desperate measures. The program was about 150 pages long and written in unstructured assembly language. It ran on a minicomputer with the company's own operating system. The program controlled a machine full of hydrofluoric acid that etched semiconductor chips on big slices of silicon. It was a real-time program with a few dozen asynchronous parts and couldn't be slowed down because of the process machinery that it ran.

If it wasn't for the other junior programmer assigned to the project, I might have just given up in despair, but the two of us would come in on weekends when the clean room was down and crank up the program and look for problems in its execution.

We finally debugged the problems by taking a much larger computer with a fast line printer and used it to do direct memory access to the minicomputer, hex dumping the section of memory containing the bad program's data structures as fast as the line printer could print the pages. By calling out to me what was going on with the silicon etch machine I could annotate our core dumps as they printed and we could go back to our desks with the 9 inch thick listing and figure out what was going wrong.

Those were the days (1976).

dekhn · on Oct 16, 2015

A core dump is when your transistor-based computer has a fault and has to dump the magnetic cores to tape. You parse it by running a magnetometer along the tape to read the values. Real programmers just harness gamma rays directly.

_ytji · on Oct 16, 2015

There's probably an emacs command for that (https://xkcd.com/378/)

mutagen · on Oct 16, 2015

What will core dumps look like for a quantum computer?

johncolanduoni · on Oct 16, 2015

Well, if you looked at it then you've already destroyed it.

FroshKiller · on Oct 16, 2015

Not exactly. You just can't know the state of every register at once.

DonHopkins · on Oct 16, 2015

Sun's "desktop productivity" tools for OpenWindows supported core dumps really well. The file manager skeuomorphically showed a core dump as a red bomb. You could effortlessly double click on the bomb, and it would predictably attempt to load it into the XView text editor, which would then intuitively explode, replacing the core dump with an even bigger, more sophisticated core dump.

peter303 · on Oct 16, 2015

It was the dreaded "thick printed output" in ancient days of IBM mainframes. It meant your program crashed overnight. In the old days you batch compiled and batch ran your program from punch card job deck. If you were a poor student you probably ran your jobs at night because your dollars went three times farther then. Researchers on fat government grants and school administration could afford day runs.

For the most part these dumps werent too useful beyond the contents of machine registers. You could probably figure out where you died in the code. If you were computing a scientific array of numbers in memory you could probably see how the computation got and whether it was generating garbage. Also in the old days you bought like 16 kilobytes or 64 kilobytes of core, so a full dump printout was super large.

I think you could tailor the format of the dump with the Job Control Language. But I dont remember.

dap · on Oct 16, 2015

Great post on the structure of core files.

It feels like a lot of people have an allergic reaction to core files. (Maybe because they remind you that your program just crashed? Or because they seem to take superhuman understanding to work with?) What's great about them is that they decouple the debugging process from restoring service. In production, when something's broken and you have a hunch will be fine if you restart it, you can save a core file, restart it, restore service, then debug it. In dev and test, when another dev or tester runs into a bug in your area, they can just send you a core file instead of waiting for you to drop everything and debug it right away.

sbahra · on Oct 16, 2015

Glad you liked it. Unfortunately, core files can be massive and require you to have a lot of the assets from the environment the fault occurred in to debug. Last but not least, it would be good if they gave us more of a head-start on root cause. These are all things we're directly tackling at Backtrace.

dap · on Oct 16, 2015

That sounds interesting. I heard about you all through Abel at Surge last month and I'm looking forward to checking that out.

I work primarily on illumos, where we do dump program text, and we're working on getting CTF everywhere. It's almost always possible to debug without the original binaries and usually on different systems as well.

sbahra · on Oct 16, 2015

Oh, cool! We hope to publish something soon on this. ;-)

It would be great to learn more about the work you're doing.

kruhft · on Oct 16, 2015

   gdb program core

Assuming you compiled with -g, this might or might not be useful.

bsilvereagle · on Oct 16, 2015

For some architectures & compilers, you need -ggdb, not just -g.

mcguire · on Oct 16, 2015

And for your next trick, undump: convert the core file into an executable program which recreates the process in the same state it was originally dumped (nearly).

bbanyc · on Oct 17, 2015

This was (is?) a step in compiling Emacs. Compile the C code into an uninitialized binary of the Lisp interpreter, start the interpreter, load in a bunch of Lisp code, dump core, then undump the core into /usr/bin/emacs.

mcguire · on Oct 20, 2015

And many other Lisps like Franz and AKCL/GCL.

femngi · on Oct 17, 2015

How does this Backtrace file format differ from Microsoft's Minidump format which is also used on Unix as part of Breakpad?

kleer001 · on Oct 16, 2015

tl;dr OP shows you what a core dump is and how to parse it.