Descent into Darkness: Understanding your system’s ABI is the only way out

Estragon · on March 15, 2010

TL;DR: Slides from a talk describing how to rewrite the Ruby VM in memory while it is running in order to collect garbage which Ruby is not clever enough to collect for itself.

Yes, it sounds masochistic to me, too.

crux_ · on March 15, 2010

Nitpicky correction: It doesn't do any garbage collection; it helps you figure out where you are leaking references.

scott_s · on March 15, 2010

Except that it's wrapped up in a nice Ruby library, so you can benefit from someone else's masochism to get runtime memory stats on your own programs.

jey · on March 15, 2010

a) Clever. b) Cringe!

I really like Ruby as a language, but the ecosystem is just not up to snuff, which is why I've gone back to Python. The docs are complete, the standard implementation is rock solid and only getting better (cf. unladen-swallow), and I can interface with C stuff easily (cf. Cython). I'm looking forward to Ruby maturing.

tptacek · on March 15, 2010

We push Ruby harder than most people ever push Python and I'm just not seeing the maturity problems you're alluding to. Maybe you could be more specific? We've got Ruby talking to chipsets, controlling WinDBG, debugging processes on Win32, Linux, and OSX, controlling DynamoRIO, running god knows how many different network protocols, calling into Java, routing raw packets with pcap, debugging J2EE container VMs, and fuzzing libraries written in C and C++.

(By the way, don't waste time with the Python C interface. Use ctypes. Ruby calls it Ruby/DL or "ffi".)

gte910h · on March 15, 2010

Don't waste time with ctypes, use weave or sip or swig :OP

tptacek · on March 15, 2010

I'd rather eat a bug than ever use SWIG again.

mst · on March 16, 2010

Seconded. And I'm perfectly comfortable using XS to bolt perl and C together.

That may be a function of my having spent enough times in the guts of the perl VM to already know exactly what I want and not want anything getting in the way while I'm implementing that, mind.

gte910h · on March 17, 2010

Oh me too, I much prefer SIP and weave.

devinj · on March 16, 2010

Or Cython. That one is nice.

erydo · on March 15, 2010

Why not just add tracing code to the source and recompile? I don't immediately see an advantage to this kind of hot-patching in this case.

tbrownaw · on March 15, 2010

The way presented has very high cost of initial creation, and almost negligible setup overhead per user ('user' being a programmer who wants memory use info).

Patching and recompiling has moderate-low cost of initial creation, but also moderate-low setup overhead per user.

Plus, some of us think this kind of thing is just plain fun. :)

ice799 · on March 16, 2010

This.

I built this because I like working on this sort of thing and because it lowers the bar for everyone else who doesn't know how to or want to rebuild their binaries.

Also, depends a lot on your infrastructure. Sometimes it's easier to distribute and maintain patched Rubies, and sometimes it's easier to just require a ruby gem. The gem is designed and built in such a way that it should be resistant to most changes made to the Ruby VM.

jrockway · on March 16, 2010

"Click here to download a version of the patched Ruby for OS X". That's as easy, for the end-user, as rewriting the executable on the fly. And it's probably easier for you guys too ;)

erydo · on March 15, 2010

I'd completely missed the reusability aspect of things; I guess because the tight coupling to specific addresses would mean redoing a lot of the work for different builds. But once you figure it out, it looks like it could be trivial to replicate for future minor revisions.

And I agree, it's fun stuff! That kind of hacking really makes me smile. I was just curious about the pragmatic motivations.

ice799 · on March 16, 2010

I didn't hardcode any addresses. That's why it works on so many different Ruby builds.

erydo · on March 16, 2010

Sorry, I didn't mean to imply that you had hardcoded the addresses, only that you'll have to go through the process of finding the addresses anew for each build. Not an insurmountable problem (as you've shown), but making it slightly harder to automate.

However, I admit to having had time to only read through it casually, so please take any incorrect statements solely as misunderstanding on my part.

psadauskas · on March 15, 2010

His point was that he was lazy, and wanted to be able to memtrace his production apps using the distro-provided rubies. Not recompiling was the whole point of the exercise.

erydo · on March 15, 2010

I understood that not recompiling was the point; I was wondering why it was a valid point.

"Because you want to do it to a running production app" seems far-fetched to me -- not saying that it is, I had just discounted that reasoning.

tbrownaw's point of 'high initial cost, low marginal cost' makes a good amount of sense, though.

mrcharles · on March 15, 2010

This stuff is just epic. I love it even though I'll never have a use for it, and don't see the point in the first place.

God bless crazy hackers, I live vicariously through your insanity.

techiferous · on March 16, 2010

It's actually quite useful. Memory is often the scarce resource on web servers. This helps you troubleshoot Ruby memory leaks, which could end up saving you a lot of money in server costs (and speed up the performance of your app).