Hacker News new | past | comments | ask | show | jobs | submit login
Research paper is also an executable x86 program [pdf] (tom7.org)
189 points by notmysql_ 10 months ago | hide | past | favorite | 28 comments



Here's the companion video for the paper: https://www.youtube.com/watch?v=LA_DrBwkiJA


Well that was a fantastic watch. Thanks for sharing.


for bonus points, the video should also be executable :)


Reminds me a PDF which is also bootable x86 image from PoC||GTFO [1], specifically the second issue (0x02) on section 8 “This OS is also a PDF”

[1] https://pocorgtfo.hacke.rs/


Not only that, but it is an executable x86 program written in a printable subset of x86 instructions (so no self-modifying code), as noted in the section 3 with a comparison to the similarly printable EICAR anti-virus test file.


Yes. This is nuts (in a good way). I remember the first time I saw the video on youtube and the author was going through how he's going to do this with only printable chatacter and i had a "no fucking way" moment followed by being in awe at the solution that was found.


Many years ago, I wrote up a post on doing this kind of thing in plain DOS .com files: https://imrannazar.com/articles/x86-printable-opcodes

It's good to see the principle can be expanded to EXEs, I'll have to dig into this some more.


Justine is probably adding both targets to the αcτµαlly pδrταblε εxεcµταblε toolchain


When he talked about the inability to jump to certain places it reminded me of a powerpoint I read a decade or two back that discussed the disassembly of Skype. They used any and every trick in the book to make disassembly impossible, like calculating an int, feeding it to a cosine instruction and the result would be the jump distance. I tried finding the powerpoint but alas Google is garbage these days, maybe the author can find some hints in there to reduce the amount of code coming out of the compiler.

Wish I had come up with this compiler, great stuff.



It isn't, unfortunately. Thanks for looking, though. The presentation I had was a powerpoint that focussed entirely on the disassembly and anti-debugging tricks, nothing about networking or traffic analysis.


One of my employees once wrote a specification for his vignette correction algorithm in Postscript.

The illustrations and charts were actual examples of his algorithm, being executed at render time.


Lazy question, sorry briefly skimmed the PDF and this doesn’t do this, but hypothetically could one design a PDF file generator technique the produces a spec compliant file that uses this technique to chain load another arbitrary base64 encoded binary stored inside the PDF. Maybe someone has already done that.


I love this guy's content so much. Easily one of my favorite programming content creators around


Tom7 the goat


Contrary to what's in the paper, I'm pretty sure IMUL-by-constant is in fact useful, since you can use subtraction:

  x * (a - b) === x * a - x * b
and this applies even when losing the top half.


Is it executable by renaming the .txt file to an .exe or what?


Yep, it has the telltale MZ header at the top. Either a DOS or windows executable. [1]

The pdf appears to be a readable formatted version, to get the actual executable you’ll need the raw text sans newlines (as described in the paper)

1:https://en.m.wikipedia.org/wiki/DOS_MZ_executable


Trivia: all Windows EXEs run on DOS, but most of them just print something like "this program doesn't run on DOS" and terminate.

There are exceptions, like REGEDIT.EXE of Windows 95.


And you can replace the default DOS stub file with `/STUB` in the MSVC linker. Different linkers can and do use different stub files, though MSVC and GCC happen to use the same stub file because the original author of the BFD code---that was eventually used by GNU ld to generate a PE file---had apparently no idea what it was...

    /* this next collection of data are mostly just characters.  It appears
       to be constant within the headers put on NT exes */
    filehdr_in->pe.dos_message[0]  = 0x0eba1f0e;
    filehdr_in->pe.dos_message[1]  = 0xcd09b400;
    filehdr_in->pe.dos_message[2]  = 0x4c01b821;
    /* ... snip ... */
    filehdr_in->pe.dos_message[13] = 0x0a0d0d2e;
    filehdr_in->pe.dos_message[14] = 0x24;
    filehdr_in->pe.dos_message[15] = 0x0;
    filehdr_in->pe.nt_signature = NT_SIGNATURE;
(Of course, this code dates back to 1996 or earlier so the existence of DOS stubs might not have been a common knowledge if you didn't do any Windows programming.)

On the other hand, lld uses a functionally same but slightly different stub file [1] because it prints no newlines.

[1] https://github.com/llvm/llvm-project/blob/d7642b2/lld/COFF/W...


Even the DLL's are executables. It is about 100 bytes in where it says the real executable type. Some are OS1.x, win3x, win32... and so on. Think there is also a platform byte (x86, arm, mips, etc). My google fu is failing on the list of different types at the moment.


I guess by chmod +x .


This is a very impressive and fun piece of work!


What does it do when executed?


it plays music specified by a printable format given as an argument. By default it plays a default song (i didn't bother to see what it was), and 3 other songs.

Author described as 'a bad organ' sound.

It's not much of a slog, but i did zone out around the 16th part, before i could get to the loops and not having access to the asm INT opcode


reminds me of PoC || GTFO e-zine stuff :D fun things! cool piece!


Knuth would approve of this literate programming?


hard core science!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: