Being self-taugh I decided what better way to learn programming than starting with the basics?. Assembly was my first language, I could read and program in it so I considered like I knew the language.
It wasn't until I created a sinple 8086 emulator where you take the raw machine code instructions and translate those into not only the assembly instructions but actually emulate what those instructions do that I finally felt like I REALLY knew assembly.
My suggestion to others that want to learn assembly is to skip any assembly books. Using whatever language you want first start with a translator from machine code into assembly instructions, and then do an emulator. You only need to implement a small subset of the instructions, check out godbolt and translate some simple programs to know which instructions you need to implement.
Other than that all you really need is the 8086 manual, it has all the information there. I also found this site useful when implementing the flags https://yassinebridi.github.io/asm-docs/8086_instruction_set.... This takes less time than finishing a book and you learn a LOT more.
The goal is not to program in assembly at all but to truely understand the cost of everything and what you can expect from your hardware.
Becoming skilled at GDB and knowing how to generate assembly listing from your tooling is a key skill that really helps with understanding.
I learned assembly by reading the assembly listings from the C compiler. It is extremely interesting to be able internalize how high level constructs are compiled and optimized.
There was a reverse engineering guide that I quite liked that introduced you to assembly by first writing c examples, compiling them and then analyzing the disassembled output.
It was quite a long guide, but I would recommend it to anyone starting out. I don't have it in my bookmarks it seems, but I will try to update my comment tomorrow when/if I find it.
Edit: damn I guess it was https://beginners.re/ before it became pay-walled. Web archive still has copies of the book, but if you like it you should consider buying it even if it means signing up for patreon m-( I still got a few versions of the book somewhere as well. Have to dive in again to see if it is as good as I remember
single file, chmod +x, compiles itself and executes the binary, can easily give the -S flag to gcc or clang (uncomment one line) or better yet run objdump on the binary.
The great thing is it still works if you include a bunch of local headers that are a hassle to supply to godbolt. Latency locally is a win.
The #if 0 trick for the compiler lines and #else for the code is a good one. Got it from Rusty Russell of iptables fame iirc. Write your script in C, why not?
I'm sorry, I thought it was public. Can't work out how to change it. Nuts.
#if 0 //instructions to build and run
THIS_FILE=$0
BIN_FILE=/tmp/$(basename $0)
gcc -std=c11 -O0 -g -march=native $THIS_FILE -Wall -Wextra -o $BIN_FILE
if [ $? -ne 0 ]; then
echo "Bug in your C code or there is something wrong with your operating system's c compiler..."
exit 1
fi
# run it
$BIN_FILE "$@"
retval=$?
# uncomment below to examine the generated machine code
#
# objdump -DC $BIN_FILE | less -p '<main>'
# uncomment below to examine the assembly language the compier
# thinks it is generating
#
# gcc -S -std=c11 -O0 -g -march=native $THIS_FILE -Wall -Wextra -o ${BIN_FILE}.s
# vim ${BIN_FILE}.s
# clean up
rm $BIN_FILE
exit $retval
#else // c program
#include <stdio.h>
#include <stdint.h>
int main(int argc, char **argv)
{
for(int i = 0; i != argc; ++i) {
printf("argv[%d] = %s\n", i, argv[i]);
}
}
#endif //end c program
Casey Muratori (of Handmade Hero fame, works/worked at RAD Game Tools for a long time) just did that as part 1 of his Performance Aware Programming series he’s doing on his Substack: https://www.computerenhance.com/
The “homeworks” only require implementing basic data transfer, arithmetic, and logic instructions, but I enjoyed it so I implemented everything except the interrupt handler stuff (into, etc.), and the BCD stuff (aaa, etc.).
I agree that it’s a good way to learn, and Casey provides a reference implementation.
Vaguely related, since a lot of people here are mentioning assembly as a first programming language for learners: Knuth's "The Art of Computer Programming" and its associated fictitious "MMIX" processor ("MIX" on the volumes that aren't yet on their new edition).
Knuth's reasoning seems to be that higher-level languages go in and out of fashion all the time, but hardware and its associated assembly is quite sticky, so it's more "timeless". It's also a smaller set of primitives, so less overwhelming for the learner.
MMIX assembly is easier to understand than x86, having been designed specifically with learners in mind, and GCC even has a backend for MIX, so you can write C code and see how GCC would translate it to MIX assembly.
I did something very similar. Assembly was not my first language (it was my 4th), but I decided to learn it by writing a compiler and linker in it.
In for a penny, in for a pound.
> The goal is not to program in assembly at all but to truely understand the cost of everything and what you can expect from your hardware.
Entirely this. Also, to help you understand more deeply how computers really work.
That said, being able to program in assembly is still of great use to me. I do it to this day, usually on ARM processors -- not entire programs anymore, but critical parts.
My approach to learn assembly was to let the C compiler generate assembly (gcc -S -c) from C code and then read the assembly to see how C code is mapped to assembly code. I have written detailed article on this here https://www.avabodh.com/cin/cin.html
Step 2. Create a file format that encodes sequences of instructions and operands for your calculator
Step 3. Create an interpreter for that file format that runs your file. Add an accumulator and flags that represent overflow and such. And a instruction pointer.
Step 4: Add comments support to your file format (optional)
Step 5. Add support for logical operators, comparisons to your file format and interpreter
Step 6. Add support for labels and jumps to your file format and interpreter
Step 7. Add support for a stack, memory and related operators to your interpreter
Why not recommend the computer program whose docs you're linking? Emu8086 is one of the nicest tools I've ever used. It's probably unobtainable these days though since last time I checked it's no longer for sale. I have it though if anyone wants to do a midnight rendezvous.
I never realized it was a program until just now that you mentioned it. I only ever used that single page I linked when I came upon it one day while searching for how instructions affected the flags register.
> but what was the benefit of translating from machine code to assembly instructions
In order to read it, I suppose? There's no reason for memorizing binary encoding schemes. I mean, you will learn that 0x90 is NOP on x86 but that doesn't help you a whole lot.
x86 performance is usually better with smaller instructions, which can be accomplished by writing code that doesn't require prefex bytes and also using certain instructions that sometimes have alternative shorter encodings for specific registers.
It wasn't until I created a sinple 8086 emulator where you take the raw machine code instructions and translate those into not only the assembly instructions but actually emulate what those instructions do that I finally felt like I REALLY knew assembly.
My suggestion to others that want to learn assembly is to skip any assembly books. Using whatever language you want first start with a translator from machine code into assembly instructions, and then do an emulator. You only need to implement a small subset of the instructions, check out godbolt and translate some simple programs to know which instructions you need to implement.
Other than that all you really need is the 8086 manual, it has all the information there. I also found this site useful when implementing the flags https://yassinebridi.github.io/asm-docs/8086_instruction_set.... This takes less time than finishing a book and you learn a LOT more.
The goal is not to program in assembly at all but to truely understand the cost of everything and what you can expect from your hardware.