How to write assembly code for the iPhone

chj · on Nov 11, 2012

Writing assembler for iPhone is a pain in the ass. I don't know which version its gas is based on, but it is definitely a very very old one. The most important feature missing is to create a macro that takes arguments. A most recent change is that some guy in Apple decided to switch conditional instructions like ldrneb to ldrbne, ruining my existing code base (> 10K lines). Now I need to remember what's the right way for iPhone, and what's for Android.

Btw: there is zero support from Apple, so your only chance is to find this kind of blog posts, and they are very few. Bookmark it now if you may need it someday.

brigade · on Nov 11, 2012

There is documentation of the OS X assembler [1] that generally applies on iOS as well. In particular, you can have macros with up to 10 arguments, numbered $0 to $9. (actually this shown in the article...)

I'm pretty sure ldrneb was never a valid instruction; gnu as tended to be more flexible than ARM's specification of the assembly language. Some of this flexibility used to be allowed by ARM (maybe), the rest was depreciated with UAL and llvm only supports ARM's UAL syntax. You can get this behaviour from GNU as via .syntax unified (which you really ought to be using, since it makes it really easy to support assembling to both ARM and Thumb-2)

[1] https://developer.apple.com/library/mac/#documentation/Devel...

chj · on Nov 11, 2012

You're my hero. Thanks a lot.

Unfortunately this works different from the assembler from linux (Android), so I need to keep two different macros. No matter what, this is still better than nothing!

stephencanon · on Nov 11, 2012

One good option for more portable macros in assembly is to use the C preprocessor. It doesn't always provide all the features you might like, but its behavior is defined by a standard, so it is very predictable.

The iOS toolchain applies the C preprocessor to assembly sources by default; it's been a while since I've used the mainline gnu tools, but I think it is automatically applied to .S (capital S) files by default there.

If you have more questions or issues with the iOS toolchain, please file bugs and/or post your question to the devforums. There are plenty of people who use assembler everyday who will be happy to help out.

And just to echo grandparent: the unified assembler language has been standard on ARM for several years now; pre-unified mnemonics like strneb are deprecated.

chj · on Nov 12, 2012

I do use C preprocessor whenever I can. But since ARM assembler uses # to refer to constants, and C preprocessor uses # for different purpose, you can't use it most of the time. I'd love to know if you have a work around.

>If you have more questions or issues with the iOS toolchain, please file bugs and/or post your question to the devforums. There are plenty of people who use assembler everyday who will be happy to help out.

Thanks. Didn't know that..

>And just to echo grandparent: the unified assembler language has been standard on ARM for several years now; pre-unified mnemonics like strneb are deprecated.

I just spent a few hours cleaning up the code to be UAL compatible. Android NDK seems to support both ".syntax unified" and ".syntax divided", but iOS assembler from xcode4.5 can only support unified. Besides the conditional instructions, with UAL enabled you can't assemble the following instruction with Android NDK's assembler:

orr r1,r0,lsl#8

instead, you must give the dest reg explicitly:

orr r1,r1,r0,lsl#8

stephencanon · on Nov 12, 2012

> I do use C preprocessor whenever I can. But since ARM assembler uses # to refer to constants, and C preprocessor uses # for different purpose, you can't use it most of the time. I'd love to know if you have a work around.

My "work around", such as it is, is that the C standard says that # shall be followed by a macro parameter; if it isn't (as will be the case with # in ARM assembly) the behavior is undefined by the standard, and it happens that LLVM does the "right thing" (from the perspective of someone who wants to get things done without a lot of hassle) and simply passes the # through unchanged.

This is obviously imperfect, but it works for my purposes. It may not be sufficient for your needs.

The inability of (some builds of?) gas to support implicit sources under .syntax unified has also annoyed me.

saurik · on Nov 11, 2012

Apple (NextSTEP, really) forked their version of gas so long ago that it predates the formation of "GNU binutils". While Apple has occasionally gone back and merged updates for some files, as far as I've been able to tell, most of the code has been stagnant since ~1990. The version of GAS they are using is pretty much 1.38. :(

(One of the most irritating things I had to do while supporting all of the initial iPhone jailbreak efforts was to reimplement a ton of gas macro features from later versions of the assembler so I could compile the "normal programs" that people expected to work on this system.)

stephencanon · on Nov 11, 2012

The current Apple devtools default to using the (much more modern) llvm-mc assembler, not gas. Patches and bug reports are always welcome!

saurik · on Nov 13, 2012

FWIW, this would not have particularly helped me (even with a time machine to take the current compilation chain back to when we are talking about, which is December of 2007), as Clang is unable to compile a bunch of the kinds of things I need to compile (it barely often even compiles my stuff correctly: the compiler itself often crashes; this was certainly the case a few years ago, when the project we are discussing was actually relevant... arguably my main contribution to the iOS toolchain community was to do the work required to get us off of the LLVM compiler and onto mainline gcc).

To the extent to which that is then some kind of veiled challenge or poke (as I'm having a very difficult time understanding why you are telling me to do something now when I've been talking about work I did five years ago, although work that I still rely on), I will then make the counter-poke: GCC is open source, as is gas; it is certainly the case that "patches and bug reports are always welcome". Just because something exists does not mean it is entitled to "patches and bug reports" when there are better alternatives. I personally have no real interest in being able to use Apple's off-the-shelf toolchain, nor do I appreciate their reasons for forking so far.

A key problem here, of course, is that many people in the Apple development community have a strong negative impression of both GCC and GAS; but, as far as I can tell, this is almost entirely the fault (and probably strategic plan) of Apple, who have been working with a 5-year old version of GCC hooked up to a 20-year old version of GAS. It is incredibly painful for me to see all of this great work and effort being ignored and rewritten from scratch by a new generation of people whom Apple have brainwashed into thinking this is then a technical issue and not a licensing one.

Regardless, in my attempts (two weeks ago) to use LLVM MC (disassembler), I found a handful of issues that I had to fix (on SVN head) to get it to even generate correct disassembly (although one or two of the bugs were in the table definitions and should also affect assembly) on the first project I tried to disassemble with it: I thought it would even be a simple case (a bootrom, which would thereby have no relocations or anything particularly tricky to understand or decode: just some instructions).

The result of that is, from being on the IRC channel (where after having a vague discussion about some MC API and usage issues, I attempted to bring up some of these bugs), I did and still do not get the impression that anyone there really seems to care about these kinds of encoding issues, so the bugs I found are likely just going to rot.

Honestly, I do not feel even remotely bad about this, because binutils (which I have already recoded my project to use instead) does not have any of these issues, and is also open source. I really believe that Apple's forking of the community efforts are harmful to the cause, and I would frankly rather not encourage their behavior: if they want their toolchain to work well, and they want my help to make that happen, they are going to need to accept the licensing requirements of GCC; otherwise, I'm quite happy to watch them have to do all of the reimplementation work entirely on their own.

(Some might now point out that binutils doesn't provide all of the same data that LLVM theoretically provides, but the way LLVM provides it is actually sufficiently useless--in addition to being incorrect--that I do not mind. The LLVM API, especially at the MC level, is highly overrated: it was seriously sufficiently bad that I ended up in a situation where parsing the string output it provides was more useful than attempting to extract meaning from their MCInst objects. I could find the attitude "stop using that thing that works and instead contribute patches to LLVM" defendable if LLVM were better designed, but it sadly simply is not; it is slightly easier to use as a library to do backend compilation, and it has better error reporting, but both of these would have been simple and welcome contributions to gcc.)

chj · on Nov 11, 2012

Did you write a translator or fix the assembler itself?

saurik · on Nov 11, 2012

As I had hoped was clear from context (but I guess wasn't), I was definitly implementing these features for the assembler. In addition to being a better result (for example, requiring fewer changes to the compiler I was also spending a lot of time trying to get cobbled together out or Apple's fork of gcc 4.0, mainline gcc 4.2, and the LLVM gcc 4.2 project that, while generated horribly broken code due to LLVM being a horrible target for ARM, had a ton of Apple internal patches; alternatively one could edit every app that required these features to mess with a preprocessor, but that would scale horribly), it was actually much easier to do... implementing the full gas macro system outside of gas and its assembly parser seems unneccessarily brutal: often you just needed to fix tiny things, like "this argument is now optional" or "you can now specify this as a variable".

__float · on Nov 11, 2012

10K lines? Can you elaborate a little on what this is used for? I am having trouble seeing why Obj-C (it is just C after all) can't be used for the majority.

chj · on Nov 11, 2012

C is sufficient for most apps, however, what I am working on happens to be in the rare few --- x86 emulator(http://www.aemula.net). We need to make good use of every register and the overhead of function call is considered unacceptable.

panic · on Nov 11, 2012

Your C compiler can also generate NEON instructions if you use NEON intrinsics:

http://gcc.gnu.org/onlinedocs/gcc/ARM-NEON-Intrinsics.html

http://blog.llvm.org/2010/04/arm-advanced-simd-neon-intrinsi...

z3phyr · on Nov 11, 2012

I ve heard that ARM assembly is easier to grasp than x86..

sparky · on Nov 11, 2012

ARM is still a bit icky, but nearly anything this side of a VAX mainframe will be easier to grasp than x86.

If anyone's looking to learn, MIPS is really easy to pick up. There's a good simulator (SPIM[1]) with terminal and GUI interfaces and lots of tutorials, as it's used in a lot of undergrad computer architecture courses.

[1] http://spimsimulator.sourceforge.net/

pjmlp · on Nov 11, 2012

As I grew up with x86 I beg to differ.

mahmud · on Nov 11, 2012

Opinions are trivial to hold. Backing them up, OTOH, is an entirely different matter.

"x86" instructions now number in the hundreds. Almost any RISC ISA will max at 30% of that, with less than half of the addressing modes. "Easy" is an adjective, but "easier" is an ordering relation: ARM is easier than x86.

Someone · on Nov 11, 2012

I do not know either instruction set well enough to make any claim as to which is easier, but instruction count on its own is not the best measure. Certainly, orthogonality plays a role, too. In that respect, ancient x86 was horrendous compared to, the 68000 (yes, there is a multiply instruction, but you have to have your data in register X or Y; Z does additions, only). The 68000 had different types of registers, too, but it certainly was way easier to remember which fell in what class (D0 through D7 are data registers, A0 through A7 are address registers).

Also, there is the x86 thing (that, IIRC, ARM and Z80 have, too) to give part of a register an entirely unrelated name (upper half of register X is called A, lower half B). That is avoidable complexity, if one is willing to give up backwards compatibility of source code.

stephencanon · on Nov 11, 2012

ARMv7 has 426 distinct instructions (counting sections in the architecture reference manual).

x86 does have somewhat more instruction names, but not dramatically more (and the blowup is largely because of differences in naming conventions; Intel, for instance uses different names for [vector|scalar][single|double]float operations, whereas they are all the same name on ARM).

What makes ARM nicer (and it is somewhat nicer, in my experience as someone who spends hundreds of hours writing assembly code for both architectures every year) is:

1. non-destructive operations (finally coming to Intel too!)

2. better orthogonality, fewer weird holes in the ISA (especially in the vector ops)

3. no piecemeal vector extensions (all the various extensions that may or may not be available on x86 are madness).

When you really get down to it, though, none of these make a huge difference; they're just niceties.

brigade · on Nov 11, 2012

ARM has a lot of instructions too. Exactly how many depends on how you count them - are pld and pldw different instructions? mov and movw? uadd16 and sadd16? vadd.i16 and vadd.i8? vadd.f32 <Dd>, <Dn>, <Dm> and vadd.f32 <Sd>, <Sn>, <Sm>?

Conservatively (relative to x86 where I'd imagine everyone counts pavgb and pavgw as separate instructions), ARM has well over 300 instructions. Include significantly different permutations of instructions, and you're easily over 500, and probably outnumbers x86.

Even if you discount vector and floating point, ARM still has around a hundred distinct instructions.

pjmlp · on Nov 11, 2012

I still beg to differ.

I find confusing the three parameter mode of many ARM instructions and the possibility to have multiple execution modes.

I did Z80, x86, 6800 and MIPS programming back in the day.

arthulia · on Nov 11, 2012

http://www-cs-faculty.stanford.edu/~eroberts/courses/soco/pr...

That is probably because there are fewer instructions to memorize.

ksec · on Nov 11, 2012

Interesting that After all these years Compiler still haven't catch up with self written Assembly speed. Or is there some LLVM optimization not used yet?

stephencanon · on Nov 11, 2012

Compilers and assembly programmers serve two different roles.

- A compiler is expected to produce reasonably fast code in microseconds.

- An assembly programmer is expected to produce the fastest code possible in hours or days or weeks.

Given hours or days, and development work focused on such a task, compilers would be able to compete with good assembly programmers; it simply hasn't been the focus of much compiler development, because most users don't want to wait three days for their code to compile.

What's remarkable is that compilers are able to fairly consistently beat assembly programmers of average skill, while delivering results thousands of times faster.