Big-endian vs. little-endian is an ancient flamewar that isn't going to go away any time soon, but sure, let's argue.
Once you've spent as much time twiddling bits as I have (as the author of proto2 and Cap'n Proto), you start to realize that little endian is much easier to work with than big-endian.
For example:
- To reinterpret a 64-bit number as 32-bit in BE, you have to add 4 bytes to your pointer. In LE, the pointer doesn't change.
- Just about any arithmetic operation on integers (e.g. adding) starts from the least-significant bits and moves up. It's nice if that can mean iterating forward from the start instead of backwards from the end, e.g. when implementing a "bignum" library.
- Which of the following is simpler?
// Extract nth bit from byte array, assuming LE order.
(bytes[n/8] >> (n%8)) & 1
// Extract nth bit from byte array, assuming BE order.
(bytes[n/8] >> (7 - n%8)) & 1
There's really no good argument for big-endian encoding except that it's the ordering that we humans use in writing.
There's really no good argument for big-endian encoding except that it's the ordering that we humans use in writing. And not even always that. We call our numbering system "Arabic", but for the Arabs, it little-endian.
For some reason humans seem to want high powers on the left, even if it makes no sense in a left-to-right language.
Take polynomials, they are typically written big-endian
ax^2 + bx + c
But infinite series have to be little-endian.
c_0 + c_1*x + c_2*x^2 ....
If you think for a moment about how you would write multiplication, you will see the latter form is much easier to reason about and program with.
I think figure 1 illustrates a common misunderstanding. People who object to little-endian are often imagining it in their heads as order (3): they imagine that the bits of each byte are ordered most-significant to least-significant (big-endian), but then for some reason the bytes are ordered the opposite, least-significant to most-significant (little-endian). That indeed would make no sense.
But because most architectures don't provide any way to address individual bits, only bytes, it's entirely up to the observer to decide in which order they want to imagine the bits. When using little-endian, you imagine that the bits are in little-endian order, to be consistent with the bytes, and then everything is nice and consistent.
> When using little-endian, you imagine that the bits are in little-endian order, to be consistent with the bytes, and then everything is nice and consistent.
But isn't that kind of at odds with how shifting works? (i.e. that a left shift moves towards the "bigger" bits and a right shift moves toward the "smaller" ones.) Perhaps for a Hebrew or Arabic speaker this all works out nicely, but for those of us accustomed to progressing from left to right it seems a bit backwards...
> To reinterpret a 64-bit number as 32-bit in BE, you have to add 4 bytes to your pointer. In LE, the pointer doesn't change.
But one shouldn't do that very often: those are two different types. The slight cost of adding a pointer is negligible.
> Just about any arithmetic operation on integers (e.g. adding) starts from the least-significant bits and moves up. It's nice if that can mean iterating forward from the start instead of backwards from the end, e.g. when implementing a "bignum" library.
-- is a thing, just as ++ is.
> There's really no good argument for big-endian encoding except that it's the ordering that we humans use in writing.
That's like saying, 'there's really no good argument for pumping nitrogen-oxygen mixes into space stations except that it's the mixture we humans use to breathe.'
It's simplicity itself for a computer to do big-endian arithmetic; it's horrible pain for a human being who has to read a little-endian listing. A computer can be made to do the right thing. Who is the master: the computer or the man?
That line of argument suggests you'd be happier with a human-readable format like JSON. Which is another eternal flamewar that we aren't likely to resolve here. Needless to say I like binary formats. :)
Floating point isn't legible either, nor are are bitfields, opcodes, instruction formats, etc. That's why we use computers to do the 'right thing' and format the data if we need to read it.
I don't have a preference for either one, but using little-endian when most/every processor you will be targeting supports it makes more sense than using big-endian + extra work on x86 just so you can read it with less effort in a memory dump.
I've just began learning about endianness and I'm sorry if this comes off as pedantic, but doesn't the last example refer to bit numbering rather than (byte) endianness?
Endianness applies to both! Or, it should, if you're being consistent. It makes no sense to say that the first byte is the most-significant byte, but that bit zero is the least significant bit of the byte.
Because all modern computer architectures assign addresses to bytes, not bits, it's up to us to decide which way to number the bits. But we should always number the bits the same way we number the bytes.
Once you've spent as much time twiddling bits as I have (as the author of proto2 and Cap'n Proto), you start to realize that little endian is much easier to work with than big-endian.
For example:
- To reinterpret a 64-bit number as 32-bit in BE, you have to add 4 bytes to your pointer. In LE, the pointer doesn't change.
- Just about any arithmetic operation on integers (e.g. adding) starts from the least-significant bits and moves up. It's nice if that can mean iterating forward from the start instead of backwards from the end, e.g. when implementing a "bignum" library.
- Which of the following is simpler?
There's really no good argument for big-endian encoding except that it's the ordering that we humans use in writing.I think the correct answer won here.