1: Yes, you will eventually need to mmap more memory. Since you want to minimize the number of mmap calls necessary for things like heap allocations, you generally allocate in larger chunks then necessary and then malloc etc. allocates from these chunks. It doesn't matter too much in terms of memory waste, since in the back end, linux allocates physical memory lazily, so the extra allocated block of virtual memory doesn't get a physical allocation until something tries to access it. For this reason, it can sometimes seem like a malloc call completes successfully, but you run into an out of memory condition once you try to actually use your "successfully" allocated memory.
2. It turns out most kernels don't actually enable the NX bit automatically, with the reason given being that early processors have buggy implementations, so by default the NX flag is not honored. As a result, all memory is executable by default, unless explicitly set not to be (i.e. using memprotect), and this is only if the kernel is compiled to support NX, which it may not be. This is changing, and some systems such implementing SELinux or similar are supposedly stricter but I haven't actually tested this out myself.
1. Elaborate more on this, as this doesn't agree with my understanding of it. Glibc Malloc allocates differently depending on the size of the allocation -- glibc uses dynamic mmap thresholds, which as I previously stated starts at 128kb, but when blocks larger than the current threshold are freed, the threshold is adjusted to the size of the freed block, unless once again, you've set any of the slew of paremeters that disable the threshold in which case you'll go back to either "old school" brk() memory management by storing current and previous results of sbrk(0) or you will have arenas for small allocations managed through mmap().
2. If only this were true, I regularly pine for the days of EIP -> 0xc0c0c0c0. It is true that some compilers generate entirely RWE memory pages, but usually for much different reasons (Executable stacks for GCC trampolines, for example)
At a guess: eventually, yes. I don't think this actually writes over the heap at all - it writes to the .bss section instead. In theory, if left to execute (without the INT3s) it'd probably segfault as soon as it hit the end of the page containing .bss.
malloc could be used to expand the heap, which conveniently appears after .bss. The pointer returned would probably still need to be followed, since heap allocations might not be contiguous (and mprotect needs to be used to mark the pages executable).
Not necessarily, you can continually write until you hit the end of the uninitialized data segment, which is typically 128kb (or MMAP_THRESHOLD) after the last heap allocation (that itself was under 128kb, otherwise the memory will be, as you state, mmapped, which has no contiguity guarantees)
Also, it seems weird to me that the heap pages are marked as executable by default.