I do enjoy these 21st Century partial rediscoveries of the good old trusty Fibonacci Polyphase merge sort. It's like the late 70's / early 80's all over again, only with commodity rather than custom hardware.
The book references are:
External Sorting, Section 5.4 in The Art of Computer Programming, Vol. 3: Sorting and Searching by Donald Knuth, Addison-Wesley, 1973.
and
External Sorting, Chapter 13 in Algorithms by Robert Sedgewick, Addison-Wesley, 1983.
Sorting and merging blocks is the beginning of the journey, learning the various tricks to doing it optimally is where it gets interesting, although perhaps no longer relevant; 30 years ago the OPs approach might take nearly a day with polyphase tweaks reducing the grind down to under half an hour, these days it's all over in a fraction of a second and more or less bound by I/O throughput speed - does optimising sort times still matter?
There's some online discussion here: http://flylib.com/books/en/3.55.1.115/1/
The book references are: External Sorting, Section 5.4 in The Art of Computer Programming, Vol. 3: Sorting and Searching by Donald Knuth, Addison-Wesley, 1973.
and External Sorting, Chapter 13 in Algorithms by Robert Sedgewick, Addison-Wesley, 1983.
Sorting and merging blocks is the beginning of the journey, learning the various tricks to doing it optimally is where it gets interesting, although perhaps no longer relevant; 30 years ago the OPs approach might take nearly a day with polyphase tweaks reducing the grind down to under half an hour, these days it's all over in a fraction of a second and more or less bound by I/O throughput speed - does optimising sort times still matter?