I found the reply to Robert Tarjan interesting, and I wonder how many people who...

leephillips · on May 21, 2014

I think the situation may have been like this:

Algorithm A has complexity like AO(n) and algorithm B goes like BO(n^2) (just made-up examples). So one would say that A is better. But the constant A happens to be very large, so that, until n becomes very large, B is actually faster, because the constant B is so much smaller. So conventional analysis would say that A is better in general, but its actual poor performance is "hiding" in the huge constant A, and is worse than B in real situations, where n never gets huge enough for A's complexity advantage to become relevant.

mcguire · on May 21, 2014

Jon Bentley (I think) described one situation once, where the problem was swapping two substrings in a string (again, I think). One of the algorithms was the double reversal trick[1] and the other wasn't; both were O(n). He implemented both and ran them on a large string on his (small) machine. One program finished in seconds while the other he terminated after several minutes.

The difference was that the string was large enough to force the machine to swap memory out to disk and the slow version was exactly pessimal regarding the machine's paging algorithm. It would touch a location on a page, forcing it to be brought back into memory, and then wouldn't touch the page again until it had touched all of the other pages of the string. The fast program did a lot of work on one page before going to the next and finished relatively quickly, the slow program thrashed. The same thing can be seen with modern machines' memory caching.

However, this kind of complexity analysis is intended to abstract over machine specific details like cache line sizes and paging algorithms, to let algorithms be compared in the abstract. Usually, if you don't trip over weird edge cases and if the problem is large enough (an assumption that is part of the analysis), an algorithm with a better O notation will kind, sorta, maybe always be faster than one with a worse.

(I think you have a misapprehension: the "large constant factor" is part of the simplification process from the hairy formula that actually describes an algorithms' performance (see Knuth's discussion of a "recurrence relation") to the abstract summary of Big-O: O(f) = n, O(g) = n^2,.... It is not any factor in the program itself. For example, O(f) = n is true if f(x) = x and if f(x) = 2x. But the second one will be slower.)

[1] https://tecnocode.co.uk/2008/11/19/string-swap-algorithms/

Jtsummers · on May 21, 2014

Radix sort, its runtime is O(d * n) where d is the number of digits and n is the number of numbers being sorted. So for practical purposes d could become a constant in a given implementation. Compare that to other efficient algorithms (like merge sort) that are O(n log(n)). The constant could, below a certain input size, be larger than log(n).