Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

glibc qsort's implementation is in libc.so, not in the header. GCC doesn't have anything to work with.

It's also an apples-to-oranges comparison, since std::sort and qsort implement different algorithms.

A lot of std::sort's performance is actually from using the version without any callbacks. If you pass a comparator function which just compares two integers the obvious way, it gets much slower. So one of std::sort's biggest advantages is actually not that it uses templates, but that it's specialized for the common case of not needing a custom callback. Theoretically the compiler should make the two cases the same, but apparently GCC is too dumb (that's not a slight on GCC; I think people expect too much from compilers):

  ------------------------------------------------------------------------
  Benchmark                              Time             CPU   Iterations
  ------------------------------------------------------------------------
  std_sort_random                 52881299 ns     52873089 ns           14
  std_sort_with_callback_random   63319633 ns     63307876 ns           11
  qsort_random                   106803314 ns    106784567 ns            7
  external_sort_random            97642851 ns     97640888 ns            7
  std_sort_sorted                  8433311 ns      8432564 ns           82
  std_sort_with_callback_sorted   13868016 ns     13865170 ns           50
  qsort_sorted                    28098439 ns     28093720 ns           26
  external_sort_sorted            33629020 ns     33628108 ns           21
external_sort is just std::sort hidden behind an extern function implemented in a separate .o file. Those benchmarks are from sorting 1MB of random and already-sorted data (as indicated in the names). I think it's important to test such cases, because often online people benchmark code which is written all in a single file, whereas real-life C++ projects are usually organized in such a way that every little class is in its own little file, which gets compiled into a separated object file, and then it all gets linked together without LTO. And then those same people go on to claim performance benefits of their language without actually using the setup which enables those benefits, which IMO is a bit dishonest.

When I drill further down into everything I want to drill into, maybe I'll publish the source for the benchmarks somewhere.



> If you pass a comparator function which just compares two integers the obvious way, it gets much slower. So one of std::sort's biggest advantages is actually not that it uses templates, but that it's specialized for the common case of not needing a custom callback.

This is not true. `std::sort`'s default comparator is a `std::less` object. The advantage comes from using a stateless callback functor object. If you pass a capture-less lambda instead of a function pointer, you can reap the same benefits as using the default comparator. Even if that capture-less lambda just forwards to a specific free function anyway.

In short, `std::sort(beg, end, [](auto x, auto y) { return foo(x,y); })` can be faster than `std::sort(beg, end, &foo)`.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: