Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Compile-Time Sort in D (dlang.org)
115 points by teleforce on Feb 12, 2022 | hide | past | favorite | 57 comments


I like to use compile time code to generate fixed tables, like these:

https://github.com/dlang/dmd/blob/master/src/dmd/backend/ope...

Notice how a static immutable array is created by a lambda at compile time. The lambda never appears in the executable. Before D, I'd use a separate C executable to generate the tables, which were then #include'd.

The lambdas can also include sanity checks:

https://github.com/dlang/dmd/blob/master/src/dmd/backend/ope...


(2017)

More impressive: manipulating strings at compile-time.

Even more impressive: using string manipulation at compile-time to build valid D code, then use string mixins to inject it into the current program, passing it on to the compiler to be compiled as part of the program being built. This is done by std.regex to compile regular expressions to native code during compilation.

Even further impressive: using a string import to read a DSL from disk, transpiling it into D code, and passing it on to the compiler. This is used by Vibe.d to compile HTML templates to native code.

Granted, all this is fairly old news, and a few other languages have since caught up.


pretty much lisp macros with the constraints of a compile language.

haskell templates are similar.


Template Haskell and Rust proc macros aren't similar to pervasive CTFE.

TH/procmacros just give you lexical tokens, you then parse them (typically adding a monstrous dependency on haskell-src-exts/syn) and emit AST.

The way D is designed, you don't need such a special facility because CTFE is everywhere. You can run normal functions at compile time. You can have compile-time "static if" and "static foreach" statements, and use various introspection calls (confusingly called "traits") in those.

Zig has similar "comptime" functionality. C++ now has "if constexpr" but not any of the introspection features.


Most people don't compare D to Lisp or other languages with excellent metaprogramming. D is almost exclusively compared to C++ and C++ went through a painful metaprogramming phase of its own in the 2000s.

While C++ is not by any means a great meta-language, it's improved considerably since that time. D still has some interesting metaprogramming features, but on the whole I don't think it's enough to justify the rest of the baggage that comes with the language. Buggy implementations, releases that constantly break valid source code and make it a pain to work with dependencies along with a poor/unscalable garbage collector.

The ability to write a string literal that contains D code, perform raw string manipulations on it and then compile the result certainly sounds neat on paper and if you're working on a solo project it can be a fun exercise to blog about, but beyond the novelty you'd hardly find a mature or reliable codebase written by a team of professionals using hacks like these.


You can use malloc/free and RAII if you prefer it to automatic memory management. It's nice to have a choice.

Having a high performance GC requires inserting write gates into the executable code. This is only worthwhile if GC is the only allocation strategy the language uses. As the GC is optional in D, and is not actually used that much it is not worth the general slowdown of the write gates.

> you'd hardly find a mature or reliable codebase written by a team of professionals using hacks like these.

I wouldn't be so sure, as mature and reliable codebases use the hackish unhygienic preprocessor macros every day :-)


> I wouldn't be so sure, as mature and reliable codebases use the hackish unhygienic preprocessor macros every day :-)

Isn't it funny how the most mature software (and often by extension, "reliable" e.g. have been in use the longest with the most bugs beaten out by years of hacks and patches) seem so far away from "best practices"?

I've always wondered about the root of this. Are programmers trudging along using "brute force" more effective, or is it just pure survivor bias on projects/products that people didn't give up on?


It's probably useful to distinguish between two kinds of "best practices".

One kind is things dealing directly with the underlying thing you are trying to accomplish with a program. Say it is a program doing some sort of physics or engineering calculation that at some point needs to add a bunch of floating point numbers. Best practice might include using something like Kahan summation [1] to reduce the error.

I'd expect to see that kind of best practice in both mature programs and new programs because it is something that directly relates to what the program actually is trying to do.

The other kind of "best practices" are those concerning the environment and practice of programming itself.

Many of those seem designed to deal with the case where there is a lot of turnover in who is working on the software. People join a project, work on it for a year or two, and then move off to something else.

You adopt these best practices so that new people will be productive faster so you can get useful work out of them before they leave. So these best practices tend to change as fad and fashion come and go.

The mature software that has been in use for a very long time I think is more likely to have a much higher percentage of its work done by people who have been working on it long term, and have a lower turnover, and when someone new does come on they stay longer.

They don't need to have their best practices match the fad and fashion best practices used by the shorter term projects.

[1] https://en.wikipedia.org/wiki/Kahan_summation_algorithm


Probably because best practice and changes at a faster rate than coding standards for old codebases. I suspect this observation exhibits some survivorship bias as well, then again you get things like OpenSSL...


Is it possible to use the standard library without the GC though? It would be nice to have at least some string/container/IO libraries that are designed to be used ergonomically without the GC.


Large amounts of the standard library are deliberately lazy so don't actually allocate at all. Exceptions however do end up using the GC, but there is a flag to work around that.

Libraries that don't use it at all are available. At work we have a large library, being a hedge fund, that must be consumed from Excel. This means it cannot use the GC because [reasons], it's not a particularly big issue.

Garbage collection is not a bug, by the way. It's one of biggest productivity wins D has over C++. Performance is often as near as makes no difference identical, only you now eliminate all memory safety issues without thinking about lifetimes.

D also has very nice composable allocators in the standard library


Yes. Not much of the library uses the GC.


No one does manipulation of D code as strings, and no one is really telling you to IMO.

I do however work on a large-ish codebase with many professionals developers which makes extensive use of metaprogramming. We also don't have particularly big issues with moving from version to version so how old is your hearsay?

C++ doesn't even have proper modules yet, it's still not as good as D in more regards than just metaprogramming.


It certainly has, using them on Visual Studio 2022 C++20 mode, and you don't need D's hack to import them inside functions.


Which hack would that be?

So visual studio has an implementation not many people use, what of the other compilers? I'm still yet to even compile a project that actually uses them, and they must have spent a significant proportion of my entire life trying to get them in the standard.


It is on the standard, update yourself.

Naturally not every compiler is yet fully C++20 compliant but they will get there, VC++ is there, next release of GCC is looking quite good, clang well lots of issues anyway now that Apple and Google resources went away and not everyone has the same love for upstream.

As for the hack, go dive into D forums from like five years ago thereabouts, since import statements in D fully parse the source of the respective modules, the hack to speed compilation time was to move import statements into the function bodies so that they are only processed if that code is actually parsed.

If I am not mistaken, Andrei was one of the persons suggesting this.


I'm aware it's in the standard now, my point is that it's taken them a very long time to get to this state and what they do have isn't all that impressive. Concepts have possibly taken my actual whole life although I don't know when the first proposal was.

That is a damp squib, of all the things I view as hacky in D that is not one of them.

Local imports are favoured for stylistic reasons anyway, they don't actually make all that much difference for code that isn't dead anyway. What's wrong with importing things where they are used?

IIRC Andrei proposed the local imports, but it was only a matter of removing an error message because they were banned at the time.


No matter how long C++ takes to adopt something, the language at least knows what it wants to be, and has an ecosystem to keep it going for decades that will outlive us.

As for what is wrong with local imports, it makes impossible to track down module dependencies unless one goes grepping for imports all over the place.


> As for what is wrong with local imports, it makes impossible to track down module dependencies unless one goes grepping for imports all over the place.

Nobody is making anyone use them. It's a personal preference.

As for "impossible", `import` is a keyword, so it's trivial to find them all using, yes, grep. D was designed to be more greppable than C/C++. For example, it's really hard to find all the cast expressions in C, but since `cast` is a keyword, it's trivial in D.

(Cast expressions are all potential bugs, so being able to find them all in a code review is a good thing.)


You call that a hack? The reason for it was "locality", i.e. it makes it clear what dependencies a function has. Compiling faster was just a side effect.

> five years ago

Is an eternity in the software business.


> Compiling faster was just a side effect.

The location of an import has nothing to do with compilation speed. The only time it would ever make a thing compile faster is if 1) it is local inside a template function, 2) that template is never actually used, AND 3) the module is never imported anywhere else in the program.

That does happen - I use this pattern for optional dependencies which aren't even looked at if not actually used - but it isn't generally the case. You have to actively think about it to fulfill all three requirements.


It is an hack, given that other languages don't need such "solutions".

In those 5 years, D's competition reduced even further the need to actually bother with D, slowly taking away all features that made D special.


Meanwhile D still exists and has all those features.

If it were a hack, people would only do it for it's hacky-ness rather than as a style preference. The former is simply not true in industrial usage of D. In less words: It's bullshit.


The features alone are useless without an ecosystem.

What industrial usage?

A couple of companies that still haven't yet left the community, because each year something new is going to be the Thing to bring everyone into D?

Lets see how D manages to survive when the core team is no longer calling the shots.


> something new

What languages don't have something new added on a regular basis? Yes, we improve D consistently and regularly.


As noted elsewhere it seems your experience is somewhat outdated: the releases of the LLVM D Compiler (one of the two compilers worth using for production builds, the other being GDC) are buffered to the bugs introduced in DMD (which is more stable than it used to be although there are still regressions), and there is a fork based GC available for linux, but as the GC will only ever trigger on allocation, don't use it and it won't collect.

> While C++ is not by any means a great meta-language, it's improved considerably since that time.

C++ has also painted itself into a corner multiple times too, which despite being technically an improvement over the status quo are lacking severely in their utility. C++ screwed up "constexpr if" big time by always introducing a scope (which costs you a pair of {}'s in the rare occasion you need one) which means you can't conditionally insert declarations (i.e. variables, structs/classes, functions).

> but beyond the novelty you'd hardly find a mature or reliable codebase written by a team of professionals using hacks like [string manipulation and mixins].

They are a wonderful hack when you need them and nothing else will do what you want. This is not unlike resorting to macros in C++, except that its hygienic, unlike macros.

I'm not claiming the project is mature and I'm only one person, but reliable definitely out there. The most heinous set of string mixins i've ever written[1] has definitely got to be the code for generating wrappers to call the OpenCL object property querying functions (clGetDeviceInfo & friends). You need to pass a size and a void pointer to the address of the return object that you have to call once, twice or more (depending on the type of the queried property) to figure out how much memory you need to allocate to call it again.

The important thing is that the interface[2] you use to drive this code generation is very clean and return on investment for getting the generic case correct is large.

[1]: https://github.com/libmir/dcompute/blob/master/source/dcompu...

[2]: https://github.com/libmir/dcompute/blob/master/source/dcompu...


About the 'constexpr if' introducing a scope, they did it volontary against the very good (and fun) proposal of Andrei Alexandrescu, so there must be arguments against 'scopelesd constexpr if' even though I admit that I don't understand what they could be.. After all #if also doesn't introduce a scope!


> they did it volontary [sic]

That really undersells the `static if considered paper` which has got to be the most offensive standards paper I've read.

> so there must be arguments against 'scopelesd constexpr if'

I've yet to find any. What I don't understand is: you hardly ever need to introduce a scope with Ds `static if` and in the very rare cases you do it takes all of two keystrokes, but with C++ you are completely unable to get rid of the scope.


> That really undersells the `static if considered paper` which has got to be the most offensive standards paper I've read.

Indeed. Never read something stinking so much of sour grapes from NIH-suffering old farts.


Bjarne wrote the rejection of mine, Andrei's, and Herb's proposal to include static if:

https://isocpp.org/files/papers/n3613.pdf


An awful paper


Been parsing JSON at compile time with std.json for years, nothing ever broke. It wasn't even designed for this.


> Buggy implementations, releases that constantly break valid source code and make it a pain to work with dependencies along with a poor/unscalable garbage collector.

I had the misfortune to have to use D professionally for a little over a year, and I can confirm all of the above are sadly true. The garbage collector was a headache from day one, and was never not a problem.


May I enquire when this was, and for what applications?


It was writing back-end systems code around 2017. I had to help develop a whole specialized production and deployment/restart system just to get around memory issues in production.


D is the power of lisp macros with a native language and without a compiler in its runtime library.


You can get virtually the same effect with marking functions `constexpr` in C++. Yes, you need to do it explicitly, and yes, you can't call functions that aren't marked `constexpr` inside, but it's more or less the same - just opt-in.

In fact, since C++20, you don't need any template metaprogramming for compile-time sorting - std::sort is constexpr! [0]

Demo: https://godbolt.org/z/6M1Wqc31W

[0]: https://en.cppreference.com/w/cpp/algorithm/sort


I’m worried about the compile-time performance though. C++ compilers currently use direct tree-walking on the AST to interpret code, which is going to be horrendously slow once you go past a certain number of elements.


I’m not sure what GCC and MSVC are doing, but Clang [1] at least is in the process of developing a bytecode interpreter for running constexpr code more efficiently.

It seems like it hasn’t been actively worked on in quite some time, unfortunately, based on the commit logs.

[1] https://clang.llvm.org/docs/ConstantInterpreter.html


Same code in Nim:

    import algorithm

    const a = [ 3, 1, 2, 4, 0 ]
    static: echo(sorted(a))


Not sure why this was downvoted, it was a valid comparison from someone who is highly involved in Nim development. Nim compiles to C(by default, other backends available) and the static means it is done at compile time.

The article itself is a D implementation of something done in C++ to show how D does it, and how much simpler it is. The Nim example is right in line with this.


Nim can run almost any code (that doesn’t interface with C FFI) at compile time, since in addition to a compiler they have a bytecode VM that executes a large subset of the language. (https://nim-lang.org/docs/nims.html) Really cool stuff, the only other language that I’ve seen doing this approach is Jai, which is sadly still in closed development. (Jai’s CTE is much grander in scope, in that it would be able to do anything normal code can do.)

C++ (gcc and clang) and D’s compile-time evaluation is a bit more limited in scope, and still uses a tree-walk interpreter, although D might also migrate to a bytecode VM someday (https://dlang.org/blog/2017/04/10/the-new-ctfe-engine/)


D's doesn't allow global variables or pointer arithmetic or monkey business like converting integers to pointers, the rest is supported.



Good catch, forgot about Zig. Although it doesn’t have a fast bytecode VM dedicated for running comptime, Andrew has stated that it will achieve CPython-level performance once the self-hosted compiler is finished, so things might become better.

https://github.com/ziglang/zig/issues/4055


> the only other language that I’ve seen doing this approach is Jai

Many (most?) Lisps have had full arbitrary CTE since ... I'm not sure, probably before C existed?


I thought most Lisps were interpreted…


Common myth. Many have an interpreter but compile to ASM. A couple compile to JVM - Clojure and ABCL. One, ECL, compiles to C. Writing a Lisp compiler in Lisp isn't much different than writing an interpreter. See the compiler in Paradigms of Artificial Intelligence Programming for an example.


> Writing a Lisp compiler in Lisp isn't much different than writing an interpreter

Writing a lisp compiler in lisp is a very different task from writing an interpreter. Especially if one expects to bootstrap.


The idea of static meaning “fully evaluate this at compile time” in D is a good one.


Obligatory C++ comparison: C++ has constexpr since C++11 and constinit since C++20. They mean more or less the same when applied to variables ("must be evaluated at compile-time"), but constinit does not imply constness.


D evaluates until it hits something it can't manipulate at compile time whereas constexpr requires things to be annotated as such, so the difference is slightly more subtle.


While true, Circle C++ compiler works as D does, although that is not standard, thus D does win in expressiveness.


Another example of compile time evaluation: one of my favourite things about the guile repl is the ,opt[imize] command at the repl. You run it on a piece of code and it shows you the result of macro expansion, inlining, DCE, some other things, and most notably partial evaluation.

One example of how I use it can be found here: https://git.sr.ht/~bjoli/goof-loop/#speed in the first example where I contrast ,expand (macro expansion) and ,opt (the source->source optimizer.


After spending time with Elixir which has hygienic macros and the whole language available to you at compile-time, this is no longer interesting to the old C++ dev in me


You'll have lots of fun in Haskell should you decide to learn it.


D has many cool features. I always wonder why it is not (more?) popular?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: