Hacker News new | past | comments | ask | show | jobs | submit login
The first new build of Circle, a new C++20 compiler, since April 2022 is online (github.com/seanbaxter)
122 points by davikr on Jan 22, 2023 | hide | past | favorite | 78 comments



Yes yes yes. Everything about this document screams good sense and logic.

It's clear why Carbon looks the way it does - they want to do what Kotlin did for Java and using the same strategy. But, it won't work. They understood Kotlin's strategy as binary compatibility at the per source file level but that's only half the story. The quiet monster of Kotlin's success is j2k which is a source translation tool built into the IDE. It rewrites a Java file to Kotlin and then applies all the intelligence in the ide plug-in to clean up the results, meaning even idioms get translated properly.

Jetbrains could do that because they were developing the IDE plug-in alongside the compiler from day one, and because Java is quite easy to parse, and because Kotlin deliberately constrained itself to Java semantics even when sub optimal.

I had high hopes for Carbon when I first heard about it but that number of proposals without even having a compiler is crazy. How do you port existing code? You can only use it for newly added code, presumably, and only if your new code doesn't interact with any c++ using the bits the carbon team don't like? That's light years from the type of interop that made Kotlin a success.

Sounds like this guy actually understands the problem in depth. I hope he's able to attract customers, though it's a tough sell to make your company codebase depends on a language maintained by only one guy.


Truly an amazing project. One person implementing the entire C++ standard, and then countless new, useful features on top of it. Circle did the interpreted, compile-time pass idea before any of the other new systems languages.

Carbon and C++Next both seem very directly "inspired" by Circle, but neither of those efforts seem to have actually produced anything yet.

Ideally, Circle it would be open source. But I understand wanting to hold out for some amount of corporate sponsorship first.


He seems like a wonderboy. His linkedin has a recommendation: "Cannot say enough good things about this guy--probably the most capable programmer I've ever met."


Absolutely, also Sean made enough money in his previous life to be able to work on Circle full-time (while living in NYC), truly living the dream.


100x programmer.


> Circle did the interpreted, compile-time pass idea before any of the other new systems languages.

constexpr is in C++11, Jai is from 2014, and Zig 0.1.1 is from 2017.

This project looks quite cool though.


I think the first take on the subject was IBM with their PL/I Checkout and Optimizing compilers. They've integrated almost full PL/I language interpreter in the macro pre-processor stage. I was using it in the beginning of 80ies.


With the small difference that Circle can use the complete language at compile time, constexpr is still far from it, and also gave birth to constinit and consteval.


Jai gets to use the entire language.

constexpr in C++ 11 is indeed very limited, arguably even more than Rust's const functions today.


Except that almost no one has access to Jai.


Sure, I'm not commending Jai in general, but it's a clear existence proof on this particular matter.


Well, Lisp and Dylan also did it first.


D has had compile time function execution since 2007 or so.


Don’t forget about cpp2/cppfront — Herb Sutter has been in direct contact with Sean and it seems their discussions have had some impact on both their projects.

cpp2 is currently my favorite contender in this arena


> One person implementing the entire C++ standard, and then countless new, useful features on top of it.

Isn't it done on top of LLVM/Clang++?


LLVM? Yes. Clang? No.

The front-end compiler and standard library are written from scratch, but it uses LLVM as the backend. The standard library is not yet complete, but it's amazing how it uses Circle's unique features to implement in a very elegant and efficient manner certain classes like variant and tuple which are absolute monstrosities of complexity in clang/g++/MSVC.


From all possible wannabe be replacements for C++ that poped up in 2022, Circle is definitely the one that has the most going for it.

Everything else requires rewriting everything, and they can only support a limited subset of C++ features, so if the goal is full compatibility with existing code they aren't going to achieve it anyway.

For full rewrites, we already have enough alternatives with more maturity years behind them.


I hope circle is the thing that makes the committee realise there is an alternative to their attitude of 'we're not versioning files, we're not giving you opt-in features, we're not giving you new keywords, we're not fixing demonstrably bad and wrong behavior'. But I doubt it. I've been in and out of C++ for over 20 years, and every time I'm 'back', it's the same old tired story.

I'm more interested in Val than the rest, but, Val isn't really a C++ replacement (in terms of syntax, at least), it's a language that offers C++ interop. To quote an old joke 'How do you get to Dublin? Well, firstly, I wouldn't start here'. I'm not sure whether any new language in 2023 should start with C++ interop as one of its main goals.


It does look highly doubtful. On the one hand, there are things like 10'000 and 1_blah notation that obvoiusly breaks backwards everything, plus completely novel mini languages, like the stuff inside lambda []. On the two hand, there is the overpowering insistence on the c++ abstract machine with its lifetimes that has resulted in monsters like bit_cast, and has no resemblance to actual hardware, living or deceased. So the feeling is more of an ideology driven developement than any practical considerations


> we're not giving you new keywords

Since when? Last time I looked C++ had almost a hundred keywords. C++ 20 added a bunch including "requires" and "concept" - both ordinary English words which oops, too bad now your software is incompatible because it used the wrong identifier.


It’s extremely difficult to get new keywords into the language. This is a very large part of the reason we can only have new functionality in the library.


I suspect they're talking about co_*


This lays groundwork for opt-in safety per file, but safety should affect API design. Once you get done doing this to every file and refactoring every public API, have you really spent less effort than would have gone into a rewrite? If you only get halfway there, is the new version going to blow up any less often?


Depends if the goal is to improve the safety of existing C++ code, or fully rewrite into something else.

Until the likes of LLVM, GCC, CUDA, V8 and co get rewritten into something else, improving existing C++ codebases is still an issue.


I agree that safety should affect API design, which means that there is at least one major piece missing after the rewrite. Either the ability to switch to an even stricter type system (e.g. Rust) or a ~zero cost binding generator to a stricter language (again, e.g. Rust).

However, before reaching that point, piecewise refactoring is something of considerable value. There are huge C++ codebases lying around that I believe would benefit immensely from a rewrite in a safer language, even if it's not as safe as Rust (or Scala, or F#, etc.)

Firefox and Chromium have already paid the entry price for piecewise refactoring into Rust, so they're probably not the best candidates. But there are many other examples that undoubtedly are, e.g. LLVM.


C++ templates cannot be reused or rewritten piecemeal in a new language, it's practically impossible. There's a reason why the Rust folks were very careful to separate out generics from the more flexible case of macros (that come in both rewrite-based and procedural varieties).


Out of the box, they can't, but I can imagine mitigations for that, if you have robust enough static analysis on your new language. For many cases, it's fairly trivial to generate a piece of .cpp that simply instantiates a template for a specific type and can be linked against.

Of course:

- this doesn't cross library boundaries;

- this assumes robust enough static analysis, which is not possible in the general case, since templates are non-decidable in the general case – so you need to restrict your new language to more reasonable semantics;

- let's not even get started on the interaction between templates and C-style macros – at some point, you need to throw the towel.


Scala and F# are safer than Rust since they're on a VM.


Well, you can compile Rust to WASM. But yes, point taken, out of the box, Scala and F# have a safer execution mode.


>> Scala and F# are safer than Rust since they're on a VM.

Do you have any evidence to back up your claim?

Both the JVM (Scala's VM) and the CLR (F#'s VM) have a history of vulnerabilities:

https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=java+hotspo...

https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=dotnet+clr

Programming languages running on VM is not a safety guarantee.


In the relevant sense of safety, they are safer. The commenter's claim was not about vulnerabilities, since those apply to implementations, not to languages.

The semantics of F#/Scala are memory safe because they're defined to be dynamic. The semantics of Rust are not memory safe. A subset of Rust is provably memory safe, but none that uses eg. much of the standard library which is unprovably memory safe against the lang's own semantics.


Given that F# has unmanaged pointers (nativeptr), wouldn't it also be true that only a subset of it is provably memory safe, just like Rust?


It seems weird to treat stuff that Scala must rely on but you can't write in Scala as part of the implementation, and yet not do the same for stuff Safe Rust has to rely on from Unsafe Rust.


Breaking down the full argument is very subtle, sure.

But, at the outset, we can simply observe that the "safety" Rust provides is memory safety, which is a non-issue in a dynamic language.

To say that Rust is as safe, or more safe, than a dynamic lang is false. It's strictly less safe, since by construction, it is a language where allocation is managed by the programmer.

The whole argument about "saftey" in the sense Rust addresses is between languages of "manual allocation". To imagine that any of these language is "safer", is to have misunderstood a great deal. Rust is about enabling an otherwise C-programmer to limit the impact of manual memory handling, it is *not* to prevent eg,. use-after-free errors in javascript. Since there are no such errors by construction.


The same "by construction" safety applies to Safe Rust as to the dynamic GC'd languages you're discussing. You can't write a use-after-free error in Safe Rust, the same way you can't write one in Javascript.

But, Safe Rust also chooses to construct safety that those GC'd languages don't have.

The reason Javascript doesn't have data races for example is that it doesn't have concurrency - can't have a race with only one runner. Java does have data races (with limited but still unsettling consequences), Go's data races are really bad, like actual Undefined Behaviour bad in some cases. Whereas Safe Rust doesn't have data races - you can write concurrent Safe Rust but you can't write a data race.

The difference you're grappling for probably feels intuitively like it should exist, but I assure you it does not.


You are right that the argument boils down to what we take "by construction" to mean.

I will resolve this, simply, by saying whatever advertisement Rust makes for itself, is better made by any "dynamic" language.

And if we're inclined to trade PR against PR, Rust looses. Managed memory has better PR.

And so those repeating the Rust agitprop against managed languages are simply "NPCs for the wrong ideology".


I would articulate things differently.

Theoretically, memory-safety between Rust and managed languages is equivalent.

However, memory-safe languages (i.e. not Managed C++) running on a memory-safe VM have one more layer of defense in depth than Rust. This matters in case there is a bug in the compiler or any of the dependencies. Additional layers may include, for instance, Unix and containers.

"dynamic" languages, though? That's entirely orthogonal to memory-safety and not very good for general software safety.


Well, it is the same 30% that Rust also cannot help much.


As far as I understand, both these languages are subject to data races whereas (safe) Rust is not.

Now, Scala is a JVM language and so presumably you get Java's rules, you aren't Sequentially Consistent but the damage is constrained - however human programmers still can't successfully reason about this state in non-trivial programs.


There are plenty of other races that Rust doesn't provide any help, and those are much more relevant in distributed computing workloads.


Your first question ignores the crux of the matter: from scratch rewrites of software that is actually worth rewriting are always failures in the making. While the rewrite is ongoing, the original must undergo continued maintenance and this (a) doubles the workload of the team and (b) presents an ever moving target for the rewrite. This means that the effort that goes into piecemeal improvement is the minimal effort possible that can succeed.

Successful piecemeal rewrites will result in gradual improvement, so if you don't see that gradual improvement over time, something is very wrong with the project.


Kotlin managed it in some cases with clever forwards compatible type system tricks, so it's not impossible


Except it does so by compiling into the same bytecode as Java, javap will just output Java like code from class files, and some stuff like co-routines cannot be called from existing Java code without wrappers that setup the runtime semantics as expected by Kotlin libraries.


Yes, but so? When they can implement a feature in a forwards compatible way they do, when they can't, they don't and it's caveated as such. That's why nullable types and other Kotlin specific type stuff are encoded using annotations.

It's the same strategy here. Carbon is (was?) also intended to compile to the C++ ABI and so does Circle, that's why it says it doesn't run on Windows (commercially a huge error. many of the biggest C++ codebases that could benefit are running on Windows like game engines, and Windows doesn't impose a specific C++ ABI anyway). But there's no way to take existing code files and switch them to the new language in Carbon, at least not yet. Kotlin was carefully designed at every step of the way such that every Java program can be expressed in Kotlin without change, even if that meant compromising on some things. Seems like the Carbon guys are being sucked up by the lure of safetyism and just want to make their own version of Rust that happens to compile to the Itanium C++ ABI.


I'd like to target Windows. Windows does impose a specific C++ ABI, but it's completely undocumented. It's really just whatever Visual C++ does. C++ ABI is very complex, especially vtable layout and RTTI and EH. I am looking to get Microsoft's assistance on targeting Windows. It's a priority, but they don't want to help yet.


Isn't that what com is for? Win32 is compiler agnostic.


https://itanium-cxx-abi.github.io/cxx-abi/abi.html See all the stuff there? exceptions, vtable layout, rtti, mangling, etc. There's a Windows equivalent for all of that. That's what I want access to.


Right I know what a C++ ABI is but I still don't get it. To write C++ apps on Linux you need to match the ABI because there are lots of raw libraries that are just straight compiled C++. On Windows it's not like that outside of the VC++ runtime itself. Everything is C APIs or COM, and wrappers around that. MS never documented their C++ ABI because as far as they're concerned it's not something anyone needs to know, in order to write Windows apps or have components/libraries interop with each other. They never expose raw C++ objects in Windows. When they have C++ APIs they come as source or headers.


It is only forward compatible, is stuck in decisions when Java 8 was current, there is no roadmap to keep the language in sync wiht the underlying progress of the platform as Kotlin tries to be everywhere.

Loom, value types, SIMD, Panama,... are all features that start to be an issue to expose as Kotlin features.

It is working on Android, because Google is pushing it as Java 8 => Kotlin, not as Java XYZ <=> Kotlin.


Loom doesn't need to pose any issues for Kotlin. If you target only the JVM, you just use virtual threads directly and ignore the coroutines stuff. If you target other runtimes you use them. If you use a KMP library, just surround stuff marked async with runBlocking {}.

SIMD/Panama stuff is just new APIs. No problem there. Kotlin maybe even wins because Panama is super verbose.

JVM value types seems never to arrive. Stuck in dev hell for years maybe. If/when it ever does see the light of day, Kotlin has value types and so it can just be compiled to a JVM value type to get the benefits. You'd need to opt in to an ABI break but no big deal.

So the binary compatibility part hardly seems an issue even looking forward a long way into the future.

It's probably harder for C++ because the language there seems to evolve quicker than Java!


What Kotlin calls value types aren't the real deal, just syntax sugar for primitive types with method with a single field.

KMP taken to the extreme, is better for Kotlin just to do its own thing with Kotlin/Native, except it is so behind that JetBrains is using Rust instead for Fleet services.

Java is expected to have language level support for some of those features, will Kotlin catch up?


First the acknowledgement: to produce this amount of work single handedly - the guy is undisputed genius.

do not want to go through the whole article but have this question:

fn ParseAsInt()

The goal for this new languages like Rust, Carbon, Go etc. is to introduce "better" alternatives to C and C++ languages and move the developers over. Nothing is wrong with that. Now to syntax: I completely understand syntax constructs that actually add value and deal with new / changed concepts like memory management, concurrency safety etc. However what is the point of changing already established syntax just for the fuck of it. What is fucking wrong with int X(). Why change something that does not need changing and increase the impedance of getting into new language?


I think the author is simply showing that Circle can implement Carbon syntax using its metaprogramming capabilities on top of C++. It is not making a statement that the syntax itself is desirable.


That's right. I'm fine with normal C++ syntax. But other people have other opinions. And there is an allure of getting to a CFG so that non-compiler tooling could build a parse tree and do useful transformations. That's a good a goal.

I'm basically looking to evolve on multiple fronts at once. If there's an interest in new syntax, put resources. If there's interest in a borrow checker (I'm sure there is), put resources there. Just move up the field however you can.


Guess it simplifies parsing, e.g. infamous "most vexing parse" etc.


Convenience should be for users of a tool not creator. And if it is fn than why not function instead. This single / double character constructs / modifiers in my opinion reduce readability. When reading Rust code apostrophe modifier for example is hard to see.


Tools are themselves "users" of syntax, and if you design a language without regard for their convenience at all, you end up in situations where e.g. code completion doesn't work because the syntax forces a forward reference (think of stuff like SELECT .. FROM ..), performance issues etc. And to the end users, the convenience should be considered as an aggregate metric: it's not just about writing code, but also about reading code, compiling code, debugging code etc. You have to optimize for UX across the board, which sometimes necessitates compromises in some of those areas.

And the declaration syntax for C++ is so bad that, in the most general case, you can't even reliably tell if it's a declaration or not without knowing the compiler that'll process it! Consider:

   template<size_t N = sizeof(void*)> struct a {
       template<int> struct b {};
   };

   template<> struct a<sizeof(int)> {
       enum { b };
   };

   enum { c, d };

   struct test : a<> {
       friend int main() {
           b<c>d; // declaration or expression?
           &d;    // error or ok?
       }
   };
The first commented line above parses either as:

   b<c> d; // local variable
or as:

   ((b < c) > d); // expression
depending on whether sizeof(int)==sizeof(void*) on that particular implementation. Consequently, the following &d is either legal or not, depending on whether it's trying to take the address of a local or of an enum constant.

That alone is, to me, sufficient reason to believe that changing it to Pascal-style "name: type" declarations, which are unambiguous, benefits both tools and humans, even if it's slightly more verbose.


>"That alone is, to me, sufficient reason to believe that changing it to Pascal-style "name: type" declarations, which are unambiguous, benefits both tools and humans, even if it's slightly more verbose."

I am not against it. It is a different language altogether though. We were talking about "fixing" C++ while it is still being C++.


As it happens, C++ has already adopted "auto f() -> T" function declaration syntax many years ago in addition to "T f()", so this is arguably just a more streamlined take on that. I don't see how it makes it less C++, since conceptually everything is the same.


I shall admit my defeat ;)


>simpler by disabling features that contribute to complexity, like multiple inheritance and function overloading.

Seeing function overloading show up surprises me. The only thing I can think of is having different semantics between different overloads, but removing overloads doesn't remove the issue. You just now have a poorly named function that isn't an overload.


If what you meant was actually a polymorphic function, just write the polymorphic function. If that's not what you meant then stop using the same name.

I think the contrast between string contains in Rust and C++ is illustrative. Overloading means you only get whatever parameter types the stdlib provides in C++.


Function overloading is orthogonal to the ability to define additional methods on existing types. C# has both, for example.

Overloading is a don't-repeat-yourself thing - if you have a bunch of functions that do fundamentally the same thing with different types, encoding those types in the function name when they're already explicit in the arguments is simply redundant.

Then there's an issue with ABI stability. Adding a new function argument breaks any existing compiled code even if there is a default value, but adding a new overload and redirecting the old one to call the new one with the defaults is fine.


> if you have a bunch of functions that do fundamentally the same thing with different types

That's where you should use parametric polymorphism, that's my point.

C++ 23 defines _3_ overloads of string.contains() with parameter defined as variously a char, a char * and a string view, enabling name.contains("Jim") name.contains("Steve"sv) and name.contains('Q'). But if you need a fourth, too bad.

Rust doesn't have overloading, so it defines string.contains() as polymorphic over the parameter type Pattern, which enables name.contains("Jim") name.contains('J') name.contains(char::is_uppercase) name.contains(['B', 'o', 'b']) name.contains(|c| { my_fun(c, 206, local_var) }) and so on and so forth.

One reason C++ doesn't do this is addressed in Circle, by offering "interfaces" which are approximately equivalent to Rust's traits or the C++ 0x Concepts (as Sean firmly points out, these were very different than Stroustrup's C++ 20 Concepts)

The problem is, what does 'Q' have in common with "Jim" ? In conventional C++ polymorphism we want a base class and of course these are basic types, they don't have a base class, much less one which has the necessary commonality.

With interfaces, we can say what we care about isn't some hypothetical "base class" of string_view and char, but instead a common interface they both have dedicated to string matching, and now we're writing polymorphic software again.


Just to be precise, traits and overloading are a form ad-hoc polymorphism not parametric.


Hmm, imagine I invent a new string matching predicate doop() which I want to define. My doop predicate says that the thing we're matching should occur both at the beginning, and the end, and these occurrences should not overlap. That is, there should exist some partition of our string such as that string.doop(pattern) implies all of: s1 + s2 = string and s1.starts_with(pattern) and s2.ends_with(pattern)

In C++ I need to overload doop() exactly three times, once for each of the three types we identified. I can't do it any other way, that's how it must be defined.

But in Rust I just write it once, in terms of Pattern, I don't need to understand how Pattern is actually implemented, that's opaque to me, I just use this Pattern trait and it all works.


In C++ you would write it once, in term of some concept or other. In practice you would implement exactly the same number of functions you would in rust.

edit: to elaborate: some functionality can be implemented generically (i.e. parametrically ) in term of some other concepts, recursively. At some point the concepts need to map to actual concrete implementation, then you use ad hoc polymorphism. This is the same in rust and in c++ [1].

Additionally in C++, even when you can implement some functionality parametrically, it is sometime useful to (partially) specialize functions and types to take advantage of some optimizations (for example it is possible to implement std::copy generically, but it is often specialized for contiguous iterators to trivially copyable types to call memcpy).

[1] Rust does of course have a more principled handling of these concepts, which are often underdefined in c++.


I see. So, for example, Microsoft's STL is doing this wrong in stl/inc/xstring where it has, as I described, exactly three overloads of each such predicate with a separate implementation for each of the three types.

Do you have an example I can look at where it's done the way you imagine it "should" be done in C++?


Proof by implementation: https://godbolt.org/z/WYjPjsTWY

The reason the standard specifies the three overloads of course is because chars and chars literals that have been inherited from C are funny and those begin/end pairs are dangerous. But that's a quirk of C++ string types and nothing to do with parametric and ad-hoc polymorphism in C++.

edit: also the three overloads in MSVC probably forward to the same implementation over string_view (except possibly for char of course that can be implemented more efficiently).

edit2: I use concepts in my implementation just because. It would work just fine without them.


Whilst that's clever I don't think it's actually practical at all.

I believe that your hypothetical approach, ignoring the fact it's not practical, is parametric polymorphism. We can invent more types of "needle" and use the same "needle" concept for other predicates, thus if we have N needles and M predicates that's N+M work, not N*M work as with the overloads.

However, what is actually practical, and thus what is done, is just ad hoc polymorphism via the overloading we looked at, if WG21 wants to expand it that's N*M implementation cost.

I do not believe we can claim something is ad hoc polymorphism solely because somewhere there's different implementation code for type A and type B as with your Godbolt example, if you believe that's ad hoc polymorphism surely you end up thinking std::sort is also ad hoc polymorphism which seems like nonsense.


The begin/end overloads are ad-hoc polymorphism. The contains function is parametrically polymorphic.

My "hypothetical" approach is bog-standard generic programming in c++ has it has been practiced for almost 30 years since Stepanov originally codified it.

ADL was litterally designed to make this sort of stuff work.


It's hypothetical because you yourself admit the standard library implementations don't actually do this here. The real C++ string contains() does in fact only work for the three specific types implemented, and is not generic, and that's not merely an oversight it's how it necessarily works even though it's worse. Circle's Interfaces would offer a much nicer approach, which makes sense because they do the same thing as Rust's traits for this scenario.

You're right that in some places C++ does the generic thing, I pointed at std::sort already - there are lots of examples of varying quality. The way we got here is that I pointed out overloads are the Wrong Thing™ and that's why it makes sense Carbon would want to outlaw them, and why no, outlawing them does not lose the nice property where the API feels generic - you can do parametric polymorphism and still have the generic API.

The main thing outlawing overloading gets rid of is actually functions/ methods which quietly have more than one distinct purpose. These functions are foot guns. Use your words, call the two (or more) things by different names reflecting the difference in intent and then fewer developers will mistakenly invoke the wrong meaning.

I don't like ADL either, but it's pretty far off topic.


Yes, interfaces open that up, and to a lesser extent choice types open that up. If your choice type supports all the alternatives from the original overload set, then you can reduce to one function.


Now, solve package management and I'll be ready to be back on board the C++ train!


I usually hate Microsoft tools but, vcpkg works really well. Easy enough to use from a couple of build systems, too

Can't comment on Conan but, I hear it's nice


One impossible task at a time :)


vcpkg and conan are gaining momentum. vcpkg was just integrated into CLion!





Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: