That bibliography is amazing. The best papers also have great bibliographies where I can dive in and learn mind altering ideas. Boring papers have boring bibliographies that just echo a list of the big papers in a field.
I really want to like Carp but I somehow feel that the Rusty memory model does not fit the Lisp philosophy... all the borrowing story comes from a basis mutability.
I wish for something more Clojurish based on immutable data. Then one can exploit the power of inferred type linearity/affinity to transparently build safe "transients" and other cool stuff. Maybe some day I should write such Lisp myself on top of my C++ immutable data structures [1] :-)
One of the good LISP's back in the day was PreScheme that let it finally be a C alternative efficiency-wise. It was also used in first, verified LISP. Such a style might also be good for bootstrapping or just making more flexible ways to code C.
Looking at PreScheme, Carp doing a C alternative safe without a GC is a nice evolution of these LISP's. Next cool possibility: use a Rust-like LISP with an OS project like Mezzano on lowest-level stuff. Might even start with Redox OS just LISPifying its parts for a kernel and shell to start with.
Why do you want a Rust with Lisp syntax? I don't see the benefits of such a "Lisp" if you take the core benefits of metaprogramming at runtime away which depends on the equality of code and data.
Safe software can also be written in other languages than Rust. Ada is still industry standard of safe programming today. Even Lisp can be used to write safe software since Lisp's memory management takes care of possible pointer problems.
Rust shines in the field where Mozilla developed it for -- safe Internet browsers. However, safe Internet browsers could also be written in Ada and Lisp. The Lisp version would just not be as performant as Ada's and Rust's.
The phrase was ambiguous. I meant a LISP w/ safe, memory model like Carp. There's already LISP's in the past and present for OS's. There's also projects that restrict the power of languages to get better efficiency in low-level work. So, more of the latter with Rust's memory model might be advantageous. The former with their flexibility could be built on top of that same language with a low-latency GC for anything done dynamically or which the analyzer just couldn't handle enough to know safety. Might lead to a great performance boost on top of safety benefits.
"Safe software can also be written in other languages than Rust. Ada is still industry standard of safe programming today."
You're right that Ada was a language systematically designed for safety that people could be using right now. I gave them some resources on that here:
The problem: Ada does not have temporal safety or flexible support for race-free concurrency. The first part in Ada doesn't exist at all: they have to do unsafe deallocation when their tricks like memory pools aren't enough. The second, Ravenscar, is like a tome in its restrictions compared to the basic rules of the borrow-checker in Rust. Rust smashes Ada so hard on "safety without GC concept" that I told Yannick McCoy at AdaCore to get someone on achieving parity pronto. For SPARK if nothing else since it would have provably-safe sequential procedures whose composition was also safe w/ Rust's method. That would be awesome for projects such as IRONSIDES DNS.
"Rust shines in the field where Mozilla developed it for -- safe Internet browsers."
Rust shines anywhere you want safe, systems code without a GC and with concurrency. That's it's main selling point, not language design. There's tons of applications for that as the huge ecosystem is showing us replacing all sorts of components in other languages with safe, fast alternatives. Redox OS did an OS in it. In high-assurance sector, Robigalia was wrapping seL4 so one can have safe apps in Rust on a formally-verified, separation kernel. bluejekyll upthread wrote a DNS server (TrustDNS) in it. Rust can do about anything an analyzable subset of C can do.
"The Lisp version would just not be as performant as Ada's and Rust's."
This is correct. My recommendation to use one close to the metal like PreScheme with Rust-style safety that outputs C for LLVM attempts to address that. All the macros apply until the resulting code is straight-forward. The translation should be straight-forward to C where LLVM does the heavy lifting. One might also use techniques from projects such as Chicken Scheme, a whole-program optimizer, and/or a super-optimizer. The speed differences might be so minimal that nobody even cares. They already are for several LISP's on most applications but I'm talking system code.
Your point is solid for most architectures. On embedded systems however, you nearly always need some sort of lifetime/ownership model. If using an RTOS, you have to play by its API rules. If bare-metal C/C++, you have to code it from scratch. But it's always there.
Compiler support for this pattern is a huge plus in my book.
> does not fit the Lisp philosophy... all the borrowing story comes from a basis mutability
I found this to be a confusing statement until I realized you were talking about Clojure. Common Lisp condones mutation, and a borrowing model could be helpful for optimizing memory allocations.
You are right that a lot of Lisps consider themselves more procedural than functional and I was extrapolating Clojure there. (In my defense, the Scheme community is also very functional and immutable oriented)
Bigloo has a special place in my heart. I spent a few years building scheme bindings to various C graphics and networking libraries. I can't quite put my finger on it, but I found it very enjoyable. Getting C-like performance from scheme is just wonderful.
> A lisp with type inference and performance is quite rare.
That's quite a befuddling thing to say, in every aspect.
The popular SBCL is likely the fastest general Lisp out there (gets close to or equal to C when declared properly), and type inference is certainly part of those speed wins.
This actually sounds like exactly what I’ve been wanting! I use Clojure and it’s great, but I’ve been wanting to use something to interop with some C++ code but while I like C++ more than most, I’d rather write in Clojure (or something like it). Awesome!
That’s quite true, but from the users perspective, it mostly doesn’t matter, at least in my mind. For people interested in the internals—which is an increasing amount of the user base—, and the people working on the compiler, it most definitely is a concern.
One place I can see this being an issue for users is debugging. I haven't tried Carp so I don't know how much of a practical issue this is. But if what is actually being compiled is a C program, how do I tie issues in the compiled C program all the way back to my source in Carp?
The borrow checker is the reason to use Rust, in spite of its annoyances (clunky syntax, limited type inference, non-interactive programming, long compilation times, etc.), so it's always nice to see someone trying to provide the upsides of Rust without the downsides.
That being said, your website is disappointingly terse regarding how Carp recovers the advantages of Rust in the context of a Lisp derivative. What makes this particularly suspicious is the fact that, in the past, Lisp programmers have claimed to recover the advantages of other statically typed languages in a Lisp setting, but there have always been huge caveats, like “the system can be very easily circumvented, reducing the benefits of static typing to nothing”.
The main reason why I feel confident that Rust is a safe language isn't just the fact that rustc seems to catch many bugs in practice. That alone wouldn't distinguish it from the countless static analysis tools for C and C++ that have existed for decades. The main reason why, at least in my opinion, Rust is such a trustworthy language in spite of its ginormous complexity, is the fact that its core developers, in particular Niko Matsakis, do a terrific job of communicating the process by which Rust's features are conceived and designed. When you read Matsakis' blog [0], it becomes patently clear that Rust's developers take into account the kind of corner cases that in many other languages are only discovered months or years after the feature has been implemented [1][2].
Other languages that inspire similar confidence in me are:
(0) Standard ML. It has a formal proof of type safety.
(1) Pony. It has a formal proof of type safety.
(2) Haskell, as specified in the Haskell Report (i.e., without the crazy GHC-specific language extensions), because its type system is similar to Standard ML's, and there are no reasons to believe the differences introduce any type safety holes.
(3) OCaml, again without the crazy extensions (GADTs, first-class modules, etc.), for more or less the same reasons as Haskell.
While I do agree with the fact that one of the best features of Rust is the communication, I’d tend to find the comparison of the two a little unfair. Carp is very young and in an entirely different position than Rust.
That being said, this is why I want to start to write more blog posts about Carp. I had to wade through the deep waters alone, with occasional help in the Gitter channel. Now I want to share that experience. As such, the blog post goes in a different direction than, say, Matsakis’.
In a perfect world someone who cares about communication and is good at it will be in the core team of the language at one point. In the meantime, we’ll have to make-do with me.
> Carp is very young and in an entirely different position than Rust.
If anything, Carp's is an advantageous position relative to Rust: You don't risk breaking many other people's code by fixing any type safety holes you find along the way. Which you will, not because you're stupid, but rather because type system design is applied formal logic, and formal logic is frigging hard for humans.
> While I do agree with the fact that one of the best features of Rust is the communication
It's not just that they communicate something, but rather what they communicate. Most language designers that have a user community do a reasonable job of explaining how new language features solve existing problems users face. However, they usually do a poor job of explaining how new language features interact with previously existing ones in all possible cases, and the main reason for this is that language designers have trouble anticipating these interactions to begin with. In other words, language designers don't understand their own designs! Matsakis' blog shows that the Rust's developers do a much better job than other language designers in this regard.
I’m not sure what you’re trying to tell me here, so maybe this comment is going in the wrong direction entirely:
I think we are on the same side of the fence. Reasoning about your designs—and communicating your reasoning openly, not fearing scrutiny but rather embracing it—are important in any project.
But you compare an introductory article in which I try to explain a language to prospective users or other interested parties with the notes, musings, and writings of someone working on a compiler, aimed at an entirely different set of interested parties. Sure, in the end we’re all programmers and should all be interested in both, but trying to cover both at the same time will only result in mental overload.
I’m also in no position to talk about these things. I provide the occasional compiler bug fix, but I’m not the principal driving force behind the compiler. I build tools and libraries with Carp, and see whether it breaks in interesting ways, or try to come up with use cases that don’t yet exist. In other words, I’m just a user.
> in the past, Lisp programmers have claimed to recover the advantages of other statically typed languages in a Lisp setting, but there have always been huge caveats, like “the system can be very easily circumvented, reducing the benefits of static typing to nothing”.
It seems to me that the benefits of static typing only apply to accidental mistakes (e.g. using a pointer to a character as though it were an integer, or a pointer to a struct, or whatever), and thus that a system with deliberately-circumventable static typing is just fine.
> It seems to me that the benefits of static typing only apply to accidental mistakes (e.g. using a pointer to a character as though it were an integer, or a pointer to a struct, or whatever), and thus that a system with deliberately-circumventable static typing is just fine.
Static typing is useful to enforce the integrity of abstractions across large systems. For instance, using (language-enforced) abstract data types, you can confidently say that a 50 KLOC program won't destroy the invariants of a complicated data structure, because the only place where this could hypothetically happen is a 1500 LOC module, and these 1500 LOC have been verified to the death. Elsewhere, the internal representation of this data structure isn't accessible. Before anyone claims this can be done using object-orientation: No. Object-orientation allows the creation of ill-behaved impostors that are indistinguishable from well-behaved objects unless you use expensive dynamic checks or even more expensive whole-program (and hence non-modular) analyses.
Static typing is also useful to enforce the exhaustiveness of case analyses. Disregarding memory safety issues, which are largely a nonproblem in high-level languages, the vast majority of bugs in computer programs arises from failing to identify corner cases or fully comprehend their complexity. Algebraic data types allow you to substitute ad hoc case analyses with induction on datatypes, for which mechanical exhaustiveness checks are possible, and, in fact, actually performed in practice.
Static typing is also useful as an aid to program verification. Program properties of interest can be classified in two groups: safety properties and liveness properties. Safety properties assert that “bad states are never reached”, and are largely covered by type soundness (for non-abstract types) and type abstraction a.k.a. parametricity (for abstract types). Liveness properties assert that “good states are eventually reached”, and, while types provide less support for verifying liveness properties than safety ones, at least induction on datatypes provides a easy-to-use tool to verify that non-concurrent (but possibly parallel) algorithms will terminate.
All of these benefits fly out of the window if static typing can be circumvented.
All mistakes are accidental. Seldom do people write a line of code thinking, "I'm making a mistake." If your language encourages people to casually write unsafe code, any time you spent on safety guarantees was wasted because those guarantees disappear in all of that code and all of the code that touches it. And that's a lot of space for mistakes.
I'd argue that even the crazy ghc features are mostly trustable. Virtually every language extension and even a lot of the optimizations are explained in papers - including design tradeoffs, proofs of correctness and interactions with other extensions.
That still doesn't mean that it's a good idea to enable -XIncoherentInstances, though.
>What makes this particularly suspicious is the fact that, in the past, Lisp programmers have claimed to recover the advantages of other statically typed languages in a Lisp setting, but there have always been huge caveats, like “the system can be very easily circumvented, reducing the benefits of static typing to nothing”
Common Lisp programmers use static typing for gaining speed, not for any kind of added "safety". For safety we have very good package isolation, really strict typing, typecasing, multimethods, conditions and restarts, and the friendly runtime overseeing the code execution like a god and helping the programmer as a good loyal friend would do.
These features exist for expressiveness, not safety reasons. Although it must be noted that these features make it hard to verify the correctness of programs in a modular fashion. Typecasing makes language-enforced abstraction essentially impossible.
> conditions and restarts
These features exist for debuggability, not safety reasons. Safety means ruling out delimited classes of “bad” behaviors by (language) design.
> friendly runtime overseeing the code execution
Now this is a safety feature, but it is only kind of incompatible with the zero-overhead needs of a low-level language.
Correct me if I'm wrong, but you can't be sure if programmers used a static analysis tool ? Maybe they used it just enough to make sure it compiles ? And if all you get is a binary, you can't even run those tools yourself. That's my problem with the argument: "The language has issues, but you can run static analysis tools" and "it takes some self-discipline". If you take over someone's code, you're back to square one.
Rust, on the other hand, could be called Trust. Borrow checker is for everyone.
> Correct me if I'm wrong, but you can't be sure if programmers used a static analysis tool?
I could be sure that they used a static analysis tool, for example, if I watched them use it. But that alone is not enough: the static analysis tool has to be sound, and most static analysis tools for C and C++ deliberately aim for less than soundness.
Scheme is a lisp, and I'm pretty confident that two of the biggest selling points of lisp are dynamic typing and a strong distaste for (if not an outright prohibition of) mutation.
Put it as a question because if you take those two things away, wouldn't it just be Rust with parenthesis?
The selling point of lisp is macros, and macros alone. Different dialects will have different characteristics: Common Lisp is dynamically typed and usually makes heavy use of mutation, others like Lux are purely functional and statically typed.
They work pretty well in Crystal[1]. Some things are certainly a bit harder than they are in lisp/scheme but you can get a lot done with them. In crystal this is (in my experience) due to the lack of reflection support, and thus the ability to call dynamical looked up methods. Add that in and i'm pretty sure you'd remove >=90% of the current limitations, but even without that, you can get a lot done.
The only requirement to make macros work in a statically typed environment is to create a suitable type to represent the source code, and provide the operations to transform the said type. For instance, Rust has a macro system that's pretty nice to work with.
I find it really interesting, and I would definitely like to try it out (the article is great by the way) I am not that good at C or Haskell, so that would be a great way to improve on this too, and that would be my main objective actually. So noob question : how would I go adding some C networking libs to Carp ? Should I directly call the code from Carp (and is that possible at the time) or should I embed it in the language core and go the haskell route ? I am new to this but would really love to embark into the adventure, for learning purpose.
It seemed they don't distinguish between mutable and immutable references (?). Does Carp not handle the issue of shared mutability, eg. iterator invalidation, like Rust?
Correct, not at the moment. Overall the type system is less expressive but also less dependent on annotations. Differentiating between immutable and mutable refs is probably coming though, it’s a useful distinction for sure.
Very excited about Carp. In general I like the simplicity of Lisp but ironically I'm not a fan of dynamic typing. Carp may be a real innovation. A bit surprised though that the Carp C runtime functions are not namespaced. With names like `IO_println` there's a risk of clashes. Unless the compiler is doing some tricks to hide names?
It's not enough :-) Ask the OCaml folks, they're going through this pain right now because their stdlib modules are called `Array`, `List`, and so on. They're planning to put them all under `Stdlib`, so e.g. `Stdlib.Array` and so on, but it's going to be a big effort with a lot of pain.
The main problem will arise when users create their own libraries; suppose some people create a `Option` libraries and then later you want to add a standard option type to Carp, it will be painful. Better to namespace your stuff under `Carp` from the beginning, so e.g. `Carp_IO_println`, `Carp_Option_map`, and so on.
I wonder how this compares to the Python ecosystem, whose standard library is also quite large. (Anyone remembers their famous "batteries-included" mantra?)
Python 2 didn't have a stdlib namespace, or "package" in Python speak. But when they created their clean-slate approach with Python 3, they also decided not to introduce a stdlib package. External Python packages just avoid the stdlib package names, and everything seems fine.
So, is there something in the Python language that makes the missing?
Or is it just the Python community which doesn't care about (or plays down) this issue?
As a Python programmer, I would consider code which shadows the names of built-ins to be poorly written although it's permissible in small scopes (e.g., a single block).
I suspect the uppercase module names are a Haskell influence. Clojure uses `/`, right? So it might look like e.g. `(carp/io/println "Hello, World!")`. Come to think of it, importing names like `(use carp/io/async)` would be pretty cool.
I'm guessing it wouldn't be the same language then. Most languages are pretty tied to their own dataa structures, and that's especially true of Lisps because code is data. After all, Clojure is not Scheme.
You might be looking for C++ :-P As I mention here, it is impressive that you can actually implement these structures are libraries.
Exactly. Someone who doesn’t see the value in a GC-less language is not going to see the value here.
I personally think “functional persistent” data structures, as a language default, trade a lot of runtime performance in order to achieve some guarantees like thread-safety and functions not having side effects. Every mutation to every array or dictionary is treated like a transaction in a MVCC database, with the garbage collector in charge of cleaning up the records of every past version that wasn’t actually necessary to keep around, because no multiversion concurrency was actually necessary in that case. I encourage languages to experiment with other methods of achieving similar benefits.
I think limiting side effects and shared mutable state is very important, but at a local level, imperative code that mutates a data structure is highly readable and performant. Certain algorithms practically require it. Certain hardware practically requires it.
Functional languages let you think in terms of abstract “compound values,” but in practice these are backed by tons of little heap-allocated objects, partly so that lots of sharing can be done when the data structure is copied on every change.
I realize that almost every language could be described as a subset of C++, but I read articles like this and think...
Deterministic language that has type inference, C interop, and uses ownership to govern object lifetimes? We have that. It's C++11. auto with std::unique_ptr and std::move().
The problem is that, while in many ways an improvement over traditional coding practice, the subset of C++ associated with the Core Guidelines (GSL stands for "(Core) Guidelines Support Library") has turned out to be a dead-end when it comes to memory safety. In fact it formalizes the depenedency on intrinsically unsafe elements like native pointers, std::shared_ptr, etc. I mean, with "regular" C++ you're not actually obligated to use those intrinsically unsafe elements. With the Core Guidelines you are.
SaferCPlusPlus[1] is an alternative subset of C++ that doesn't have that same problem. It achieves memory safety by simply excluding the intrinsically unsafe elements and providing memory-safe alternatives.
> All compile time checks.
Not quite in reality. One small issue with the GSL, for example, is that its not_null pointer class does a run-time check on every pointer dereference[2]. SaferCPlusPlus can enforce "not null"ness at compile-time.
A bigger example, for instance, is the situation where you want to allow multiple threads to simultaneously modify different parts of an array. With SaferCPlusPlus, this is straightforward and safe [3]. With the GSL/Core Guidelines, less so.
I don't know how much these technical considerations factor into (or will factor into) popularity of adoption. I don't know how big the intersection is of the sets of developers who take code safety seriously and those who remain interested in C++.
I've only have a pretty basic understanding of how it works, but I thought the checks didn't rely on the compiler. They're special templates that fail when their conditions aren't met
And when I say it didn't take off - I mean that I haven't come across a single large project that has take up using the GSL. Hope I'm wrong - maybe I just haven't looked in the right place?
No, GSL is a kind of stop-gap solution, until standard catches on.
For example std::string_view and ongoing design on std::array_view are based on gsl::span. The gsl::byte is also no longer needed on C++17 thanks to std::byte as yet another example.
The GSL asserts are there until code contracts[0] get into the standard.
The magical types like gsl::owner allow clang-tidy and VC++ checkers to apply a Rust-like memory tracking usage.
As for not taking off, being initially a Microsoft proposal, it is surely used by Office and Windows teams, specially given they already use SDL (Security Definition Language) macros as well.
I've been waiting for a non-Clojure lisp to make some headway. Immutability is mostly a fad: look at how incredibly complicated Clojure's implementation is. It's not worth sacrificing elegance just to attract the true believers.
Honest question: is immutability complicated in Clojure because it's hard, or because it's running in a virtual machine with no meaningful support for it?
The Erlang VM has immutability deeply baked in, and it just works(™).
Erlang is typically used for problem domains (eg. fault-tolerant servers, network brokers & routers, message queues) where immutability fits the problem domain pretty well. If you want fault tolerance, you typically need to store all your state externally anyway so you can replicate it, and make all your logic stateless so you can restart & re-run it if a node fails.
There are some problem domains - eg. GUI software, web browsers, simulations, many of the more complicated parsing, data extraction, or graph-traversal algorithms - that are inherently mutable, and Erlang is not used very much in these domains. C++ still rules the roost there, even though many of its practitioners hate its shortcomings.
I would argue the immutability is actually not a great thing in Erlang to be honest. It makes it hard to reason about what gets copied, what gets shared in the global heap which may be a source of contention, makes code unnecessarily verbose, etc. I think it's one of those features (like lazy evaluation in Haskell) that language practitioners tend to advocate without necessarily understanding what tradeoffs they are making in the process.
Except that isn't true in Erlang (or other BEAM languages). Binaries larger than some threshold are managed on a shared-heap. I think a couple other things have been promoted to being managed that way too fairly recently.
I've seen academic papers with no date and it drives me crazy. I'm sure there's some reason for it related to the publication process but can be hard to vet a paper without knowing more about the context in which it was written.
[0] HN Discussion. https://news.ycombinator.com/item?id=14248419
[1] http://home.pipeline.com/~hbaker1/LinearLisp.html