Have you ever worried that by programming in this way, you are methodically giving Anthropic all the information it needs to copy your product? If there is any real value in what you are doing, what is to stop Anthropic or OpenAI or whomever from essentially one-shotting Zed? What happens when the model providers 10x their costs and also use the information you've so enthusiastically given them to clone your product and use the money that you paid them to squash you?
Isn’t it widely assumed Microsoft used private repos for LLM training?
And even with a narrower definition of stealing, Microsoft’s ability to share your code with US government agencies is a common and very legitimate worry in plenty of threat model scenarios.
Ha, I did not see your post before making mine. You are correct in your assessment of the blame.
Moreover, I view optimization as an anti-pattern in general, especially for a low level language. It is better to directly write the optimal solution and not be dependent on the compiler. If there is a real hotspot that you have identified through profiling and you don't know how to optimize it, then you can run the hotspot through an optimizing compiler and copy what it does.
To a large extent, this problem is primarily due to slow compilation. It is possible to write a direct to machine code compiler that compiles at greater than one million lines per second. That is more code than I am likely to write in my lifetime. A fast compiler with no need for incremental compilation is a superior default and can always be adapted to add incrementalism when truly needed.
Yes, this is fine for basic exploration but, in the long run, I think LLVM taketh at least as much as it giveth. The proliferation of LLVM has created the perception that writing machine code is an extremely difficult endeavor that should not be pursued by mere mortals. In truth, you can get going writing x86_64 assembly in a day. With a few weeks of effort, it is possible to emit all of the basic x86_64 instructions. I have heard aarch64 is even easier but I only have experience with x86_64.
What you then realize is that it is possible to generate quality machine code much faster than LLVM and using far fewer resources. I believe both that LLVM has been holding back compiler evolution and that it is close to if not already at peak popularity. As LLMs improve, the need for tighter feedback loops will necessitate moving off the bloat of LLVM. Moreover, for all of the magic of LLVMs optimization passes, it does very little to prevent the user from writing incorrect code. I believe we will demand more from a compiler backend than LLVM can ever deliver.
The main selling point of LLVM is that you gain access to all of the targets, but this is for me a weak point in its favor. Firstly, one can write a quality self hosting compiler with O(20) instructions. Adding new backends should be trivial. Moreover, the more you are thinking about cross platform portability, the more you are worrying about hypothetical problems as well as the problems of people other than yourself. Get your compiler working well first on your machine and then worry about other machines.
I agree. I've found that, for the languages I'm interesting in compiling (strict functional languages), a custom backend is desirable simply because LLVM isn't well suited for various things you might like to do when compiling functional programming languages (particularly related to custom register conventions, split stacks, etc.).
I'm particularly fond of the organisation of the OCaml compiler: it doesn't really follow a classical separation of concerns, but emits good quality code. E.g. its instruction selection is just pattern matching expressed in the language, various liveness properties of the target instructions are expressed for the virtual IR (as they know which one-to-one instruction mapping they'll use later - as opposed to doing register allocation strictly after instruction selection), garbage collection checks are threaded in after-the-fact (calls to caml_call_gc), its register allocator is a simple variant of Chow et al's priority graph colouring (expressed rather tersely; ~223 lines, ignoring the related infrastructure for spilling, restoring, etc.)
--
As a huge aside, I believe the hobby compiler space could benefit from someone implementing a syntactic subset of LLVM, capable of compiling real programs. You'd get test suites for free and the option to switch to stock LLVM if desired. Projects like Hare are probably a good fit for such an idea: you could switch out the backend for stock LLVM if you want.
Hare is a very pleasant language to use, and I like the way the code looks vs something like Zig. I also like that it uses QBE for the reasons they explained.
That said, I suspect it’ll never be more than a small niche if it doesn’t target Mac and Windows.
If only that was only about emitting byte code in a file then calling the linker... you also have the problem of debug information, optimizers passes, the amount of tests required to prove the output byte code is valid, etc.
I suspect this is wrong. If you are correct, that implies to me that LLMs are not intelligent and just are exceptionally well tuned to echo back their training data. It makes no sense to me that a superior intelligence would be unable to trivially learn a new language syntax and apply its semantic knowledge to the new syntax. So I believe that either LLMs will improve to the point that they will easily pick up a new language or we will realize that LLMs themselves are the dead end.
There is some intellegence. It can figure stuff out and solve problems. It isnt copy paste. But I agree with your point. They are not intellegent enough to learn during inference. Which is the main point here.
You are talking about the future. But if we are talking about the future the bitter lesson applies even more so. The super intelligence doesnt need a special programming language to be more productive. It can use Python for everything and write bug free correct code fast.
I don't think your ultimatum holds. Even assuming LLMs are capable of learning beyond their training data, that just lead back to the purpose of practice in education. Even if you provide a full, unambiguous language spec to a model, and the model were capable of intelligently understanding it, should you expect its performance with your new language to match the petabytes of Python "practice" a model comes with?
Further to this, you can trivially observe two further LLM weaknesses:
1. that LLMs are bad at weird syntax even with a complete description. E.g. writing StandardML and similar languages, or any esolangs.
2. Even with lots of training data, LLMs cannot generalise their output to a shape that doesn’t resemble their training. E.g. ask the LLM to write any nontrivial assembler code like an OS bootstrap.
LLMs aren’t a “superior intelligence” because every abstract concept they “learn” is done so emergently. They understand programming concepts within the scope of languages and tasks that easily map back to those things, and due to finite quantisation they can’t generalise those concepts from first principles. I.e. it can map python to programming concepts, but it can’t map programming concepts to an esoteric language with any amount of reliability. Try doing some prompting and this becomes agonisingly apparent!
This is weirdly confident and defensive at the same time. If c++ is so great, why does it need to be so ardently defended and its very obvious problems handwaved away?
I take a very different view about the trajectory of languages given the current trends in software development. The more people rely upon agentic coding processes, the more they will demand faster compilation which will increasingly become a significant bottleneck on product velocity. The faster the llms get, the more important it is for the tools they use to be fast. Right now, I still think we are in an uncanny valley where llms are still slow enough that slow tooling does not seem that bad, but this is likely to change. People will no longer be satisfied asking their agent to make a change and come back in a minute or an hour. They will expect the result nearly instantaneously. C++ (and rust) compile times are too slow for the agent to iterate in the human reaction window so I believe that one of two things will happen over the next few years: llm progress will stall out or c++ and rust will nosedive in popularity.
I honestly think that a 2GB limit for a code base, excluding non code assets would be perfectly reasonable. If it exceeds that it should be split into separate modules.
C/C++ compilers can get huge compile time speed ups by compiling translation units as Unity files. For my work AAA game engine, a compiler can use 8GB+ per unit.
Just splitting up code might allow the compiler to use less memory, but compile time will increase hugely.
This is true, whether it matters is context dependent. In an embedded program, this may be irrelevant since your program is the only thing running so there is no resource contention or need to swap. In multi-tenant, you could use arenas in an identical way as single static allocation and release the arena upon completion. I agree that allocating a huge amount of memory for a long running program on a multi-tenant os is a bad idea in general, but it could be ok if for example you are running a single application like a database on the server in which you are back to embedded programming only the embedding is a database on a beefy general purpose computer.
That sounds great. We can focus on planning the obsolescence of the auto industry rather than having the industry continue to extract rents on society. Of course this will be a political nightmare, but it does seem like truly the sooner the better.
I don't love a good deal of Lua's syntax, but I do think the authors had good reasons for their choices and have generally explained them. Even if you disagree, I think "without good reasons" is overly dismissive.
Personally though, I think the distinctive choices are a boon. You are never confused about what language you are writing because Lua code is so obviously Lua. There is value in this. Once you have written enough Lua, your mind easily switches in and out of Lua mode. Javascript, on the other hand, is filled with poor semantic decisions which for me, cancel out any benefits from syntactic familiarity.
More importantly, Lua has a crucial feature that Javascript lacks: tail call optimization. There are programs that I can easily write in Lua, in spite of its syntactic verbosity, that I cannot write in Javascript because of this limitation. Perhaps this particular JS implementation has tco, but I doubt it reading the release notes.
I have learned as much from Lua as I have Forth (SmallTalk doesn't interest me) and my programming skill has increased significantly since I switched to it as my primary language. Lua is the only lightweight language that I am aware of with TCO. In my programs, I have banned the use of loops. This is a liberation that is not possible in JS or even c, where TCO cannot be relied upon.
In particular, Lua is an exceptional language for writing compilers. Compilers are inherently recursive and thus languages lacking TCO are a poor fit (even if people have been valiantly forcing that square peg through a round hole for all this time).
Having said all that, perhaps as a scripting language for Redis, JS is a better fit. For me though Lua is clearly better than JS on many different dimensions and I don't appreciate the needless denigration of Lua, especially from someone as influential as you.
> For me though Lua is clearly better than JS on many different dimensions and I don't appreciate the needless denigration of Lua, especially from someone as influential as you.
Is it needless? It's useful specifically because he is someone influential, and someone might say "Lua was antirez's choice when making redis, and I trust and respect his engineering, so I'm going to keep Lua as a top contender for use in my project because of that" and him being clear on his choices and reasoning is useful in that respect. In any case where you think he has a responsibility to be careful what he says because of that influence, that can also be used in this case as a reason he should definitely explain his thoughts on it then and now.
Formally JavaScript is specified as having TCO as of ES6, although for unfortunate and painful reasons this is spec fiction - Safari implements it, but Firefox and Chrome do not. Neither did QuickJS last I checked and I don't think this does either.
ES is now ES2025, not ES6/2015. There are still platforms that don't even fully implement enough to shim out ES5 completely, let alone ES6+. Portions of ES6 require buy in from the hosting/runtime environment that aren't even practical for some environments... so I feel the statement itself is kind of ignorant.
Also look at Hedgehog Lisp. The bytecode compiler (runs on a PC) is separate from the interpreter, i.e. there is no REPL. But it means that the interpreter is only about 20KB of code. It's quite practical. It's not Scheme but rather is a functional Lisp (immutable data including AVL trees as the main lookup structure) and it is tail recursive. https://github.com/sbp/hedgehog
> I think the distinctive choices are a boon. You are never confused about what language you are writing because Lua code is so obviously Lua. There is value in this.
This. And not just Lua , but having different kind of syntax for scripting languages or very high level languages signal it is something entirely different, and not C as in system programming language.
The syntax is also easier for people who dont intend to make programming as their profession, but simply want something done. It used to be the case in the old days people would design simple PL for new beginners, ActionScript / Flash era and even Hypercard before that. Unfortunately the industry is no longer interested in it, and if anything intend to make every as complicated as possible.
You’re wrong in the way in which many people are wrong when they hear about a thing called “tail-call optimization”, which is why some people have been trying to get away from the term in favour of “proper tail calls” or something similar, at least as far as R5RS[1]:
> A Scheme implementation is properly tail-recursive if it supports an unbounded number of active tail calls.
The issue here is that, in every language that has a detailed enough specification, there is some provision saying that a program that makes an unbounded number of nested calls at runtime is not legal. Support for proper tail calls means that tail calls (a well-defined subgrammar of the language) do not ever count as nested, which expands the set of legal programs. That’s a language feature, not (merely) a compiler feature.
I still think that the language property (or requirement, or behavior as seen by within the language itself) that we're talking about in this case is "unbounded nested calls" and that the language specs doesn't (shouldn't) assume that such property will be satisfied in a specific way, e.g. switching the call to a branch, as TCO usually means.
Unbounded nested calls as long as those calls are in tail position, which is a thing that needs to be defined—trivially, as `return EXPR(EXPR...)`, in Lua; while Scheme, being based around expressions, needs a more careful definition, see link above.
Otherwise yes. For instance, Scheme implementations that translate the Scheme program into portable C code (not just into bytecode interpreted by C code) cannot assume that the C compiler will translate C-level tail calls into jumps and thus take special measures to make them work correctly, from trampolines to the very confusingly named “Cheney on the M.T.A.”[1], and people will, colloquially, say those implementations do TCO too. Whether that’s correct usage... I don’t think really matters here, other than to demonstrate why the term “TCO” as encountered in the wild is a confusing one.
Cheney on the MTA is a great paper/algorithm, and I'd like to add (for the benefit of the lucky ten thousand just learning about this) that it's pun on a great old song: Charlie on the MTA ( https://www.youtube.com/watch?v=MbtkL5_f6-4 ). The joke is that in both cases it will never return, either because the subway fare is too high or because you don't want to keep the call stack around.
Because that's a description of the intended behavior, and I reason about a language as an abstraction that allows one to express an expected behavior ignoring the implementation details.
I know it's not universal: some languages in their infancy lack a formalization and are defined by their reference implentation. But a more theoretical approach has allowed languages like C to strive for years.
I sort of see what you are getting at but I am still a bit confused:
If I have a program that based on the input given to it runs some number of recursions of a function and two compilers of the language, can I compile the program using both of them if compiler A has PTC and compiler B does not no matter what the actual program is? As in, is the only difference that you won’t get a runtime error if you exceed the max stack size?
That is correct, the difference is only visible at runtime. So is the difference between garbage collection (whether tracing or reference counting) and lack thereof: you can write a long-lived C program that calls malloc() throughout its lifetime but never free(), but you’re not going to have a good time executing it. Unless you compile it with Fil-C, in which case it will work (modulo the usual caveats regarding syntactic vs semantic garbage).
I think features of the language can make it much easier (read: possible) for the compiler to recognize when a function is tail call optimizable. Not every recursion will be, so it matters greatly what the actual program is.
It is a feature of the language (with proper tail calls) that a certain class of calls defined in the spec must be TCOd, if you want to put things that way. It’s not just that it’s easier for the compiler to recognize them, it’s that it has to.
(The usual caveats about TCO randomly not working are due to constraints imposed by preexisting ABIs or VMs; if you don’t need to care about those, then the whole thing is quite straightforward.)
I don't think you're wrong per se. This is a "correct" way of thinking of the situation, but it's not the only correct way and it's arguably not the most useful.
A more useful way to understand the situation is that a language's major implementations are more important than the language itself. If the spec of the language says something, but nobody implements it, you can't write code against the spec. And on the flip side, if the major implementations of a language implement a feature that's not in the spec, you can write code that uses that feature.
A minor historical example of this was Python dictionaries. Maybe a decade ago, the Python spec didn't specify that dictionary keys would be retrieved in insertion order, so in theory, implementations of the Python language could do something like:
But the CPython implementation did return all the keys in insertion order, and very few people were using anything other than the CPython implementation, so some codebases started depending on the keys being returned in insertion order without even knowing that they were depending on it. You could say that they weren't writing Python, but that seems a bit pedantic to me.
In any case, Python later standardized that as a feature, so now the ambiguity is solved.
It's all very tricky though, because for example, I wrote some code a decade that used GCC's compare-and-swap extensions, and at least at that time, it didn't compile on Clang. I think you'd have a stronger argument there that I wasn't writing C--not because what I wrote wasn't standard C, but because the code I wrote didn't compile on the most commonly used C compiler. The better approach to communication in this case, I think, is to simply use phrases that communicate what you're doing: instead of saying "C", say "ANSI C", "GCC C", "Portable C", etc.--phrases that communicate what implementations of the language you're supporting. Saying you're writing "C" isn't wrong, it's just not communicating a very important detail: what implementations of the compiler can compile your code. I'm much more interested in effectively communicating what compilers can compile a piece of code than pedantically gatekeeping what's C and what's not.
Python’s dicts for many years did not return keys in insertion order (since Tim Peters improved the hash in iirc 1.5 until Raymond Hettinger improved it further in iirc 3.6).
After the 3.6 changed, they were returned in order. And people started relying on that - so at a later stage, this became part of the spec.
There actually was a time when Python dictionary keys weren't guaranteed to be in the order they were inserted, as implemented in CPython, and the order would not be preserved.
You could not reliably depend on that implementation detail until much later, when optimizations were implemented in CPython that just so happened to preserve dictionary key insertion order. Once that was realized, it was PEP'd and made part of the spec.
I'm saying it isn't very useful argue about whether a feature is a feature of the language or a feature of the implementation, because the language is pretty useless independent of its implementation(s).
It wouldn't be the first time the specs have gone too far and beyond their perimeter.
C's "register" variables used to have the same issue, and even "inline" has been downgraded to a mere hint for the compiler (which can ignore it and still be a C compiler).
inline and register still have semantic requirements that are not just hints. Taking the address of a register variable is illegal, and inline allows a function to be defined in multiple .c files without errors.
I'd love to hear more how it is, the state of the library ecosystem, language evolution (wasn't there a new major version recently?), pros/cons, reasons to use it compared to other languages.
About tail-calls, in other languages I've found sometimes a conversion of recursive algorithm to a flat iterative loop with stack/queue to be effective. But it can be a pain, less elegant or intuitive than TCO.
Lua isn't my primary programming language now, but it was for a while. My personal experience on the library ecosystem was:
It's definitely smaller than many languages, and this is something to consider before selecting Lua for a project. But, on the positive side: With some 'other' languages I might find 5 or 10 libraries all doing more or less the same thing, many of them bloated and over-engineered. But with Lua I would often find just one library available, and it would be small and clean enough that I could easily read through its source code and know exactly how it worked.
Another nice thing about Lua when run on LuaJIT: extremely high CPU performance for a scripting language.
In summary: A better choice than it might appear at first, but with trade-offs which need serious consideration.
Yeah, you can usually write a TCO based algorithm differently without recursion though it's often more messy of an implementation... In practice, with JS, I find that if I know I'm going to wind up more/less than 3-4 calls deep I'll optimize or not to avoid the stack overflow.
Also worth noting that some features in JS may rely on application/environment support and may raise errors that you cannot catch in JS code. This is often fun to discover and painful to try to work around.
Does the language give any guarantee that TCO was applied? In other words can it give you an error that the recursion is not of tail call form? Because I imagine a probability of writing a recursion and relying on it being TCO-optimized, where it's not. I would prefer if a language had some form of explicit TCO modifier for a function. Is there any language that has this?
At least in Lua then the rule is simply 'last thing a function dose' this is unambiguous. `return f()` is always a tail call and `return f() + 1` never is.
No, the last thing is the +; which can't run till it knows both values. (Reverse Polish notation is clearer, but humans prefer infix operators for some reason)
JS has required proper tail calls (PTC) for a decade now. Safari's JavascriptCore and almost every implementation except v8/spidermonkey (and the now defunct chakra) have PTC.
v8 had PTC, but removed it because they insisted it MUST have a new tail call keyword. When they were shot down, they threw a childish fit and removed the PTC from their JIT.
> More importantly, Lua has a crucial feature that Javascript lacks: tail call optimization. There are programs that I can easily write in Lua, in spite of its syntactic verbosity, that I cannot write in Javascript because of this limitation. Perhaps this particular JS implementation has tco, but I doubt it reading the release notes.
> [...] In my programs, I have banned the use of loops. This is a liberation that is not possible in JS or even c, where TCO cannot be relied upon.
This is not a great language feature, IMO. There are two ways to go here:
1. You can go the Python way, and have no TCO, not ever. Guido van Rossum's reasoning on this is outlined here[1] and here[2], but the high level summary is that TCO makes it impossible to provide acceptably-clear tracebacks.
2. You can go the Chicken Scheme way, and do TCO, and ALSO do CPS conversion, which makes EVERY call into a tail call, without language user having to restructure their code to make sure their recursion happens at the tail.
Either of these approaches has its upsides and downsides, but TCO WITHOUT CPS conversion gives you the worst of both worlds. The only upside is that you can write most of your loops as recursion, but as van Rossum points out, most cases that can be handled with tail recursion, can AND SHOULD be handled with higher-order functions. This is just a much cleaner way to do it in most cases.
And the downsides to TCO without CPS conversion are:
1. Poor tracebacks.
2. Having to restructure your code awkwardly to make recursive calls into tail calls.
3. Easy to make a tail call into not a tail call, resulting in stack overflows.
I'll also add that the main reason recursion is preferable to looping is that it enables all sorts of formal verification. There's some tooling around formal verification for Scheme, but the benefits to eliminating loops are felt most in static, strongly typed languages like Haskell or OCaml. As far as I know Lua has no mature tooling whatsoever that benefits from preferring recursion over looping. It may be that the author of the post I am responding to finds recursion more intuitive than looping, but my experience contains no evidence that recursion is inherently more intuitive than looping: which is more intuitive appears to me to be entirely a function of the programmer's past experience.
In short, treating TCO without CPS conversion as a killer feature seems to me to be a fetishization of functional programming without understanding why functional programming is effective, embracing the madness with none of the method.
EDIT: To point out a weakness to my own argument: there are a bunch of functional programming language implementations that implement TCO without CPS conversion. I'd counter by saying that this is a function of when they were implemented/standardized. Requiring CPS conversion in the Scheme standard would pretty clearly make Scheme an easier to use language, but it would be unreasonable in 2025 to require CPS conversion because so many Scheme implementations don't have it and don't have the resources to implement it.
EDIT 2: I didn't mean for this post to come across as negative on Lua: I love Lua, and in my hobby language interpreter I've been writing, I have spent countless hours implementing ideas I got from Lua. Lua has many strengths--TCO just isn't one of them. When I'm writing Scheme and can't use a higher-order function, I use TCO. When I'm writing Lua and can't use a higher order function, I use loops. And in both languages I'd prefer to use a higher order function.
EDIT 3: Looking at Lua's overall implementation, it seems to be focused on being fast and lightweight.
I don't know why Lua implemented TCO, but if I had to guess, it's not because it enables you to replace loops with recursion, it's because it... optimizes tail calls. It causes tail calls to use less memory, and this is particularly effective in Lua's implementation because it reuses the stack memory that was just used by the parent call, meaning it uses memory which is already in the processor's cache.
The thing is, a loop is still going to be slightly faster than TCOed recursion, because you don't need to move the arguments to the tail call function into the previous stack frame. In a loop your counters and whatnot are just always using the same memory location, no copying needed.
Where TCO really shines is in all the tail calls that aren't replacements for loops: an optimized tail call is faster than a non-optimized tail call. And in real world applications, a lot of your calls are tail calls!
I don't necessarily love the feature, for the reasons that I detailed in the previous post. But it's not a terrible problem, and I think it at makes sense as an optimization within the context of Lua's design goals of being lightweight and fast.
I do not think your compiler argument in support of TCO is very convincing.
Do you really need to write compilers with limitless nesting? Or is nesting, say, 100.000 deep enough, perhaps?
Also, you'll usually want to allocate some data structure to create an AST for each level. So that means you'll have some finite limit anyway. And that limit is a lot easier to hit in the real world, as it applies not just to nesting depth, but to the entire size of your compilation unit.
TCO is not just for parse trees or AST, but in imperative languages without TCO this is the only place you are "forced" to use recursion. You can transform any loop in you program to recursion if you prefer, which is what the author does.
reply