Hacker Newsnew | past | comments | ask | show | jobs | submit | gritzko's commentslogin

Yep. These things have been solved by massive investments. The question is, can WASM as a language (not an implementation) do something JavaScript can't?

Wasm can do 64-bit integers, SIMD and statically typed GC classes.

JS could have had support for SIMD and 64 bit it's by now, and progress was actually being made (mostly just through the asm.js experiments), but it was deprioritized specifically to work on WASM.

WASM can even do 32-bit integers, which JavaScript can't, so uses floats instead.

JS has had byte arrays like Int32Array for a while. The JS engines will try to optimize math done into them/with them as integer math rather than float math, but yeah you still can't use an integer directly outside of array math.

The answer to that is no. But innovating at the language level was never a goal for WASM; quite the opposite, as simple as possible so it can be compiled and run anywhere.

That is like Linux on a laptop. When you buy a laptop, you pay for Windows anyway.

Not necessarily. I bought a laptop with Linux preinstalled and its the best thing to do if you buy one with the intent of using Linux on it.

That's what I thought when I bought a Dell XPS. Probably the worst laptop I've owned.

There's lots of good options that come with windows preinstalled.


I had some dramatic episodes. Surprisingly, FedEx is very bad in Germany too. Once I had to retrieve a laptop from a FedEx station to prevent it from going back to China. The trick is, it was addressed to a different name. The truck just went by the door for three days, "nobody home". That was a situation.

It's not FedEx, fluent usage of "c/o" when receiving post and packages in Germany is an absolute necessity. Once colleague from workplace ordered a modem from vodafone hoping he'll set up the internet by the end of the week, in the end the set up took over a month and almost led to his mental breakdown.

GNU Taler seems to be alive https://www.taler.net/en/

I wonder, at which point it is worth it to make a language? I personally implemented generics, slices and error propagation in C… that takes some work, but doable. Obviously, C stdlib goes to the trash bin, but there is not much value in it anyway. Not much code, and very obsolete.

Meanwhile, a compiler is an enormously complicated story. I personally never ever want to write a compiler, cause I already had more fun than I ever wanted working with distributed systems. While idiomatic C was not the way forward, my choice was a C dialect and Go for higher-level things.

How can we estimate these things? Or let's have fun, yolo?


> Meanwhile, a compiler is an enormously complicated story.

I don't intend to downplay the effort involved in creating a large project, but it's evident to me that there's a class of "better C" languages for which LLVM is very well suited.

On purely recreational grounds, one can get something small off the ground in an afternoon with LLVM. It's very enjoyable and has a low barrier to entry, really.


Yes, this is fine for basic exploration but, in the long run, I think LLVM taketh at least as much as it giveth. The proliferation of LLVM has created the perception that writing machine code is an extremely difficult endeavor that should not be pursued by mere mortals. In truth, you can get going writing x86_64 assembly in a day. With a few weeks of effort, it is possible to emit all of the basic x86_64 instructions. I have heard aarch64 is even easier but I only have experience with x86_64.

What you then realize is that it is possible to generate quality machine code much faster than LLVM and using far fewer resources. I believe both that LLVM has been holding back compiler evolution and that it is close to if not already at peak popularity. As LLMs improve, the need for tighter feedback loops will necessitate moving off the bloat of LLVM. Moreover, for all of the magic of LLVMs optimization passes, it does very little to prevent the user from writing incorrect code. I believe we will demand more from a compiler backend than LLVM can ever deliver.

The main selling point of LLVM is that you gain access to all of the targets, but this is for me a weak point in its favor. Firstly, one can write a quality self hosting compiler with O(20) instructions. Adding new backends should be trivial. Moreover, the more you are thinking about cross platform portability, the more you are worrying about hypothetical problems as well as the problems of people other than yourself. Get your compiler working well first on your machine and then worry about other machines.


I agree. I've found that, for the languages I'm interesting in compiling (strict functional languages), a custom backend is desirable simply because LLVM isn't well suited for various things you might like to do when compiling functional programming languages (particularly related to custom register conventions, split stacks, etc.).

I'm particularly fond of the organisation of the OCaml compiler: it doesn't really follow a classical separation of concerns, but emits good quality code. E.g. its instruction selection is just pattern matching expressed in the language, various liveness properties of the target instructions are expressed for the virtual IR (as they know which one-to-one instruction mapping they'll use later - as opposed to doing register allocation strictly after instruction selection), garbage collection checks are threaded in after-the-fact (calls to caml_call_gc), its register allocator is a simple variant of Chow et al's priority graph colouring (expressed rather tersely; ~223 lines, ignoring the related infrastructure for spilling, restoring, etc.)

--

As a huge aside, I believe the hobby compiler space could benefit from someone implementing a syntactic subset of LLVM, capable of compiling real programs. You'd get test suites for free and the option to switch to stock LLVM if desired. Projects like Hare are probably a good fit for such an idea: you could switch out the backend for stock LLVM if you want.


>Adding new backends should be trivial.

Sounds like famous last words :-P

And I don't really know about faster once you start to handle all the edge cases that invariably crop up.

Point in case: gcc


It's the classic pattern where you redefine the task as only 80% of the original.

That is why the Hare languages uses QBE instead: https://c9x.me/compile/

Sure it can't do all the optimizations LLVM can but it is radically simpler and easier to use.


Hare is a very pleasant language to use, and I like the way the code looks vs something like Zig. I also like that it uses QBE for the reasons they explained.

That said, I suspect it’ll never be more than a small niche if it doesn’t target Mac and Windows.


If only that was only about emitting byte code in a file then calling the linker... you also have the problem of debug information, optimizers passes, the amount of tests required to prove the output byte code is valid, etc.

>On purely recreational grounds, one can get something small off the ground in an afternoon with LLVM. It's very enjoyable and has a low barrier to entry, really.

Is there something analogous for those wanting to create language interpreters, not compilers? And preferably for interpreters one wants to develop in Python?

Doesn't have to literally just an afternoon, it could be even a few weeks, but something that will ease the task for PL newbies? The tasks of lexing and parsing, I mean.


https://craftinginterpreters.com/introduction.html

AST interpreter in Java from scratch, followed by the same language in a tight bytecode VM in C.

Great book; very good introduction to the subject.


There's quite neat lexer and parser generators for Python that can ease the barrier to entry. For example, I've used PLY now and then for very small things.

On the non-generated side, lexer creation is largely mechanical - even if you write it by hand. For example, if you vaguely understand the idea of expressing a disjunctive regular expression as a state machine (its DFA), you can plug that into skeleton algorithms and get a lexer out (for example, the algorithm shown in Reps' "“Maximal-Munch” Tokenization in Linear Time " paper). For parsing, taking a day or two to really understand Pratt parsing is incredibly valuable. Then, recursive descent is fairly intuitive to learn and implement, and Pratt parsing is a nice way to structure your parser for the more expressive parts of your language's grammar.

Nowadays, Python has a match (pattern matching) construct - even if its semantics are somewhat questionable (and potentially error-prone). Overall, though, I don't find Python too unenjoyable for compiler-related programming: dataclasses (and match) have really improved the situation.


I am a big fan of Ragel[1]. That is a high performance parser generator. In fact, it can generate different types of parsers, very powerful. Unfortunately, it takes a lot of skill to operate. I wrote a parser generator generator to make it all smooth[2], but after 8 years I still can't call it effortless. A colleague of mine once "broke the internet" with a Ragel bug. So, think twice. Still, for weekend activities I highly recommend it, just for the way of thinking it embodies.

[1]: https://www.colm.net/open-source/ragel/

[2]: https://github.com/gritzko/librdx/blob/master/rdx/JDR.lex


Is this the same Ragel that Zed Shaw wrote about in one of his posts back in the day, during Ruby and Rails heydays? I vaguely remwmber that article. I think he used it for Mongrel, his web server.

https://github.com/mongrel/mongrel


The worst part of designing a language is the parsing stage.

Simple enough to do it by hand, but there’s a lot of boilerplate and bureaucracy involved that is painfully time-wasting unless you know exactly what syntax you are going for.

But if you adopt a parser-generator such as Flex/Bison you’ll find yourself learning and debugging and obtuse language that has to be forcefully bent to your needs, and I hope your knowledge of parsing theory is up-to-scratch when you’re facing with shift-reduce conflicts or have to decide whether LR or LALR(1) or whatever is most appropriate to your syntax.

Not even PEG is gonna come to your rescue.


how is the boilerplate etc related to the syntax. not clear. i would have thought you first decide the syntax n only then start the work.

but i've never created an interpreter, let alone a compiler.


Thanks to those who replied.

> I wonder, at which point it is worth it to make a language?

AT ANY POINT.

No exist, nothing, that could yield more improvements that a new language. Is the ONLY way to make a paradigm(shift) stick. Is the ONLY way to turn "discipline" into "normal work".

Example:

"Everyone knows that is hard to mutate things":

* Option 1: DISCIPLINE

* Option 2: you have "let" and you have "var" (or equivalent) and remove MILLIONS of times where somebody somewhere must think "this var mutates or not?".

"Manually manage memory is hard"

* Option 1: DISCIPLINE

* Option 2: Not need, for TRILLONS of objects across ALL the codebases with any form of automatic memory management, across ALL the developers and ALL their apps to very close to 100% to never worry about it

* Option 3: And now I can be sure about do this with more safety and across threads and such

---

Make actual progress with a language is hard, because there is a fractal of competing things that in sore need of improvement, and a big subset of users are anti-progress and prefer to suffer decades of C (example) than some gradual progress with something like pascal (where a "string" exist).

Plus, a language need to coordinate syntax (important) with std library (important) with how frameworks will end (important) with compile-time AND runtime outcomes (important) with tooling (important).

And miss dearly any of this and you blew it.

But, there is not other kind of project (apart from a OS, FileSystem, DBs) where the potential positive impact will extend to the future as much.


At the point you want to interface with people outside of your direct influence. That's the value of a language — a shared understanding.

So long as only you use your custom C dialect, all is fine. Trouble starts when you'd like others to use it too or when you'd like to use libraries written by people who used a different language, e.g. C.


This actually started of by Christoffer (C3 author) contributing to C2 but not being satisfied with the development speed there, wanting to try his own things and moving forward more quickly. Apparently together with LLVM it was doable to write a new compiler for what is a successor to C2.

I personally implemented generic types, slices and error propagation in C… that takes some work, but doable. At which point, the question is: what else a new language should offer to make it worth it?

One day I will compare that to Zig if I figure out what to measure


Just read AWS or CloudFlare outage postmortems and you will see: are still there, in the happy land.

Did you read "Parsing JSON is a minefield"?

Probably, we need a formal data format, because JSON is just a notation. It does not mandate the bit width of numbers, for example, or whether ints are different from floats. Once there is such formal model, we can map it 1:1 between representations.

I am writing this because I work on a related topic https://replicated.wiki/blog/args.html


I am a bit saddened by the fact people get obsessed by syntactic innovation or even less than that. Don't we have plenty of urgent problems around us?

People have a problem and are trying to solve it. We are not all required, nor able, to solve whatever the world’s most urgent problem is today.

In this case they are formatting JSON in an easier to read way. It’s not an alternative to CRDT, it is a totally different issue.


What can I say. I want all problems in my life to be like that.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: