I started working through Crafting Interpreters, building up a language syntax a...

munificent · on June 5, 2023

Much of the complexity and error reporting that exists in the lexer or parser in a non-Lisp language just gets kicked down the road to a later phase in a Lisp.

Sure, s-exprs are much easier to parse. But the compiler or runtime still needs to report an error when you have an s-expr that is syntactically valid but semantically wrong like:

    (let ())
    (1 + 2)
    (define)

Kicking that down the road is a feature because it lets macros operate at a point in time before that validation has occurred. This means they can accept as input s-exprs that are not semantically valid but will become after macro expansion.

But it can be a bug because it means later phases in the compiler and runtime have to do more sanity checking and program validation is woven throughout the entire system. Also, the definition of what "valid" code is for human readers becomes fuzzier.

HelloNurse · on June 6, 2023

> later phases in the compiler and runtime have to do more sanity checking

But they always have to do all the sanity checking they need, because earlier compiler stages might introduce errors and propagate errors they neglect to check.

> program validation is woven throughout the entire system

Also normal and unavoidable.

As far as processing has logical phases and layers, validation aligns with those layers (the compiler driver ensures that input files can be read and have the proper text encoding, the more language-specific lexer detects mismatched delimiters and unrecognized keywords, and so on); combining phases, e.g. building a symbol table on the go to detect unidentified identifiers before parsing is complete, is a deliberate choice to improve performance but increase complication.

munificent · on June 6, 2023

> because earlier compiler stages might introduce errors and propagate errors they neglect to check.

Static analyzers for IDEs need to handle erroneous code in later phases (for example, being able to partially type check code that contains syntax errors). But, in general, I haven't seen a lot of compiler code that redundantly performs the same validation that was already done in earlier phases. The last thing you want to do when dealing with optimization and code generation is also re-implement your language's type checker.

packetlost · on June 5, 2023

Those rules help reduce runtime surprises though, to be fair. It's not like they exist for not purpose. It directly represents the language designer making decisions to limit what is a valid representation in that language. Rule #1 of building robust systems is making invalid state unrepresentable, and that's exactly what a lot of languages aim to do.

baq · on June 5, 2023

Note that this approach has been reinvented with great industry success (definitions may differ) at least twice - once in XML and another time with the god-forsaken abomination of YAML, both times without the lisp engine running in the background which actually makes working with ASTs a reasonable proposition. And I’m not what you could call a lisp fan.