Hacker News new | past | comments | ask | show | jobs | submit login

Ben, could you go a little into detail on what you had in mind with the term rewriting system? The sin/cos example seems a little underwhelming, since these should be evaluated at compile-time in the first place.

They seem like a variant on C macros, but without the restriction on function syntax. Maybe one could implement Python's with statement in userland, which is very nice.

It also seems to me that you could replace C++ templates, yet you do have generics. How come? What's the difference?

My intuition is that ambiguities are at least possible, so what about precedence, which AST term is evaluated first? I'm thinking of some situation where two subtrees match the rule, but you end up with different results depending on which is transformed first. E. g. if you were to define a cross product rule, and apply it to a × b × c.

What about infinite loops in rule sets? What about recursion?

What about a situation where multiple rules could be applied? Is there any way to debug the transformations? Is that what the `using` scopes are for? If the compiler was to detect ambiguities, that seems pretty expensive, because it would need to match every rule to every subtree, right?

This thing seems very, very powerful, I just find it a bit hard to grasp what you can do with it in practice, and what the limitations are. I would be very interested to hear what you ran into so far.




I think this thread captures a more interesting application of term rewriting: https://twitter.com/bendmorris/status/1041130408463687681

>My intuition is that ambiguities are at least possible, so what about precedence, which AST term is evaluated first? I'm thinking of some situation where two subtrees match the rule, but you end up with different results depending on which is transformed first. E. g. if you were to define a cross product rule, and apply it to a × b × c.

This is absolutely a potential issue, and the examples I have up now are quite bad. This is why it's important for rules to be strictly scoped. The only rules that can affect an expression are the ones you've explicitly brought in with a "using" block, or those that are defined for the types you're directly working with. Given the scoping, I think overlapping rule applications will be uncommon in practice.

>What about infinite loops in rule sets? What about recursion?

There's a sanity limit on the number of times an expression can be rewritten, and it'll show an error in that case which displays the transformations that triggered the limit (including the first, last, and any non-repeating transformations going backward from the last - so it should be very clear.)

>Is there any way to debug the transformations?

This is definitely an area for improvement. I'm planning to (1) add enforced "rewrite this" metadata that will fail to compile if the expression isn't rewritten, and (2) enable dumping not only the final AST but also all of the transformations that occurred.


It seems you've independently rediscovered extensible programming which had seen much research in the past and again a few years back. There have been a number of languages that did same thing as in your tweet (e.g. [1]) but most aren't developed anymore.

[1]: http://seed7.sourceforge.net/examples/declstat.htm


Neat, another one for my list. Seed7 allows the extension of the grammar [1]. The compiler understands a simple EBNF which doesn't distinguish between different non-terminals. The semantic checks and the transformation into a known AST are deferred to another stage.

[1]: http://seed7.sourceforge.net/manual/syntax.htm


Does the compiler build the initial AST based on a grammar, before the transformations happen? That would mean you can't expand the grammar with user-defined rules. The reason you can "implement for loops in user land" is because they're already part of the grammar, is that correct?

How abstract is that initial AST? Can you use these rewriting transformations to expand the grammar?


That is correct. I think there are tradeoffs in making the language too extensible; in this case it's changing some of the details but not the high-level semantics of how the language works. If every library looked completely different lexically, it would be a nightmare to reuse code. Users of Rust macros may already start to feel this; many libraries are implemented as essentially DSLs.

With that said, down the line I plan to add procedural macros, and those will likely be lexical (they take a series of tokens as input, which doesn't need to parse into valid AST.) If I do go that route, such macros would have to be invoked explicitly.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: