> I remember being taught to use yacc in our compiler course because "writing it by hand is too hard". But looks like Ruby joins the growing list of languages that have hand-written parsers, apparently working with generated parsers turned out to be even harder in the long run.
I've been writing parsers for simple (and sometimes not so simple) languages ever since i was in middle school and learned about recursive descent parsing from a book (i didn't knew it was called like that back then, the book had a section on writing an expression parser and i just kept adding stuff) - that was in the 90s.
I wonder why yacc, etc were made in the first place since to me they always felt more complicated and awkward to work with than writing a simple recursive descent parser that works with the parsed text or builds whatever structure you want.
Was it resource constraints that by the 90s didn't really exist anymore but their need in previous decades ended up shaping how parsers were meant to be written?
Parser generators will tell you whether the grammar given to it is well-formed (according to whatever criteria the parser generator uses).
When hand-rolling a parser, there could be accidental ambiguities in the definition of your grammar, which you don't notice because the recursive descent parser just takes whatever possibility happened to be checked first in your particular implementation.
When that happens, future or alternative implementations will be harder to create because they need to be bug-for-bug compatible with whatever choice the reference implementation takes for those obscure edge cases.
> When hand-rolling a parser, there could be accidental ambiguities in the definition of your grammar, which you don't notice because the recursive descent parser just takes whatever possibility happened to be checked first in your particular implementation.
Is that a problem? Just use a grammar formalism with ordered choice.
My hot take is that the allure of parser-generators is mostly academic. If you're designing a language it's good practice to write out a formal grammar for it, and then it feels like it should be possible to just feed that grammar to a program and have it spit out a fully functional parser.
In practice, parser generators are always at least a little disappointing, but that nagging feeling that it _should_ work remains.
Edit: also the other sense of academic, if you have to teach students how to do parsing, and need to teach formal grammar, then getting two birds with one stone is very appealing.
It is not academic. It is very practical to actually have a grammar and thus the possibility to use any language that has a perser generator. It is very annoying to have a great format, but no parser and no official grammar for the format available and being stuck with whatever tooling exists, because you would have to come up with a completely new grammar to implement a parser.
I fully agree that you need to have a grammar for your language.
> and thus the possibility to use any language that has a perser generator.
See, this is where it falls down in my experience. You can't just feed "the grammar" straight into each generator, and you need to account for the quirks of each generator anyway. So the practical, idk, "reusability"... is much lower than it seems like it should be.
If you could actually just write your grammar once and feed it to any parser generator and have it actually work then that would be cool. I just don't think it works out that way in practice.
I've been writing parsers for simple (and sometimes not so simple) languages ever since i was in middle school and learned about recursive descent parsing from a book (i didn't knew it was called like that back then, the book had a section on writing an expression parser and i just kept adding stuff) - that was in the 90s.
I wonder why yacc, etc were made in the first place since to me they always felt more complicated and awkward to work with than writing a simple recursive descent parser that works with the parsed text or builds whatever structure you want.
Was it resource constraints that by the 90s didn't really exist anymore but their need in previous decades ended up shaping how parsers were meant to be written?