Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

"Part of the motivation here is to avoid writing two parsers."

Have you considered implementing a parser generator instead or taking the higher-order route and constructing a parser from higher-order primitives? Both ways provide a lot of flexibility to produce different trees for different problems, save a bit of time and a lot of code. Languages are mostly sequences, alternations, repetitions and characters anyway.



Yes, I'm thinking of writing my own parser generator / meta-language for the Oil language. Now that I have experience with OSH/bash, I can see what I need to support.

On the one hand, I know how to write the parser by hand for Oil. Now that I've written the parser for OSH, it's not too much work, and not too much code. The thing that might push me over the edge is handling "broken" code like Microsoft's Roslyn, but I may put that off for v2 and just get the shell working.

The first half of my blog is kind of about how nothing off the shelf like ANTLR or yacc will work. I will have to write my own.

There are posts there about parsing in a single pass, undecidable parsing, four mutually recursive sublanguages, lexer modes/lexical state, using two tokens of lookahead vs. arbitrary lookahead, and Pratt parsing for experssions.

http://www.oilshell.org/blog/

Oil won't be as complicated as OSH/bash, but it at least needs lexer modes and Pratt parsing.

I actually thought about trying a bounty / crowdsourcing for a meta-language that can express Oil in the fewest lines of code (with no dependencies). My claim is that no existing meta-language can handle it.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: