Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Regular expressions can be as robust as you need them to be, just like any other kind of code. They are a DSL to create lexers, and they are exactly as robust (or hacky) as if you wrote the same lexer by hand.


C code can be as robust as you need it to be. So why bother with formal verification, safe C coding standards, Rust, etc?

The answer is that it can be robust, but the effort required to do that is so large that in practice it usually isn't.


Are you arguing that the effort required to make a regex robust and correct is larger than the effort required to make some hand-rolled character-by-character based lexer robust and correct?

Because that sounds counter-intuitive to me. A regex is a higher level DSL for lexing.


That's exactly what I'm arguing. Especially because it's very unlikely that you'd write an XML/HTML parser yourself instead of using somebody else's well-tested library.


OK but these are two separate question.

Of course you should use an existing library if it solves the exact problem you have. Don't waste time re-implementing the wheel unless you are doing if for educational purposes. Whether such a library used regexes or not under the hood would be irrelevant as long as it works and it well tested.

But I would certainly like to hear an argument why you think a regex is less robust that a similar manual character-by-character matcher.




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: