Regular expressions can be as robust as you need them to be, just like any other...

IshKebab · on May 10, 2021

C code can be as robust as you need it to be. So why bother with formal verification, safe C coding standards, Rust, etc?

The answer is that it can be robust, but the effort required to do that is so large that in practice it usually isn't.

goto11 · on May 10, 2021

Are you arguing that the effort required to make a regex robust and correct is larger than the effort required to make some hand-rolled character-by-character based lexer robust and correct?

Because that sounds counter-intuitive to me. A regex is a higher level DSL for lexing.

IshKebab · on May 10, 2021

That's exactly what I'm arguing. Especially because it's very unlikely that you'd write an XML/HTML parser yourself instead of using somebody else's well-tested library.

goto11 · on May 11, 2021

OK but these are two separate question.

Of course you should use an existing library if it solves the exact problem you have. Don't waste time re-implementing the wheel unless you are doing if for educational purposes. Whether such a library used regexes or not under the hood would be irrelevant as long as it works and it well tested.

But I would certainly like to hear an argument why you think a regex is less robust that a similar manual character-by-character matcher.