Regular expressions can be as robust as you need them to be, just like any other kind of code. They are a DSL to create lexers, and they are exactly as robust (or hacky) as if you wrote the same lexer by hand.
Are you arguing that the effort required to make a regex robust and correct is larger than the effort required to make some hand-rolled character-by-character based lexer robust and correct?
Because that sounds counter-intuitive to me. A regex is a higher level DSL for lexing.
That's exactly what I'm arguing. Especially because it's very unlikely that you'd write an XML/HTML parser yourself instead of using somebody else's well-tested library.
Of course you should use an existing library if it solves the exact problem you have. Don't waste time re-implementing the wheel unless you are doing if for educational purposes. Whether such a library used regexes or not under the hood would be irrelevant as long as it works and it well tested.
But I would certainly like to hear an argument why you think a regex is less robust that a similar manual character-by-character matcher.