Hacker News new | past | comments | ask | show | jobs | submit login

Sure, but on the pragmatic hand, "regular" isn't super useful unless you're designing a language yourself. If you're parsing something, not understanding "regular" is going to hurt a lot less than attempting to use PCRE to do something it can't.



Regular expressions has implications in space time complexity. PCRE tosses that away.


Indeed. To support backreferences, "regex" libraries are forced to use algorithms that can be very slow in the worst case.

The sad thing is that the libraries use the same algorithms even if the expression doesn't contain backreferences. A while ago, Stack Overflow had a brief outage because of regular expression performance, although the expression that caused it didn't even use backreferences or other non-regular features:

http://stackstatus.net/post/147710624694/outage-postmortem-j...

In contrast, Google Code Search - when it still existed - supported regular expression searches over world's public codebases. One key ingredient making this possible was to only use proper regular expressions:

https://swtch.com/~rsc/regexp/regexp4.html


Regular is actually incredibly useful because it allows you to confidently parse or validate (large amounts of) untrusted data without having to worry about DoS attacks.

I'd argue that this is useful, even if you use it as part of a parser that doesn't just parse regular languages, as you'll have a slightly easier time reasoning about the time and space complexity of the construction.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: