Hacker News new | past | comments | ask | show | jobs | submit login

What would be a use for finding a minimal discriminating regex? Perhaps understanding the difference between boys' and girls' names?



Performance. For example if you are syntax highlighting a programming language with hundreds of keywords, then using a regexp like "(kwd1|kwd2|...|kwdn)" is not very efficient. An optimized regexp can do the same matching much faster.


You would be much better off just generating the DFA of the straightforward regex and minimizing that DFA. This is simpler and faster to generate and faster to execute. Furthermore, in a parser for a programming language you do not want to match some things in one list but not in this other list, you want to match some things in this list and nothing else.


I know this isn't what you're asking, but I imagine in this case it's because that's one of the challenges of regex golf. Matching regex, as short as possible.


It's a subset of the set cover problem, which is a well-known NP-hard problem.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: