Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> unicode all violate the identifier rules […] library to sanitize this input

Unicode standardises identifier rules. The default identifier as per UAX31-R1 can be checked with a regex:

    perl -E'say "my_identifier" =~ /\A \p{XID_Start} \p{XID_Continue}* \z/x'
I guess that's not what you mean, so an explanation with some details is needed here.


Yes, unicode is fine. What is broken is that nobody cares. Not even perl5 cares about non-identifiable identifiers.

Btw your snippet does not work for identifier validation. There is much more needed to pass the identifier security guidelines.

But here we just need a simple terminal escape sequence stripping library. Unicode is a bit harder. Only Java, rust and me did that.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: