> some people do consider using the same code point for apostrophe and citations problematic (it's definitely annoying when doing word segmentation)
And those people are a menace. "Ball bearings" is a single word with a space in the middle. The only way you're going to get reliable word segmentation is with a natural language parser and a lexicon with an entry for "ball bearing". At that point, you're already recognizing "aren't" as a word; the punctuation isn't really relevant.
On the other hand, if you're not particularly upset about messing up space-including words, there's no real reason to be upset about apostrophe-including ones either.
And those people are a menace. "Ball bearings" is a single word with a space in the middle. The only way you're going to get reliable word segmentation is with a natural language parser and a lexicon with an entry for "ball bearing". At that point, you're already recognizing "aren't" as a word; the punctuation isn't really relevant.
On the other hand, if you're not particularly upset about messing up space-including words, there's no real reason to be upset about apostrophe-including ones either.