Maybe the questioner is also in full control of the HTML creation and they don’t need a parser for all possible HTML edge cases.
It seems that even the very conceptually simple example given by the questioner is impossible.