Hacker News new | past | comments | ask | show | jobs | submit login

Or, you just use the DOM API to manipulate the structure. You don't implement a parser because one is already provided by the tooling - you use it to go from known-valid text to a data structure (here, DOM), and do your operations there.

You shouldn't do "escaping" and string concatenation. That's just parsing and unparsing while cutting corners, which is how you get injection bugs.

> The only situation where a parser makes sense over simple escaping routines is if you actually intended to accept a subset of the language that you're injecting into rather than plain text

And that's exactly what you're doing. With escaping, you're taking a serialized form of some data, and splice into it some other data, massaged in a way you hope will make it always parse to string when something parses this later. It's going to eventually bite you; not necessarily with XSS - web template breakage is another common occurrence.

Working in string space is tricky, dangerous, and dumb - parsing, working on the parsed representation, and unparsing at the end, is how you do it correctly and safely.

(Another way to put it: plaintext is a wire format; you don't work in it if the data is structured.)

Note that the API may look like you're doing text - see JSX - but it internally goes through a parsing stage, and makes it impossible for you to do stupid things that break or transform the program, like working in string space lets you.




If you don't want your users to produce HTML, then why would you use the DOM API to parse their text into an HTML data structure? Then you'd have code that's capable of producing <script> tags or who knows what else from untrusted user input and you now have to explicitly filter out tag types. Alternatively you can implement the middle bit of a compiler and map nodes to a new, safe data structure that you spit out at the end, but in the scenario we're discussing the user input was supposed to be unstructured text. HTML content is in most cases a malicious edge case, not expected data.

If you instead escape the user-provided unstructured text by replacing the very well-known set of special characters that could create tags, you know your users cannot produce active code, only text nodes.

It's the principle of least power: if you don't need users to access anything other than unstructured text then why feed their input into a parser that produces a data structure that represents code? Make illegal states unrepresentable by just escaping the text nodes as they're saved!


The problem isn't with what the user can do, but with what your code can. If you bork your escaping, which is context-dependent, then user data can turn into arbitrary HTML, complete with script tags. If you keep an abstract tree representation, and add the user-provided data by passing it vetbatim to "set text content" method on a node, then there's no possible way the user input can break it. That is exactly what it means to make illegal states unrepresentable!

Working on the data structures after parsing makes it impossible to accidentally break the structure itself. Like, maybe your string escaping is perfect, but if you do:

  $content = $templatePrefix + $sanitizedString + $templateSuffix;
Then you're still vulnerable to trivial errors in your template breaking the structure and creating an exploitable vulnerability, despite the $sanitizedString being correct. If you instead work at parsed level and do:

  $result = $template.findNode("#foo").setText($unsanitizedInput)
Then there's just no way this can break (except bugs in the HTML parser and DOM API in general, which are much less likely to exist, and much easier to find and fix).


I think we've been talking past each other.

  $result = $template.findNode("#foo").setText($unsanitizedInput)
This is not parsing the user input, this is letting the native API escape the input for you, which is exactly what I'm advocating for. See my note above:

> escape the HTML using the language's standard library or the web framework's tooling.

This is what parsing the user input would look like with the DOM API:

    const newDiv = document.createElement('div');
    newDiv.innerHTML = untrustedUserInput;
    // Do some work to attempt to sanitize the new HTML elements
    document.body.appendChild(newDiv);
To me this is definitively a Bad Idea™, and I thought this was what you were advocating for.

What you actually proposed is just escaping the HTML, not parsing user input, with the only twist being that you prefer to inject user input into your templating system imperatively with something resembling the DOM API instead of declaratively with something resembling JSX. That's fine, but not relevant to the question of what method we use to sanitize the untrusted input that we're injecting. On that front it sounds like we're in agreement that parsing user input is a terrible idea.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: