Hacker News new | past | comments | ask | show | jobs | submit login

I really liked the blog post and was interested to see the end results but...

> I hope to publish another blog post describing [speculative parsing] thoroughly, along with details on the performance improvements this feature would bring.

Isn't there any preliminary results on the perf improvements this would bring?

I understand that the % of time when the speculative parsing succeeds (instead of having to roll back to a sequential parsing) can't be know easily, since it'll be based on the testing of a large number of real-world web pages,

but I'd love to see just 2 simple examples:

* One full of `document.write`, so we can have an idea of the time lost in case of failed speculation

* One 'embarrassingly parallel' dom tree where the speculation should pay off the most.

That would give us a good idea of worst case / best case results.




Presto (the old Opera engine) had a mode called "delayed script execution" which would speculate even harder than the gecko/servo approach. As I understand it — which might well be incorrectly — it would continue treebuilding after a script tag, and then if the script ran a document.write it would check if it produced something that wouldn't affect subsequent tokenisation (or, presumably alter the treebuilding state) and if so patch the tree in-place rather than throwing it away and restarting.

It was never debugged enough to be turned on on desktop, but I think it was considered useful enough to ship on mobile in the days when mobile performance was significantly worse than it is today. In particular, on a network where request latency is high having to pause treebuilding for scripts to download can significantly increase the time until it's possible to paint anything at all on the screen.


My memory of DSE is that it parsed the entire document prior to doing any script execution; it didn't merely run the script when it became available. That said, I probably touched it only marginally more recently than you.


With things like this sometimes they do the work without preliminary perf numbers because other browsers have already done it successfully. IIRC, at least one production browser already does off-main-thread parsing.


Do you know which one? Is it speculative parsing like this one or does it stop at scripts?


Pretty sure all WebKit-based browsers (including chrome) do this. I'd guess they block if needed, but I'm not sure.


fwiw WebKit never turned on the threaded html parser (and deleted it after the fork [1]) and Blink removed it a bit ago too [2]. In Chrome we measured it across a number of sites and found that the time saved from background tokenization wasn't benefiting real world content enough to justify the cost and that in some situations it actually made things worse.

That's not to say Servo shouldn't try it of course! Part of having a healthy multi-browser ecosystem is each browser trying lots of ideas and coming up with new solutions to problems that were encountered in other implementations.

[1] http://trac.webkit.org/changeset/162260/webkit (the number cited here is a joke :P)

[2] https://groups.google.com/a/chromium.org/forum/#!topic/loadi... (see also the design doc linked from the email)


I guess one of goals of Rust is to make these sort of optimisations viable (compiler provides safety nets). Will be nice to see what they can do.


Gecko has had off-the-main thread HTML parsing since Firefox 4. Docs: https://developer.mozilla.org/en-US/docs/Mozilla/Gecko/HTML_...

WebKit/Blink put fewer parts of the HTML parser off the main thread, so negative results should not be taken to apply to more comprehensive approaches.


I don't get the joke. Were these defines never used? Did the commit removing the threaded html parser happen before?


I'd wager the feature required fewer than 8.8 million lines of code :-)


Gecko/Firefox.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: