Speaking as one of the original three authors of Google Docs (Writely), but zero involvement in this project (I left Google in 2010): I'm seeing a lot of comments asking how JavaScript-on-Canvas could possibly outperform the highly optimized native code built into the browser engines. It's been a long time since I've really been involved in browser coding, but having written both Writely and, farther back, several native-app word processing engines, here are some thoughts.
Word processors have extremely specific requirements for layout, rendering, and incremental updates. I'll name just two examples. First, to highlight a text selection in mixed left-to-right / right-to-left text, it's necessary to obtain extremely specific information regarding text layout; information that the DOM may not be set up to provide. Second, to smoothly update as the user is typing text, it's often desirable to "cheat" the reflow process and focus on updating just the line of text containing the insertion point. (Obviously browser engines support text selections, but they probably don't expose the underlying primitives the way a word processor would need. Similarly, they support incremental layout + rendering, but probably not specifically optimized in the precise way a word processor would need.)
Modern browser engines are amazing feats of engineering, but the feature set they provide, while enormous, is unlikely to exactly match the exacting requirements of a WYSIWYG word processor. As soon as your requirements differ even slightly from the feature set provided, you start tipping over into complex workarounds which impact performance and are hell on developer productivity and application stability / compatibility.
This is loosely analogous to CISC vs. RISC: browsers are amazing "CISCy" engines but if your use case doesn't precisely fit the expectations of the instruction set designer then you're better off with something lower-level, like Canvas and WASM. (I don't know whether Docs uses WASM but it would seem like a good fit for this Canvas project.)
Frameworks in general suffer from this problem. If you've ever had to fight with an app framework, or orchestration framework, or whatever sort of framework to accomplish something 5% outside of what the framework is set up to support, then you understand the concept.
Also, as noted in many comments here, browser engines have to solve a much more general problem than Docs, and thus have extra overhead.
I'd like to chime in here as someone who has worked on optimizing the execution of your code :) Google docs specifically was one of the subjects of a particular performance push when I was working on Spidermnonkey within Firefox, and I got to see how it behaves under the hood pretty well.
The thing that stands out to me the most was the giant sparse array (a regular js-native array) being used to store layout information, presumably. It really messed with our internals because spidermonkey didn't expect those to be used in fastpaths, and it was really lazy about trying to optimize for them.
Anecdotes aside.. I wanted to endorse your entire comment :) I remember thinking to myself how terrible it was to have to piggyback a document layout engine on top of HTML layout and these awful JS abstractions, and how much better and more performant it would be to do a proper layout engine - either in JS or compile-to-wasm, and have it run its own rendering logic.
In particular for large documents where you were making changes to early parts of the document, a single keystroke could invoke this _cascade_ of sparse array fetches and mutations and DOM rearrangements and all sorts of fireworks.
However, I can't claim credit (or blame, but I would argue mostly credit) for that code. There have been three generations of the Docs editor that I know of:
1. The original, which I was involved in, was an unholy mess perched shakily atop contenteditable. As such, it contained no layout or rendering code (but did all sorts of horrid things under the hood to massage the HTML created by the various browser contenteditable engines and thus work around various problems, notably compatibility issues when users on different browsers are editing the same document). Originally launched in 2005.
2. In the early 2010s, an offshoot of the Google Sheets team launched a complete rewrite of the Docs engine which did its own editing, layout, and rendering based using low-level DOM manipulation. This was more robust, supported layout features not available in contenteditable (e.g. pagination), and generally was a much better platform. My primary contribution to this effort was to incorrectly suggest that it was unlikely to pan out. (I was worried that the primitives available via the DOM would be insufficient; for instance, to deal with mixed-directional text.)
3. This canvas-based engine, which I learned about a few hours ago when this post popped up on HN.
I don't know whether #3 is an evolution of #2 or a complete rewrite; for all I know there was another generation in between. But I imagine you were looking at some iteration of #2.
You're right. This was a few years ago, so well after 2010.
And yes, I'd say credit as well for the layout code, not blame. I wasn't knocking the code - for that era sparse arrays + DOM stuff were pretty common approaches and there didn't exist better web tooling than that.
It's only been the last few years I'd say where the optimization quality (on the engine side) and API support has been good enough to justify this sort of approach ofjust plumbing your own graphics pipeline on top of the web.
That was a spidermonkey issue. I treat that experience more as a lesson in how obscure corner cases left as perf cliffs never stay obscure corner cases, and always get exercised, and you can't afford to ignore them for too long.
With a canvas-based engine, the editor is no longer relying on the contenteditable spec right?
For the majority of use cases, do you think contenteditable + view layer which precisely updates the HTML is still viable?
More specifically, what do you think about open-source libraries like ProseMirror (https://prosemirror.net/) or Slate.js (https://github.com/ianstormtaylor/slate) which do that (ProseMirror uses its own view library on vanilla javascript, Slate uses React)?
I understand if you have really long documents or spreadsheets (I imagine latter is more frequent), you could maybe solve performance rendering problems with virtualization, which canvas gives more flexibility to?
> With a canvas-based engine, the editor is no longer relying on the contenteditable spec right?
Correct. In fact, contenteditable went out the window a decade ago when the "#2" engine (low-level DOM manipulation) was launched.
My experience with contenteditable is ~12 years stale at this point, so the only thing I'll try to say is that I expect it would work well up to a certain level of ambition, and no further. As I say above regarding frameworks: they're great so long as your requirements fit within the expectations of the framework, but you quickly hit a wall if you need to stray outside of that. For Docs, the desire for a paginated view/edit mode was an example; there was simply no sane way of squeezing pagination into a contenteditable-based engine.
My experience with modern contenteditable suggests that it does work pretty well, overall, though I've not been using it for something as layout-heavy as Docs -- I've worked on the VisualEditor for mediawiki, which has different requirements.
A canvas-based document editor with any sort of international ambitions has a fairly high bar to clear for reimplementing basic features. The browsers really do handle a lot of useful things for you in contenteditable, like the upthread-mentioned RTL issues, and complex IME input methods.
If you have a lot of HTML-rendering inherently required, strong internationalization requirements, and no need for something like page-based layout... contenteditable has advantages, particularly when comparing the up-front work required.
The sparse array is likely to be a protobuf. I ran into this issue with Firefox when working on Google Inbox, it was one of the reasons why the Firefox version was delayed, but there was degenerate performance with sparse arrays. (I'll note, various conspiracy theories on HN thought it was a deliberate attempt to hamper FF, when in reality, is was an unintended consequence of usage of an old protobuf format which never caused a problem until protobufs with huge extension fields were used in a specific way in the codebase, so the problem was discovered late)
protobufs can be stored in array format. In that format, each field number is basically it's index in the array. Extension fields in protobufs typically grab high numbered slots. So if you have a prototype with 1 field (id = 1), and one extension field (e.g. id = 10000000), you now have an array with [undefined, stuff, ... 999999 ..., stuff] and various array operators seem to reify this into a real array in older versions.
> various conspiracy theories on HN thought it was a deliberate attempt to hamper FF
I remember those being fairly rampant.
I wonder if a technical blog post about the issue would have silenced some of the conspiracy theories.
Regardless, there's a lesson in there somewhere. Never attribute to malice that which is adequately explained by degenerate performance of a browser pushed to its limits?
Yeah, it's primary the fault of overzealous pushing of limits. At the time we were using WebWorkers/SharedWorkers, bleeding edge CSS "compositor" behavior to achieve smooth 30-60fps animations, and lots of other stuff I don't remember. It was very easy to get off the 'golden path' of performance. Small changes in CSS or DOM structure for example would destroy layout/paint performance and require days of debugging.
Add on top of that, that Inbox was developed using a shared codebase for 3 platforms (Web, Android, iOS), the non-UI code was written in Java, while the UI code was written in JS, Java, and Objective-C respectively.
The shared "business logic" layer was cross compiled, and it was the protobuf runtime for GWT inside that was causing trouble. We "optimized" it by making it run on top of pure arrays instead of OO objects. This was a feature of GWT called 'JSO's (JavaScriptObjects) that let you pretend that raw JS objects had methods on them that they didn't, like Extension Methods in other languages.
All was good until IIRC, a utility function was introduced that did Object.keys(some protobuf array). This returns a sparse array on V8, but a reified real dense array on SpiderMonkey at that time, and so if you were unlucky enough to have a high extension field in you protobuf, you'd end up creating an array with a billion entries in it.
It was hard to forsee this because Inbox was built out of so many interacting systems. Ideally, the GWT Protobuf Compiler runtime would have had integration tests for Firefox that exercised iteration over sparse arrays with high extension number methods, but it didn't, which means the problem languished until discovered in Inbox. GWT Protobuf was probably a 20% project of someone at the time, implementing the minimal features they needed.
Also, debugging it was a nightmare, because as soon as Object.keys(big sparse array) was encountered, the Firefox debugger would essentially freeze/die, and we couldn't get iinformation out. Single-stepping through a huge ginormous bit of code after bisecting was how I tracked it down, because when I tried to console.println(object.keys(big sparse array)) it would die.
I'm not blaming Firefox, I'm not sure the JS specification even says what the right thing to do with things like Object.keys(sparse array), maybe it was unspecified/vague behavior? I'm just pointing out that there was absolutely no malice, and no desire to block Inbox from running on FF, or IE10 or WebKit for that matter. It's always basically a matter of launch schedules, late discovered bugs, and triage.
Spidermonkey's diciontary object representation leaves a lot for improvement. The issue you cite here isn't specifically related (sounds like it could have been fixed with a one or two line change) but I can describe one of my (still standing) pet peeves about the implementation of objects in spidermonkey:
Dictionary objects are what we call objects that have fallen off the happy path of tracked property-names, and become degenerate associative maps from keys to values. They use a representation where the key-mapping for the object's bound names is kept in a linked entry hashtable (a hashtable where the entries form a doubly linked list) structure that hangs off of the hidden type of the object. Every lookup for a property (including array indexes) involved first pulling this hashtable out, then looking up the property on the hashtable, to obtain a shape, which gives the _offset of the property on the original object_, and then using that offset to look up the value on the original object.
All said and done, there were about half a dozen to a dozen distinct heap accesses, and pollution of about 6-7 cache lines, just to retrieve a single property on an object that had gone into dictionary mode (which is what sparse arrays would become).
Fixing the object representation was on my long-term todo-list for a while. It is a very time-consuming task because all the JITs and other optimization layers were aware of it, so any changes to it would involve adjusting a ton of other code.
> I'm not blaming Firefox, I'm not sure the JS specification even says what the right thing to do with things like Object.keys(sparse array), maybe it was unspecified/vague behavior? I'm just pointing out that there was absolutely no malice, and no desire to block Inbox from running on FF, or IE10 or WebKit for that matter. It's always basically a matter of launch schedules, late discovered bugs, and triage.
One thing you learn working on any sort of a public facing project a lot of people use is that people, especially the most emotionally invested people, will assign motivations to you personally that have no external reference points except their interpretation of events.
I've encountered that working at Mozilla, but thankfully largely been sheltered from direct consequences. You've arguably worked on even more public projects.
There's no need to pollute your commentary with defences that aren't owed.
So I've experienced Docs getting, hmm, sad once the doc you're editing gets beyond something like 30-50 pages.
Does this change mean that I can look forward to being able to write hundreds or thousands of pages in a Google Doc without it getting periodically non-performant?
I sat by Steve when Writely joined Google. [Hi Steve!]
I sat by the Google Page Creator team when they were a thing. Regardless of performance, it's a miracle that a WYSIWYG editor can be written on top of the DOM at all, let alone a performant one. They had to work multiple miracles a day just to get bulleted lists to work somewhat reliably.
I have no doubt whatsoever that a Canvas-based editor can be faster and easier to maintain. I don't know how well it'll handle accessibility issues, though. I expect they'll have to do a lot of tedious work to get screen readers and the like to be happy.
I have nothing to add to the discussion, other than I’ve been using Writely since ~2005-2006 and wanted to say thanks for all the fish!
It was super handy before I had a laptop for regular use. I used it at public libraries for projects in my last year of high school. It helped me develop a habit of having a third-space workplace that was away from home and school.
The "floating workspace" aspect has always driven at least as much usage as the "collaboration" aspect. That came as a complete surprise to us, but it turned out to be very important to adoption. At some point I think we determined that the average document had something like 1.1 collaborators.
The name Writely reminds me of a side project I worked on around 2010. I was not satisfied with the performance of Google Docs and its competitors at the time like EditGrid and thought (naively, as it turned out) I'd be able to develop a faster alternative.
I had no idea what I was doing and thinking that using JavaScript to manipulate the DOM was going to be slow, I chose ActionScript and Flash as the language and runtime to develop the project in. I wrote a client-side expression parser and formula engine, and managed to develop a functioning spreadsheet UI with resizable rows and columns, copy and paste with Excel-like animations, cell references etc.
The problem that I ran into was text-rendering when there was a lot of text on the screen. The application would consume a lot of memory and the page would slow down to a crawl when scrolling. I couldn't really find a way to speed up the performance and stopped working on the application after some time. That's when I realized the incredible amount of work that went into Google Docs and other web-based spreadsheets. :)
You are correct. I meant Google Sheets instead of Google Docs. I thought I'd tackle the word processor part once I got the spreadsheet to a usable state. I do think Google Docs suite is used as an umbrella term to refer to all the Google collaboration tools like Docs, Sheets, Forms etc.
There is no substitute to building one yourself, but The Craft of Text Editing book has a lot of accumulated wisdom. It is Emacs-centric, but basics are same.
One of my first programming projects as a teenager back in terminal-type days was to write my own text editor for the Atari ST. I was super happy with it, and sold three (3) copies of it! That made me very happy at the time.
Of course there was that time that I messed with the save/load code and destroyed the text files of one of my customers. Not so happy with that! Saved it by writing a fix system, and that actually led to being hired at that guy's company for my first "real" job. ;)
Ahahaha! You said company. ;) It was just me, a teenage kid, selling to people who came through the store I worked at selling computers (the store sold the Atari line).
I called it DEdit, because every programmer wants to grab a single letter title.
I'd love to have a good answer for you, but I learned the basics all the way back in the '80s. Seems like I've seen references posted occasionally on HN, hopefully someone has a good link.
I don't think there's any way better than build a text editor from scratch, you have to understand the exact problem before reading other people's solutions.
A good open-source example of this type of problem is CodeMirror (a code-editing widget for the web). To achieve syntax highlighting and everything else, it basically fakes every single aspect of text editing in a browser context - even the cursor and text highlighting - replacing the native versions with pixel-placed DOM elements driven by JS. It receives raw events from an invisible textarea and does everything else itself.
This is just about the worst possible use-case for the DOM: you get almost none of the benefits, and still get most of the costs.
Edit: To be clear, I'm not saying this to rag on CodeMirror. It's a marvel that they got it working as well as it does, and it's been around for a long time- possibly longer than the canvas API has been widely available. It's just that doing things this way requires a pile of hacks and I can see a strong argument for just cutting out the middle-man and doing 100% custom rendering.
Distribution. It is so, so, so much harder to get most users to install an app, especially for casual purposes ("please add your notes to this doc") which is often how people first come to Docs.
That's a common belief but there are a lot of exceptions that mean we should question how true this really is.
1. Mobile-first, mobile-only apps.
2. Minecraft or really any game.
3. The proliferation of Electron apps that are basically downloadable versions of the website.
4. Apple's own suite of apps. Keynote is pretty darn popular.
In the case of a user who is really, really unmotivated to comment on a doc, sure. Then every click, every second counts because the user doesn't really have a fixed need to complete the task to begin with. For most other things, users are willing to download apps and may even prefer it.
It's also worth considering that Writely/Docs never really supplanted Word and is still rather feature poor even after a decade of continuous development, perhaps because they keep having to rewrite the rendering engine. If Docs was a downloadable app with a simple web-side static renderer + commenting engine, it might have obtained features that could offset any loss of casual users due to needing a download to collaborate. Especially if the download was fast, tight and transparent.
I think a major advantages for me are easy bookmark-ability and sharing via URLs to people who may not have the app. You can bodge workarounds for those things in an offline app, but now the bookmarks aren't in my browser or I have to copy them to it, or the people I share the doc to are just looking at a browser rendered viewer or something.
While Docs hits 95% of my needs there's still that 5% and I suspect most of those are held back by the current implementation architecture. Hopefully moving to a Canvas based system will enable them to more easily add complex features.
On the other hand, allowing for a native desktop app could have caused it to end up in the same state as Microsoft's apps on the web (e.g. the web version of PowerPoint is pretty awful), which would have undermined a key differentiator between Docs and Microsoft's suite.
I am not sure what has held gSuite back all these years, but the pandemic seems to have brought them out of their slumber.
What I don't understand is why google doesn't just do what they do all the time and add 20 new APIs for it and strongarm everyone else into having to implement them too
Kids these days usually don't understand exactly how fast hardware actually is. ;)
"I have 200 million entries in a table I need to compress. I'm gonna write a flume job! I estimate it will take 5 minutes to start the job and a half hour to run! Then I'll spend a few days figuring out how to shard it so it actually finishes."
"Sounds good. But I also have this bit of Java code here that does the same compression on my desktop in about 30 seconds. Would you like that instead? You could convert it to C++ if that would make you feel better."
yellowbrick.com did some neat stuff pushing the query algebra down into the flash storage firmware.
Does the challenge of word processors extend to these chromium based IDEs like VsCode ? I wonder if that can also be optimized if going to a non DOM based approach?
I wrote the terminal canvas renderers in VS Code that has been called out a few times here. Initially I implemented a canvas renderer using just a 2d context to draw many textures which sped things up "5 to 45 times"[1] over the older DOM renderer.
Since then I moved onto a WebGL renderer[2] which was mostly a personal project, it's basically the first canvas renderer but better in every way since it works by organizing a typed array (very fast) and sending it to the GPU in one go, as opposed to piece meal and having the browser do its best to optimize/reduce calls. This was measured to improve performance by up to 900% in some cases over the canvas renderer, but actually much more than that if for example the browser has GPU rendering disabled and tried to user the canvas renderer on the CPU.
My opinion here is that canvas is a great technology, capable of speeding things up significantly and getting close to native app performance. It comes with very real trade offs though:
- Cost of implementation and maintenance is much higher with canvas. This is particularly the case with WebGL, there have been very few contributions to xterm.js (the terminal frontend component) in the WebGL renderer because of the knowledge required.
- Accessibility needs to be implemented from scratch using a parallel DOM structure that only gets exposed to the screen reader. Supporting screen readers will probably also negate the benefits of using canvas to begin with since you need to maintain the DOM structure anyway (the Accessibility Object Model DOM API should help here).
- Plugins/extensibility for webapps are still very possible but requires extra thinking and explicit APIs. For xterm.js we're hoping to allow decorating cells in the terminal by giving embedders DOM elements that are managed/positioned by the library[3].
More recently I built an extension for VS Code called Luna Paint[4] which is an image editor built on WebGL, taking the lessons I learned from working on the terminal canvas renderer to make a surprisingly capable image editor embedded in a VS Code webview.
Okay, if anyone else after reading Accessibility needs to be implemented from scratch felt ashamed, raise your hands with me. SW engineers suffer from assuming everyone is like them and there are no corner cases. My Mom was recently sued for violating ADA with her real estate website not working with screen readers well enough.
I’ve never worked somewhere that treated a11y as a feature, with the commensurate resources put towards implementing it. It’s always ignored and then sometimes maybe worked on as an afterthought, during a hackathon or whatnot.
In other words, even when engineers are aware of it and inclined to do something about it, mgmt still has to care, and I’ve just never seen that once.
Do you think the large performance benefits can be achieved for any general web app (e.g. if I rewrite my Vue app's render functions to using a canvas instead of the DOM) or is the canvas' benefits mainly for niche workloads?
Definitely niche workloads or when the performance benefit from a UX perspective is worth the cost of implementation. Start out with virtualizing the DOM so only the visible parts are showing, if the framerate isn't acceptable after that then consider switching to canvas.
Using the terminal as a case study, its DOM renderer needs to swap out many elements per row in the terminal every frame (the number depends on text styles and character widths) and we want to maintain 60fps. It's also not just a matter of maintaining low fps since more time rendering means less time processing incoming data because they share the main thread, which means commands will run slower.
In my experience no - at least on current iterations - for very specific things such as a text editor where the DOM isn't really prepared to deal with the way it (the editor) has to be structured probably yes, if you knew what you're doing - but for most things not really - and to have the same functionality you would need to implement a lot of things by yourself (even if functionally it would work it wouldn't have the same accessibility unless you did that yourself and not sure how much you can fully emulate it).
From what I understand DOM is pretty shit in terms of performance because it needs to support so much legacy crap (eg. float layouts) - so even using sandboxed WebGL (which adds overhead over native APIs which your browser would use to render) you can still be much faster.
> Supporting screen readers will probably also negate the benefits of using canvas to begin with since you need to maintain the DOM structure anyway (the Accessibility Object Model DOM API should help here)
The fact that the DOM elements are invisible (don't affect layout) should eliminate the majority of the performance cost, right?
They can't be display: none as that would mean the screen reader can't access them. To do this properly and help low vision people, you need to make sure the textarea is synced with the cursor position and that all the text is positioned roughly where the text on screen is. By doing this the screen reader will correctly outline the element being read.
There may also be additional costs like the string manipulation required to build the row text in the terminal, this is nothing that can't be optimized but then that's more memory and cache invalidation to worry about.
In 2009, I joined Mozilla and started working on the Bespin[1] project, which Ben Galbraith & Dion Almaer had brought to Moz. Bespin was built with a canvas-based renderer. Bespin was way faster than other browser-based code editors at the time.
Then the Ajax.org/Cloud9 folks came along with their Ace editor[2], which was DOM-based and still very fast. We ended up merging the projects. edit to add: and switching to DOM rendering
Rik Arends[3] was one of the Ajax.org folks and he's been working on a WebGL-based code environment called Makepad[4], which is entirely built in Rust and has its own UI toolkit. He's complained a lot about how difficult it is to make a performant JS-based editing environment.
My point in all of this is just that there are absolutely tradeoffs in performance, accessibility, ease-of-development, internationalization, and likely other aspects. If raw performance is what you're going for, it's hard to beat just drawing on a canvas or using WebGL. Google Docs needs to worry about all of that other stuff, too, so I'll be interested to see how this shapes up.
It's funny — Google's approach here reminds me of the Netscape/Mozilla XUL tree[0] element.
For those unfamiliar, the XUL tree is a performant 1990s-era virtualized list that is able to render millions to tens of millions of rows of content without slowdown since it gives you the option of rendering internally in Firefox rather than through the DOM.
I still don't completely understand why Mozilla is/was planning to axe[1][2] it since there's no web-based HTML5/JS replacement (the virtualized "tree" is implemented in C++, iirc) and it's still being actively used in places.{xul/xhtml}[3] and the Thunderbird/SeaMonkey[4][5] products.
It's interesting that both Google's canvas bet (Flutter, Docs, etc.) and the Mozilla XUL tree are basically trying to solve a nearly identical problem (DOM nodes are expensive and DOM manipulation is slow) ~20-25 years apart.
> I still don't completely understand why Mozilla is/was planning to axe[1][2] it since there's no web-based HTML5/JS replacement (the virtualized "tree" is implemented in C++, iirc) and it's still being actively used in places.{xul/xhtml}[3] and the Thunderbird/SeaMonkey[4][5] products.
XUL is a maintenance burden and exacts a development tax on new features (having to make the Servo CSS engine support XUL so that it could be uplifted to Firefox was extremely annoying). It's also full of security problems, as it's written in '90s C++ that nobody is around to maintain properly. Getting rid of it is an inevitability.
Thanks for the reply! I'm a big fan of your work. I can only imagine the nightmare of trying to implement two separate XUL <=> HTML/CSS flex/box models in Servo/Rust.
For readers that are unaware, there is also a great blog post breaking down some of these points in finer detail [0][1].
I guess my question is — are there replacements planned for any of the legacy yet performant XPCOM interfaces / XUL elements like nsITreeView/tree? My tl;dr understanding of XUL trees is that the DOM is and always has been too slow to render millions of scrollable rows in a performant manner (bookmarks, thunderbird, etc.). Would it not be possible to re-implement the XUL tree logic in Rust, for example? Is the goal to completely get rid of all non-standards compliant elements in the long-run?
It seems like there will always be some custom elements necessary for a native desktop interface which can never be integrated into HTML...
"I’ve talked about this before, but things like panel, browser, the menu elements (menu, menupopup, menucaption, menuitem, menulist) don’t have HTML equivalents and are important for our desktop browser experience. While we could invent a way to do this in chrome HTML, I don’t think the cost/benefit justifies doing that ahead of the rest of the things in our list." [2]
..., yet I don't see much discussion about this anywhere.
I'm particularly interested because I'm currently working on a XULRunner project where a <tree> is central to the user interface (millions of rows, image column, embedded data, must run on macOS/Windows/Linux/*BSD, etc.), and it's a little alarming that there is an open bugzilla ticket that did not initially mention either the performance nor ecosystem implications (essentially kill Thunderbird, kill SeaMonkey more than it already has been) of its removal.
I think the one part I have trouble with is that implementing a native looking/performant cross-platform desktop UI is still a nightmare and XUL could have potentially been a fantastic desktop-focused superset/companion of/to HTML.
I mean, you probably won't like this answer, but I don't think you should be writing a XUL-based app in 2021 if you want it to be useful, as opposed to for fun. XUL is 25-year old legacy technology, and using it is an exercise in retrocomputing.
Sorry, I should have clarified a bit more — I'm writing a cross-platform desktop application that has a preact[0] frontend (+ a Go backend) using `firefox --app application.ini`.
I have been experimenting with performant lists (which is why I brought up the XUL tree — it's currently central to the interface, though not the final implementation for sure) — I'm currently only using the XUL window/menubar elements in order to populate the native macOS menubar.
I am a fullstack web dev in my day job, so my goal here is to write a fast, easily extendable UI that I can quickly iterate upon using modern html/js/css/etc.
I love gecko and used to write XUL add-ons many years ago, so I'm already familiar with JS code modules, XPCOM, XUL, the internal browser architecture etc.
Basically, I'm now using XULRunner (`firefox --app application.ini` as previously mentioned — will eventually be stubbed into a native macOS .app/OS program) as a replacement for Electron / Chromium Embedded Framework[1].
I'm basically doing the same thing as Positron[2]/qbrt[3].
While you're definitely correct, I enjoyed working with Komodo IDE which was (is?) built around XUL even up to a few years ago. A neat API for extending and hacking around on it, with quite nice discoverability. I'm sad to see it go, but it makes perfect sense as to why!
I really doubt anyone is going to revive those old legacy widgets. That style of widget is flawed and predates MVC-style design, you'll find it becomes very impossible to present non-string data with that tree. Most applications now will want to show arbitrary widgets within the table, and for that they'll use the standard HTML/CSS. I would expect you can do something comparably fast by using a virtualized list in HTML, along with IndexedDB.
I recently wrote a somewhat performant react-virtualized[0] list for a project at work, though it's definitely a bit trickier in plain HTML/JS.
As far as the XUL virtualized tree goes, a couple of Mozilla engineers wrote some examples using plain html/javascript + DOM node manipulation[1]. While promising, I can't imagine that this implementation could ever be as fast as the compiled C++ one[2].
tangential but amusing — the old chrome://global/content/config.{xul/xhtml} used a XUL tree and rendered 0 DOM nodes to display its treechildren whereas the new about:config renders upwards of 4500 <tr> DOM nodes by default
ignoring the fact that you can no longer sort by specific columns (name, status, type, value, etc), you can really feel how slow the new implementation is if you click the "Show Only Modified Preferences" button — the DOM update feels incredibly sluggish whereas both searching and sorting columns in the old xul tree always felt snappy and instantaneous
I use the modern Firefox HTML5/JS dev tools on a daily basis and love the featureset that they provide, though it is equally shocking to compare the feel to that of the old DOM Inspector[0] (and Venkman[1], the old JS debugger), which was a XUL add-on for DOM inspection that used to run in Firefox, Thunderbird, and SeaMonkey.
What feels snappy and instantaneous in DOM Inspector feels somewhat muddy and laggy in the modern devtools.
While I greatly appreciate the amount of features that Mozilla has integrated into the modern (post-firebug) devtools over the years, it is a little sad that the next generation will never get to experience just how fast some narrow aspects of web development used to be.
Thanks for the shoutout to the legacy DOM Inspector (which I used to maintain) and Venkman (which I have both admired and used in anger). When Mozilla decided to put together a devtools team for Firefox 4, I was disappointed when I realized that they weren't going to take any effort to make sure the fruits of their labor sidestepped any of the performance issues that Firebug had exhibited for most of its lifetime. I do want to quibble, though, about the suggestion that this is a matter of XUL+JS versus HTML+JS. I say this even as someone with strong feelings about what a joy XUL was in comparison, and a long-lasting bitterness over the decision (among many) that Mozilla made in mishandling its own future.
WebKit's Web Inspector has for a long time gone with HTML, and in all its incarnations I've ever tried out, it has always been snappier than either Firebug and or the devtools that ships with Firefox.
When making comparisons like this, it's important to keep in mind that you're comparing/contrasting teams and their output, and it's not just a matter of the building blocks they're using. Some teams do better work than others.
Similarly, my understanding is that the TreeStyleTab extension was forced to migrate in the same manner and it can get very sluggish as well when you have a lot of tabs open.
> Why can't XUL vs DOM just be the same data with a fast C++ API and a slow JS API?
This is a great question! I'm not really qualified to answer it, but I'll give it a try.
My understanding is that the XUL tree is fast because it implements the XPCOM C++ nsITreeView[0][1][2] interface.
If you're writing a XULRunner program ...
(Firefox "is distributed as the combination of a Gecko XUL runtime — libxul, other shared libraries, and non-browser-specific resources like those in toolkit/ — plus a Firefox XUL application — mostly just the files in Contents/Resources/browser/, plus the 'firefox' stub executable that loads Gecko and points it at a XUL application", see [3])
..., XPCOM[4] allows you invoke those implemented interface methods directly from JavaScript.
XPCOM is a technology that, since the removal of XUL/XPCOM addons, is inaccessible to everyone except for Mozilla devs and those who write XULRunner programs using `firefox --app /path/to/application.ini`.
So, some XUL elements (like <tree>) implement an XPCOM interface that invokes native C++ (or rust, python, java, etc.) code, which is statically compiled directly into the Gecko XUL runtime.
Modern HTML5 elements, in general, must utilize the native interpreted browser DOM/JavaScript and cannot choose to implement/satisfy an arbitrary internal XPCOM interface. While I'm sure that Mozilla has figured out a way to make these elements fast (C++, Rust, I have no idea), you are always bounded by the limitations of the DOM.
So, my understanding is that, because we are relying on standards-compliant HTML5 elements which mutate the DOM, we cannot specify and implement new XPCOM interfaces ("with a fast C++ API") that could theoretically bypass the DOM — we /must/ rely on the "slow JS API."
It may sound stupid, but this was the feature i tried to add to ACE and i couldn't. And i spent the last decade trying to invent a drawing API that would let me do this effect.
it basically folds (hides) the implementation code of every method in the file, giving an API-like view, but with a smooth animation shrinking the text instead of instantly disappearing it
Can someone explain why https://makepad.dev/ is extremely slow and "unusable" on Microsoft Edge browser but run smoothly on Chrome?
Is it because of bad WebGL perf on JavasSript perf in general?
Did you try using dev/canary Edge? It should pretty much be the same rendering engine and JS engine as Chrome. Definitely report this to the Edge team if you have the time (very easy to do from the dev version of Edge). In my experience, they are very responsive to bug reports and feature suggestions.
Back in the day i made it work on Edge on an xbox. This was microsofts browser +JS engine. Nowadays its just chrome though. If it has problems, i'd be highly surprised.
Can you give a bit more flavor to "not very fast"? I'm on a measly chromebook and scrolling, selecting text, expanding directories, everything is smooth and high framerate.
Is there a specific operation that is not fast? Opening it for the first time took a few seconds but afterwards it was pretty buttery.
Makepad kinda has a minimum GPU spec. It's aging-out for the people who don't have this, but some people still don't have gpu's that can bitblit their screen with a solid color.
> ... it's hard to beat just drawing on a canvas or using WebGL.
Both of these APIs perform quite poorly for what they're doing.
To compete with native, the web platform needs simple low-level APIs that do not have a lot of Javascript marshalling overhead and other performance cliffs. You can always build a more convenient library above low-level interfaces, but the opposite is not true.
It is, but with luck version 1.0 will arrive at the end of the year and then there is the whole adoption rate.
Now given that WegGL support is still hit and miss, and the only way to debug is to rely on native GPGPU debuggers, and having the pleasure to differentiate between browser own rendering code and the one from the application, that shows how easy it is to do 3D on the Web.
>Both of these APIs perform quite poorly for what they're doing.
When it comes to Canvas, do you mean that it actually performs poorly when putting pixels on the screen using putImageData, or do you mean that it does that fine but it performs poorly when it comes to drawing vector graphics? In either case, do you know why it performs poorly?
Personally, I would be happy if Canvas just let you put raw pixel data on the screen and did that as well as possible. I have never felt any need for its vector graphics features. To me, they seem too high-level for what Canvas is supposed to be. But I guess things are different when it comes to using the graphics card, since from what I understand it is actually optimized for drawing polygons.
It performs poorly in either case. For one, Javascript APIs have significant marshalling overhead.
Secondly, Canvas is often "hardware-accelerated", which can make some things faster, but also slower because this kind of immediate-mode drawing doesn't match the GPU interface well. It's particularly slow at vector graphics. Some effects would require pixel readback, which is slow for the same reason.
> Personally, I would be happy if Canvas just let you put raw pixel data on the screen and did that as well as possible.
Drawing cached Bitmaps is relatively fast in Canvas, if you don't need too many calls. Getting arbitrary data into such a Bitmap is slow, so if you want to do it every frame you may run into issues.
> Secondly, Canvas is often "hardware-accelerated", which can make some things faster, but also slower because this kind of immediate-mode drawing doesn't match the GPU interface well. It's particularly slow at vector graphics. Some effects would require pixel readback, which is slow for the same reason.
One would assume/hope that specifying "bitmaprenderer" for the context type would give you a regular immediate-mode CPU rasterizer. Is that not the case?
> Getting arbitrary data into such a Bitmap is slow, so if you want to do it every frame you may run into issues.
To expand on this, doing that ("putting raw pixel data on screen") anywhere is slow if it's modified regularly. There just doesn't exist a fast CPU-buffer-to-display pipeline anymore, that died out years & years ago. So that one at least isn't a JS/web limitation, it's more a modern graphics architecture one. You just can't bit-bang pixels yourself anymore, not reasonably efficiently anyway. In theory that'd be possibly on unified memory architectures (read: mobile devices & integrated graphics), but GPUs don't like to publish their swizzled texture formats so you still don't get to even there.
> One would assume/hope that specifying "bitmaprenderer" for the context type would give you a regular immediate-mode CPU rasterizer. Is that not the case?
No, that's something else.
> To expand on this...
I almost wrote something like that, but then I considered that I haven't really benchmarked this. Streaming data from CPU onto the GPU is certainly possible and graphics APIs do have hints for such usage. You also don't need to convert to a texture to get arbitrary data on the screen, a trivial shader can do that for you.
If your data/transformations naturally live in RAM/CPU, that may well be the most efficient thing to do.
I think this is a problem of the mainstream conception seeing the future browser mainly as a monopolized, walled garden (compared to GNU/Linux, which nowadays would do everything one wanted from a computer and more) with the canvas being a kind of framebuffer.
Back when I first read about the canvas, iirc there was no fancy CSS, no fancy custom elements and making a simple doodle-element or the famous doodle-jump as a webapp was ... - well, I guess there was flash. So if you think of HTML and DOM as a GUI toolkit, it filled an important void (and continues to do so) but nowadays noone wants to use (standardized) HTML anymore, so...
If you look into tk (or nowadays tkinter) you basically see the same with the Canvas-class (I think you can't draw anything custom at all in tk easily)!
I don't build extensions or work on much front end web lately but this reads like Google wants more control over their stuff. The web is becoming less open.
> By moving away from HTML-based rendering to a canvas-based rendering, some Chrome extensions may not function as intended on docs.google.com and may need to be updated.
> If you are building your own integrations with Google Docs, we recommend using Google Workspace Add-ons framework, which uses the supported Workspace APIs and integration points. This will help ensure there will be less work in the future to support periodic UI implementation changes to Docs.
This is basically putting an API on top of an API as far as I'm concerned. The web renders markup and executes javascript to produce experience. Putting an API on top and using canvas to render your content creates a more closed system.
For those more deep in web technology, I'd like to know if there are reasons to move to canvas for strictly technical merits.
Performance of DOM based rendering is very problematic and not unified across browser implementations. Canvas rendering will likely increase the performance of Google Docs, and make the UX more unified across platforms. Google Docs is really an application built on the web platform. HTML DOM rendering was never intended to give developers the control they need to build fully featured high performant applications, we just shoe horned things until they sort of work. I think this is a positive thing, UX will be better and the integration API's will become much cleaner and not depend on structure of how they design the UI. Concerns will be separated and the end result is something much cleaner, more performant, and more supportable.
I think this is the real reason for the change as well. A few years ago Visual Studio Code underwent a similar change where rendering the terminal moved from using DOM to canvas. I never noticed a huge difference between the two methods but I imagine using canvas gave them a lot more flexibility in addition to being more performant.
Here's an article that talks about the switch to canvas for VS Code. The "5 to 45" times faster part really sticks out to me. Kind of surprised it took Google this long to do this with Docs.
I personally didn't even bother checking out VS code because it was based on electron, and so I figured the performance just wouldn't be there because it's doing all of this awkward web stuff while trying to be an IDE.
I was completely wrong though. Using it, it really doesn't feel like a web app at all. It's really shocking and impressive. It feels like a text editor. Perhaps I should learn more about what they're doing.
The major problem I have with electron apps is that they eat memory for breakfast, lunch, and dinner.
This happens with jvm applications as well, but you can limit the max heap size and force the garbage collector to work more, trading off speed for the ability to run more apps side by side.
AFAIK, you can't limit the memory used in electron apps, and they don't respond by sharing heap with their child processes. With enough extensions to make it usable, vscode easily eats GB of memory.
I like lsp. But I don't need vscode to do the rendering.
The majority of the problems with "Electron" are actually just problems with the development style used by the types of people who publish and consume packages from NPM.
We've gone from a world where JS wasn't particularly fast, but it powered apps like Netscape, Firefox, and Thunderbird just fine (despite the fact that the machines of the era were nothing like what we have today) and most people didn't even know it, to a V8-era world where JS became crazy fast, to the world we're in now where people think that web-related tech is inherently slow, just because of how poorly most apps are implemented when they're written in JS.
If you want to write fast code, including for Electron, then the first step is to keep your wits as a programmer, and the second step is to ignore pretty much everything that anyone associated with NPM and contemporary Electron development is doing or that would lead you to think that you're supposed to be emulating.
I agree, and this is something also I have witnessed in my own Electron project, where careful care was taken to write fast and memory efficient code. It doesn't really use that much memory compared to native applications when running, I've done comparisons.
I feel also that the problem is more with the style of javascript development rampant these days, where not a lot of care is taken into making memory efficient or even efficient code.
This has to do a lot of course with the high rise in people studying to become (mostly) web-developers, without any deeper degree in CS or understanding of how computers really work.
This isn't entirely the fault of Electron though, but the convenient data types exposed in a web environment. Beyond the baseline memory of running Chromium, you could use various tricks to keep memory very low such as minimizing GC (eg. declare variable up front, not within loops), use array buffers extensively, shared array buffers to share memory with workers, etc.
Behaviorally they aren't the same thing, so that's not a straightforward mechanical translation. For example if the variable is an object instance then if escape analysis can prove the variable doesn't outlive the loop, then it can be put on the stack & then yes there wouldn't be any benefit to the suggested change. Although deoptimization makes stack allocation more complicated, so JS engines are more conservative here than say JVM runtimes.
But it's really easy for escape analysis to fail, it has to be conservative. So you can end up heap allocating a temporary object every loop iteration quite easily.
Not sure on any particular guide, but I learned a lot from the old #perfmatters push from Chrome, getting a deeper understanding of what the JS engine does when you create an object, where it lives, how it interacts with the garbage collector and so on would be a good thing to learn about. Also it's generally only worth considering optimization for things that store a lot of data like arrays/maps. I don't see why these techniques wouldn't be good in the long term.
I definitely agree that it's easier to make webapps that consumer much more memory than it is using a lower-level language like C++, unless you're being careful.
I just in the past month upgraded my main work laptop from 16 GB from 40 GB (8 GB soldered + 32 GB SODIMM). So your point is granted, but on the other hand, DDR4 prices have collapsed ~50% from 2018 (I couldn't believe it either, given all of the other semiconductor issues).
I'd assume a significant cost-benefit tradeoff. For all its flaws, the DOM rendering algorithm is at least "document-like," so there's a lot of wheel-reinventing to do going from just using the DOM to a custom document layout implementation underpinning a canvas-targeted rendering algorithm.
Yes, at OrgPad, we are writing our own rich text editor and if you want to use the DOM approach, you don't have much choice than to do it like this. You can see the WIP demo here (in Czech but it is quite visual): https://www.youtube.com/watch?v=SkFJ1zcRjQY
It is also written in ClojureScript. Some of the reasoning is here (in English, but 3 hours long): https://www.youtube.com/watch?v=4UoIfeb31UU
> I imagine using canvas gave them a lot more flexibility in addition to being more performant.
I'm perplexed because I don't expect canvas rendering to be faster - or necessarily more flexible - because the web is document-first: HTML and CSS were/are all built-around describing and styling textual content, and computer program source code files are invariably all textual content files. So while browsers all have heavily-optimized fast-paths written in native code for rendering the DOM to the screen with the full flexibility of all of CSS's styling features - so applications switching to canvas rendering will first have to contend with needing to reimplement at least the subset of CSS that they're using for their editor - and it has to run as JavaScript (or WASM?) - and I just don't understand how that could possibly be faster than letting the DOM do its thing.
I appreciate that DOM+CSS rendering is not designed-around monospaced text editing or with specific support for typical text-editor and IDE features which do indeed throw a wrench into the works[1], but I think a much better approach would be to carve-out the cases where the current DOM and rendering model is insufficient or inappropriate for those specific applications' purposes and find a way to solve those problems without resorting to canvas rendering.
That said, is this change because Google wants to use Flutter for a single codebase for Google Docs that would work across iOS, Android, and the web? Flutter does have a HTML+DOM+CSS rendering mode, but it's horrible (literally thousands of empty <div> elements in their hello-world example...)
[1] e.g. a HTML/DOM document is strictly an unidirectional acyclic tree structure, and CSS selectors are also strictly forwards-only (e.g. you cannot have a HTML element that spans other elements, you cannot isolate individual text characters, you cannot select a descendant element to style based on its subsequent siblings, or ancestor's subsequent siblings), and how the render-state of a document is also strictly derived from the DOM and so does not allow for any feedback loops unless you start to use scripts, which means you can't select elements to style based on their computed styles (unlike, for example, WPF+XAML, where you can bind any property to another property - something I think XAML implements horribly...), and I appreciate this makes certain kinds of UI/UX work difficult (if not impossible in some cases), but in the use-case of an editor I just don't see these as being show-stopper issues.
>I'm perplexed because I don't expect canvas rendering to be faster
...yet it is. Really.
Even though DOM paths are heavily optimized, they are extremely flexible, and that flexibility creates a wall in possible performance optimizations. In a context like word processor, precision is more important than your regular website (and across browsers!) so you end up implementing little hacks everywhere, pushing half a pixel here and another 1.5 pixels there.
A purpose built engine that writes directly to the framebuffer of a canvas without dealing with legacy cruft has the potential to be a lot faster - if you know what you are doing. Google has no shortage of devs who know what they are doing so here we are.
They aren't that optimized, this small team changed Chromium's DOM to have better cache utilization and more coherent access patterns with data-oriented/SoA and got 6X speedup in some animation use cases:
> Google has no shortage of devs who know what they are doing so here we are.
They also have no shortage of devs who advance crazy ideas that somehow gain adoption... like starting a new general-purpose programming language in 2007 without generics nor package manager.
The web is (or at least was) document-first, yes, but Google Docs is an extremely heavily-featured WYSIWYG word processing and desktop publishing application that happens to be distributed on the web (in addition to other platforms). The fact that you're (sometimes) using Google Docs to generate a simple document that could easily be represented with simple HTML does not imply that Google Docs itself is a natural candidate for being implemented with simple web APIs like DOM.
Now, I think if the contentEditable API were significantly more robust and consistent across browsers, it could have been viable to build extremely complex WYSIWYG editors using the DOM. Most of the popular rich text editor libraries for the web are essentially compatibility layers around the contentEditable API that attempt to normalize its behavior across browsers and present a more robust API to the developer. These libraries are popular and do work pretty well, but based on my experience with them it's no surprise that an app as popular and extensive as Google Docs would constantly bump into the limitation of this approach. (My impression is that Google Docs never used contentEditable and instead wrote their own layout and editing engine that manually rendered out DOM, and they're now changing that to render out to canvas.)
> My impression is that Google Docs never used contentEditable and instead wrote their own layout and editing engine that manually rendered out DOM, and they're now changing that to render out to canvas.
Back before Google owned Google Docs, it was a non-Google company and website called Writely, and their website was basically a document-hosting system tied to a fairly stock `contentEditable` editor.
This was around 2005 - back when every web-application development client would insist that users have WYSIWYG/rich-text editors - of course they had no idea how WYSIANLWYG (What you see is absolutely nothing like what you'll get) those WYSIWYG editors are like.
> HTML and CSS were/are all built-around describing and styling textual content, and computer program source code files are invariably all textual content files.
HTML and CSS are fairly well optimized, but dynamic HTML and the DOM were an afterthought. If you could throw out a lot of the guarantees about DOM behavior, you could make a much faster browser, but you'd also break the web.
At the end of the day, after the browser does all of its highly optimized processing of the dom, html, and CSS, it is issuing drawing commands that are the same as the ones you make on canvas. Canvas skips the in-between steps.
If you're in a situation where you know you want this text at this location on the page, it may be simpler to just draw what you want verses trying to arrange a DOM that will cause the browser to draw what you want. Especially if you're already doing pagination, at which point you're already doing the text breaking and layout anyway, and you're just trying to tell the browser in a high level language to give you the same low-level results that you already have in hand.
It looks like they're just doing this for text within a page, BTW. I looked at the sample document and the page scroller is DOM, and the individual pages are canvas of text, overlaid with an SVG containing the images.
The big question I have is how they manage to deal with stuff like IME (input method editors) and how they manage to work with the keyboard on mobile (looks like they don't do mobile though).
> The big question I have is how they manage to deal with stuff like IME (input method editors) and how they manage to work with the keyboard on mobile (looks like they don't do mobile though).
A common technique used in other web-based editors for other content-types (like online video editing, online image editors, etc) work by creating a hidden <textarea> or <input type="text"/> and giving that element focus - and then updating the manually-rendered content in response to normal DOM events like 'input', 'change', 'keydown' (if necessary - the 'input' event should be preferred, ofc). Because a "real" DOM element with native IME and soft-keyboard support is being used to process user input there's little to no degredation of the user-experience.
...though the user does lose the ability to do things like drag text-selection handles. Alternative approaches include instead making the textarea very visible and instead positioning it directly on-top of the manually-rendered content and using as much of the browser's built-in support for styling input elements and input text to match the manually-rendered content as closely as possible - but also hiding the manually-rendered content to avoid confusing the user. They may have a toggle to allow the user to choose between "simple-edit with live preview" (i.e. hidden textarea) and "edit mode". This technique isn't confined to just the web: lots of desktop software (especially in the days before WPF, JavaFX, etc) that needed to allow the user to precisely edit text within a design-surface would just instantiate a native textbox widget directly on-top of the text's location in the design-surface. It wasn't just 2D art software that did this, but also at least a few WYSIWYG-ish HTML editors (prior to contentEditable) did this. I actually wish this technique would come back (despite its clunkiness) simply because Markdown+Preview is far, far better than a WYSIWYG contentEditable widget where an inadvertant mouse-click or drag would create a `float` disaster - or bugs where elements wouldn't be closed correctly and so ending-up breaking the entire website layout...
> because the web is document-first: HTML and CSS were/are all built-around describing and styling textual content
They were built to display static textual content. Moreover, they were built to display static textual content on 90s-era computers in a single rendering pass. IIRC two-pass rendering didn't appear until some improvements around tables in early 2000s.
For that, yes, they are quire fast. Anything else? Nope.
Document display and document editing are rather different tasks. The DOM was built for the display of static documents. Dynamism was slowly added over the years through JS, and eventually CSS (animations, transformations, etc). But the underlying purpose of the browser rendering engine has remained the same, which is to display static documents. It's not surprising that a client built from the ground up around the concept of displaying static documents doesn't do a good job of allowing users to edit documents in a WYSIWYG kind of way. That has never been its job!
I like the hacking mindset to make something work even if the odds are against it but the better approach would be to fix the DOM APIs and to do the necessary performance work instead of basically throwing all the responsibility on some library and the web developer.
Microsoft Word, Pages and Open Office don't seem to be bottlenecked by rendering performance like Google Docs. Perhaps the browser is the wrong platform for document editing.
I believe this 100%. After using google office for years (just because it's free and cloud-based), I recently tried MS Word and Excel recently at work. The different was mind-blowing. I forgot just how functional and straightforward MS Office is compared to the clunky, barebones google options.
If I wanted a desktop-first, cloud-backed solution, what would be the most future-proof and durable? Can I use Open Office across OSes? What would be the best cloud backup service these days? (just a general question to readers)
I also prefer desktop-first, cloud-backed solutions, but I have quite the opposite experience. Working with MS Office has been a pain and I've been a happy Google Docs user for about 10 years. My wife who isn't an especially technical person also finds Google Docs quite a lot more intuitive and laments when she has to use MS Office products for work (she is a consultant for Microsoft including their 365 line of business and her whole firm makes pitch decks in Google Slides before converting them to MS Office to present at Microsoft meetings--IIRC for the Azure and other b2b lines of business they don't even bother with MS Office). Note that my wife and I (like most of our age group) grew up on MS office, so it's not a question of familiarity.
Google Docs just built a better product and MS Office still hasn't caught up. I wonder if this is because or in spite of the browser target?
Google Docs seems so bare-bones. I recently couldn't find a way to format a series of chunks of text within a Google Doc as code, and I'm pretty sure that it simply doesn't support styles for anything but headings and body text. It just doesn't seem to be the same kind of tool as Word.
Copy a few cells from a Google sheet and paste it in an email, then do the same with Excel. Collaborate on building out a document from scratch with 10 people in Google sheets vs Excel.
Excel is a monster, and much more powerful than Google sheets in many ways, but in my experience, Google docs apps are a little better for collaboration, and they integrate a little tighter with each other.
Google docs is their document editor. Sheets is a part of GSuite.
I've also never had trouble pasting a spreedsheet selection into a word document. Email is a nightmare in general though.
I'm not sold on collaboration personally. I've had to do it a bunch since the pandemic began and I've found it to be an anti pattern. One of the big inconsistencies is that cells in sheets don't update while being edited while collaborating, which is not great if you have a spreadsheet heavy workflow. Docs is impossible to replace that though, because it's auto formatting is draconian and always seems to reset its preferences. When editing docs we spend more time formatting them then creating the content.
> I'm not sold on collaboration personally. I've had to do it a bunch since the pandemic began and I've found it to be an anti pattern.
How much of this is really related to technology? I do a lot of writing in both Word and Google Docs and see different sets of problems for both products. Having a group of people jump into either and expecting a good product (and experience getting there) is unrealistic.
With the pandemic, I think people have been trying lots of things without understanding what will be most effective. At least early on, there was a feeling that people had to be seen to be productive. It's nothing like real remote work.
For important docs, I still come back to having individuals write their content and only then does one person attempt to assemble it. The individuals often need their own independent reviews and consultation anyway before they have a decent draft. In some ways it improves visibility and helps with keeping folks on schedule too.
Google sheets is the specific example that I hate. In my experience, it's often laggy and clunky. You can't even scroll smoothly: the window MUST snap to row/column lines. When I realized that google sheets has such a laughable shortcoming, I knew I needed to get out of google office eventually.
i think copying some cells from excel into outlook, which i guess is the comparable transaction, works pretty well - what doesn't work for you? Maybe I am just missing out on some amazing functionality by not using google docs.
Personally, I like it better sometimes for having less features. MS Word has such a massive number of formatting features that interact in complex ways that there's plenty of ways for your document to end up formatted in a weird way and to be very difficult to figure out exactly where the switch is to make it not do something. I think one time I had a document where the entire doc was highlighted in yellow, and it took me over an hour of fiddling with various formatting boxes to figure out how to turn it off. Any word processor that doesn't have the capability to do that has some appeal to me.
I haven't seen a word processing document in a professional setting for many years now (didn't realize it until just now). Who uses a word processor these days? Writers certainly don't use that garbage.
I use text editors so I can think about the content and if it is going to get prettied up with fonts it goes into a target system that supports markdown (confluence, git, email, etc..). If you are flummoxing around in a word processor or sending around formatted docs that aren't PDF I fully expect people to be looking at you sideways.
I hate to inform you that, yes, writers do indeed use “that garbage”. I’m married to an author who regularly uses Scrivener to write. But anytime she has to send anything to anyone she has to convert to a Word document and send that out. Everyone uses Word that she interacts with. (Though author friends of hers might also use Scrivener for their writing)
Writers who understand git, let alone Markdown, are going to be extremely rare. You’re in a bubble if you haven’t encountered how dependent the writing field is on Word documents.
Unfortunately I do agree with this. I think a lot of tech isn't a matter of "what's the best?" but instead "what's the least bad?". I don't think Office is perfect but I think it's a lot less bad than google. I don't think MacOS is great but it's a lot better than windows for certain things, and vice versa. IMO unless software puts the user first in allowing customization and control, the best we can ever get is good instead of great.
I would recommend Libreoffice over Openoffice, but yes (for both)
And you can of course backup to your cloud service of choice. The main benefit of google docs, o365, etc. Is real-time collaboration. But there is no reason why a desktop app couldn't support realtime collaboration with a suitable backend service.
The only time I've ever seen real-time Google Docs collaboration has been during meetings which should have been an email. Total waste of everyone's time. Not to mention the horrible UX of people constantly moving their cursor around and moving text around. I'd suggest that pass-the-baton style collaboration would be a much better UX if you absolutely must collaborate real-time on creating a document. Which I find the premise to be incredibly dubious to begin with.
Even if actual realtime collaboration is rare, there are other collaboration features that are missing in most desktop equivalents, like getting notified of changes, being able to mention people in comments, etc. that I do see used quite a bit.
But my experience is that realtime collaboration is useful. In particular, immediately after emailing a doc to multiple people it is not at all unusual for more than one person to be actively looking at commenting on, and maybe changing the document at the same time.
I have had the exact opposite experience—I've used Google Docs for 10 years now, and in every way it manages to exceed Microsoft Office in usability. You're right that Google Docs can sometimes feel a little barebones, but it makes up for it by being very easy and straight-forward to use. In 10 years of using Google Docs, I can count on one hand—across probably tens of thousands of documents—the amount of times I've been missing something so critical to my work that I've needed to use an Office product.
(That said, I'm really excited about the recent changes Microsoft is making for Excel, with LET and LAMBDA, and I look forward to trying it out again in the future. Maybe this is the thing that finally gets me to switch! I've also enjoyed doing some more ~fancy~ graphic design in Pages on Mac, but overall the clunkiness was just so frustrating that I can't in good faith recommend it to anyone)
I prefer LibreOffice over Open Office, but I believe both are cross-platform (Linux, Windows, macOS). Then, I'd just use Dropbox or similar to save the files to for cloud storage. The only downside is no real-time collaboration. You can also look into Collabora, but I don't have any experience with it.
If you don't require Linux support or if the web is tolerable for Linux, I personally recommend the Microsoft Office suite. There's the obvious compatibility concern because nearly everyone uses those, they have real-time collaboration built in for both desktop and the web, comes with OneDrive storage, and will obviously be extremely future-proof. I cannot recall a single time any of the apps have crashed on me on both Windows and macOS, so I think it's pretty "durable".
IMHO HTML documents backed by a versioning system (probably fossil or pijul rather than the overly complex git) are the way forward for documents where content is much more important than presentation.
While “text in a VCS” is a great option, it’s obviously far less usable than something like Google Docs, and you still don’t get real-time collaboration, which can be really nice.
Yeah... I'm wondering though, Fossil is based on SQLite - a database - and databases are designed to solve the issues arising when multiple users try to change the same data. (Also, fossil by default works in "autosync" mode.) So it should be "easy(er)" to make a real-time collaboration tool based on Fossil ?
P.S.: By researching this, I've stumbled on a (barebones) alternative to Google Docs : HackMD/CodiMD/HedgeDoc :
https://demo.hedgedoc.org/
The best approach for a desktop first cloud-backed solution is possibly to have a VDI with Windows (on AWS for example), and use Microsoft Remote Desktop from your preferred physical computer to access it.
I have multiple desktop Macs in my various homes but I only use them for web browsing and RDP to the same Windows VDI.
A free OneDrive account is enough, plus Office 2016+
autosave function, with the added bonus to have a cloud version of word to edit in collaboration your document on the go
It was indeed a very strong marketing move for... decades to convince people, like smart people, that document editing can be a web-based thing. Actually, now that the browser is so ubiquitous that GUIs sit on top of it (think Electron), then is time to ask the very obvious question - since everyone seems to agree that universal GUI is needed (proof: the browser) then is the browser the right universal GUI?
Not being heavily biased by any vendor, but really, is there anything better than XAML to describe user interfaces, that is also cross-platform and does not have the burden of DOM? Please - share examples.
Absolutely not; but the web has became the behemoth it is through an absurd amount of money and engineering work. Chrome (well, Chromium) has 34 million lines of code now[1].
If we assume any competing universal GUI platform will need a similar amount of engineering effort, there's a very small list of companies in the world who have the resources to fund an effort like that. And Apple, Microsoft and Facebook have very little strategic incentive to care. (React Native notwithstanding). Google is trying with Flutter - but we'll see.
I wonder if maybe the the right direction is up. WASM is already supported by all major browser engines. I'd love to see a lower level layout & rendering API for the browser, exposed to wasm. We could do to the DOM what Vulcan did to OpenGL. And like opengl, if it was designed right, you should be able to reimplement the DOM on top in (native wasm) library code.
Then the universal GUI of the future could be the gutted out shell of a web browser (we'd just need wasm + the low level layout engine), running libraries for whatever UI framework you want to use, written in any language you like. A UI environment like that would be small, portable and fast.
That smells suspiciously like the Linux desktop environment. There was X. It was a minimal desktop environment. Then there were dozens of ones built on that… there was almost no way to have a consistent experience for a really, really long time.
Yeah, but the web isn’t very consistent already. The main set of common elements are buttons, links, form elements and scroll bars. Just about everything else is done custom on every webpage you visit.
I don’t think we should get rid of the common UI elements (if anything we need more of them & better APIs for them). But what Google docs, and flutter seem to really want is a simpler, more primitive way to create a layout out of those UI elements. Buttons and scrollbars are great. We need something more primitive than the DOM and CSS. Houdini is a solid start here.
Well, it’s clearly what the Google docs team wants. And it would yield higher performance for other similarly complex web apps (eg Figma). And allow native UI development in more languages (Blazor). It also looks to be the sort of thing the Flutter team want for web builds. And it could work well for the base system of chromeOS too.
For whatever reason, Google invests hundreds of millions each year into chrome, and trusts their engineers’ leadership on how to make it succeed. The question in my mind is if browser engineers themselves decide to push in this direction.
Chrome has been pushing Houdini [1] for years. It doesn't have special WASM integration right now AFAICT but it is basically a lower level layout & rendering API for the browser.
I've looked at Houdini again and I'm not convinced.
First, because it's more like OpenGL 3 (add more powerful APIs) than Vulkan (clean room design).
Second, it seems mostly abandoned. The page you cited lists multiple sub-proposals that have "No signal" even from the Chrome team. All mentions of Houdini I can find on developers.google.com are from 2018. I can't find anything about Houdini integration with WebAssembly, which is what I'd expect if development was ongoing.
Overall, I'm seeing everything I would expect to see in the timeline where Mozilla has no intention of ever implementing Houdini, and Google has decided it's not worth pursuing beyond what's already implemented.
The killer feature of Google Docs is the real-time collaboration. People willingly gave up a lot of editing and layout functionality to get that. It was so much better than sending drafts of documents back and forth in email.
I feel the need to argue that the browser is not the browser engine. An app sitting in a chrome tab is significantly different than an app built on electron, they just share some rendering code paths.
Electron apps have shown that you can use a browser's rendering engine to make high quality apps distributed on multiple platforms. They also have the benefit of persistence, filesystem access, hooks into native code should you need them (not WASM - mind you), you can implement true multithreading and explicit SIMD optimizations. You don't have memory limitations, and you don't have to worry about browser sandboxing, malicious or well intentioned extensions that break the experience, etc.
The browser is not the same platform as electron. I would guess that Google Docs would function much better in electron than on the web.
> An app sitting in a chrome tab is significantly different than an app built on electron, they just share some rendering code paths.
That isn't really true, Electron is basically a thin veneer over the Chrome browser, with NodeJS tacked on the side. Just take a look at the source code.
> Electron apps have shown that you can use a browser's rendering engine to make high quality apps distributed on multiple platforms.
Electron has shown that you can use a re-skinned browser and NodeJS to ship applications on all platforms capable of running Chrome. That ranges somewhere between "acceptable tradeoff" and "absolute overkill", depending on the application.
> You don't have memory limitations, and you don't have to worry about browser sandboxing, malicious or well intentioned extensions that break the experience, etc.
You still do have almost all of the limitations of a web browser in your rendering code, and you have none of the features of the web browser outside of it. The bridge between the two is inefficient.
Yeah, I'm wondering why Google isn't building a desktop version of their office apps in electron. I can practically hear the collective sigh of relief upon those landing in users ' laps.
> It was indeed a very strong marketing move for... decades to convince people, like smart people, that document editing can be a web-based thing.
I think this is overly reductive. There was a technical problem driving some of this; namely - document collaboration sucked (to some degree still does).
Moving documents online was a tradeoff - making the editor web based solves a bunch of problems but causes some other ones; desktop based cloud backed editing didn't exist (not that it's perfect now) at a time when you could get useful collaboration done with web based editors.
I'm not saying this was the only thing going on, but reducing it to just "marketing" misses the mark, I think.
The way that word processors are designed, essentially as very smart linked-lists of objects, would've actually allowed for the document collaboration very early on. We can perhaps speculate dozens of reasons why dis did not happen, but I guess it was for strategic reasons. But it will and is happening.
Is about right making the point that IMHO the desktop office processor is far from dead, actually I would imagine a comeback of desktop UIs because they are so much easier to get right, especially when you have complex forms (which all business software has) or custom GUIs (such as those in software like Blender, Photoshop, Lightroom, etc).
Question is did people really needed the collaboration feature so much, or as much as it was praised for decades... When it shows that source code (which IS one very important content) is being developed not collaboratively in real-time in the browser, but with the aid of various version control systems (CVS, SVN, GIT etc.) that is neither real-time, nor collaborative in the sense that Google DOX is.
So the whole collaboration thing is fun to have, great thing to demo, but perhaps not the killer feature.
Question is whether other features were more important and thus got implemented in the office packages. Such as enterprise integration capabilities and very powerful and well crafted WYSIWYG that is only possible with custom built engine.
Let's be honest - the most complex apps that is typically running on an average desktop OS is the browser and the word/spreadsheet processor. Back in the day the browser was not a VM and was not that complex. And as OpenOffice showed - this is not very easy to get right. As WPS Office (the Chinese office) showed - even if the presentation layer is fast/correct, it is not really that easy to (originally) come up with it nor integrate it with other enterprise services.
One may wonder whether MS Office was created to run best on Windows, or was it that Windows is made so to enable good run of MS Office and the integration of all this mandatory software that constitutes the modern enterprises... (again, trying to be as unbiased as possible)
> Question is did people really needed the collaboration feature so much, or as much as it was praised for decades... When it shows that source code (which IS one very important content) is being developed not collaboratively in real-time in the browser, but with the aid of various version control systems (CVS, SVN, GIT etc.)
This is a good point. I don't think realtime collaboration is so important, but multiple author collaboration is. And "track changes" is a sort-of good-enough solution, but painful.
I've had good luck collaborating on documents (research papers) using latex and source control, but that assumes (a) participants are comfortable with both and (b) the storage format is amenable to revision control. Most word processing doesn't work well like this because you can get the document into a broken state in ways that are hard to recover from, and many of the users have no mental workflow map for "source control"
TeX/LateX or orgmode/Markdown type approaches have an advantage here for complicated collaboration.
These days a lot of collaborative stuff is being done outside of spreadsheets and word processing docs, the lines are blurrier and the collaboration is broader. In the "old days" a wiki might have done the trick for this but people want richer environments too. Not sure what he answer really is.
Microsoft Word and Pages both also have web apps, for years, that are 'bottlenecked by rendering performance' (would put it as 'clearly would be improved by better rendering performance', as you're noting)
Google docs is worth it for the coorporation, but if you are writting for yourself, or anything seriously it is simply not good enough, but I don't think the performance is the issue.
This mode of argument seems odd to me. Google is announcing a solution to the problems they were having with the platform. Wouldn't the criticism "Perhaps the browser is the wrong platform for document editing" only be appropriate if Google was complaining that they have been unable to fix the problems?
The fact that, while developing for a given platform, you can encounter problems and fix them, doesn't seem to imply that there's something wrong with your choice of platform.
The browser is the wrong platform for anything that isn't an HTML document, and not only for performance reasons, but perhaps much more importantly : for interface reasons.
For instance : in your typical windowed program, when you press "Alt", it's supposed to show the Menu, which you can then quickly navigate using keyboard shortcuts. You can't do that properly inside the browser because it's going to conflict with the browser's own Alt-Menu.
Based on inspecting the DOM of the read-only preview document they link to, my guess is that they will be using traditional DOM elements for much of the editing UI. There appear to be many empty DOM elements that are there to hold various toolbars and other UI elements. And for what it's worth, there seem to be empty DOM elements intended to be read by screen readers.
I hope so. Just one example, but when you use the API to export HTML, nested lists aren't actually nested.. they just inject increasing padding on subsequent LI tags. This is ridiculous and causes big issues for me, but I'm sure they had to do it for formatting purposes. So hopefully they can give us semantic HTML now that it's not coupled to the editor.
> For those more deep in web technology, I'd like to know if there are reasons to move to canvas for strictly technical merits.
Performance. My company switched from dom to canvas (and then to webgl) for a document-centric app a long time ago because of performance reasons. drawing to a canvas is much faster than updating dom. Also better control over how it displays. With dom you have to worry about differences in how differeny browsers render the same dom a loy more. Although that is less of an issue than it used to be.
There are downsides too though. Besides making it much more difficult for extensions to modify things, you also have to build your own spell check, because there isn't a browser API for that. However, I think google docs was already using google's own spellchecker.
In this case, I think the switch is most likely entirely based on technical merits, rather than some way of asserting more control.
So with that in mind, the fact that the team behind one of Google's most interactive pieces of software has to throw up their hands and say "DOM is too slow, we gotta roll our own" should be a wakeup call for everyone working on Chrome and other browsers, but mostly for Google itself.
When you escape the DOM, you're going to be doing pretty much everything yourself. And for someone like Google, that might be worth the absolutely insane amount of effort, but what about everyone else? You're Google, Chrome has 60%+ market share. Why isn't the plan here to systematically start improving DOM performance, or create APIs to more directly modify how elements are laid out and created? Why do all of this work to benefit only Google Docs?
We've had years (decades!) of articles and talk about how the DOM is slow (including a bunch from Google), so why not improve it? Why give up and waste all this time on a custom solution? Why not create something that is *actually* capable of handling the complexity of modern, highly interactive applications, including Google's own products?
You can say it's Flutter, but that's yet another effort to escape the DOM, rather than actually improve it.
Maybe this has been the plan behind the Google Docs team, to push people on the browser side and other Google teams to start seriously looking at what to do with the DOM, if so, I hope this actually has the intended effect. We all deserve a better, more performant web.
DOM performance has already improved leaps and bounds after millions of dollars of engineering and countless hours of effort. Same with Javascript. At some point you have to accept that the DOM has fundamental design flaws, its specification and requirements are the problem yet cannot be radically changed because of backward compatibility concerns. Browsers have spent decades optimizing everything possible, one should start by acknowledging that before tiredly trotting out magical performance improvements as the answer.
> Why isn't the plan here to systematically start improving DOM performance,
There are already many steps taken to improve DOM performance over these years. However, DOM is designed for documents. The performance can never be good enough when it is abused for non-document usage.
> or create APIs to more directly modify how elements are laid out and created? Why do all of this work to benefit only Google Docs?
Because other browser engines are unlikely to adopt these APIs just so that Google Docs can have better performance. Not to mention that these new APIs will take years to be present in every user's devices.
> There are already many steps taken to improve DOM performance over these years. However, DOM is designed for documents. The performance can never be good enough when it is abused for non-document usage.
JavaScript was originally designed for simple tweaks, but we've significantly expanded and improved the language over the years to adjust it for what it's used for *today*. I don't see why DOM is special. Sure it was designed to handle small, unchanging documents, but it's used for much more now, just the same as JavaScript. Also it's worth noting we're talking about Google Docs here, so it looks like DOM fails even at its intended use-case (I'm saying this *mostly* jokingly).
> Because other browser engines are unlikely to adopt these APIs just so that Google Docs can have better performance. Not to mention that these new APIs will take years to be present in every user's devices.
We wouldn't have fetch, canvas, async/await, PWAs, websockets, etc. if those things had to be available immediately and/or be guaranteed to be adopted. I'd rather it take years to get improvements, but eventually have them, than not doing anything and still be talking about how bad X, Y, or Z is 10 more years from now.
I'll take FLIP animations as a specific example. If I want to have a box animate from one part of the page to another (where it's position in the DOM hierarchy changes), we're having to do all kinds of crazy gymnastics around when you read the DOM, when you write to it, how do you update it, etc. And even still, you're unable to do this without using JavaScript animations (if your box contains content and it happens to change size, we'd have to do a reverse scale animation on the content).
This is stuff that's trivial in iOS and Android, and commonly used. In the web land, we're stuck doing this poorly both from a development point of view, and with bad performance, resulting in poor end user experience.
The FLIP hack has been talked about for 6+ years [1], and yet here we still are, unable to simply move and animate a box from one place in the tree to another. Want nice drag and drop interactions? Good luck. Limited animations, or slow, and often both.
Why are we getting articles from Google about how it's bad to change the size of something on the screen [2], instead of seeing improvements to the underlying APIs that cause it to be slow in the first place? If a hacky JavaScript based solution is able to make this performant, surely a native API would do better.
The DOM has to evolve to support interactive apps in a performant way, or risk being replaced by custom things like the canvas or WASM, that are not easy for machines to parse, that won't have nearly as much consideration for accessibility and extensibility. That aren't as easy to enforce good usage of, or share knowledge about. It should not be "DOM is slow, oh well", or "DOM is slow, lets drop it", or "DOM is slow, so lets build a JS scaffolding around it (VDOM)." It should be "DOM is slow. What are the contexts in which it matters most, and how can we improve the APIs such that it can natively do those things performantly and easier?" Be that better selection APIs, animation APIs, better ways to read/write to styles, the list is endless. The DOM is slow, but it does not have to be slow. We *choose* not to make significant improvements to it, and one can come up with plenty of reasons why or excuses for it.
My point is that we should choose to improve it, because the alternative will lead us down a worse path. Years of neglect has led us here, where Google, a browser vendor themselves, has to give up on DOM because it's bad. This is fundamentally messed up.
I would not consider myself super deep in web technology, but rendering to canvas allows programmers to have pixel perfect control over the look of their applications across all devices. Currently, web developers need to "reset" lots of default rendering behaviors in every major browser to ensure that their applications look the same.
After building lots of specialized UI components within the HTML standard, a programmer may ask themself if they might as well write their own UI library. Specifically, lots of specialized applications have UI components which do not have a corresponding HTML standard. For example, in a spreadsheet, a cell may have a clickable triangle in its upper right corner that should display a comment bubble. Should a programmer create that in css or write a specialized library?
Do your users care about your app looking the same or do they care about their browser looking and acting like a browser?
Moving to canvas is sure to break many features such as text selection, adblocking, and accessibility. All in the name of more controls over the pixels? Are you truly doing it for the users?
Very good point, users do not care that a padding is rounded up or down when they switch from their laptop to their desktop, as long as the application is usable, understandable and visually competent.
But mindspace is much more important for interaction, so if their laptop is a Mac, their brain will be in "Mac mode", and "Linux" or "Windows" mode when on their other device. Respecting the platform's conventions will allow them to keep their cognitive load due to "fiddling" to a minimum.
I don't think the GP was talking about the app looking consistent with the rest of the platform, but the exact opposite: the app looking the same whatever the platform. Using Flutter allows them to have the app look the same on the browser as on mobile.
This means that at best it will look "Mac mode" for everyone, including Windows users; at worst it will look foreign to everyone.
My second paragraph was maybe a bit unclear, I meant that the user would expect platform conventions to be respected over application conventions. (especially if we consider platforms with different primary interactions)
Having a program look the same on a big screen with mouse+keyboard input as well as on a small touchscreen is a recipe for having a bad user experience on both.
I might be wrong, but I'm pretty sure Google Docs is already using a completely custom implementation of word wrapping, text layout, text selection, cursor placement, etc. It's not like it's just a <textarea /> with some CSS styles. Likewise, they seem to already have completely separate DOM elements that are invisible to normal browsers but can be read by screen readers. Based on the DOM on the new read-only preview document they link to, it looks like they will continue to use traditional DOM elements for some of the editing UI (just not the actual WYSIWYG editing area) and for screen readers.
Canvas still sucks for rendering fonts. Getting accurate font metrics is still hacky[1] and more advanced methods are still experimental[2]. Though I'm sure with Google Docs moving to canvas, some of this will be expedited. But it's yet another one of those things about web tech that anyone would half a brain would tell you should have been there day one.
> [...] rendering to canvas allows programmers to have pixel perfect control over the look of their applications across all devices.
Canvas-based fingerprinting due to rendering differences is a thing, so using the canvas is not pixel-perfect either.
Creating an UI library atop of that is a lot of work, though to be fair certainly manageable by Google. Remember, a UI library is not just about putting things on the screen, but sanely defining layouts, interactions, accessibility...
> Specifically, lots of specialized applications have UI components which do not have a corresponding HTML standard
I think that is the entire point behind Web Components [0], and if one really really don't want more DOM elements for their visuals, then the CSS Paint API and in fact the whole Houdini initiative [1] should be pursued instead, at least at the Google scale.
Besides these technical points, interoperability should be considered. Web browsers do a lot of work to match user expectations in behaviour to their native operating systems, as well as web conventions. (The classic example infractions being: links that cannot be control-clicked or middle-clicked to open on new tabs because they are not actual links but elements with click handlers; or not being able to scroll with page up/down, arrows, or middle-click, because the page reimplements scrolling in an unsemantic way)
Making your own UI toolkit is bound to all those problems, and those are user-facing problems that will affect the often-ignored long tail of users with unconventional setups.
Look at Flutter for Web, for example [2], it definitely feels entirely different from a regular website, even if it were to look the same. The scrolling does not respect my system settings, interaction is limited to only the most-common method, image scaling is subtly different.
And as Google observed themselves, extensibility and the user-agent should be considered before making such a decision, but it appears they consider it a liability in detriment of the user.
> Canvas-based fingerprinting due to rendering differences is a thing, so using the canvas is not pixel-perfect either
Luckily Firefox asks you if you want to use the HTML5 canvas API. You can specifically whitelist some pages to use the canvas API and stop it from running by default on all your other browsing. Also: you may want a dedicated browser just for Google's whole ecosystem so they can't track you across the web. I have a Chromebook just for that, which has its own unique canvas fingerprint completely separated from my main workstation PC's fingerprint.
> Luckily Firefox asks you if you want to use the HTML5 canvas API.
Really? Where? I know you can disable it in about:config, but not that it was asked of the user. I have used Nightly for years and never seen such a dialog.
Unless you did mean about:config, but I don't know how that works with a whitelist either.
Did you feel the same when Maps went from HTML to Canvas?
There is no factual basis for your claims. The simplest and most obvious answer is performance. Docs these days can take 5-10s to fully open, especially with comments and annotations.
I think the panic over this is a little unfounded. For docs this makes perfect sense. My concern I guess is if this becomes the normal way of doing development, and people build HTML replacements that work using canvas. What will the impacts be.
* How many browser functions will break.
* Will middle click still work.
* Will screen readers still work?
* Will search engines/ctrl+f still work?
One of the awesome things about web browsers is you get so many features for free on every website. Making the text bigger on a website mostly just works everywhere while on traditional apps it only does if the app has a specific setting for it.
I don't think it'll become the normal way. If anything, the opposite is true, people making Electron apps to get to access the power of web development, instead of making native apps.
It's definitely not easy, it requires you to re-implement everything from scratch. It only makes sense in cases where performance is paramount. I could imagine applications being migrated to the web doing it though, like Photoshop an the like.
But I absolutely don't see normal web development migrating over, it's just way too much pain for little value. Development, debugging, testing, etc. Everything becomes much harder.
I'm assuming that we will eventually have powerful HTML like frameworks built on top of canvas. So for the end developer, its just as simple as using GTK or HTML.
No factual basis? Moving from DOM produced content to canvas produced content closes off your ability as a user to see what is happening.
As for Google Maps, it is not speedy or snappy by any stretch of the imagination. In fact, it is pretty abysmal in terms of page performance on the web.
It's not about the web being more or less open, it's about the browser playing the role of a distributed application run time. I'd argue that Google Docs is basically not a part of the web, it just incidentally happens to run in the same browser as the web does for logistic reasons.
Web apps have been around for well over a decade but some people are still struggling with the idea that a web browser can display hypertext documents and also run applications, plus a whole universe of hybrid things which lie in between these two extremes.
Not everything a browser displays has to fit in the "page" paradigm.
Web 2.0 enabling web apps was always a myth. There have been web "apps" since the 1990s with CGI. XMLHttpRequest merely allowed for the moving of some of that logic to the client side.
In that respect, we've been working around the "page" paradigm since the web was practically born. It was a flawed analogy because, even back then, screen sizes and display tech varied among users. Designers still approach web design as if they are designing for print. I've always maintained that if the web were based on a vector technology (think PostScript, but obviously not PostScript) we would be in a much better place both design-wise and accessibility-wise. Content would flow in a much more controlled manner with much less room for browser interpretation and second-guessing. But people were still clinging on to the write once run anywhere (ahem, Java) naivety of the day. And likewise they really thought that you could divorce presentation from semantics and... have something that just worked? I guess? Just sprinkle on some afterthought CSS tech crap and no one will ever notice that the entire thing is flawed at a fundamental level.
>Google wants more control over their stuff. The web is becoming less open.
That's because the old problem "web-document vs web-application" hasn't been solved properly. HTML was designed for documents. It wasn't designed for applications. No wonder as applications become more sophisticated they try to squeeze out HTML/DOM where possible.
The part I don’t understand is how in the world is a renderer written in JavaScript better performing than their own Chrome c++ code? With Edge being a Chrome clone and Safari being also performant browser, what are they worried about?
For specific apps, or parts of apps, yes. Doing less is how you make things faster, highly agreed. And sometimes canvas allows you to do that, and then your app is much faster.
The problem is that in many cases, moving to canvas eventually turns into having a UI framework that renders to canvas, which turns into a layer of abstractions that handle keyboard and mouse events for you, including stuff like hover, which means suddenly you're tracking element position on your raster surface and thinking about z-indexes and event bubbling to parent elements...
I think this is part of the reason why individual apps that start using canvas and that can genuinely cut down on complexity by doing so tend to be able to get real speed improvements, but app frameworks like Flutter tend to perform so poorly. Eventually your cross-platform GUI toolkit like Flutter ends up being just another browser engine written in WASM. And in that scenario your approach becomes strict downside.
One good example: the browser doesn't expose an accessibility engine other than the DOM. So what I see apps end up eventually doing is either writing their own accessibility engine that doesn't work with programs like JAWS, or rendering out to a hidden DOM. For something like a game, you can get away that, maybe you don't even provide an accessibility layer at all. For a web component or a chart, a lot of your rendering might be unrelated to accessibility at all. But you get away with those kind of shortcuts because it's a targeted, specific use. For a big UI toolkit, it's harder to do that, and then surprise, suddenly you have all the overhead of updating a DOM tree and the overhead of updating a canvas.
When people talk about getting raw access to the graphics layer, I think it's important to understand there's a difference between apps that are genuinely reducing complexity vs the theoretical canvas-backed "universal web framework" that people sometimes talk about as just around the corner.
"Slapping some rectangles on a raster surface" can also be done in HTML/CSS, with the right options (set 'overflow' content to be clipped with no reflow).
VS Code feels performant enough, with its complex functionality IMO exceeding Google Docs, and yet I don't think it is using canvas. I believe it comes down to strategic design that avoids unnecessary layout and reflow events in the UI.
That said, the UI of VS Code (the desktop app) only needs to run in Chromium. And generally Google Docs could be a different enough beast that it can’t take advantage of the same tricks—hard to say from the outside.
> VS Code feels performant enough, with its complex functionality
VS Code has an entire dedicated team that only works on VS Code. They can spend resources on trying any trick in the book to make something performant. Whenever actual performance is required, well, they ditch DOM and go for canvas: https://code.visualstudio.com/blogs/2017/10/03/terminal-rend...
And while sufficiently complex, it actually displays significantly less complex information than required by a regular document that will have any number of fonts, layouts, inline images and tables, references to other documents, etc.
> I believe it comes down to strategic design that avoids unnecessary layout and reflow events in the UI.
Yup. And it's nearly impossible to do any amount of "strategic design" because if you as much as glance at a document, it will repaint and reflow: https://csstriggers.com
I stand corrected, seems like they’re using canvas for at least the integrated terminal and maybe (?) the editor. Makes all the more sense for Google Docs to follow suit. I’m not against apps moving entirely to canvas by the way, as long as the regular non-webapp sites don’t start doing this just because they can.
The editor is all DOM based apart from the minimap. More pixels could definitely be pushed faster by re-writing it in canvas but it would be quite the undertaking when you consider accessibility, backwards compatibility, monaco extensibility, etc. with the end result just being a improved scrolling experience.
but VS code still only has to support monospaced code + some popups and sidebars, not mix-and-match of font and all its variants in all complex layouts, line heights, paragraph spaces, column layouts, images including float etc etc
JavaScript is actually quite fast these days, especially if one has a compiler in the flow to narrow it to the set of operations that are known to be high-performance.
And Google would be paying a lot of that cost anyway if the DOM is the render target, because what they gain in the render algorithm being precompiled assembly they lose in the JavaScript layer pushing the wrong abstraction around to trigger all that C++ code.
Do I have Google's engineers to develop it and full control over the implementation of the JS engine?
Am I allowed to compile the JavaScript to assembly?
Because if yes to all of these, then in the abstract, as a thought experiment, I can create an implementation in the JavaScript language with machine code that is byte-for-byte compatible with Chrome written in C++. Step one is write a C++ compiler in JavaScript... ;)
... but more importantly, I don't know how the question is relevant to the question of whether a JavaScript implementation of render commands into a canvas might be faster than a JavaScript implementation of layout declarations that have to play a bunch of games to get desired results from a C++ renderer. The gains from C++ render performance start to get lost if the renderer is making a bunch of wrong guesses about what should be rendered and when.
That's kinda a weird question, since it would obviously depend on what is executing the JavaScript for the browser written entirely in JavaScript. Chrome doesn't ship C++ code to your machine, they compile the C++ to native code for your particular hardware and operating system (presumably with a great many differences and performance tweaks between each compilation target).
in general a lot of dom-nodes can reduce performance and with canvas you have much more control over what and how things get rendered. I would assume this is also nothing new or "special" afaik google spreadsheets is using canvas for years under the hood with some dom for nicer ux
"A lot of dom nodes" is not inherently problematic if you do not insist on twiddling them individually in a JS for-loop. Other than that, it's just HTML - and plain HTML/CSS rendering is blazing fast.
That seems like a moderately weak argument. DOM tables are slow, but if you know what you are doing the DOM is otherwise an insanely fast interface. You can get sub-nanosecond response speed from DOM calls in Firefox (faster than a billion operations per second).
It would be interesting to see the performance differences in numbers. The performance impact of a canvas based approach can be approximated from measuring the performance of heavy SVG animations on GPU load.
Raw calls to DOM mean literally nothing when you have to layout those nodes.
And yes, tables are slow. "If you know what you're doing" routinely becomes "let's reinvent virtual lists on an interface that doesn't have a single API to make this pleasant or performant in any conceivable way".
> For those more deep in web technology, I'd like to know if there are reasons to move to canvas for strictly technical merits.
We’ll see the results, but let’s be honest - web is bad fit for apps. It was never designed for it, and has tons of hacks and layers to make it possible, that makes them messy and slow.
I bet it's a feeling of, "I used to be able to right click -> view source on any web page and see source code I could understand and learn from. I can't do that anymore, which makes it less open."
Which has some truth to it, but it applies almost equally to the current HTML-rendered Google Docs I'd bet.
The DOM gives a developer very little control over when something should be repainted (or how repainting occurs) relative to the compositing options available when controlling one's own canvas. Moving the rendering engine to canvas allows the docs team more control over the optimizations of layout and rendering they can do (especially cross-platform; there are a hundred hundred mutually-incompatible bugs and quirks in Firefox, Chrome, IE, etc.'s layout and content rendering algorithms that make cross-platform high-performance very hard to guarantee at the DOM layer of abstraction).
It is not unreasonable to request that devs building on top of your platform use an API. They also aren't requiring it, they are simply suggesting it to avoid future breakages when they change the internals of how their application works.
Requiring that they maintain this compatibility would be like requiring the maintainer of an OS library to maintain the contract of a private method because my app relies on grepping their code base to parse the contents of the method.
When you don't have a defined set of public interactions with your app, every change is a breaking change.
I did a small project where I drew a large animated graph with SVG. Despite careful optimization, it ran at a mediocre frame rate and kept spinning up my fans. I know about another project (whiteboard with notes) that ran into the same issues.
PowerPoint, Miro and Figma all run in the Canvas. I don't blame Google for doing the same.
> this reads like Google wants more control over their stuff. The web is becoming less open.
Absolutely agreed. I'd also be surprised if they don't try to roll out the same for search results, ostensibly for the purpose of improving performance, but actually to thwart ad-blockers.
The only thing that's kept everyone from doing that, so far, is accessibility. If not for that all the major ad platform companies (FB, Twitter, Google, and so on) would be already be all-in on Canvas.
Accessibility concerns make it both expensive to develop a UI that renders to canvas, and ensures that the content can be processed & understood by a program (else how will assistive tools read it?), which opens up the door to ad blockers again, defeating the purpose of the whole exercise.
We literally have blind people to thank for the Web remaining as open as it has, for this long.
Only that it's a really obvious move for them, for a few reasons (including that making it the norm for stopping ad blockers and trackers forces all their would-be competitors to play catch-up on basic taken-for-granted stuff like displaying text on a screen) and accessibility is the only thing I'm aware of that attacks both the difficulty of the project and the feasibility of the desired outcome, sufficiently to explain why they've not at least given it a shot.
I worry about the same thing, but in Google's defense, docs is a wysiwyg document editor, not a webpage for displaying info. It's meant to help users create and edit documents. It has different needs than HTML.
replying to myself though haha. I use rikaikun, an extension to add popup translations to Japanese words. It obviously won't work on google docs drawn in canvas which sucks if someone posts a link to a google docs doc. It will also suck for support for brail etc... Maybe they'll need a "fallback to HTML" button
I also wonder how they'll support CKJ IMEs. Generally when sites try to do their own input the languages that need an IME get 2nd class support. You can see an example but looking at the Qt WASM examples which draw the entire app in canvas using WebGL. They don't support anything but English
I have a running theory - Google doesn't create a product unless it captures data in a unique manner compared to their other products. Maybe they have moved past this phase. Maybe I just don't see the collection happening here.
I think this is actually one of the few cases where canvas-based rendering makes sense as a replacement for the DOM.
Docs is a full-featured application which already has to re-implement a lot of DOM-like features in order to fulfill its primary function (things like text formatting and layout, spell checking, etc). Doing it all in canvas doesn't sound significantly more difficult than what they're already doing.
The loss of extensibility is regrettable, but probably worth the trade-off for better performance.
If this starts happening with more traditional websites, then yes I'd agree that'd be a bad thing. We're certainly not there yet though.
This is a great example of a comment written by a developer or some otherwise technical person who is so used to thinking and speaking in terms of trees that they can't see the forest, let alone the village it's situated next to and the people inhabiting it.
Google Docs may be an app, but a big part of the app, and confirmed to be one of the reasons for the migration here, is the part that renders the document to the screen. Only the sorts of techno-fetishists found on this site and in programmer circles would be able to make the argument you just have and not recognize the perverseness of what you're saying and what this decision by Google really means.
Documents are the quintessential use case for the Web, full stop, and they're the direct object of whatever the verb form of "Google Docs" would be. Google's decision here reflects the belief that, despite this, the Web is not suited for documents. We're not talking here about the mere act of the "app" chrome at the edges (surrounding the edited document itself) being switched over to use some more perfect framework better suited for e.g. painting interactive widgets, etc. No, it's right there in the announcement: "we’ll be migrating the underlying technical implementation of Docs from the current HTML-based rendering approach to a canvas-based approach".
Google is completely dropping the ball here and leading us down the path towards the picture painted in the top comment; what this is is just shy of an abrogation of their responsibility to act as a steward for the Web and the intention that it best serve users (rather than overpaid frontend developers trafficking in flavour-of-the-month fads, frameworks, etc and who already disproportionately receive attention under the status quo).
If the Docs team has identified deficiencies in the underlying standards-based model for—let's repeat it—presenting documents, then they are perfectly situated for translating that into feedback about how to improve things so that the Web is better fit for that use case. Even if that meant the Docs team going off into a corner, identifying what the problems with the Web are down to its fundamentals (DOM, etc), and then emerging with an entirely new approach for how to lay down bits so they might be better interpreted by the viewer running on the end user's computer, and then get the Chrome team to bake native support into Blink while disregarding every other vendor's possible objection to this act of steamrolling the standards process, then that would still be better in the long-term than what Google is doing here.
"Displaying documents" and "displaying editable documents" are two completely different beasts. The web browser has never dealt well with displaying editable documents, the closest standard that exists is contentEditable and pretty much everyone agrees that it sucks and is not fit for complex use cases.
Why can't Google work on improving `contenteditable`? It would benefit so many and the problems are well-known. It's probably even used by Google somewhere
I've heard there was actually a vocal push from Chrome for Docs to use contenteditable and even with motivated Chrome engineers they couldn't make it competitive. They basically could never fix it for Safari.
You could easily burn five years trying to fix `contenteditable` and get nowhere. There's too much legacy and impedance mismatch. Better to start fresh at this point.
Ok, and then the new version is only supported by chrome. So they still need to keep their old rendering engine up to date. Will the other big browsers ever go along with supporting it? Or will that work be lamented as google trying to own the web and just updating standards that benefit their bottom line?
All web pages are editable currently, they are editable via Javascript. Not sure the distinction you are making is meaningful. The only thing we are missing is a good edit UI.
> Documents are the quintessential use case for the Web, full stop, and they're the direct object of whatever the verb form of "Google Docs" would be.
You seem to be basing your whole point on the misplaced idea that the web is great at displaying, nay was specifically made to display Google Docs' class of content. This is incorrect. At its core, Google Docs is a typesetting engine with semantics and features that don't align well with HTML+CSS. Typesetting is about graphics, the web (and HTML) is about content. They're very different use cases.
> [...] what this is is just shy of an abrogation of their responsibility to act as a steward for the Web
Google has no obligation, either moral or legal, to be a technical leader. They're simply a market player. The only way in which I care about this change is how it affects me as a user and it's nowhere near as catastrophic as you make it sound imho.
> Google's decision here reflects the belief that, despite this, the Web is not suited for documents.
I don’t see how you jumped to that conclusion. They’re not changing the way documents are stored, they’re only changing how documents are rendered.
Who’s relying on the render API? What functionality do you think you’re losing, as a web user?
The web of HTML+CSS is good for some kinds of documents - static documents, but it has never been good at editing documents, and it has never been good at all documents, and it has never been the best platform for high performance applications like editors or games.
> this is just shy of an abrogation of their responsibility to act as a steward for the Web and the intention that it best serve users
Ignoring the problems with your assumption that Google should act as a steward for today’s Web (Google’s mission statement doesn’t mention preserving HTML. And it’s better if public, not private for-profit entities are our stewards) -- this decision is being made to best serve users, no? As a user, I want Docs to render faster, don’t you?
I’m not quite seeing your reasoning why this affects the long term health of the Web. I can’t help but note that many large web apps have transitioned to canvas rendering for performance, and the internet is still growing. There is, in fact, a problem with rendering HTML+CSS in a performant way when editing things, and it might be too late, and there might be too much legacy to fix it... maybe. It’s still pretty good at what it does, and not likely to go away.
If that's your point of view then no, the web has never been suited to spreadsheets, it was only suited to PowerPoint-like presentations when Opera had some nice markup for that functionality, and it is only suited for Word-like documents, with ugly hacks for equations.
In the end, you end up making Google's argument just stronger.
Google Docs produces printable documents not hypertext. Print was never a priority of HTML. And I guess it shouldn't be either. Just look at the complexities of DocBook, LaTeX or Open Office XML.
> Documents are the quintessential use case for the Web, full stop
In 1996, yes. And you can still use the Web that way. Nobody's stopping you. Much of the Web is still used that way, and that's not going to change.
But in 2021, the web serves documents, and applications. And HTML/CSS was never intended as a way to build applications.
It's why people came up with Flash, Java Applets, ActiveX, and all that other shit. And the web today wouldn't be one whit better off if the vendors of those technologies took your advice, and rolled their functionality into the core web. In fact, it would be in many, many ways worse off.
This is not a response to the point being made at the point where the quoted text originates, even though superficially it looks like it is. Please re-read the comment you're responding to.
Actually, it responds to the entire[1] train of thought that follows that statement. Please re-read what I am saying, and consider it in context. You are suggesting that Google work on updating the standards to make their desired use case work better. I am suggesting that Google's desired use case will never be served by document-oriented standards, and that is why it shouldn't mess with those standards.
Again, the world would not be better off if ActiveX were rolled into web standards back in the oughts.
Leave document standards responsible for displaying documents, and don't let the needs of application people get in the way of that.
[1] Well, I don't respond to your point that Google Docs is intended to display documents - because, as another commenter points out, it's incredibly obvious that its most important function is to edit documents.
Thank you for articulating this.
For the last decade we have had frontend developers trying to force the web to be an app delivery platform. We have bent over backwards to accommodate this. We have turned browsers into a giant pile of hacks essentially emulating their own operating systems. And for what? As soon as they're able, they say oh, the browser is a giant bundle of hacks now, let's implement our own app delivery system on top of it.
Next time we get a document delivery format and people start trying to make it something it's not, because oh gosh it's easier to install the app, we need to push back and say hey, maybe clicking a few buttons in an installer isn't the worst thing in the world.
So, because you don't want binary blobs to be deployed over the web, you would like users to download and install binary blob native applications... And that makes things better in what way, for whom, exactly?
Let's start with accessibility, because this is a field I am forced to be an expert in. Native "binary blobs" have access to the operating system accessibility APIs, and consequently any native app is a lot more likely to be accessible out of the box than something rendered to a canvas. People have spent literally decades building out APIs for the native platforms [0] [1] [2].
To think that a native app is equivalent to a canvas-rendered app is to not understand either.
it's viable to make accessible apps in DOM though - it's just canvas apps that can't be made accessible, and I agree with the GP that Google should not be trying to circumvent the DOM for the Docs interface.
The example that Google shows in their post has accessibility features [0], although currently these need to be switched on with a keyboard shortcut (⌘+Option+Z).
Pay attention to `document.getElementById('docs-aria-speakable')`.
For starters, a binary blob on your computer can be prevented from "phoning home" to collect personal data much more effectively than a web app can. Additionally, the vendor is often forced to compete with their own old software because they can't always force you to run the newest binary blob. This prevents the wholesale loss of features from products you rely on at the whim of the vendor.
>
For starters, a binary blob on your computer can be prevented from "phoning home" to collect personal data much more effectively than a web app can.
In theory, it can be. In practice, the average user doesn't do a damn thing to do so, and installing a random native blob from the internet is far more dangerous than a webapp running in a browser sandbox.
> Additionally, the vendor is often forced to compete with their own old software because they can't always force you to run the newest binary blob.
This is also how we get people running years-old software instead of deploying security patches.
I understand that a power user may prefer these trade-offs. Most users are not power users.
Surely, you realize that docs needs to be connected to a server in order to perform it's basic use case of collaborative editing... Right?
You must also realize that an arbitrary native blob can and will happily connect out to whatever server it's authors want it to.
I can't say that 'there are a few techies who aren't happy that Google doesn't ship docs as a native application' is a very compelling reason for Google to try to change core web standards.
You know what I'm tired of? Slow apps. Google Docs is a great platform but performance has always been slow compared to native Office. I welcome this change. Draw the document quickly and make me more productive.
We are working on a full-featured Google Docs alternative - https://writer.zoho.com . We are already capitalizing canvas technology for rendering documents across devices, except for the Web which still is DOM.
Having a single codebase that renders across all platforms is our long-term motive, and it does require rendering to a cross-platform canvas backend - like Skia.
I'm assuming Google Docs is already on that direction as well. Would be great if someone from Google can clarify the technical internals.
I love zoho's suite of free tools. When I first got started in web development business 10+ years ago, I used their invoicing tool, and thought it was perfect for a freelancer like I was at the time.
How do you deal with the fact that fonts render differently on different platforms and it might look off on one of them, because the user isn’t used to the Canvas font rendering? Or is this not an issue?
I do like seeing the new Writer interface consistent across multiple applications, and it's very intuitive to use.
I do find this interface is slow to load of even semi-large documents, especially when large tables are involved. That and a few formatting/editing quirks are the only real complaints I have. Otherwise I find Writer a great alternative.
I consider neither Zoho nor Google Docs to be vendor lock-in since you can always download the documents as docx and continue working on them locally.
I'm gonna give Zoho Writer a spin since Google Docs is missing some vital features for me, most notably the option to have custom page designs (font/colors/footers) that you can centrally update and apply to a bunch of documents. If anyone has another Google Docs alternative, I'm all ears.
Vendor lock-in is an inability to use another vendor without substantial switching costs. [1]
Changing Google Docs for .docx has significant costs, because their compatibility with MS Office is far from ideal. That said, using .docx is yet another vendor lock-in. As you might know, OOXML is not a true open standard: it was recognized as such by ISO only due to Microsoft's shenanigans. That's why it is not fully supported by any other software and trying to use alternative applications results in numerous compatibility issues.
So your best option is to use LibreOffice, because it is libre software, capable, available from several vendors, and is based on true open standard, OpenDocument.
We develop app that does have end-to-end encryption. From my experience, people who demand it rarely understand what it means and how many inconveniences true e2ee introduces to a product.
Anyway, if you are seriously worrying about the privacy of your documents, just use a localy deployed libreoffice and send encrypted files to your contacts.
(googler, opinions are my own. I know nothing about this project).
I really hope this fixes the large-document problem. As someone that has to deal with large specs (1000+ page MS Word documents), Google Docs does a not-so-great job of handling them. I also understand this is why many writers don't like Google Docs to write the entire book in, as things start slowing down when you get into the hundreds-of-pages. I don't know if this is a javascript limit issue or a rendering issue, but if it's rendering, Canvas should hopefully help.
Switching to canvas sounds like a very roundabout way to handle an issue related to the dom growing out of hand. Surely unmounting pages that you you're not currently looking at should work?
Of course there could be many reasons but we're left to speculate...
You can ditch the off-screen rendered pages but you are still stuck with the entire 1000-page data structure. Should your current line be on page 997 or 998? You have to calculate the size of every line of every page to answer that.
Efficient 3 dimensional position or range queries (line/column/page) is a pretty well studied problem. You don't need to query every point of the space to answer anything.
It's a task more commonly found in computational geometry (3D range query == find all the points in a data set enclosed by a cube).
There are numerous data structures that are well suited for various geometric queries like ranges/lookups (interval trees, quadtrees) as well as more text-oriented operations like cut/copy/paste/insert/merge/etc (like ropes).
I'm not familiar with the operations required to put a cursor at the right place in a document, but knowing how much research has gone into storing similar data and looking up what you need efficiently the idea of "going through all the text every time" is a big code smell.
But I once read, most performance problems come from all the changes that are chached for a document and yoh can fix the problem by copying the text of an old document in a new one.
This doesn't let me hope for huge performance gains.
I've looked at some of the public spec documents (Specifically the windowing system for Flutter desktop) that Google has hosted in GDocs and that was also the first time I noticed performance degradation. If I had to guess, I'd say there's not a better way around this performance barrier.
The Web started as a document display and delivery framework.
Now it has turned into an application delivery framework, which uses the old document-rendering abilities to display the UI, with many quirks and workarounds, because DOM was never intended as a performant dynamic medium.
With canvas and WebGL taking more and more, it will turn back into an X terminal, with more advanced network capabilities.
If most important sites switch to canvas rendering, what used to be the browser can be radically simplified and made lightweight and much faster. The 'legacy' full-web sites can be shown using a plugin that does all the fancy and heavyweight HTML 5 and CSS 3 stuff.
Also, with webassembly running JITted platform-independent code delivered over the network, and displaying it in a platform-independent graphical client (née browser), the promise of Java from 30 years ago will finally come true — sans Java proper, though.
If a developer wants to “remove all user choice” on the web it’s already very easy for them to do that. The only new concern here is that developers will be tempted to use tools/frameworks for unrelated reasons (maybe convenience, cross-platform support, etc.) and those tools/frameworks will just happen to also remove user choice.
Except it's not that easy. Web developers have been trying to remove user choice all the time - disabling right-click menus, preventing copying on websites, forcing the page to look blank unless you turn JavaScript on - and yet it's fairly easy to get around most of this using extensions.
That's because how websites display information in the browser is pretty standardised. Before the Canvas element, your choice was basically HTML/CSS or nothing (unless you did something incredibly strange like rendering output to a data:image/png URL and updating an image tag with it).
The Canvas element, on the other hand, doesn't force a standard way of displaying information. It's essentially your dynamic data:image/png render output method on steroids, and users can't even use it unless they have JavaScript on. As it is, many sites are still usable even without JavaScript, including Hacker News.
The canvas element is essentially a bitmap that you can draw on using JavaScript. As far as I know, it doesn't introduce any features that increase the developer's ability to control the user over simply generating bitmaps on the server. (There are some fingerprinting techniques that use canvas and have bad privacy implications, but I believe that's a separate topic.) The only difference is that the bitmaps can be changed dynamically by client-side code. Everything you mention: disabling right-click menus, preventing text copying, making the site not work without JavaScript etc. can be done with canvas or without canvas. Moreover, the canvas tag has been well-supported for over 10 years.
I don’t think the open nature of the web gets in the way. Those developers could quite easily generate a screenshot of the website they’re building and ship the screenshot instead of the HTML/CSS/JS.
a) It would be completely inaccessible to blind users;
b) Sometimes devs want to be able to give users some choice, like copying text, and this is impossible with your proposed method.
These two reasons alone mean it's undesirable for devs to use these, and they're problems that Canvas addresses - albeit by letting the devs have that choice, but not the users.
There are several other ways to prevent copying text or making it difficult, some of which probably don't impact screen readers (although I'm skeptical that many developers would be intent on preventing the copying of text while also caring about screen readers). And, of course, if the developer wants to allow copying text, that's trivial. How does canvas give a developer more ability to choose which things the user can and cannot do?
Why, I thought I was describing an anti-utopia, a kind of a dark cyberpunk future, because half of the dark cyberpunk future predictions from 1980s have come true already.
What do I “own” when I open a site that is a tiny HTML doc and 300kB of minified JS, plus 100 fetch calls for JSON data to render? At that point, use the DOM, use a canvas or a literal paint brush, I don’t care.
your choices are still there. go to whatever website you want. but i think users will want to go to the best websites that have the highest production value and may not be built the same way as we did 20 years ago
most people don’t come to the web for choice, they come for the apps
> With canvas and WebGL taking more and more, it will turn back into an X terminal, with more advanced network capabilities.
Not really, Canvas is just that, a blank canvas. The issue is that it's up to the developer to implement everything on top. It's very hard for me to believe that Canvas would be more efficient than DOM for creating text documents. Obviously there are limitations to CSS or the DOM. But then wouldn't SVG be a better alternative than straight out drawing pixels on screen?
I'm wondering if at some point it makes sense to compile servo to webassembly and let it render to a canvas.
Producing a good user experience with pure canvas rendering is a lot of work. A tree stucture to manage UI elements makes this easier. Flutter is going in a similar direction here.
The interesting thing would be that the engine wouldn't need to conform to web standards anymore, while providing a guaranteed consistent cross browser experience. One could probably strip away lots of old dom concepts that make the dom slow today.
> With canvas and WebGL taking more and more, it will turn back into an X terminal, with more advanced network capabilities.
For some specialized apps this will be great, but for apps in general this is not a good idea at all. If it was, we'd be using Java applets today. HTML+CSS based apps are highly constrained. But that's a positive, not a negative, because it is the constraints that makes HTML based apps predictable, consistent and usable.
It makes a hell of a lot of sense, even from a QA perspective alone. Canvas is going to be far more predictable as a rendering target than the wide swath of browsers they manage now.
The DOM certainly has a place, but if you've looked at Google docs rendered output you'd notice it left web standards behind a long long time ago. This is an application which happens to be served on the web, not a traditional website.
VS Code already shows why Google would do this. They moved the Terminal from DOM nodes to canvas and got somewhere like a 20x render improvement (after already having spent a lot of time optimizing the DOM implementation). The previous jank all but went away too.
I can't speak for Google, but my company does 3D medical rendering (BioDigital). To test that our 3D engine is appropriately rendering the content, we use Cypress with image diffing. I'm sure Google has a way more advanced QA process for docs, but that'd be an option.
Maybe because I am out of loop with web development, but this sounds like a huge change to me. So instead of HTML elements, there will be just one big <canvas> element, and everything will be rendered there? How will text selection work for example? Will Google Docs basically implement a GUI library like GTK, which renders everything itself, but for the web?
Yup, afaik Flutter web is already rendering its components in canvas instead of DOM [1]. Text selection works, but kinda jarring and opt-in instead of opt-out. If the developer forgot to enable text selection in an element then the user can't select and copy the text in that element at all. It might be an indicator that google will continue with this approach going forward.
Not necessarily. Only the text rendering is done in canvas, which indicates they don't switch to flutter because in flutter everything is in a canvas. But if google, a major player in w3c and whatwg, decides that they don't want to use dom anymore and go through a great length to re-implement a major feature in canvas, this might indicates that this approach is beneficial enough and might double down on this route in the future.
If they simply dissatisfied with dom feature, they can simply influence the standard to make the feature they wanted happen like they always do.
As to why they don't want to use dom and instead use canvas at great expense of re-implementing a lot of basic text handling and accessibility feature, one can only guess. My take is moving to canvas will be beneficial to google because they can effectively kill dom-based adblocker, so it's only logical to start experimenting with canvas-only web applications.
I'm not sure but what you are suggesting is not outside the realm of possibility. There are already projects like https://makepad.dev/ which implement their GUI on top of WebGL. (Which is arguably more challenging than implementation on top of Canvas 2D.)
Ran very smoothly for me. My only complaint is that right clicking doesn't offer a context menu, and that scroll bars don't allow snapping back to a previous position on Windows.
It would be interesting to see a project that allows integration of these native features into an otherwise canvas-like environment. Copying all the minutiae of OS features and quirks seems like an unending task.
I expected scrolling to feel janky and "non-native" but it was fine. Text rendering, on the other hand... it was a bit of a blurry mess, compared to Chrome's native rendering (on Ubuntu 21.04).
Ran well for a few minutes on my iPhone, but froze up and then again after refresh. But very responsive and crisp. Obviously not yet optimized for touch devices; have to tap and hold the scrollbar to scroll, native iOS stuff didn’t come up after text selection. Very cool and while imperfect according to other comments, this is not bad at all.
If you look at the bottom left of the screen, it is live compiling/rendering a 3D particle swarm animation while you edit the code, so really not so surprising that it makes the machine work a bit.
On my laptop (a Core i7 Chromebook - fast but not crazy by HN standards) the fan runs for a half second at first load - presumably during the compilation step - then shuts off immediately.
Perfect example of the pitfalls this approach has. I'm on a Samsung Galaxy Tab S7+ and visited that site using Chrome for Android. I can't enter text as the on-screen keyboard is never activated.
Easy to fix, but these are the sorts of edge cases a canvas-only solution is going to have to deal with.
And it puts the entire structure behind a giant blackbox. Yucky.
As if fate tries to prove a point, I recently struggled with a random ass google docs document and just wanted to download and import to use in my own spreadsheet software. There probably would have been a Google Approved way of doing it but if there was, they intentionally made it hard. I ended up copying the HTML somehow. I guess that will no longer be possible in the near future.
And it puts the entire structure behind a giant blackbox. Yucky.
Technically, sure, but the reality is that very few people are interacting directly with the HTML content of a Google Doc. What it does behind the scenes is really only relevant to the developers working on the app.
Well it also breaks all context menu functionality. No more right click -> copy text unless the website specifically implements it. Any custom context menu you use is now useless.
They couldn't stop you: either the interaction is handled in JS (in which case you can programmatically call the JS) or it is handled in WASM (in which case the interaction is actually still handled in JS, and exposed WASM functions are called).
This tech is called CanvasKit, and it scares the shit out of me. Google has, in this case, carefully written a proprietary custom API to take the role of Web Extensions (albeit with far more controls/restrictions/less flexibility) for Docs, but as more and more web properties switch to using Flutter with CanvasKit, we'll see less and less of the web be hypertext, less of the web be scrapeable, less of the web have good accessibility, less of the web be able to use extensions.
To me, pushing pixels in people's faces is not the web. The web implies hypertext, the web implies user-agents that fulfill the user's desired agencies. Developers switching to Canvas obstructs everything good, unique, & empowering about the web, converts it to the same terrible awful anti-user mess that everything else in tech is.
> The web implies hypertext, the web implies user-agents that fulfill the user's desired agencies.
That's entirely valid for many things people put on the web, but clearly not all. A document editor is not hypertext with agencies fulfilled by the user-agent.
The web now supports a wide range of user experiences, and the "text + links" model is great for a subset of those, but not all. The fact that some apps are moving to a different model to me does not imply any existential threat to apps/sites that do fit well as hypertext.
I see a lot of comments in the thread as essentially arguing that now that UPS using jets to ship packages means that soon they're going to take our cars and bicycles away from us.
Just imagine the sheer glee on the faces of the marketing department when their SEO experts tell them there is a technology that allows them to push pixels in peoples' faces while making it harder for users to engage in "unauthorized content use" and harder to block ads.
The web is about to get a whole lot more user hostile.
> there is a technology that allows them to push pixels in peoples' faces while making it harder for users to engage in "unauthorized content use" and harder to block ads.
That was exactly true in the Flash days and yet it didn't take over the world or kill the open web.
Canvas is more capable than Flash, it’s supported by everything already, it’s well sandboxed, it’s apparently faster than the alternatives, the technique in question now has the backing of a powerful industry leader, and the underlying technology is an open standard so it’s even got moral high ground.
And it’s possible we only managed to escape Flash-hell thanks to the fortuitous self-interest of Apple, so let’s not be complacent by assuming we’ll escape the next trap.
Having said all that, I actually like the idea of separating web applications from web documents and I don’t at all begrudge web application developers from doing what it takes to improve their applications. I just know it will be abused. The best I can hope for is that it leads to some re-focus on HTML as a stable and finished document language.
It already does. The old DOM renderer renders _everything_ by hand. Selected text are absolutely positioned divs, so is the cursor, etc. So switching to canvas is not a big change in that regard.
(googler, opinions are my own, I know nothing about this project).
I will bet that this will ship with accessibility support. If it doesn't, maybe they will launch with it being opt-in, as they flesh out a11y. But if I were to put money on it, it will ship with a11y functionality.
The good thing here is that Google already has experience with accessibility on canvas UIs . Flutter, which uses canvas on web[0] and opengl on phones, has accessibility[1]. Hopefully there was some good knowledge sharing there.
The fact that there isn't a word about accessibility in this blog post is beyond disheartening. I would really like to know how they're doing it. Is it the Accessibility Object Model (which AFAIK, isn't fully baked)? Some shadow DOM thing? Accessibility is a foundational feature of the web. Without it, it is 100% broken.
Its also a bit disheartening to read that someone would consider accessibility an optional thing that could be shipped later.
Great, but will it ship with ad block support? Or will I be forced to look at ads, where high value individuals (most of HN) are selling their attention for pennies (being able to view a Youtube video or whatever)?
The question is, who is sending binary WASM blobs? That ad data needs to come from an ad server, and if network requests to the ad servers are blocked, the ads will not load.
It might, but it doesn't have to be served from a different website. YouTube content is indistinguishable from ads since Google serves them both from the same domain. pi-hole does not work with YouTube ads, and it hasn't for years.
Needing to reverse engineer and recompile the YouTube app is the only way to block YouTube ads with the mobile client. It's the same situation with everything being a JavaScript canvas with custom rendering. The bar for preventing ad-infested content from being shown becomes much higher than simply removing an HTML element when everything is byte-compiled.
I feel like this is not talked about enough, because the implications are heartbreaking. This will be the end game for web ads: WASM + single canvas for ads and content + companies with the infrastructure to host ads on the same domain as the content. There will be no longer be a single content blocker that will be automatically compatible with a large ratio of web content if such a combination of tech becomes widely adopted. YouTube Vanced is essentially one of those specialized content blockers, where YouTube is a service lucrative enough to wall off with its own specialized ads implementation.
As mentioned earlier in this thread, I would guess that the only reason the ad agencies haven't won out and pushed WASM/canvas rendering everywhere is because of the necessity of screen readers.
I see, that is a valid point. I suppose Google could provide dynamically including ad content into the WASM blob before serving it out to the browser, but I would agree if you said this is pure speculation.
Yeah, and it's possible today to route that data through a first-party server and randomize the file paths too. So I would say it's plausible, but moving to canvas doesn't automatically make that part worse.
They won't be able to sell the service in Europe if they ignore web accessibility. The US also have similar laws, or at least people can sue a company if they don't meet the accessibility requirements.
It is an interesting possibility. Obviously people wouldn't want it to hurt their SEO (for public facing websites). However Google is okay with displaying a different page to their Googlebot (for example if you have a SPA it is okay to prerender it and display that as long as the content it the same). So if your HTML based content is the same as your canvas based content maybe Google would be fine with it and you could get both good SEO and an ad-blocker-free experience...
That's a good point. It's not clear how to square the circle of making your content sufficiently accessible that it can be indexed by Google properly, but ad blockers can't get a grip on it. I'm not saying it's impossible, but it's not clear. Any website that wants to go to 100% pure WASM canvas code is going to have serious problems with getting indexed properly. Anything they do to scramble ad blockers will scramble the indexing as well.
It makes sense for google docs because the contents of the documents shouldn't be indexed (mostly). And of course there's the special case here of Google having both the document and the web engine so they'll always be able to index the public documents, even if nobody else can. (I'll pencil in the antitrust hearings for this for 2027 or so.)
Eh, I'll believe the slippery slope when I see it. Right now I think this move makes total sense since the dom is currently just being used as a rendering target and has to implement all its document-y behavior itself. Nothing about Google Docs uses the dom semantically and it already has to implement all its own accessibility support itself because without a lot of help screen readers would be useless.
Seems far-fetched that normal webpages would use canvas as the primary method to render text something like Google docs makes sense because it's a complex application and consistency is important. For these type of sites if they choose to put ads on I would not try to block them because clearly the effort the developers put to build them justifies displaying ads.
Additionally, I can see future ad blockers also getting more sophisticated by finding a way to block already rendered elements on a canvas without having access to the representational elements (reverse engineering).
Sure this impacts google docs free users, but the blog post is on google workspaces, which already has 0 ads. Google never has and never will advertise on a google workspace account.
Absolutely. All it's going to take is somebody porting a browser to WASM. They just need to "tunnel" accessibility concerns to the "outer browser" until all of that can be completely implemented within the "inner browser".
This is exactly the reason I don't use an ad blocker. In the long run, it will make everything worse, and I still probably won't be able to block ads. I'm clearly on the losing side of this though.
"Ad blockers" need a new name. I don't consider uBlock Origin an "ad blocker". It's a proper user agent (like a browser is supposed to be) that gives the user control over what is allowed to happen on their machine. That just happens to have the side effect of not allowing most ads to render because the way they're typical delivered. It won't matter if everyone starts using canvas unless they also stop using my machine to fetch random third-party content, which they won't, because the "ad" industry is now really the "tracking" industry.
The very few times I've seen ads over the last decade or so, they were standard images hosted directly by the site. That's all ads should be. The contract between the site and advertiser should be their problem, my machine isn't obligated to be their mediator.
Ever since WebAssembly was first introduced, I thought it was obvious[1][2] that the end game for those that want to control the web is sending opaque binary blobs of code that only use the browser for the canvas tag's framebuffer. Font rendering and layout can all easily be accomplished by embedding libraries like freetype. This breaks all forms of user control - like ad blockers - and turns the web back into cable TV.
I didn't expect the first move towards canvas-only rendering to be traditional wordprocessors. As usual, this is merely the first step. Once the tech has been normalized by developers, transitioning regular webpages will be a fait accompli.
Just for what it’s worth, “word processors” are a logical starting point specifically because the APIs for text manipulation in the DOM are abysmal, and every in-browser WYSIWYG text editor is a steaming pile of hacks. Some of them work well despite this, but, it’s a horrible set of code to maintain and very, very hard to make a good user experience. One of the very hardest things to do well in the browser, actually.
Thus I wouldn’t necessarily interpret Google Docs making this move as the start of an inescapable trend. It may be, or it may just be that certain kinds of applications (like Docs and Games) can work a lot better by providing their own rendering engine.
A million times this - having worked in this space, nobody wants to do this. I understand the cynicism, but the affordances provided by the browser for things like accessibility are very painful to give up. This isn't some dark pattern attempt to cut out whatever "openness" remains on the web, or to convert the browser into a display-streaming client for some sinister DRM reason.
In my past job, we rewrote our spreadsheet rendering to use canvas and gained massively in simplicity and maintainability. And there was no obfuscation angle to it - we even shipped source maps! Handling accessibility and text input was hard (we ended up adopting a hybrid-DOM model where some things like input fields were still native ones, or shadowed by native ones), and even then it was still easier than dealing with browser rendering.
This is simply a reasonable way to work around the DOM being trash. The way to fix this trend would be to reimagine the presentation layer of the browser as something other than a stack of hacks over hypertext, but so far nobody seems to have a good solution.
> The way to fix this trend would be to reimagine the presentation layer of the browser as something other than a stack of hacks over hypertext, but so far nobody seems to have a good solution.
About a decade ago I had the start of a Eureka moment on how to do this (back then — https://medium.com/space-net/spacenet-51aca95d49a2, nowadays https://treenotation.org/). It seems to me we've missed a sort of fundamental universal notation of the universe, which you can think of as "two-dimensional binary". I predict we will soon see a Cambrian Explosion of new formats and languages that are simpler and more interoperable with each other, and some will have the opportunity to build new great languages for rendering stacks.
Hey, nice! I have a notes file experimenting with a similar tree-style minimalist data notation; it's definitely an area that needs more exploration.
I like what you've done with it, I'll need to take a closer look.
While I was thinking about it, I also decided to take a look into prior art, which took me all the way back to Landin's "The Next 700 Programming Languages" from 1966 which introduced the hypothetical language ISWIM, which was the first to introduce indentation based syntax and inspired the ML and Haskell family of languages (https://www.cs.cmu.edu/~crary/819-f09/Landin66.pdf)
It wasn't exactly what I was looking for, but it's fascinating to see how long we've been thinking about these kinds of issues.
Yes, and surprisingly there are not too many that use any indentation based syntax (about 1 in 50). However, of the ones that do (abc, aldor, boo, buddyscript, cobra, coffeescript, csl, curry, elixir, f-sharp, genie, haml, haskell, inform, iswim, literate-coffeescript, livescript, madcap-vi, madcap, makefile, markdown, miranda, nemerle, net-format, nim, occam, org, promal, python, restructuredtext, sass, scss, spin, stylus, xl-programming-language, yaml) there are some real gems.
Very nice! The nice thing about these kinds of notations is that they last. Simplicity is timeless. I can immediately grok exactly what you are doing here. :)
Common question. You cannot do column and matrix operations natively against S-Expressions (You need to parse the whole thing). This turns out to be very important and useful. S-Expressions need to be transversed in 1-D linear order.
When you two-dimensionalize S-Expressions and remove the parens, you get Tree Notation.
The rules for the basic notation are really simple. It's just basically a grid (visualize a spreadsheet), where each line is a row and each word is a cell. Indent a line to indicate a child node/row.
Thanks - that’s what I was looking for! Looks like a potential improvement on s-expressions to me in terms of a readable syntactic kernel.
For what it’s worth, I have been interested in the concept of 2d languages for quite some time, and I’m not sure what your definition is.
To me this seems to have a 1d topology - I.e. it’s still left to right top to bottom, and there are no loops or semantics that require reading in other directions.
I contrast this with something like a circuit diagram or labview program.
This is a dark pattern. Webassembly is binaries with no source. There is nothing to deny that. It closes the web and it isn't free software. It is moving back to proprietary. For reasons that all the googlers or other webasm devs here can't admit.
The dom isn't trash. Its good enough. We just need to stop making it do things it wasn't supposed to. That's why things like electron make sense a little bit. Not perfect but sensible.
People are trying to make DOM be a WYSWIG text editor. You're point is dead on.
The response is to write an opaque, vendor-specific "DOM". Which is the complete opposite of what HTML was supposed to be. But the browser is now a platform and not a browser, so the metaphors are horribly mixed to the point where I'm not even sure what a DOM is supposed to be anymore.
The DOM was always a WYSIWYG editor – albeit, for HTML, not text. Though, back in the day, HTML was a subset of text with image and table support; none of this fancy `line-height: 1.5em;`…
I wish the web had evolved in a more documenty way.
Hmmm... maybe I should have chosen a different acronym.
What you expected to see isn't alwasy what you got.
WYETSIAWYG
Rendering across browsers and operating systems was a trainwreck until maybe 5 years ago (over a decade of polyfill is finally gone). And even with an ecosystem like ElectronJS + Bootstrap, Win10 MacOS and Linux all look slightly different.
So yes and no to "DOM was always...", but more "No."
I read this as sometimes you need to use an exit hatch like Canvas instead of a bunch of hacks to make a WYSIWYG editor strictly with the DOM. Google seemed to come to the same realization and that is why they are using Canvas
Oh no. They're google. They make a godforsaken browser. They can pay people to do nothing more than think about code. By all means. They can launch a satellite with css if they want.
I'd agree. I think the DOM is fantastic, *unless* you are doing advanced document manipulation, in which case it is trash (for that use case only). This is what I'm assuming the author meant.
I spent many years working on that sort of thing, a decade ago.
The problem is that the escape hatch from document to application is completely broken. The DOM is fine for documents, but it's completely awful for applications. Admittedly, some things on the web that should be documents insist on being applications instead, but many things on the web aren't documents.
For these things, using the DOM is painful. When the pain becomes great enough, the biggest available escape hatch is pure native 2D rendering (canvas), which is a nightmare for accessibility and affordance to end-users. It would be awesome to have something else to escape to.
That's the entire point. DOM is meant for structured web pages, not web apps. If you want to build complex web apps it is better to use your own renderer..which is exactly what Google is doing with Docs.
Furthermore, actually implementing your own text renderer and input engine in WASM is a different horrible nightmare, because you've now forfitted any chance of supporting IMEs (e.g. touch keyboards, pinyin/romaji/kanji input, iPadOS Scribble), you need to implement your own text selection inputs, your text won't show up in the accessibility tree, and you can't copy-paste without additional, browser-specific tweaks. The only reason to actually do this is if you can't use HTML.
Source: I am a Ruffle developer and have had to do exactly this to support text input in old Flash games.
> The APIs for text manipulation in the DOM are abysmal
I develop an NLP labelling interface (which is similarly document-focused) in JS all day and can attest to this.
Here's some specific examples:
- To work out where some text is being being laid out, eg to display some UI around it, the Selection/Range/Rect APIs will help but Rects are always viewport-relative, so you'll need to convert them to element-relative to position your UI, which sucks.
- Firefox supports multiple DOM Ranges per Selection (ie allows non continguous text selection), but nearly all other browsers support a single range.
- You can't truncate text across multiple lines (the CSS line clamp spec is an experimental fix)
- If you want to pop up an interface while some text is selected, you'll need to capture and reimplement selection because as soon as another item is focused the selection will no longer display.
teach markdown in schools. who needs word processors anyway? need something fancier than headings, lists, block-quotes etc? learn LaTeX. Text rendering can be fantastic in the browser.
I'm half joking, but I think the world would be a better place if folks had fewer, more precise ways of making words look the way they want
I love markdown, all my blog posts, TILs, book notes are in it. Even my presentations are written in markdown. But this comment is so obtuse and user hostile. Millions of people know how to use word processors. Let's not take that away from them or even imply that they're doing it wrong. They're fine. Software can always be improved, like the Docs team is attempting here. But we can't overnight tell people that their skills are obsolete because some developers somewhere like Markdown and Latex.
> Just for what it’s worth, “word processors” are a logical starting point specifically because the APIs for text manipulation in the DOM are abysmal, and every in-browser WYSIWYG text editor is a steaming pile of hacks.
If that's the case, it probably would have been better to standardize some better APIs.
I had this instinct as well but I think this is about rendering, not necessarily editing. The document link is not to something editable. Granted, perhaps these will merge.
A dystopia. You can't control-F in the page. Can't copy and paste a word you don't know into a translator or dictionary. Can't share text unofficially - like copy a sentence into a email, etc.
This is 100% the reason why I decided not to use Flutter back when I was shopping.
It probably would have reduced my developer effort considerably compared to basically every other option that made it to my short list, but I can't in good conscience build something that isn't accessible.
Thank you for saying "claims to". Copy pasting a response I made elsewhere:
This is making a mockery of accessibility.
It's the equivalent of having a company that actively discriminates against everyone not passing a internally developed test to check whether you are a "neurotypical" and then respond to criticism by pointing at the wheelchair ramp you installed on one of your entrances.
If I can't copy paste words from your websites to put them into a translator, if I can only use your website when I use one of the two corporate-affiliate-selected screen readers (not even platform-agnostic ones), when I can't look at your website from anything that does not have 4+ cores and a 4G or better connection...saying accessiblity is a "first class citizen" or even citizen at all is laughable at best.
I said "claims to" because I don't know how I can verify if it is. The other person responding to me correctly mention how they suggest tools to scan for accessibility issues on iOS and Android, but they don't say how to do that on the web... have you managed to somehow figure out what is not currently working, exactly?
Regarding performance: Chrome uses Skia to render the DOM, why do you think rendering Flutter, also based on Skia, should be so much worse?
The thing is, when you're shopping for GUI toolkits, "try it and find out" isn't a great decisionmaking rubric. Partially because there are a zillion of them and you simply can't try them all, and partially because learning one is expensive and time-consuming, so, unless you genuinely enjoy that kind of thing, you kind of want to pick one in one shot.
Meaning that, in a slate of options where one of them can definitely do something you need, and another one where the official story seems to be ¯\_(ツ)_/¯ and the examples you've seen seem to indicate that it's at least something you need to go out of your way to accomplish, there really isn't any injustice in deciding to expend no further mental energy on the latter.
That said, if you want spend your own time digging in and finding out, be my guest.
Flutter is not competing with general GUI toolkits, it's competing with fully crossplatform toolkits, of which there's only a couple of real contenders for real-world apps: React Native and maybe Sciter.
Nothing else is production-ready and can run on the web, iOS, Android and desktop OSs (Linux, Windows, MacOS).
It's not competing with pure web frameworks. For this reason, yes, I think that if you're going to criticize it, you absolutely need to know what you're talking about... if Flutter claims it can handle accessibility, it's on you to prove it can't if you make this argument so eloquently on the Internet without actually knowing it.
BTW if you know of any other good UI Toolkits that really can run anywhere like Flutter, using the same code base, please let me know.
No, I just looked at the documentation and saw that it isn't supported on all the platforms I'd be targeting.
For example, that page you link is specific to iOS and Android targets. Nary a mention of what to do when you're targeting the Web. But it's also, at least as of when I did my comparison, explicitly not supported on at least some desktop targets.
It's the equivalent of having a company that actively discriminates against everyone not passing a internally developed test to check whether you are a "neurotypical" and then respond to criticism by pointing at the wheelchair ramp you installed on one of your entrances.
If I can't copy paste words from your websites to put them into a translator, if I can only use your website when I use one of the two corporate-affiliate-selected screen readers (not even platform-agnostic ones), when I can't look at your website from anything that does not have 4+ cores and a 4G or better connection...saying accessiblity is a "first class citizen" or even citizen at all is laughable at best.
You're talking about very different issues of accessibility, one of which is the financial accessibility of client technology and reliable web connection.
That is attacked through very different means. For example, FB being free and ad funded addresses that issue; but this is a business model and not a technological standard. Society subsidizing technology is another possibility. But these very serious issues of access and the solutions to negotiate with them are very different from purely technical solutions to access.
HyperText Markup Language is accessible, doesn't require a powerful computer, and can be copy-pasted into a translator.
If your website is unreadable in view-source:, it's probably inaccessible in some way. It's not hard to make an accessible website. If it's hard to make that accessible website look the way you want? Tough luck; hire a competent web designer, or scale back your artistic vision, because you sure ain't sacrificing the ability for people to actually use your website for some pretty colours, are you now?
We have to be careful not to throw the baby out with the bathwater here, though. Obviously accessibility is an important issue for both commercial and ethical reasons, but if you're talking about writing laws and severe penalties for breaking them then you will always have to choose what minimum standards you will always require and then accept that it might be unreasonable for a software developer to go further.
For example, suppose the entire purpose of a certain office application is to draw some clever visualisation that helps the office workers to understand complex relationships in their business data. Maybe that presentation style has been chosen based on years of experience and saves a lot of time and avoids a lot of mistakes compared to a simple text report or table of figures. However, maybe it also doesn't work for someone whose vision is too limited to usefully see or interact with the visualisation.
Perhaps you could present the same data and relationships in a different way, a format that would be more amenable to sound- or touch-based interfaces for those with severely limited or no vision. In reality, that might mean writing a second entirely separate application, one that might cost more than the original to implement and support, for an audience that will usually be very small and often empty.
So where should the line be drawn? In an ideal world we want to be as inclusive as possible regardless of anyone's individual limitations, but we also want to provide the most effective presentation possible to those who don't have the same limitations. In general doing both might be prohibitively expensive, so how do you make realistic, ethical decisions in this space, and given that every software application is different, how do you codify the standards you want to make mandatory for the accessibility reasons?
Cannot read property 'match' of undefined
TypeError: Cannot read property 'match' of undefined
at /var/app/current/routes/avs.js:44:29
at Layer.handle [as handle_request] (/var/app/current/node_modules/express/lib/router/layer.js:95:5)
at next (/var/app/current/node_modules/express/lib/router/route.js:137:13)
at Route.dispatch (/var/app/current/node_modules/express/lib/router/route.js:112:3)
at Layer.handle [as handle_request] (/var/app/current/node_modules/express/lib/router/layer.js:95:5)
at /var/app/current/node_modules/express/lib/router/index.js:281:22
at param (/var/app/current/node_modules/express/lib/router/index.js:354:14)
at param (/var/app/current/node_modules/express/lib/router/index.js:365:14)
at param (/var/app/current/node_modules/express/lib/router/index.js:365:14)
at Function.process_params (/var/app/current/node_modules/express/lib/router/index.js:410:3)
Flutter for web is not ready and flutter devs seriously should remove those demos because it's ruining reputation of Flutter for Android, iOS and Desktop which is absolutely a delight to work with. I never developed apps for mobile being a backend developer but now I have developed few in record time. Flutter for mobile is amazing and it's going to change the game.
I am not really sure about Flutter for web though.
I honestly believe they WILL implement copy paste...
... along with tracking exactly what you copied, when, where you hovered your mouse, inserting all sorts of features into your copied piece of “plain text”
... along with the ability to force-turn off copying from server side whenever it benefits them.
You WILL use ChromeOS (by another name) and you WILL like it
I mean, that's how MacPaint was going to work. (They eventually ditched “edit text once it's written” entirely, for fear that people would use MacPaint for word processing, but the code was all written.)
I had these exact complaints about web apps many years ago, and I was dead wrong. Nobody cared that web apps were slow, had/has horrible UX compared to native apps, and feel janky by comparison. These demos elicit similar feelings. Now, they're not as fast as I like, but they feel better than most webapps already. (It reminds me of Flash, actually.)
If the Flutter dev experience is good enough, this will eat traditional web apps slowly. In a way, the web was designed for this sort of scenario: a delivery mechanism agnostic of the client implementation. We've gone from pushing server-generated HTML to megabytes of JS + SPAs, and now a WASM/Canvas hybrid seems inevitable.
It all comes down to how well these perform on mobile devices, what the dev experience is like, and how the end product feels. There's still a lot up in the air, but, I'm honestly impressed with the progress shown by Flutter here. There's still a host of considerations unaddressed (including accessibility) that I hope they have an approach for.
SPAs to Flutter is a much smaller step conceptually than native to Web. Most of the world is using a Blink-based browser of some sort already. Updates are pushed aggressively.
> What has flutter to offer that a regular webapp lacks?
Google branding, novelty, more sane dev UX. I'm not saying those are good reasons to choose it, I'm just extrapolating from how the industry works.
Let's be honest: end users basically just put up with whatever devs release at this point. It doesn't matter if the end product murders battery life, doesn't fit the target platform, and steals their data. It's not like switching to some less-than-web-native framework is somehow a bridge too far for end users.
Your last point made me think: for all the flak the gamer demographic gets for being over entitled and toxic, they seem to actually ultimately end up with nice things like programs that run at a performance you'd expect from a modern computer and don't swamp you with creepily-targeted ads
Great point. It is striking to me how similar many of the modern languages are for various platforms (thinking of Rust, Kotlin, and Swift here), but it is still quite difficult to share business logic among web, Android, and iOS devices because of language differences.
If Dart/Flutter is good enough for all of those platforms, that is quite the value proposition.
Seems like it's fast (on my machine at least), can select text and Ctrl+F works. I totally understand the urge to complain and doom say, but could we at least stick to the example at hand? I'm sure if you went through this doc you'd find things to complain about. That would be more relevant to the discussion.
After trying it for a few seconds, noticed X select-to-copy was broken, as well as the browser spellcheck feature that my daughter relies on heavily in Google Docs (where a text area can be spellchecked in french or english depending on the active spellcheck).
I didn't spend more than a few seconds on it, but those were the features I immediately noticed.
This is going to be a painful transition for us since she uses both of those a great deal with her schoolwork.
Since the doc was read-only, I couldn't test IME/XCompose - hopefully those work at least.
And they speak about plugins being broken in TFA. I'm not surprised that your daughter's plugin doesn't work today, because they announced the change today. Presumably, those plugins can be rewritten to adapt to canvas. Just give them time.
Well, I was referring to the built-in browser spellcheck and the X windows copy/paste.
Both of those probably rely more heavily on normal native widgets. Not saying it is impossible of course.
The X windows copy/paste could be done using the invisible text approach firefox uses in PDF.js. That would also help with accessibility somewhat.
But I'm not sure how they could integrate with browser spellcheck without basically going back to a less canvas-y approach. Invisible text areas surely would have layout problems aligning with the rendered font. But, I guess we'll see. I'm guessing Linux users are not a major factor in their decisions though. Firefox users probably even less so.
It`s not slow on an old intel i3 laptop and on a motorola one phone. I keep seeing these comments about flutter web with different examples, they are never slow on my old devices. Also you can make the text selectable if you want to.
Which is itself a commentary on the immaturity of the technology. I certainly wouldn't run it in production if I can't trust that all devices of a certain performance level can run it. Random performance issues on top-tier hardware is going to be a non-starter.
Just loaded it and it ran awfully. We all have anecdotal experiences and it doesn’t make yours any less valid than mine but it does point out that they don’t show much besides what you personally experienced.
wdym by awfully? This makes me crazy, the gallery loaded instanltly, every item I click loads instantly, no problem with scrolling. There is a five second loading delay on the phone when the app loads, the items after that open just as fast as on the laptop.
The only thing I saw that didn'r run smoothly on my weak phone is the flutter plasma demo[0], but it runs without any tearing on the laptop.
I clicked the "Shrine" app on a 6-core MBP, it took about 3 seconds to render the main gallery page and I got an entire 4 frames rendered during that time.
there is an option for flutter to compile for both canvaskit and html - and then on mobile might select the latter (for reasons being smaller) - it might be that (not sure).
Wow, that’s truly a masterpiece in awful. Are they also rendering everything to canvas and then reinventing layouts, controls, etc. to get it so laggy?
Yep, I agree. Even though it's normal on mobile devices, it's weird behavior for the web. Given the way flutter is designed (sharable code base, eVeRyThInG iS a WiDgEt), I don't see an easy and obvious solution here.
And all the problems solved by HTML and CSS will be new again, there will be many new incompatible frameworks for handling layout and styling, a blinking cursor will again take 200x the CPU it should to implement. Ugh.
Let's not forget DRM. Now that framebuffer can be delivered via an encrypted EME module, thereby killing the open web permanently for the purposes of scraping.
HTML and CSS are fantastic; when used what they were designed for. Document markup and display. I blame webapp developers for the scope creep and subsequent denigration of a perfectly reasonable set of standards.
I find it really hard to rationalise this thought with the thread...
- WebAssembly + Canvas rendering is bad because it turns the web into a bunch of opaque blobs and removes user freedoms and abilities
- The web should only be for documents, not applications
- So it follows that... all the applications which are currently "open" websites should be opaque native app executables where you can't have extensions or adblockers or anything?
To be clear, I agree with the first point. Websites moving to opaquely rendered canvases is terrible. They should remain "normal" DOM/HTML.
I'm not a web developer, so keep this in mind as I clarify.
Application development for browsers is, in a way, a stack of huge hacks on the top of huge hacks. Developers are using HTML and CSS for purposes they were never intended (e.g. a WYSIWYG word processor), and doing some really gnarly hacks to make it all work.
Those gnarly hacks required to make their applications work prejudice web application developers against HTML and CSS, since they don't work worth a damned for those developers.
But HTML and CSS work remarkably well for what they were designed for: Documents.
Applications have, historically, usually been distributed in a different format than documents. Perhaps they should resume that trend; targeting the browser just being the OS, and leaving the document standards for documents.
My knee jerk response to the original article was, "this is making the web less free," but a bit of thought raises the question, "I can't interact this way with Microsoft Word, so why do I expect to interact this way with Google Docs?"
CSS has absolutely never been good at what it was designed for.
It is meant to display a document. It took TWENTY YEARS before it got support for showing text in columns, the most basic of basic things in how text is often displayed.
It is only now, after twenty-five years, getting a way to specify the aspect ratio of an element.
CSS was originally intended to consolidate document style information, so you didn't have to repeat "24 point bold, blue" for all your section titles. It's pretty good for that, but then we somehow decided that it should also handle the entirely separate job of page layout. Which led to years of standards people yelling at us to use CSS and not tables, and then plugging their ears when asked how to implement complex designs like vertically centered text.
HTML and css are great for making application GUIs. I used a bunch of frameworks like qt and Java <some kind of verb> and wpf and they were all worse than php, css and HTML.
HTML and CSS were never denigrated. There's not one superficial feature which doesn't make sense. Most of it is barebones. If I had to pick one feature the blows the scope out of proportion, it would be the `is` selector, but that's the only one I can think of
And there's plenty of examples within the comments of this article where people - webapp developers - harp on how bad HTML and CSS are, because HTML and CSS weren't designed for webapps.
I find those tools much easier to get a decent GUI layout with. Getting the same layouts with HTML and CSS seems to involve a pile of hacks, because they were designed as document markup, not button and panel layout tools. (that said, a markup system for constraint based layout would probably be better, and CSS does seem to be growing some of the features, it still just seems to be hard to get something that works as consistently as e.g. Qt).
...and yet we still have these things, like Gtk, Tk, ImGUI, some parts of Qt, etc, and they work reasonably well. I just spent time building a new widget set for small LCD screens (~160x40) and built the set in as simple a way as I could, and this was the easiest method to do so. Sure, it's not great from a designer perspective, but it gets the job done -- more or less what HTML + CSS do now, but (frankly) in a far less verbose way.
We had layout managers by the early 90s, and Java was not the only choice. Delphi, wxPython, and Qt were/are pretty good. Do not use pixel positioning or hardcode colors was right in the docs.
I don't, we are going to have hundreds of different accessibility and IME implementations, all with different quirks and bugs and many websites won't care at all so even writing letters with accents will be a hassle (if possible at all). Or think about websites assuming your keyboard layout is US QWERTY. Or websites not properly handling subpixel rendering or retina displays (and looking upscaled and blurry). Those are all solved problems with HTML and CSS that will have to be solved again, this time not by 3 rendering engine dev teams but by each web dev for each website.
It’s not like the w3c provides us all with rendering engines. We still use browsers which do it for us. If these canvas techs spread, everyone will rely on canvas framework toolkits that do it for them. Still not great for user power
Input methods are managed by the OS, the browser delegates control. It can't do that over a simulated input control rendered on a canvas. Also what I mean is that all these are problems already solved by browsers that follow w3c recommendations, custom rendering solutions will have to reimplement everything without access to OS APIs that make many of those things possible in a unified way.
What would you replace it with? Keeping in mind that web pages are supposed to be accessible to people in the sense that anyone with an interest to do so can put up a marked document for display because it's displaying material, not programming an interactive environment. If you take away the accessibility of HTML from people, then you're also taking away a significant portion of the "open" Internet and moving toward a more closed corporate-curated model due to the higher barriers to entry if you want to publish something.
I've watched a weird mix of "I don't like javascript" that usually actually translates to "I don't like what some people do with javascript." along side some cheer leading for webassembly...
Now those obviously aren't entirely connected arguments, but whatever "I don't like what some people do with javascript" is supposed to mean, it has nothing to do with javascript. It's just what people do with it.
Here we go now, we've got rando blobs you've got no idea what they're doing... this is not better.
I am in complete agreement. When I've brought this up on HN in the past it's always been dismissed with "But accessibility!". A big company like Google can handle that just fine.
Presumably they'll eventually port Chrome to WASM. Then they can completely control the browsing experience.
To exert any control over our browsing experience we'll be single-stepping thru machine code. I had fun cracking Apple II and PC software back in the late 80s and early 90s but I'm not necessarily looking forward to doing that again.
> Presumably they'll eventually port Chrome to WASM. Then they can completely control the browsing experience.
...what? As opposed to shipping binary executables for an OS/arch combo, they'll ship a binary executable for WASM, and then a second binary executable for the OS?
If they want to take Chrome closed source and make it fully obfuscated, they can do that already, with or without WASM. Chrome is already closed-source, only Chromium is open -- and we know Chrome has secret sauce in it not from the repos.
I expect there will be a WASM-based browser embedded within some (most?) websites, eventually. The public-facing webserver will serve-up the WASM-based "inner browser" and nothing else. Content will only be accessible via the "inner browser". Determined attackers will be able to extract the keys used by the "inner browser", for sure, but the average person won't be able to.
It'll be packaged as a product, probably targeted at "traditional media" sites: Use this on your website and nobody can block your ads or bypass your paywall. You can use all your existing development tools, servers, etc, and it'll "Just Work".
It's minimal, not much beyond widevine and a set of hard coded API keys for things such as sync. The code that uses the API keys is all open source, only the keys themselves aren't.
It's a bad thing in terms of openness but a good thing in terms of performance and making web applications more like real desktop applications. A lot of DOM work is a nightmare, so I sort of welcome to the Canvas based approach.
We're just one more concession away from getting the performance we had 20 years ago with native applications.
Edit:
The path we're on just seems so obvious I kind of just want to skip ahead and get it over with.
- Developers and content publishers love the ergonomics and control of just shipping WASM binaries that paint to a canvas and it becomes the de facto standard.
- After 2-3 years of everyone's computer being used to surreptitiously mine crypto everywhere they go we re-learn the lessons of Java applets.
- In comes browsers that require signed WASM binaries for your protection.
Browsers still support arcane HMTL features from decades ago (and thus can display most web sites from decades ago). Other than things that were already decidedly not part of web standards (like Flash), I’m aware of very little if any backwards-incompatible changes made by popular browsers. So I'm having trouble seeing this slippery slope you’re describing.
Yes, DOM is "just" HTML and CSS with a JS API. All of that is quite broken in various places, the APIs usually rather bad, mutability everywhere and performance suffers as soon as you e.g. switch the cursor from an arrow to hand/ finger, there can be a complete re-render.
Google especially is in a position to fix most of these things if they wanted to. They even have the best test-data of any company. But they choose to side-step some of these problems by introducing another, completely new technology and then a small team of people goes on to hack something useable using this technology as if they were a small startup.
The problem is, Google is basically a collection of teams with very different focus points. Some are paid to develop new technology no matter if they want it for a specific product. Other people are paid to "improve" an existing product, e.g. Google Docs and have to use what is available. Any kind of exchange of information between teams is a friction and blocks things from being improved.
At OrgPad, 10% of CPU time is the maximum we can use if we want a smooth experience. The rest is eaten by the browser and inefficiencies there. I don't think we are alone in stepping over bugs, idiotic APIs and implementation differences everywhere all the time. The fix is not to delegate most of the work to the web developer but for the browser and web standards people to step up and do a thorough job with existing stuff.
I'm one of those persons that really want browsers to be viable options to building fully fledged applications but I sincerely like the core concepts of HTML and CSS.
I think the biggest issue is that things kept on being tacked on the primitive specification to solve ever evolving needs. The browsers themselves are written in languages that are meant to go fast - which is important for drawing of elements, rebuilding trees, etc - but not good for doing concerted async operations in a structured fashion. Just looking at those codebases makes you want to stab your eyes with a fork up to your brain.
In my, very valuable, opinion, a re-assessment of what HTML is (ignoring the name, and treating it as the language of the browser instead of hyper text) and what at web page can be - along with a browser designed to be multi-tabbed, async by nature, supporting advanced non-tree based layouts (maybe still trees, but trees that could be part of zindexed layers? detached layers that allowed for interaction handlers separate from the visual elements?) in a language that has tools to model those behaviours, could probably re-use all the lessons from HTML, CSS and JS apps.
Of course, it would still be like building everything again - but I'm not sure wasm/canvas blobs is a better solution long-term.
If you've ever used a real UI framework, HTML/CSS is absolutely a nightmare. It is missing almost entirely in any kind of useful controls, which is why there are so many myriads of control libraries.
Edit: however, replacing HTML+CSS with a canvas is absolutely a step in the wrong direction. I'm advocating for a richer Web, not a stream of pixels controlled by Google.
Ah, but this is in the eye of the beholder. For example, Firefox has an option to allow one to deselect the option "Allow pages to choose their own fonts, instead of your selections above". I usually browse the web in aliased legible Verdana as a result. It also allows you to choose the colours of the background, links, etc.
Chromium browsers do not allow you this choice. There, the philosophy seems to be that the web is a designer-driven layout medium and that the viewer should not have accessibility choices.
Btw, some of the "sites worse", if you're referring to Google, that's them just breaking things intentionally. You can sometimes work around it by specifying a different useragent and then things are fast again until they change things again.
As a browser, Firefox is standards compliant. I agree there are many applications that use a web interface as the GUI and a bunch were coded for IE6 or IE6 with ActiveX and don't display well (or at all) in Chrome. It really depends, but the problem isn't that Firefox follows standards, but that the developers did not follow standards.
User preference and configurability of hyperTEXT display has been totally thrown under the bus, increasingly throughout the web's existence. I remember when it was straightforward to have your own text color and font preference, and web sites didn't try to override them. You could even configure a brick wall GIF background for sites if you wanted to. Now, these preferences are buried and easy for a developer to ignore using CSS and JavaScript. And as you say, it'll get even worse. The user agent (who, by the way, you're writing your web site for) now has very little say in how a web page is rendered.
Unpopular opinion on HN full of web devs, but I don't want my browser to be a "canvas" for some developer. I don't want it to be a "stable ABI" for some opaque binary app. I already have a platform for binary applications: It's called a desktop operating system. I wish more developers would go back to developing for that and leave the web alone. I want my web browser to display hyperTEXT documents, with links out to other hyperTEXT documents, and occasionally accept input from me via a FORM. And that's it. I understand that opinion makes me a minority nowadays, though. Maybe we should revitalize gopher or something--something to get back to the roots of fetching, displaying, and navigating information.
Ad blockers could still "scroll" through a website, run an AI model on it to discover the ad sections, and cut them out. Even a simple model that filters out visual animations and replaces them with a static would be useful as a large annoying component of online ads is the movement and blinking etc. that make it hard to focus on the content.
Dark mode etc can, too, be implemented to work in terms of framebuffers only.
But yes, unless there is a serious need, canvas rendering is a severe regression.
I've seen some rather atrocious techniques employed to bake ads into source but that can still be done without any performance impact. Making that work in WASM would require dynamic recompilation, doable perhaps for smaller apps but for behemoths like Gmail and Docs I couldn't see even Google getting that to scale. I think we'll still have browsers performing additional HTTP requests for that stuff, at least in the short to medium term. So traditional adblocking techniques will still be viable.
The difference from a physical newspaper is that on a computer we don't have to leave holes, we can replace them with beautiful artwork or our favorite cat pictures. That's still a hell of a lot better than ads.
I'm ok with ads in a newspaper. I can ignore them safely and blend them out. Many online ads nowaday blink or do other dark patterns to make ignoring them harder though. I hate that.
I stopped watching twitch after Amazon took over. They changed it, it’s so aggressive and full of little dark patterns and micro rewards now to try and hook people
The WASM code will eventually implement its own DNS analog (or use DoH with a pinned certificate). It'll all be completely opaque TLS to your Pi Hole.
Your Pi Hole is the same thing, functionally, as the tools a nation state hostile to human rights uses to filter the Internet. You'll get the same treatment that they do.
That’s when you install squid and have it do the terminating TLS. They won’t pin the certs, too many corps do this already and not to mention certificate lifetime is in the realm of leaving a browser open for weeks.
> I thought it was obvious[1][2] that the end game for those that want to control the web is sending opaque binary blobs of code that only use the browser for the canvas tag's framebuffer.
Google is also a search company, the product from which the majority of their ad revenue comes. Blobifying the web indiscriminately breaks search, so it is not in their interest to push this to reap some x% revenue loss from ad blockers.
Incidentally, or not, a traditional wordprocessor itself is not an interesting search result and makes a perfect candidate for this type of rendering. There already exists compatibility layers with search for published documents. So absolutely nothing is lost for anyone, and I think it is an overgeneralization to take this as a sign of obvious doom to come.
Treating the web browser like a VNC terminal, a way to push pixels, is just a vile degredation of the web. The web is not applications. People view the web via a user-agent, a tool that let's them navigate & view hypertext as they want.
Breaking away from hypertext, pushing images in people's faces: it's not the web. It's an assault, a great step backwards. It's an attack on the internet.
Absolutely people are definitely starting to treat the web as a big canvas. Those people are doing great injustice & cruelty to one of the only pro-user pro-agency information technologies ever created.
Unfortunately, all the people who work on the web disagree. They need what could be a simple, text based site into a full blown "application" to justify their own job.
The web can do programmatic, interactive systems very well. In my opinion, it does it much much better than applications. And it should be doing it. (Alas, a huge huge huge amount of web applications don't espouse the virtues of the web, don't use url based routing, don't have good service worker caching, & countless other horrific faults. I continue to see that as not the web's fault but I think there's a lot of room for opinions on this, at least.)
But interactive web systems shouldn't regress to the hostile, anti-user, anti-extensibility stance of an application. It should continue to offer the upsides of being the web.
Google de-facto controls the browser but it doesn't control the OS, its motivation is absolutely transparent here: make the browser the OS. It even auto-updates transparently by default so Google can push its code almost live to its users.
As someone who always fought against the bloat of the web and reimplementing everything on top of HTTP and JS I feel a bit like an anti-atomic weapon activist who, upon seeing the mushroom cloud in the distance, can utter a final "see, I told you so!" before being demapped by a G-shapped shockwave.
The open web was fun while it lasted, but ads are more important.
Man, this is a very dramatic interpretation of their intent. Docs doesn't even have ads. They actually do have issues rendering consistently across browsers (I ran into this just like 2 days ago where something that had been carefully laid out on Chrome looked different in Safari).
My bigger criticism of this is that I imagine it was a very large refactor at a time when docs has barely evolved for years and still remains far behind desktop counterparts in many ways (although still slowly taking over because of vastly superior online collaboration).
I never said they would never do this. I just said it was over dramatic and making big assumptions that this action by google was part of some nefarious plot to turn the internet back into corporate tv, especially given that there are very good reasons in this case to make this change without those assumptions.
>I thought it was obvious[1][2] that the end game for those that want to control the web is sending opaque binary blobs of code that only use the browser for the canvas tag's framebuffer
As opposed to a world where huge, unintelligible minified JavaScript blobs are sent to the browser?
This is the endgame of putting JS/programmability (rather than just markup) in browsers at all. HTML and CSS are great for formatting documents. But, in spite of years of jQueries, Angulars, Reacts, Vues, and 10M other frameworks being implemented, HTML+CSS+JS is frankly awful for developing applications. Business needs have made us push them to be okay for enough form entry to collect credit card numbers and do billing stuff, but even then every step forward comes with a few steps back.
An optimistic take is that this should mean we get better web apps and less JS garbage on web pages where we just want to read something and there’s no justification for having code execute when you visit the document. That’s the intent of WebAssembly and it makes sense.
The reality is that the NYTs of the world will continue to put JS garbage all over pages that have no business executing any code. But WebAssembly doesn’t really have any bearing on that. It offers the opportunity to make things better. The organizations who seek to profit off the web and make it worse for the rest of us will still do so, with or without WebAssembly.
It wasn't designed to, it grew bit by bit with new features gradually being tacked on, so the APIs are inconsistent (For example,NodeList, HTMLCollection, and Array all have different methods and naming conventions) and poorly designed.
I literally had the same thought when I first heard of WebAssembly. Adblockers blocking your tracking JS/Cookies? Why not pass it via a binary blob though WASM and track the users? I know WASM is a boon to performance sensitive websites, but there are always players who would like to take advantage of the situation.
I don't blame them, considering how damn hard it is to center something with CSS. With a canvas you just tell it where to display it and to hell with CSS.
Yet another reason to use Pi-hole. The web browser is the wrong place to try and fight against this sort of thing. Secure your home network once instead of fighting an arms race on each device you have.
Pi-hole wont protect you here, they will send the entire page, ads and all as an encrypted binary blob. Decrypting it is illegal, and since the data is coming from a single source Pi-hole can't block it.
Pi-hole (and ad-blocking) days are numbered. Hopefully that day is further out than I dread.
It has to be decrypted at some point to be viewable, so if you have system administrator access to the OS and device your browser is running on, someone will figure out how to identify which pixels and which audio bits are ads and block them from making it to the frame buffer (and whatever the equivalent is for a sound card). Where there's a will there's a way. There has just not been a need for that yet.
It's just an arms race, and the makers of uBlock Origin will need to level up and figure out how to do some serious computer vision and heuristic anomaly detection.
When ad blocking on web becomes impossible/nonviable, I will retreat from using it, as I have retreated from using most popular sites, unless absolutely.
I already don't have a TV, don't have any news apps on my phone, don't have any newspaper subscription, and I find that all the important news find their way to me no problem. I will add large degree of WWW to it and be mostly fine.
Its already possible to have server side rendering bake in ads on web pages, but just about nobody does it. I think the percent of people who know about pi-Hole vs the cost of server side rendering will always make client-side rendering the optimal choice for the people running the websites. They can block adblocking browser extensions, and that will be enough for them.
That's because the ad networks require a level of validation of their views that they prefer to have control over with javascript snippets or pixels.
But it's technically a bit different, since an ad-blocker could still block those baked in advertisements while it's traditional web technology. If it's a binary blob rendered with canvas, that won't be true anymore, which I think was what the parent comment was getting at.
> If it's a binary blob rendered with canvas, that won't be true anymore.
The request for the ads will either happen server side or client side. If it happens client side, it can be blocked at the network level by the pihole regardless of whether a blob or js script is making the request. The request happening server side is unlikely per my previous comment.
Your previous comment doesn't explain a reason why it would be unlikely, just that it isn't. The reason it's currently unlikely is because ad-networks need to validate their views. Because you can ad-block on HTML alone, it isn't worth it to risk webmasters falsifying serverside views in order to try and get around ad-blockers because it wouldn't work. But add the ability to get around ad-blockers entirely and it becomes worth it to solve for server-side view validation.
This is the landscape as I see it:
HTML, ads loaded by user network: Can be blocked at network or in HTML.
Binary blob, ads loaded by user network: Can be blocked at network level
Serverside ads, HTML : Can be blocked by removing the ads in the HTML, and ad-networks don't like it because their views can't be validated and they could be scammed by webmasters.
Serverside ads, binary blob: Can not be blocked by network or by hiding things in the HTML, ad-networks and webmasters would want this because they can get around ad-blockers, but will need to solve view validation, the tradeoff becomes worth it.
> The reason it's currently unlikely is because ad-networks need to validate their views.
Ad-networks also want to validate views because they don't trust the hosting website. They don't need to trust the hosting website's server when they can receive a direct connection from the client.
However, there is an alternative that fits "serverside ads, binary blob" scenario: the ad-network hosts the website. Obviously the website owner wouldn't want to turn over control to the ad-network, but they don't get a choice when they depend on the ad-network for revenue. Even worse: this isn't a hypothetical concern. Google already tried to implement ad-network hosted websites with "AMP".
I should have chosen my words more carefully, but the blobs with ads will have some sort of DRM on them, which is technically illegal to decrypt and cut out.
> Removing DRM on something you own for personal use is perfectly legal.
Perhaps some places. In the US, it is perfectly legal to "violate" copyright restrictions for purposes like fair use. But if the data is protected by any kind of security, it is a felony to bypass that security (even for fully legal purposes).[1]
There’s a place for DNS based ad blocking like Pi-hole or NextDNS and a place for first party ad blocking and element blocking (and even script blocking) using uBlock Origin, NoScript, etc. They’re complementary in certain ways.
It would be easy to implement server-side rendering, but it carries significant cost in processing time and networking. If they can block all adblocking browser extensions by abusing WASM, the cost of server-side rendering will outweigh the 0.0001% of people that have a pihole set up.
A proxy can do very little to actually block ads compared to what a web browser can do. Web browser _is_ the right place, or at least it is the only place where you can actually do something intelligent to block ads. For example, if you are blocking ads at a proxy, it is trivial for site owners to detect it and then your filtering proxy can literally do nothing to compensate. A huge chunk of all the filters in adblocking lists are basically to defeat anti-adblocking functionality...
DNS over HTTPS and, eventually, certificate pinning in the WASM-based DoH client running inside your browser will render your Pi Hole useless. It'll all just be opaque TLS to your network gear. Network operators who legitimately need to control traffic on their networks are lumped in with human rights-violating nation states. That includes you on your network.
Can you substantiate any of that? DoH is a feature of the browser that can be configured or disabled. Firefox even disabled DoH automatically if it detects you are using a custom DNS server which blocks "potentially malicious content"[1].
I'm not talking about the browser doing the name resolution. I'm talking about the opaque binary running inside the browser doing name resolution (either by way of DoH with a pinned certificate, or a custom name resolution protocol running over TLS). You won't get a say in what that opaque binary does (w/o modifying it).
Is anybody doing this now? I doubt it. Will somebody do it? Yeah. Definitely.
When the "browser" is an embedded device (like Google Chromecast devices hard-coded to use 8.8.8.8 for DNS) you'll have no configurability there either.
Honestly if they did that I think I’d strongly consider paid alternatives to YouTube. Most of the channels I like are on curiosity stream/nebula. I’ve never subscribed because I don’t feel like I have a reason to. But that would definitely give me a reason.
This reminds me a lot of those websites that used to be implemented in Flash. It's just one giant opaque blob that gets downloaded and can do whatever it wants.
When this stuff inevitably starts being used by every news and shopping website I wonder how search engines will be able to index it.
meta tags, and crawlers will use headless browsers or another thing and OCR on top of that. Google already handles SPAs where the content isn’t available until the JS is run. I wouldn’t be surprised if they already have tested or use a modified chromium and for crawling, plus OCR if necessary.
If news websites can use this to bypass ad blockers then it might still be worthwhile, and I imagine a lot of traffic comes from social media sites these days.
Accessibility notes: “ we generate a second DOM tree parallel to the DOM tree used as the RenderObject tree and translate the flags, actions, labels, and other semantic properties into ARIA.”
If you've tried using Google Docs extensively for layout-heavy things, you've probably noticed a bunch of subtle things break a little bit, for example as pixel rounding errors accumulate down a page. This can be especially evident when using things like lots of table cells, borders, and zoom levels different from 100%. The blinking cursor can sometimes be almost a full line misaligned from the text being edited.
Also if you've tried using Docs across different platforms and browsers, there can be subtle differences in line heights, word-wrapping, and so on. Which might not seem like a big deal, until you try to write your term paper to be no more than 25 pages long, and then when you go to a shared computer to print it, it's now 26 pages because a bunch of lines are now wrapping that weren't before, and your professor won't accept it!
If canvas means Google now has total control over precise element and letter positioning and wrapping, this will be a BIG step forwards in truly being a consistent, cross-platform, WYSIWYG word processor, as opposed to the way it's often been "usually mostly right but not always exactly".
Websites are supposed to be responsive and flexible and not depend on exact font rendering. But word processors really do need reliable control over layout when producing serious documents like papers, resumes, etc.
It's like going back in time with Flash Player, There were many WYSIWYG flash based editors around 2010. The problem was it required a browser plugin but it did work much better then HTML4 at that time. For canvas rendering no plugin is needed. Even there was a feature to export SWF file to canvas.
There are some issues like accessibility, browser features like search or text highlighting, lazy loading etc. What HTML provides out of the box must be implemented from scratch.
This is a huge amount of work, and authors are most strongly encouraged to avoid doing any of it by instead using the input element, the textarea element, or the contenteditable attribute.
When the spec warns you in advance that this is a high risk approach.
You can render elements inside of the canvas tag, they simply won't be visible. This gives screen readers something to do, though it becomes trickier since they also have to bugger keyboard shortcuts to implement key navigation for sighted users.
If you change the /preview to /edit you can then go to File→"Make a copy", and you'll have your own editable version of this document. If you change that one to /preview again, you (unsurprisingly) get something that doesn't use canvas, but uses the DOM with the usual div/span soup. Here's an example I created: https://docs.google.com/document/d/1RnQonRlivBogXxSl1nTZSCfR...
These two /preview versions would be a good basis for comparison (until we have examples of editable documents that use canvas-based rendering).
Visually, comparing the two /preview documents by alternating rapidly back and forth between the two tabs shows some minute differences in kerning (on only some of the lines). It might be interesting to repeat this experiment on multiple devices and see whether and in what way the differences themselves differ across devices.
One difference I've been able to find is that using the browser's in-built find doesn't work. But all the common ways of triggering search (Ctrl-F or Cmd-F, etc) trigger the search function of Docs, and that seems to work identically in both versions. And anyway the browser's find cannot find things off-page before scrolling down the page, so it's not usable in general anyway.
This is one reason I think projects like Servo are potentially very beneficial even well before they can render every legacy quirk you find on the open web.
Instead of bringing a custom rendering engine to the web via WASM and canvas, it'd be better in so many ways to bring a subset of spec-compliant HTML rendering to embedded contexts. If your content renderer can run on Servo, and render the same in modern browsers, you can use HTML as the basis for cross-platform rendering.
At least I would hope that would be possible by now so we don't have to write renderers for every platform anymore.
It is surprising to me how blind Google is to the second-order effects of this is. Suppose that this model of rendering pixels to canvas is a success. Suppose it is adopted by various content serving systems like WordPress, JavaScript frameworks, etc. Now Google Search has nothing to index! The whole web just became completely unsearchable. Google Search, and all web search as a concept, becomes mostly useless.
You might think that Google assumes that they’d be served special content to index, but I remember the shenanigans that people used to pull with serving the Google bot one set of content and actual page visitors other content, and how Google has to continually fight against this to preserve the relevancy of search results. If canvas rendering becomes normal, Google will have no way to fight against this anymore, and Google search results will be dominated by links which does not contain the promised content.
It’s possible that Google thinks that Google is big enough that nobody will opt out of being searchable by Google, but consider Facebook. Facebook content is basically impossible to find using Google, and everybody wants to be the new Facebook, right?
As I see it, the efficacy of Google Search rests on a few premises which were true in the 1990s:
1. Web content which is created without indexing in mind will be able to be indexed by Google.
2. If web content creators have indexing in mind, they want Google to index the content, and the path of least resistance will allow Google to do it.
3. If web content creators want to play funny games and serve one set of content to Google and other content to site visitors, this is difficult and requires playing a constant game of cat-and-mouse with Google.
If canvas rendering becomes the new thing, all three of these things reverse and becomes false. How, then, could Google Search stay relevant? Why would people pay to advertise on Google if Search becomes useless? Especially when Facebook ads exist, and Facebook is where people spend more than 90% of their web time on, anyway.
I highly doubt Google would embrace something that kills there business without having a plan.
Off the top of my head I can think of 3 scenarios:
1. Google is investing in tech that allows them to index canvas-rendered items (putting them far ahead of all other search engines if they are the only ones with this tech)
2. Google will put out an "index.txt" guideline ushering in a new way to create SEO content for the internet (there are problems here but, like with anything, they can be addressed in time)
3. Google might ditch indexing altogether and look into alternative ranking methods (curation groups? popular in your social sphere? heavier reliance on ads?)
Google’s crawler, or rather the component that processes pages, is much more advanced now. I’ve read it has no problem handling complex SPAs where only a “shell” page is initially returned and no information is available until JS runs. They probably have a modified chromium and maybe employ OCR to digest some content since there are now a gazillion ways to display a page and its content, aside from canvas.
This is absolutely expected, reasonable and the way to go for an rich GUI app such as Google DOX. Particularly since WASM is so widely adopted, so does Canvas, WebGL etc... and Rust compiles to WASM, not to forget. So it was about time that some major apps ditched all the DOM glory.
Wonder whether they're using the greatly overrated Angular to render the canvas, i guess not, but that's for Google insiders to say.
Current Google Docs already uses iframes to intercept keyboard/text events, the document you actually see isn't the thing you're interacting with, because contentEditable is nowhere near powerful enough for something like this.
All their decision to use canvas is doing is changing the render target.
This is already possible with tiny or invisible text, I don't think browsers are trying to protect from that. The protections that exist are to prevent sites from reading your clipboard or replacing it without you clicking on anything, other than that the site already has control over what goes in it.
If a worker circumvents these options being disabled, that's a far worse issue. While before you can say "oh I didn't know", afterwards you can just be fired.
Hold up - Google docs didn’t already use canvas? You’re telling me the existing docs apps is HTML and CSS? That’s a mighty impressive engineering achievement to have gotten it working as well as it does now.
For everyone saying "Google is pushing canvas usage to kill ad-blocker effectiveness", maybe, but most of Google's ad revenue already works around ad blockers. Most of their ad revenue comes not from their display network (ads that they serve on other ppl's property), but from showing ads on their own properties. Only ~13% of their revenue comes from ads displayed on properties they don't own: https://finshots.in/infographic/breaking-down-revenue-stream... And "only" ~25% of users use ad blockers, so ad blockers likely only hit Google's total revenue by ~3%.
Ad blockers mostly work by blocking known display-ad-serving JS scripts, that ppl include on their page. Ads within Google search, Gmail, YouTube (and big non-Google advertisers like Facebook, Twitter, Reddit), don't use this approach, the ads are much more built into the page (embedded in product JS bundles, or simply returned by the backend from Ajax calls, mingled in with non-ad data). These ads are really hard to block, regardless of whether the page is a traditional web document, or uses canvas. Blocking them basically involves parsing the page and looking for things you think are ads, which is incredibly difficult to make work reliably over time, so nobody really bothers, and that's true regardless of canvas vs. traditional web document.
That 3% you're referring to is 5.5 billion dollars, which easily justifies canvas usage. I don't personally see the connection between google docs and ads, but I think this analysis is misleading.
Fair enough, I’m sure they’d love to kill ad-blockers in general. But ... using canvas in a product that doesn’t show ads of any kind, nevermind display network ads, it is indeed a very tenuous argument that this is an anti-ad-blocker play.
Interesting. Looks like each page is a separate canvas, and the canvas are reused as you scroll the pages (just like how list view reuse cells when you scroll in a mobile app). The canvas seem to be only used for rendering text and tables though, images and graphic elements are still drawn as svg dom elements.
So we're building binaries, to run them in WASM, that's running in JavaScript, which is running in browser, that is written in C++ .. is that what we're doing now?
Given for a text editor, find-and-replace is a feature anyways that's not natively supported by contenteditable.., you will need to build out a custom app search feature anyways. But yeah, for preview only mode, sad to see native features not work on a static document.
On multiple levels it seems nuts to use canvas for a stylized text editor.
There are layout consistency issues across browser and OS. The place I’ve seen this is page breaks changing between computers (or even zoom levels). Line length isn’t controlled solely by the browser, IIRC both kerning (letter spacing) and word spacing is determined by the OS.
I implemented something like Docs myself, but with the goal of total consistency of line/page breaks. Knowing about these issues I used SVG. In SVG text you can override line length, but you still need to have a value in mind, a reference to be calculated independent of user environment details.
Regarding performance, one issue I can speak to is the cascading changes to line breaks. This can be pretty brutal when making a change to the beginning of a long document. To get acceptable performance I had to partially remove the application framework I had been using.
I liked my solution and it worked quite well in prototypes, but I don’t know how it would have faired in the wild.
You can render the fonts and do the text shaping yourself. You don’t even have to use any of the Canvas API except putImageData, or WebGL. Also you can use your own curve drawing algorithms so you own all pixels of the application.
A lot of discussion going on here, but I want to add two things in support of this change:
1. A browser- and OS-agnostic layout experience will have real effects. Just last week, a doc I wrote on my Mac (with Chrome) ended up with several pages with orphaned text, because the page/section breaks I inserted near the end of a page on purpose ended up breaking for Windows Chrome users due to slight variations in font rendering.
2. They have a sample document linked in that post. The text is still 100% selectable/copyable, and assistive technology is able to read and navigate it.
I've been playing around with pure-canvas web apps that are set up as follows:
Client -> Server: Raw event stream (keyboard/mouse/touch/resize events)
Server -> Client: Canvas draw batch stream
The server provides a small javascript shim that bootstraps a websocket & subscribes to all required event sources. It also subscribes to the server issued batch events and has logic to dispatch draw commands to the canvas element.
On the server, I use LMAX Disruptor to aggregate the client events and process them in micro batches (the size of which are determined dynamically based on backpressure). This results in an incredibly low-latency/low-jitter UI. Client events are passed around as readonly structs, so very little allocation or GC is involved throughout (.NET5/C# codebase). Processing is concluded with async/parallel dispatch of the appropriate client-side draw batches as they are ready to be issued. Not all client events result in a draw, and not all client events result in draws on the same client. There are some synchronous concerns between clients in my application, so having a single fast thread allows for lock-free processing of all events at the same time.
The only caveat I have encountered is the latency constraint. Going over ~100ms makes this sort of interface feel like crap. For LAN/localhost, this approach is effectively instantaneous.
I’ve considered making something similar, but I’ve thought: what’s the advantage? Just more computation for the server that could be done client side. Why not just have the app on the client.
I’ve dubbed this the “cloud gaming” approach, lol.
I get why they're doing this, but this sucks for me. It's going to permanently and unfixably break browser extensions that work on the DOM, for these docs. In particular, I am learning Japanese and use Rikaikun/Rikaichamp to get kanji definitions/readings on hover.
I don't see this getting fixed in the extension, since it would involve having a completely separate brittle path just for Docs. I may end up having to switch to an alternative platform for document sharing, at least some of the time.
If they're going to make the leap of switching the render layer, I'm surprised the leap isn't to flutter (which is supposedly stable/ready for production web apps?).
Flutter is Canvas-based on the web, so this post well-could mean a switch to Flutter. Hard to see how non-Flutter customer Canvas-renderer for Google Docs makes sense.
Canvas will become de-facto, just like React did.
Once that happens, indexing and scraping the web won't be trivial anymore.
One would have to "opt-in" to using HTML/CSS to make sure that their page would be indexable.
Clearly also indicates that Google is willing to give up on its web-indexing capabilities in the long term, now that they have a stronghold on the Mobile app marketplace.
Most likely spells doom for websites, further making "apps" the norm.
Their crawlers are more advanced now. They already handle SPAs with # URLs and content that’s only fetched once the JS is executed, and likely use some OCR and more powered by AI magic.
Edit: could you please stop posting unsubstantive comments generally? Your account has been doing a lot of that, and we're hoping for a little higher quality of discussion here.
I was going to write at length how bad a moderator you are, but I don't think you are capable of understanding it. I just feel sorry for you. You can be a better moderator. this is not your potential. let's delete my account and relax.
> To deliver her message most effectively, the visual designer needs as much control as possible over what the viewer sees. But, by definition, the designer only has direct control over the tool. She is at the mercy of whatever platform implementation the recipient happens to supply. This implies that a good platform must be as simple and as general as possible.
> From a practical (and historical) standpoint, we can assume that no complex specification will be implemented exactly. This, in itself, is not a problem. However, multiple, decentralized implementations of a complex specification will be incorrect in different ways. A platform consisting of the union of all possible implementations is thus arbitrarily unreliable—the designer can have no assurance of what a recipient actually receives. For a platform to be reliable, it must either have a single implementation, or be so utterly simple that it can be implemented uniformly. If we assume a practical need for open, freely implementable standards, the only option is simplicity.
I've been working on a canvas-mostly render library. Some things like text input would require using the native component to some degree; e.g., the input element could be invisible while the canvas would render the text field in its place.
Honestly, why even bother with the DOM? Even for "article" sites; you can have pixel perfect control over your rendering, especially if you even implement your own anti-aliased bezier curve rendering algorithms. There's already a library that processes fonts (opentype.js) and returns the parameters of the bezier curves that need to be drawn to construct the glyphs.
You're fine still using HTML/CSS for a bunch of reasons (lighter static sites, already have developed codebases for each).
But, why not own the whole thing in TS/JS? For web applications, you'd then have pixel perfect consistency across browsers and mobile devices. There are two ways Flutter, for example, compiles to a web application[0]: either to a mix of canvas, CSS, SVG, and HTML, or just to pure WebGL Canvas using a WASM build of Skia (that's right - the engine that renders HTML/CSS in the first place).
So with WebAssembly and the Canvas, especially taking into account the upcoming WebGPU API which will unlock even better render performance, I'd rather have my whole application be defined in JS/TS and own all of the experience down to the pixel -- complete control. Obviously this is assuming convenient libraries (in WASM/JS) for rendering text/curves/etc. to canvas will exist.
Not a Googler, but I've done many projects in both Canvas and HTML/CSS/JS.
Canvas is actually really hard to get right. You basically are given these super basic tools and have to go do everything yourself. With HTML and CSS you're standing on the shoulders of giants. With Canvas you're drawing arcs and squares and lines.
That being said, canvas is the kind of thing if you DO get it right, it's awesome. It's just fast. And really portable: every platform supports a canvas of some kind, and the primitives tend to be really similar.
Am I the only one who doesn't really have any performance issues with Google Docs?
The initial loading speed is aggravating - but that is just down to the size of the codebase. Actual rendering performance when I type is imperceptible to me.
Putting a few hundred words and a few images onto the screen is presumably very quick whatever technology one uses to render it.
I suspect that accuracy of rendering and consistency across platforms is the bigger concern here.
There are enough differences between how different browsers and operating systems render text via the DOM, to make it very hard to get consistent results between then all. By using canvas they can get very close to pixel perfect consistency. (Especially if they bring their own FreeType library or similar along.)
> You can already get nearly pixel-perfect consistency by rendering each character as a separate dom element at specific X-Y coordinates.
You could do that for every character, but the size of the DOM and updating it will kill your performance. You're better off just drawing the characters yourself into a canvas then.
In my opinion they should wait a bit and build a solid C++/WebGPU on WebAssembly renderer to skip a few steps and avoid doing the same work a few years down the line.
The web was designed to display document, not to host application, it was stretched very far but is also extremely complex and kind of slow.
> ... Over the course of the next several months, we’ll be migrating the underlying technical implementation of Docs from the current HTML-based rendering approach to a canvas-based approach to improve performance and improve consistency in how content appears across different platforms.
This article talks about some of the tradeoffs between Canvas and DOM approaches:
The thing that seems least appealing about replacing DOM rendering with Canvas rendering is the lack of CSS. You may gain performance, but assuming even a fraction of the responsibilities CSS handles so well seems very daunting.
That's why I suspect this approach will do best where the content least resembles a read-only document, and more resembles a game, such as Google Docs.
I give the gdocs team credit for reaching out to developers of Chrome extensions that may be affected. My accessibility Chrome extension [1] will be broken by this change, and I reported this when asked by the gdocs team a couple months back.
I don't know if feedback from my company and others might sway this decision, or if it's set in stone (it's kind of weird to ask for feedback on something that has already been fully decided). I also don't know if they'll be providing more details regarding the timing of the rollout, which seems important since this will break third-party products.
They unfortunately already do. The state of contentEditable on the web is such that you need to if you want to have a non standard experience like google docs.
I think the trend of rendering web apps in canvas is about to explode and it will be ironically promoted by one of the biggest defenders of HTML: Google. The reason I believe is that they no longer need to parse web apps using the DOM because instead they are/will be using Machine Vision.
So I'm using Office/OneDrive online and the web versions of Word/Excel. The reason I don't use Google Docs is that I think the UI is inferior to Microsoft solutions. And I'm pretty sure they are DOM based (for the most), In fact some Excel operations even execute a server round trip. Google Docs is nice because of the addon ecosystem that AFAIK doesn't exist with Office 365. But I don't like the editors as they are inferior versions of Word/Excel...
Also why is it so hard to produce or edit PDF files online still today, or fill PDF forms online? Neither Google or Microsoft provide that fuction. And no, I don't want the PDF file to be converted into some Word like file, I want the PDF to keep its layout/fonts...
Flutter web can use both DOM and Canvas, but the default is Canvas. I haven't to this day seen any disadvantages of using Canvas though, possibly speed on slow mobile devices.
I am also wondering about this. Having recently built a production app in Flutter Web used by millions of people, I am very eager to hear about other people's experiences with Flutter Web in real life applications.
I've been coding Flutter mobile apps for a couple of years, but have only toyed around with the web target. In my experience it seemed kinda off. Scrolling lists ignored my native scroll speed and janked, text manipulation felt laggy, etc. It reminded me of the good old horrible days when Flash-based websites was widespread. I tested it right after the 2.0 release, so perhaps they've fixed it since.
EDIT: Also IIRC the bundle size of a release bundle was like 3mb which was a big yikes.
Yeah that was our primary risk - the actual main.dart.js file becoming too large. We host the JS asset on Cloudfront and the web app from somewhere else, which makes it much better. As far as Flutter Web itself, I only think it became production ready for the web in the last 3 months - literally! It was a bit of a risky choice, but it worked out well for us.
I'm a Flutter dev and last time I checked out the web target it resulted in massive bundle sizes and bad performance. This was a couple of months or so ago, but still after their "official" 2.0 release which took web from beta feature to stable.
EDIT: Removed hyperbole: performance was bad, but not unusable
I don't think that's true. Flutter is having a shitload of momentum right now. Just recently Canonical embraced Flutter as a first class option for building Wayland apps. Flutter is running on Google hardware such as smart speakers. Of course you also have the ever-looming Fuchsia OS that will have Flutter as the premier toolkit for apps. Etc. etc.
That Flutter is nowhere to be seen beyond As Words, the new buggy Google Pay, and a couple of cases that Google sponsored to show something in stage during Flutter conferences.
And Flutter isn't even something that the whole company buys into, hence the counter movement from Android team and JetBrains with Compose or the whole PWA and Fungus Project from Chrome team.
Fuchsia remains to be seen if it is going to be another Android or follow Brillo's footsteps.
Flutter is used in an increasing number of production apps. I'm getting frequent recruitment offers for Flutter roles. So regardless of any skepticism you might have for the longevity of the project, saying that it's only relevant for the Ad Words people is just not true. It's being adopted, and being embraced, way outside of the googleplex.
I'd be very curious to hear if they looked at SVG. I assume it's less peformant, but maybe for word processor it would be about equal. Easy to imagine canvas outperforming SVG for a spreadsheet app.
This is very bad for accessibility. Rendering things in canvas means there is no DOM, which means nothing for screen readers to read. To be fair, the old google docs editor was also very bad for a11y, but at least there was potential to improve it. This removes that option. They could have worked with the Chrome team to help improve standards for everyone, but decided to reimplement the rendering engine from scratch without regard for accessibility. I'm very disappointed that this wasn't flagged at a company the size of Google.
The example that Google shows in their post has accessibility features [0], although currently these need to be switched on with a keyboard shortcut (⌘+Option+Z).
Pay attention to `document.getElementById('docs-aria-speakable')`.
Yeah it's pretty bad to have accessibility be a mode though. Also, since it's completely non-standard, the usual screen reader navigation keys don't work as expected and users have to learn a completely custom interface. Better than nothing but it's not great.
Google Docs, from a user's point of view, is an incredible bit of software engineering. It has revolutionized my work flow and collaboration with partners. So whatever you engineers did to make it happen: KUDOS!
The only thing that worries me is Google's finickiness. I have a lot of data on Google drive and many important Docs. When I hear of Google just cancelling someone's account or abandoning some software I get a bit nervous. I keep backups of course, but it is unnecessary stress.
I'll be shocked if they manage to create a canvas which will render without off-by-1 pixel dimensions and blurry text when I'm using fractional DPI scaling or browser zoom (not Google Docs zoom). VSCode's canvas terminal failed, Chrome's PDF reader failed...
EDIT: Hmm, it actually works, at 125% DPI and various browser zoom levels, the text isn't blurry. Maybe that's linked to how the page doesn't change size when you resize your window.
This whole discussion reminds me of work we did in the early 80's trying to convert typesetting tapes into TeX/Metafont for display/printing. One the things we tried was converting scanned images into rectangles which could then be typeset by TeX using 'rules' of various sizes. Another approach was to convert the scans into specialized fonts, either whole or in contiguous chunks that could be assembled.
Sample doc isn't editable affect, presumably to hide the fact that in addition to overriding standard scrolling it also doesn't support international text entry or, 50/50, standard and custom key shortcuts from the OS
[edit: I was right - cmd-f doesn't use the proper find ui, cmd-e doesn't update the search pasteboard, and cmd-f doesn't track the system search pasteboard.]
I wonder how they will _measure_ the text. You can use the range API to get the client rects for individual letters of text nodes in DOM, and then you could render that into a canvas letter-by-letter. You'd get kerning etc handled by the browser. But canvas has no such API afaik. Maybe Google has implemented their own kerning?
It will be interesting to see if Google Docs undergoes a feature explosion and achieves parity with MS Word after the rendering engine enhancements.
I wonder what competitive pressures Google is responding to with this decision. My guess would be adding features taht facilitate selling into businesses that have significant dependency on MS Office.
I had to put in a lot of effort to write a useable slide rendering engine for my presentation app SlideMagic that was super precise, scales up and down without issues, and still enabled clicking things in the UI. If Google has a good canvas-based engine, hopefully they will open source it as a new standard.
As someone who spends an inordinate amount of time in Docs, some speedups would definitely be welcome... but if this is going to break the workflows I'm using now (most importantly Zotero integration) you can be damn sure I'll go back to Word.
It's sort of clear they intend to do a rewrite of Google Docs in C++ or some other systems language and compile it to webasm because of better tooling.
That's why they went went canvas instead of webgl, since webasm+gl was never widely adopted or complete.
They have already been using that for spreadsheets where you can't copy things. You may think... "I'm a 1337 developer, I will use Developer Tools" and select elements and run JS. But it's canvas rendering, so nope.
The canvas 2D API[1] has support for rendering text, but it is ... limited. For instance, there's no inbuilt multiline support. Which means that if you're serious about rendering text to the canvas, then you have to reimplement a whole host of things yourself in Javascript.
Another worry is that browsers have an inconsistent approach to interpreting `ctx.font = "CSS font string"` - Safari, for instance, does not support the 'bold' keyword; the only way to get that browser to display bold text is to supply it with an already emboldened font.
However with a lot of work, it's possible to do some amazing things with text. Below, links to a couple of demos of text rendering using my own canvas library[2][3].
[1] I highlight the 2D engine because the Google Workspace team specifically linked to the W3C HTML Canvas 2D Context specs in their announcement. I understand that rendering text in the WebGL context is far more complicated. https://www.w3.org/TR/2dcontext/
Huh, I thought it would already be based on canvas, but I'm not sure why. I believe I observed some behavior or looked into the developer console once and got the impression that they can't be using DOM.
My understanding of canvas is that it’s a 2D bitmap you can draw to. So how is text selection (and a keyboard up in mobile Safari) working here? Do they have hidden text elements behind the canvas or something?
Meaning they're creating text dynamically with the click event? I'd assume that'd be chicken-and-egg where the browser wouldn't respond to the click event if there's no text to select?
But that's not happening, as I'm able to select text on iOS Safari and have it give me the copy/paste options above it. If the text wasn't available, it would also completely break any kind of screen reader / accessibility feature so something has to be supporting this at a native-browser level.
I don't know about iOS, but on Firefox Desktop, the right-click menu is emulated by the canvas, and it uses the clipboard API to copy it.
> If the text wasn't available, it would also completely break any kind of screen reader / accessibility feature so something has to be supporting this at a native-browser level.
When using a screen reader, it tells you you press Ctrl+Alt+Z, which loads an iframe containing (only?!) the first paragraph of the text, so it is made available to screen readers independently.
I have no idea if this would work for editing though.
Invisible text inserted in the DOM with Javascript (forgive the French, it ignores browser settings and seems to use geolocation):
<div id="docs-aria-speakable" class="docs-offscreen-z-index" aria-live="assertive" role="region" aria-atomic="true" aria-hidden="false">Pour activer la compatibilité avec le lecteur d'écran, appuyez sur Ctrl+Alt+Z Pour connaître les raccourcis clavier, appuyez sur Ctrl+barre oblique</div>
Wonder what happens in a mobile device context, where you can't execute such a shortcut.
But that's interesting, especially the class="docs-offscreen-z-index". This to me suggests that they do have shadow elements underneath the canvas to support various functionality.
And this is not new, they already do that with the current dom renderer. One div for each wrapped line of text, divs for selection, cursor is a CSS animated div, etc
Am I right to assume that also parts of the interaction UX are now rendered by canvas? I looked at the source and even text selections etc. do not have DOM tree elements anymore.
I didn't see any mention of it in the article, but I wonder if this has some code-sharing with Flutter. IIRC, Flutter uses canvas when rendering in the browser.
...how long before someone strips the browser and ships a runtime with no DOM, canvas and WASM/v9 only. Very platform independent and perhaps fast also...
DOM is just not the way to build complex UI, we need to have control over the lower level draw loop and graphics, it's a pain to build a high level system on top of another high level system, it leads to hacks and unintuitiveness like react hooks. Building UI framework in canvas is way more fun and clean and the only way (in the browser) to think about UI programming from first principal, I dream of a browser environment where DOM / HTML is an optional library.
HTML Canvas is really smooth. Especially with all graphics acceleration work. Afaik, it's smart enough to load elements that aren't in the view port. I believe text rendering was an issue, but it's mostly usable now.
I have a vague memory of some people who installed the https://github.com/panicsteve/cloud-to-butt extension for fun, thinking it only
modifies text on the screen, but since it modified the dom it caused the changes to somehow end up modifying the open google docs
despite the fears of locking down the web, this is really neat from a UI/UX perspective. i'm assuming they're using flutter to do the rendering. is anyone working on an open sourced js framework that also does render to canvas UI?
So Google is betting on Flutter's ethos of canvas rendering for everything. I really hope this doesn't catch on. If Figma, a design tool, doesnt need to use canvases, then neither does Google Docs.
This is incredibly destructive, an attack on the core technologies & interoperability of the DOM. It's highly concerning, highly scary that they would do this.
Slowly adding more features to the web, a system really not designed for those features, seems to have been the only way to get a really cross platform application ecosystem. Very weird for it to happen through this random technology, but obvious in retrospect.
The dom is really, really fast, we have spent decades refining the process and there is deep tooling to support it built right into the browser. What a bunch of nonsense.
Please don't call names in HN comments. Your post would be fine (well, a little unsubstantive and flamebaity, but nothing too serious) without the swipe at the end.
If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful. We're trying for a bit higher quality of discussion here, if possible.
Looking at the DOM in isolation, its pretty quick. The primary reason web apps are slow is because the business logic is typically un-optimized.
I've encountered a web app that persisted its state by stringifying it and putting it into localStorage with every UI change. The resulting string was ~5MB and took 200-300ms each time (freezing the UI) on my tablet.
I've encountered an app that used an in-house built chart library using jquery and D3. For each chart rendered it created a 50MB array of x-axis labels that wasn't cleaned up when the chart was destroyed.
The reasons for why these inefficiencies exist is another topic. But there's nothing stopping someone from creating a complex web application that's performant.
> The primary reason web apps are slow is because the business logic is typically un-optimized.
This crops up a lot, shifting the blame from the web technologies to the developers who target those technologies, but where are the exceptions? What's the best example of a large, complex, efficient and responsive web-based application? It's easy to give examples of tragically inefficient and unresponsive web-based applications (Teams, Slack), but no one ever seems able to give a counterexample.
Pinboard and Hacker News are fast and efficient precisely because they make minimal 'application-like' use of the web platform.
Visual Studio Code is generally pretty responsive, but if I understand correctly there's little question that it uses vastly more computational resources than if it had been built using a 'conventional' (non-web-based) GUI toolkit.
Figma is one of my favorite examples of something which is a nontrivial webapp that avoid feeling like a webapp a lot of the time, even when running in a browser instead of their Electron wrapper.
But their technical leadership contains some of the (arguably) most accomplished folks working in the Javascript world these days, they might be an outlier in this area.
This exchange is absolutely hilarious to me. We've got people complaining about Google doing something while praising Figma for not being so foolish or evil and extolling them as being exceptionally clever. Except oops, turns out Figma is doing the same thing that is believed to be foolish or evil.
It turns out HN is prone to the same sort of knee jerk reactions as most every other "social news" site (even if to a lesser degree). Who would've thought?!
> But there's nothing stopping someone from creating a complex web application that's performant.
Google--the folks who make V8--seem to disagree with you here. If they find canvas rendering to be more performant, then I'm going to believe them absent more information.
A little bit of both. The DOM enforces a specific model of presentation focused on a structural, hierarchical, declarative description with layout including an automated vertical bias. Great if I want to layout something "document-shaped," a colossally bad match to the problem domain if I just want to draw a pentagon and animate it around the screen.
The libraries exist to mitigate the huge problem-domain gap between the DOM model and other models.
Out of curiosity, I just tested a 100-comment thread on my phone with Firefox. Folding it was instantaneous, unfolding it took ~4 seconds. That makes me wonder what takes so long, given that the initial page load didn't take anywhere near that long.
Same here, just this comment's tree starts to have some perceptible latency on my one+ 8 pro (which should be pretty powerful. In contrast I have some Qt apps where unfolding trees with tens of thousands of nodes is instant.
Tried it with a sub-tree of ~240 comments just now. Less than a second on Firefox. Perfectly usable and acceptable in my opinion. To me there are probably hundreds of more annoying or slower things currently out there on the web that frustrate me.
"Less than a second" is definitely not an acceptable metric for any kind of UI latency. It should be at most a frame FFS, and that means 8 milliseconds on today's screens.
It’s a blanket statement because the time to render a standards based user interface is very small. Once we bring in the latency of the internet, (or on the desktop side, database or any other networked resource) all bets are off.
It makes sense for Google to make this change. Why should I care anyway? Its not like anybody has cared about open document formats in this space anyway. This never was and never will be an open platform.
I find it frustrating because this is just another nail in the coffin for open documents.
the immense majority of my experience on the web is slow, and on the non-web is fast, it's definitely not 50/50. I wouldn't be able to give you the name of a slow desktop app I use actually.
The DOM is abysmally slow for everything except one use case: displaying a simple text page with a couple of images.
The core of the web is designed to display that on a 1990s computer in a single rendering pass. Everything bolted on top adds layers of complexity and indirection resulting in a laughably slow and inefficient system.
The web can't even reliably animate an item in a list for laughing out loud.
React (not just React, but they "meme"'d it) sold everyone on the DOM being slow. What they actually meant was that if you add & remove 5,000 elements one at a time in a loop, it's slow. If you don't do that it's plenty fast and all of React's tricks to "make it fast" are just overhead that are bloating and slowing down your "app". (FWIW, I actually kind of like writing React, if I must do front-end JS, so I'm not hating on it just for hating's sake)
> What they actually meant was that if you add & remove 5,000 elements one at a time in a loop, it's slow.
Ok, but it's not trivial to build complex web-based applications that don't do exactly that. This is why there are so many frameworks out there that take care of that problem for you. React isn't the only one. Angular, Vue, and Svelte also handle this for the developer, using various different approaches. It's not just a gimmick. If the web standards themselves were up to the task, there'd be far less need for frameworks.
I would agree that companies have mostly settled on paying 20% as much for development, at the cost of 5-10x the memory footprint and 5-10x the input latency in the finished program. I think the web tends to get hit the hardest by that because if a company is targeting the web for an "app" in the first place, they've already chosen cost savings over performance and UX, so they'll be prone to make even more trade-offs for the same reason within the web platform.
> I think the web tends to get hit the hardest by that because if a company is targeting the web for an "app" in the first place, they've already chosen cost savings over performance and UX
Or it could be they're choosing a ubiquitous, open, cross platform distribution model for their app?
I disagree, I prefer to Google something and have it instantly in my browser instead of having to find and install yet another app, then uninstall it when I'm done with it. Especially for stuff I only use once/rarely.
Also, I might be using the latest iPhone, an old Android, Windows or Linux, or hell maybe my TV browser, and it's most likely going to work. The app wouldn't exist on at least half those platforms.
There certainly are issues with webapps, but well done it can a really good UX.
You're right that the web scores extremely well on portability and on having zero installation hassle. The points about efficiency still stand though. To put that another way: the web's upsides are real, and so are its downsides.
Word processors have extremely specific requirements for layout, rendering, and incremental updates. I'll name just two examples. First, to highlight a text selection in mixed left-to-right / right-to-left text, it's necessary to obtain extremely specific information regarding text layout; information that the DOM may not be set up to provide. Second, to smoothly update as the user is typing text, it's often desirable to "cheat" the reflow process and focus on updating just the line of text containing the insertion point. (Obviously browser engines support text selections, but they probably don't expose the underlying primitives the way a word processor would need. Similarly, they support incremental layout + rendering, but probably not specifically optimized in the precise way a word processor would need.)
Modern browser engines are amazing feats of engineering, but the feature set they provide, while enormous, is unlikely to exactly match the exacting requirements of a WYSIWYG word processor. As soon as your requirements differ even slightly from the feature set provided, you start tipping over into complex workarounds which impact performance and are hell on developer productivity and application stability / compatibility.
This is loosely analogous to CISC vs. RISC: browsers are amazing "CISCy" engines but if your use case doesn't precisely fit the expectations of the instruction set designer then you're better off with something lower-level, like Canvas and WASM. (I don't know whether Docs uses WASM but it would seem like a good fit for this Canvas project.)
Frameworks in general suffer from this problem. If you've ever had to fight with an app framework, or orchestration framework, or whatever sort of framework to accomplish something 5% outside of what the framework is set up to support, then you understand the concept.
Also, as noted in many comments here, browser engines have to solve a much more general problem than Docs, and thus have extra overhead.