Thank you, bzbarsky. Please help us (me) understand the situation a bit better if you have the time.
1. Could you please describe a bit more about the browser-CDM interaction and what is implemented there?
2. Why do you think the browser-CDM interaction was left unspecified? Wouldn't a standard be beneficial to all parties, even CDM developers (no need to back-and-forth with browser developers: just follow the standard)?
3. For a browser to support a CDM, is a developer required to write CDM-specific browser code? That is, if CDM APIs are not standardized, then does the browser need to be modified to accommodate each API? Maybe this is obvious but I can't believe this is the state of things.
4. I, and I believe many others, have been under the impression from the beginning that EME was intended to globally constrain CDM behaviour. What you've described in this thread is entirely different. EME is just an API for CDM-script interaction and nothing more. Meanwhile, these blobs are integrated into the browser and the extent to which they're constrained is up to the browser developers. Unlike an NPAPI plugin, there is no standard for what they're allowed to do or know.
It occurs to me now that a standard defining browser-CDM interaction would never come from the W3C as it is simply outside their scope (ie. Web standards, not browser standards). CDMs can choose where to run today because there wasn't enough interest (or coordination) in establishing a standard browser-agnostic environment for them to run in. Now the CDMs are here, entrenching themselves, and the time to establish this environment is long gone. Is this an accurate representation?
> Could you please describe a bit more about the browser-CDM interaction and what is implemented there?
I don't really know what this interaction looks like in non-Firefox browsers. Last I checked, the CDMs Chrome ships didn't work with Chromium, but I don't know whether that's still true, and I don't know whether the browser-side bits involved are implemented at all in Chromium or just in Chrome. Likewise, I don't know whether the CDM interaction bits in Safari are in the public WebKit repo or not. IE's source is not available, of course. In the case of Firefox, https://hacks.mozilla.org/2014/05/reconciling-mozillas-missi... really does cover most of the details. We put together an API that made sense on our (Firefox) end internally. We then worked with some CDM vendors to integrate their products, by building shims to convert the API their CDMs exposed to the API we wanted to be using internally. That's probably all I can say on the subject.
https://hsivonen.fi/eme/ has a reasonably in-depth discussion of the way these bits fit together from someone who was much more intimately involved in this than I was.
> Why do you think the browser-CDM interaction was left unspecified?
Because the people writing the spec pushed back pretty explicitly on doing so, claiming that this would take too much time and overconstrain things too much in terms of both CDM and browser implementations.
> Wouldn't a standard be beneficial to all parties, even CDM developers (no need to back-and-forth with browser developers)
The CDM developers I'm aware of are Google, Apple, Microsoft, and Adobe. Three of these are also browser developers, who are shipping their own CDM in their own browser. Two of those three, along with Netflix, happened to be the spec editors.
There was literally zero incentive for them to standardize the browser/CDM interaction, and some incentives to NOT do so. So they didn't.
> For a browser to support a CDM, is a developer required to write CDM-specific browser code?
Yes. Not just that, but for actual CDMs on the market the developer is also required to work with the CDM vendor to accept that particular browser as a trusted enough party.
This is because CDMs are supposed to prevent the decoded data being captured, so they must either handle their own on-screen display or do so via an intermediary they trust. See also the "What does this mean for downstream users of the Firefox code base?" section of https://hacks.mozilla.org/2014/05/reconciling-mozillas-missi... and note that in the setup described there the CDM basically bakes in some sort of signature of the actual browser _binary_ that it's willing to work with. So just compiling the same, or worse yet slightly modified, source is not enough to get something that works with the same CDM.
> Maybe this is obvious but I can't believe this is the state of things.
It's totally the state of things.
> I, and I believe many others, have been under the impression from the beginning that EME was intended to globally constrain CDM behaviour.
EME describes a set of things that a CDM must effectively support. This means that a browser can demand that a CDM run in a sandbox that limits its interactions with the outside world to whatever is needed to support the EME APIs. This is the approach Firefox is taking with its CDMs.
Of course the CDM vendor can tell the browser vendor to go take a hike with its sandboxing demands and simply refuse to run in such a sandbox. Then the browser vendor can either back down or not ship that particular CDM.
There was a lot of talk about how EME opened the _possibility_ of CDMs that were more constrained than NPAPI plugins are (because the NPAPI includes all sorts of stuff, whereas a CDM could be built with a much smaller and more sandboxable API). And some people (the Netflix ones in particular, iirc) sure made it sound like this possibility would be a definite reality. And to some extent they were right: the CDMs in Firefox are certainly a lot more sandboxed than NPAPI plugins! But that's because Firefox decided to make it so, and EME somewhat enabled it to make that decision, and the CDM vendors involved agreed to play along.
> (ie. Web standards, not browser standards)
I'm not sure the distinction is that meaningful.
That said, the W3C can, when it wants to, work with other standards bodies on joint things. Examples include WebSocket (API defined by W3C, wire protocol defined by IETF), WebRTC (similar), JavaScript (API and integration points defined by W3C, language defined by ECMA), and probably other things I'm forgetting. If people had really cared about standardizing the browser/CDM interaction and had really decided that the W3C was the wrong venue for it (which is not obvious), another venue could have been found.
> because there wasn't enough interest (or coordination) in establishing a standard browser-agnostic environment for them to run in.
Correct. The only interest expressed in such a thing was from Mozilla and Opera, as I recall. Oddly enough, those were the only major browser vendors that were not also CDM vendors. What a coincidence!
> Is this an accurate representation?
I think the time to establish such an environment is not any more gone than it used to be, because nothing has much changed. Apple, Google, and Microsoft are still both browser vendors and CDM vendors, and still not interested in standardizing CDM stuff. Mozilla could create a "standard" on its own, but it would be rather meaningless in practice. And the problem of CDMs wanting to authenticate exactly who they're talking to on the binary level would remain.
1. Could you please describe a bit more about the browser-CDM interaction and what is implemented there?
2. Why do you think the browser-CDM interaction was left unspecified? Wouldn't a standard be beneficial to all parties, even CDM developers (no need to back-and-forth with browser developers: just follow the standard)?
3. For a browser to support a CDM, is a developer required to write CDM-specific browser code? That is, if CDM APIs are not standardized, then does the browser need to be modified to accommodate each API? Maybe this is obvious but I can't believe this is the state of things.
4. I, and I believe many others, have been under the impression from the beginning that EME was intended to globally constrain CDM behaviour. What you've described in this thread is entirely different. EME is just an API for CDM-script interaction and nothing more. Meanwhile, these blobs are integrated into the browser and the extent to which they're constrained is up to the browser developers. Unlike an NPAPI plugin, there is no standard for what they're allowed to do or know.
It occurs to me now that a standard defining browser-CDM interaction would never come from the W3C as it is simply outside their scope (ie. Web standards, not browser standards). CDMs can choose where to run today because there wasn't enough interest (or coordination) in establishing a standard browser-agnostic environment for them to run in. Now the CDMs are here, entrenching themselves, and the time to establish this environment is long gone. Is this an accurate representation?