More

DaleCurtis · 2025-12-24T22:47:45 1766616465

FWIW, you can do this with a few lines of JS in the browser using canvas.drawImage() from an img element followed by canvas.toBlob().

valadaptive · 2025-12-25T06:56:01 1766645761

Firefox now adds random noise to all canvas readback operations (getImageData, toDataURL, and toBlob).

DaleCurtis · 2025-12-25T17:15:04 1766682904

Ah, I didn't realize that always happened. I thought it was only if you did something that might have OS specific rendering characteristics (text-draws, etc).

Maybe having an ImageEncoder API might be worthwhile after all then https://github.com/w3c/webcodecs/issues/204.

fouc · 2025-12-25T06:29:13 1766644153

That would require a browser that supports WebP

theandrewbailey · 2025-12-25T12:09:54 1766664594

...which is, like, all of them released over the past 5 years: https://caniuse.com/webp

DaleCurtis · 2025-09-08T16:48:09 1757350089

Thanks for writing this! It skips my favorite edgy product name though, the X-Fi Fatal1ty cards: https://en.wikipedia.org/wiki/Sound_Blaster_X-Fi :)

DaleCurtis · 2025-08-23T06:50:11 1755931811

Unfortunately canvas (rgb'ish) can't overlay as efficiently as <video> (yuv'ish), so there is some power cost relative to the lowest power video overlays.

It really only matters in long form content where nothing else on the page is changing though.

chrismorgan · 2025-08-23T17:08:45 1755968925

> It really only matters in long form content where nothing else on the page is changing though.

Did you not just describe at least 99% of all web video?

DaleCurtis · 2025-08-23T17:54:48 1755971688

If only that were true, battery usage would be much better :) Just consider the prominence of content like tiktoks/shorts/reels/etc alone.

DaleCurtis · 2025-02-16T18:44:07 1739731447

The code in Chromium which handles this suspension is here: https://source.chromium.org/chromium/chromium/src/+/main:med...

Basically after detecting silence for 30 seconds or so it switches from a sink backed by the OS audio device to a null sink.

Note: Since this uses a different clock than the audio device we have received some reports that when the context is finally used there can be some distortion at specific tones. The workaround is for sites to use the suspend resume API mentioned in the article.

DaleCurtis · 2025-02-09T02:13:10 1739067190

Chrome supports HEVC with transparency on macOS since VideoToolbox does. Other platform decoders don't support it though.

DaleCurtis · on Sept 8, 2024

What a fun excursion :) You can also use the ImageDecoder API: https://developer.mozilla.org/en-US/docs/Web/API/ImageDecode... and VideoFrame.copyTo: https://developer.mozilla.org/en-US/docs/Web/API/VideoFrame/... to skip canvas entirely.

purplesyringa · on Sept 8, 2024

It's unfortunately Chromium-only for now, and I wanted to keep code simple. I've got a PoC lying around with VideoFrame and whatnot, but I thought this would be better for a post.

DaleCurtis · on May 25, 2024

I still remembered my id from all those years ago, 1569200. I was excited to read others were logging in with their old numbers, so I tried the password I thought I had used, but no luck.

Feeling adventurous, I dug through my archives and found my old ICQ database and followed https://sec.sipsik.net/tuts/net/icq.txt and used the code at https://web.archive.org/web/20070209002044/https://rejetto.c... to decrypt my password...

Sadly it was the one I had tried :( It was a fun trip through nostalgia at least!

DaleCurtis · on Jan 6, 2024

Very cool! I think it's missing some entries though. I'm pretty sure we've had at least one in third_party/ffmpeg. Those fixes often land upstream first which might make tracking difficult.

rebane2001 · on Jan 6, 2024

It's using whatever Git Watcher comments on the monorail bugs.

DaleCurtis · on Jan 7, 2024

IIRC, that bot used to be called bugdroid. I forget when it switched over, probably somewhere in 2020.

DaleCurtis · on Dec 1, 2023

Even if you have crash reporting disabled there should be a .dmp generated somewhere in the user profile directory. Manually uploading that to a bug at https://crbug.com/new would allow a Chrome developer to debug it.

If you can't share the dump for similar reasons to why you have crash reporting disabled, you can build minidump_stackwalk from Chromium and use it to generate an unsymbolized stack trace that you can post to the bug. A Chrome developer can then symbolize it.

https://www.chromium.org/developers/decoding-crash-dumps/ has some more details.

DaleCurtis · on Oct 30, 2023

Thanks for the nice write up! I work on the WebCodecs team at Chrome. I'm glad to hear it's mostly working for you. If you (or anyone else) has specific requests for new knobs regarding "We may need more encoding options, like non-reference frames or SVC", please file issues at https://github.com/w3c/webcodecs/issues

hokkos · on Oct 30, 2023

I use the WebCodecs API with VideoDecoder with a very specific use case, to get data arrays using the great compression of video codec and the data having temporal coherency. Demo here : https://energygraph.info/d/f487b4fd-45ad-4f94-8e7e-ea32fc280...

And I have some issues with the copyTo method of VideoFrame, on mobile (Pixel 7 Pro) it is unreliable and output all 0 Uint8Array beyond 20 frames, to the point I am forced to render each frame to an OffscreenCanvas. Also the many formats of frame output around RGBA/R8 with reduced range 16-235 or full range 0-255 makes it hard to use in my convoluted way.

DaleCurtis · on Oct 30, 2023

Please file an issue at https://crbug.com/new with the details and we can take a look. Are you rendering frames in order?

Android may have some quirks due to legacy MediaCodec restrictions around how we more commonly need frames for video elements, frames only work in sequential order since they must be released to an output texture to access them (and releasing invalidates prior frames to speed up very old MediaCodecs).

hokkos · on Oct 30, 2023

It will try to do a simple reproduction, and yes the frame are decoded in order.

vlovich123 · on Oct 30, 2023

There’s a few that would be neat:

* maybe possible already, but it’s not immediately clear how to change the bitrate of the encoder dynamically when doing VBR/CBR (seems like you can only do it with per-frame quantization params which isn’t very friendly)

* being able to specify the reference frame to use for encoding p frames

* being able to generate slices efficiently / display them easily. For example, Oculus Link encodes 1/n of the video in parallel encoders and decodes similarly. This way your encoding time only contributes 1/n frame encode/decode worth of latency because the rest is amortized with tx+decode of other slices. I suspect the biggest requirement here is to be able to cheaply and easily get N VideoFrames OR be able to cheaply split a VideoFrame into horizontal or vertical slices.

DaleCurtis · on Oct 30, 2023

* Hmm, what kind of scheme are you thinking beyond per frame QP? Does an abstraction on top of QP work for the case you have in mind?

* Reference frame control seems to be https://github.com/w3c/webcodecs/issues/285, there's some interest in this for 2024, so I'd expect progress here.

* Does splitting frames in WebGPU/WebGL work for the use case here? I'm not sure we could do anything internally (we're at the mercy of hardware decode implementations) without implementing such a shader.

vlovich123 · on Oct 30, 2023

> what kind of scheme are you thinking beyond per frame QP

Ideally I'd like to be able to set the CBR / VBR bitrate instead of some vague QP parameter that I manually have to profile to figure out how it corresponds to a bitrate for a given encoder. Of course, maybe encoders don't actually support this? I can't recall. It's been a while.

> Does splitting frames in WebGPU/WebGL work for the use case here? I'm not sure we could do anything internally (we're at the mercy of hardware decode implementations) without implementing such a shader.

I don't think you need a shader. We did it at Oculus Link with existing HW encoders and it worked fine (at least for AMD and NVidia - not 100% sure about Intel's capabilities). It did require some bitmunging to muck with the NVidia H264 bitstream to make the parallel QCOM decoders happy with slices coming from a single encoder session* but it wasn't that significant a problem.

For video streaming, supporting a standard for Webcams to be able to deliver slices with timestampped information about the rolling shutter (+ maybe IMU for mobile use cases) would help create a market for premium low-latency webcams. You'd need to figure out how to implement just in time rolling shutter corrections on the display side to mitigate the downsides of rolling shutter but the extra IMU information would be very useful (many mobile camera display packages support this functionality). VR displays often have rolling shutter so a rolling shutter webcam + display together would really make it possible to do "just in time" corrections for where pixels end up to adjust for latency. I'm not sure how much you'd get out of that, but my hunch is that if you knock out all the details you should be able to shave off nearly a frame of latency glass to glass.

Speaking of adjustments, extracting motion vectors from the video is also useful, at least for VR, so that you can give the compositor the relevant information to apply last-minute corrections for that "locked to your motion" feeling (counteracts motion sickness).

On a related note, with HW GPU encoders, it would be nice to have the webcam frame sent from the webcam directly to the GPU instead of round-tripping into a CPU buffer that you then either transport to the GPU or encode on the CPU - this should save a few ms of latency. Think NVidia's Direct standards but extended so that the GPU can grab the frame from the webcam, encode & maybe even send it out over Ethernet directly (the Ethernet part would be particularly valuable for tech like Stadia / GeForce now). I know the HW standards for that don't actually exist yet, but it might be interesting to explore with NVidia, AMD, and Intel what HW acceleration of that data path might look like.

* NVidia's encoder supports slices directly and has an artificial limit on the number of encoder sessions on consumer drivers (they raised it in the past few years but IIRC it's still anemic). That however means that the generated slices have some incorrect parameters in the bitstream if you want to decode them independently. So you have to muck with the bitstream in a trivial way so that the decoders see independent valid H264 bitstreams they can decode. On AMD you don't have a limit to the number of encoder.

DaleCurtis · on Oct 30, 2023

> Ideally I'd like to be able to set the CBR / VBR bitrate

What's wrong with the existing VBR/CBR modes? https://developer.mozilla.org/en-US/docs/Web/API/VideoEncode...

> I don't think you need a shader...

Ah I see what you mean. It'd probably be hard for us to standardize this in a way that worked across platforms which likely precludes us from doing anything quickly here. The stuff easiest to standardize for WebCodecs is stuff that's already standardized as part of the relevant codec spec (e.g, AVC, AV1, etc) and well supported on a significant range of hardware.

> ... instead of round-tripping into a CPU buffer

We're working on optimizing this in 2024, we do avoid CPU buffers in some cases, but not as many as we could.

vlovich123 · on Oct 30, 2023

> It'd probably be hard for us to standardize this in a way that worked across platforms which likely precludes us from doing anything quickly here. The stuff easiest to standardize for WebCodecs is stuff that's already standardized as part of the relevant codec spec (e.g, AVC, AV1, etc) and well supported on a significant range of hardware.

As I said, oculus link worked with off the shelf encoders. Only the Nvidia one needed some special work and even that’s not even needed anymore since they raised the number of encoders (and the amount of work was really trivial - just adjusting some header information in the h.264 framing). I think all you really need is the ability to either slice a VideoFrame into strips 0 cost and have the user feed them into separate encoders OR to request sliced encoding and under the hood that’s implemented however (either multiple encoder sessions or using Nvidia slice API if using nvenc). You can even make support for sliced encoding optional and implement it just for the backends where it’s doable.

jampekka · on Oct 30, 2023

I'm currently working with WebCodecs to get (the long awaited) frame-by-frame seeking and reverse playback working in the browser. And it even seems to work, albeit the VideoDecoder queuing logic seems to give some grief for this. Any tips on figuring out how many chunks have to be queued for a specific VideoFrame to pop out?

An aside: to work with video/container files, be sure to check the libav.js project that can be used to demux streams (WebCodecs don't do this) and even used as a polyfill decoder for browsers without WebCodec support!

https://github.com/Yahweasel/libav.js/

DaleCurtis · on Oct 30, 2023

The amount of frames necessary is going to depend on the codec and bitstream parameters. If it's H264 or H265, there's some more discussion and links here: https://github.com/w3c/webcodecs/issues/698#issuecomment-161...

The optimizeForLatency parameter may also help in some cases: https://developer.mozilla.org/en-US/docs/Web/API/VideoDecode...

jampekka · on Oct 30, 2023

Thanks. I appreciate that making an API that can be implemented with the wide variety of decoding implementations is not an easy task.

But to be specific, this is a bit problematic with I-frames only videos too, and with optimizeForLatency enabled (that does make the queue shorter). I can of course .flush() to get the frames out but this is too slow for smooth playback.

I think I could just keep pushing chunks until I see the frame I want coming out but it will have to be done in an async "busy loop" which feels a bit nasty. But this is done also in the "official" examples I think.

Something like "enqueue" event (similarly to dequeue) that more chunks after last .decode() are needed to saturate the decoder would allow for a clean implementation. Don't know if this is possible with all backends though.

DaleCurtis · on Nov 1, 2023

Often Chrome doesn't know when more frames are needed either, so it's not something we could add an API for unfortunately.

Yes, just feeding inputs 1 by 1 for each dequeue event until you get the number of outputs you want in your steady state is the best way. It minimizes memory usage. I'll see about updating the MDN documentation to state this better.

Rodeoclash · on Oct 30, 2023

Wow, great to see some work in this space. I've been wanting to do reverse playback, frame accurate seek and step by step forward and back rendering in the browser for esports game analysis. The regular video tag gets you somewhat of the way there but navigating frame by frame will sometimes jump an extra frame. Likewise trying to stop at an exact point will often be 1 or 2 frames off where you should be. Firefox is much worse, when pausing at a time you could +-12 frames where you should be.

I must find some time to dig into this, thanks for sharing it.

jampekka · on Oct 30, 2023

I have it working with WebCodecs, but currently i-frames only videos and all the decoded frames are read to memory. Not impossible to lift these restrictions, but the current WebCodec API will likely make it a bit brittle (and/or janky). For my current case this is not a big problem so I haven't fought with it too much.

Figuring out libav.js demuxing may be a bit of a challenge, even though the API is quite nice as traditional AV APIs go. I'll put out my small wrapper for these in a few days.

Edit: to be clear I don't have anything to do with libav.js other than happening to find it and using it to scratch my itch. Most demuxing examples for WebCodecs use mp4box.js which really makes one a bit uncomfortably intimate with guts of the MP4 format.

krebby · on Oct 30, 2023

Encoding alpha, please! https://github.com/w3c/webcodecs/issues/672

Thanks for the great work on WebCodecs!

kixelated · on Oct 30, 2023

Thanks for WebCodecs!

I'm still just trying to get A/V sync working properly because WebAudio makes things annoying. WebCodecs itself is great; I love the simplicity.

padenot · on Oct 30, 2023

https://blog.paul.cx/post/audio-video-synchronization-with-t... has some background, https://github.com/w3c/webcodecs/blob/main/samples/lib/web_a... is part of a full example that you can run using web codecs, web audio, audioworklet SharedArrayBuffer, and does A/V sync.

If it doesn't answer your question let me know because I wrote both (and part of the web audio spec, and part of the webcodecs spec).

kixelated · on Oct 30, 2023

I'm using AudioWorklet and SharedArrayBuffer. Here's my code: https://github.com/kixelated/moq-js/tree/main/lib/playback/w...

It's just a lot of work to get everything right. It's kind of working, but I removed synchronization because the signaling between the WebWorker and AudioWorklet got too convoluted. It all makes sense; I just wish there was an easier way to emit audio.

While you're here, how difficult would it be to implement echo cancellation? The current demo is uni-directional but we'll need to make it bi-directional for conferencing.

padenot · on Oct 31, 2023

Just use getUserMedia as usual and it will just work, nothing special to do.