If that app ran in a 640x480 mode, memory accesses would be just as fast as the ...

Retr0id · on Nov 29, 2023

SDL is simply not optimised for this use-case, but that doesn't mean it can't be done. If you're writing single-threaded code (which I assume BBC Basic is!), you can cut out the locking and directly poke a 640x480x3 array in memory. This part is extremely fast, as fast as your memory subsystem can go.

Then, convert that into a texture and send it to the GPU once per frame. This is the only added overhead relative to the old ways. If you picked the right format for your in-memory buffer (probably ARGB-8888, perhaps with a certain row stride) then that conversion is a nop. The "correct" format depends on your hardware (which is why this isn't a typical workflow), but even a non-trivial pixel format conversion is fast at 640x480.

If you wanted to send a 3840*2160 texture to your GPU at 60Hz, that requires "only" 2GBps of bandwidth, which I think you'll find in most modern systems. This is pretty inefficient, which is why we don't do it, but it can be done.

Edit: I think this stackoverflow answer gives a good summary of how to actually do this in SDL2, but I haven't tried it: https://gamedev.stackexchange.com/a/157608

The linked docs are also worth a read: https://wiki.libsdl.org/SDL2/MigrationGuide#if-your-game-jus...

Retr0id · on Nov 29, 2023

I made a quick moving xor texture demo, it can hit 4K240fps on my machine.

https://gist.github.com/DavidBuchanan314/2f68ff42df047a448be...

vkaku · on Nov 29, 2023

Thank you for following up on this. This code we should be sending to the BBC SDL author and asking them to look at improving their own GetPixel code.

Every bit (pun intended) helps.