To me, the appeal is that game environments that can now be way more dynamic because we're not being limited by prebaked lighting. The Finals does this, but doesn't require ray tracing and it's pretty easy to tell when ray tracing is enabled: https://youtu.be/MxkRJ_7sg8Y
Because enabling raytracing means the game supports non-raytracing too. Which limits the game's design on how they can take advantage of raytracing being realtime.
The only exception to this I've seen The Finals: https://youtu.be/MxkRJ_7sg8Y . Made by ex-Battlefield devs, the dynamic environment from them 2 years ago is on a whole other level even compared to Battlefield 6.
Not just loading times, but I expect more games do more aggressive dynamic asset streaming. Hopefully we'll get less 'squeeze through this gap in the wall while we hide the loading of the next area of the map' in games.
Technically the PS4 supported 2.5" SATA or USB SSDs, but yeah PS5 is first gen that requires SSDs, and you cannot run PS5 games off USB anymore.
The issue is cutting off your inverter from your grid without also cutting off your outlets from the grid. The ways I've seen to do this are to either put some clamp meters on your grid connection and configure your inverter to a "zero export" mode that'll scale depending on your current usage. Not sure about the legality of that; I'm sure it depends on your locality.
Or to put your outlets on a subpanel with the inverter controlling the feed to that subpanel. Maybe with a 2nd lockout connection to the main panel so you can do maintenance on the inverter without having no power.
> behemoth 100B+ parameter models, but to run those I would need to invest much more into this hobby than I'm willing to do.
Have you tried newer MoE models with llama.cpp's recent '--n-cpu-moe' option to offload MoE layers to the CPU? I can run gpt-oss-120b (5.1B active) on my 4080 and get a usable ~20 tk/s. Had to upgrade my system RAM, but that's easier. https://github.com/ggml-org/llama.cpp/discussions/15396 has a bit on getting that running
I use Ollama which offloads to the CPU automatically IIRC. IME the performance drops dramatically when that happens, and it hogs the CPU making the system unresponsive for other tasks, so I try to avoid it.
I don't believe that's the same thing. That should be the generic offloading that ollama will do to any too big model, while this feature requires MoE models. https://github.com/ollama/ollama/issues/11772 is the feature request for similar on ollama.
One comment in that thread mentions getting almost 30tk/s from gpt-oss-120b on a 3090 with llama.cpp compared to 8tk/s with ollama.
This feature is limited to MoE models, but those seem to be gaining traction with gpt-oss, glm-4.5, and qwen3
I've been turning off my home server even though it's a modern PC rather than old server hardware because it idles at 100W which is too much. Put a Ryzen 7900X in it.
Not sure if it's not properly doing lower power states, or if it's the 10 HDDs spinning. Or even the GPU. But also don't really have anything important running on it that I can't just turn it off.
> The other difference with AMD's AI Max is that it's using a 256-bit bus compared to LPCAMM2's 128-bit bus.
Right, you'd put in two of them.
Half your data lines run to each module, and you can put them both tight against the socket, so no routing issues there.
If there's a control line that would need to be shared across both modules, and it can't be shared in a fast way, or there's some weird pin arrangement that causes problems... oh look I'm back to blaming the CPU.
Is a PiKVM considered a display? I've got one attached to my home server. Alongside the dedicated graphics card, it probably uses more power than usual server motherboard KVMs, but it's still cheaper and accessible for home servers.
But that's a game design change that takes longer
reply