It is especially useful for grayscale content, as finding the optimal dithering matrix from the available palette is a straightforward exact operation, and the result can be placed in a LUT for real-time rendering.
In my opinion it looks much better than bayer or random dithering, especially on gradients.
Do you have any other examples where systolic arrays are suitable other than matrix multiplication? As far as I am aware, other problems require different systolic architectures. So I am curious whether you are talking about a general purpose architecture.
I'm talking about a Turing complete, general purpose architecture.[1] One with some seemingly stupid choices that turn out to work well. A cartesian grid of look up tables (LUTs) with latches, clocked in alternating phases. This slows things down, and makes it completely deterministic, and very easy to reason about. There are no race conditions to worry about. Like an excel spreadsheet, it's very easy to see all of the dependencies, and circular references aren't possible in the traditional sense, because of that delay.
I'm stuck in analysis paralysis... or I'd have more than an emulator[2] to show you. In theory, you can take any expression that can be broken down to a directed graph of binary logical expressions, and compile it into the "program" for the BitGrid. Because the grid is homogeneous, you can shift or rotate, or flip it to move the I/O around. The aforementioned dependency tracing makes it possible to prove the functionality conforms to the desired logical expression graph.
I need a kick in the pants to get the rest of this thing figured out. As near as I can tell, the only problems are my own writers block.
The implementation is what I was thinking of. I've also heard claims that VP9 is inherently slower to encode than H.264, but no idea if that's accurate. AVC/H.264 has very broad hardware support. For example, the 2019 MBP I'm using right now can't do hardware-accelerated VP9 encoding, but even 2011-ish MBPs can do H.264 acceleration in both directions. Intel's support matrix: https://en.wikipedia.org/wiki/Intel_Quick_Sync_Video
AV1 looks like it's getting broader support, but it's still new. Zoom's release notes mention they'll use AV1 if the participants support it, and I don't see a similar note about VP8/9.
It's only in newer chips, but it's broader than VP8/9 was. Intel and Nvidia's newest chips support hardware en/decoding of AV1, while Nvidia never supported VP8/9.
This is a common misconception, but is not the case. For example, look at the Voodoo 1, 2, and 3, which also used fixed point numbers internally but did not suffer from this problem.
The real issue is that the PS1 has no subpixel precision. In other words, it will round a triangle coordinates to the nearest integers.
Likely the reason why they did this is because then you can completely avoid any division and multiplication hardware, with integer start and end coordinates line rasterization can be done completely with addition and comparisons.
Didn’t PS1 also lack perspective corrected texture mapping? That would definitely make textures wobbly. AFAIK they compensated for it simply by using as finely subdivided geometry as possible (which wasn’t very finely, really).
The folk that made Crash Bandicoot were pretty clever. They figured out that the PlayStation could render untextured, shaded triangles a lot faster than textured triangles, so they "textured" the main character with pixel-scale geometry. This in turn saved them enough memory to use a higher resolution frame buffer mode.
A few years ago a friend and I made something similar to universe sandbox, though only with the gravitational simulation part: https://github.com/fayalalebrun/Astraria
Surprisingly enough the jar still runs without issue. Something which probably would not be the case for linux binaries, but maybe for windows.
Even though the newest nvidia drivers have started to support GBM, the Wayland compatibility story is still not great. In my experience on Wayland, several OpenGL programs refuse to work, and Vulkan does not work at all. This is with driver 545.
Ok long story, I'll try to keep it short.
New motherboard, Asus w680 ace ipmi. It has an extra ipmi card.
Nvidia-drm.modeset = 1 is required for Wayland to render anything other than 1920×1080.
I have a 4k monitor.
I insert the IPMI card, disable the aspeed gpu on the card via jumper.
The card is still active. But the BCM isn't working, correct cabling and all.
The Aspeed card being active clashes with the Nvidia 4070 card.
I have to disable modeset.
Effectively I can't do Wayland with the IPMI card plugged in.
Bonus fun: I ordered the motherboard via Amazon DE, they imported it from Amazon US, because with import taxes it's over 150€ cheaper than directly from DE.
I contact Asus support (which was hard enough, I can not talk to Asus US support only DE), and waste about a month doing pointless things, going past the instant return period of Amazon.
They tell me to contact ASUS RMA.
I can only get in touch with ASUS DE RMA.
They tell me to talk to Amazon, because I don't have an invoice, Amazon refuses to give me one because it's an imported product.
I contact Amazon 2 times.
Both times some Indian support people promise me the heavens, I have to tell them, "I can't send the motherboard back, I'm using it and about 2400€ worth of components on it, I use the motherboard for work".
They say, no problem we'll send you a replacement in 3 days.
2 months have now passed, no replacement in sight.
amd uses the gbm properly, nvidia is half cooked and previously used eglstreams instead of because they did not like gbm. reality is every compositor just uses gbm.
I probably don't have nearly as much experience as you do, but I have used VHDL, Verilog, and modern HDLs like Chisel and SpinalHDL. I think the main advantage of a modern HDL is to have the full power of a traditional programming language when it comes to generating hardware. This especially helps when making deeply parameterizable and reusable hardware in a fraction of the lines compared to SystemVerilog, and which sometimes is impossible to do in Verilog.
From a first impression, your language doesn't look all that different from SystemVerilog. Does it have any features that make parameterization easier than SystemVerilog? Can I, for example, easily generate hardware using higher order functions and other functional programming features like those available in Rust and Scala?
Powerful generative capabilities aren't always as useful as you might think. There's two major issues:
1. Verification - You can verify one particular part of the configuration space but verifying the full generic component is something else entirely. As far as I'm aware there's no new HDL which seriously tries to address this point.
2. Implementation - If you're generating something sufficiently advanced you likely want different micro-architecture for different configurations to reach the most optimum design (in terms of power, timing and area). As an example, take a CPU, a single, dual and triple issue core will need to be designed in very different ways. You could aim to build something which can generate all of these wrapped up as a nice CPU module with an 'IssueWidth' parameter but that's going to be harder than just writing separate 1, 2 and 3 issue width CPUs.
Certainly for more mechanical things like interconnects, interrupt controllers, pin multiplexers etc yes it can work well. However building those things in System Verilog is often done with separate generator programs anyway and overall doesn't consume much of the total project engineering time, it's just tedious work.
It does seem a lot of new HDLs focus on eradicating the annoyances and tedium you get developing with System Verilog but increase the difficulties you get in point 1 and 2 which are the actual hard bits that take up the bulk of the time.
Early days for Veryl but it's taking a different direction for most (i.e. just building a more sane System Verilog) I shall be watching with interest!
I agree both issues.
I experienced the whole design became to be broken by adding a trait in my Chisel work.
When I want to change a timing path related a register, Chisel-ish sophisticated descriptions could't be used.
I think SystemVerilog has sufficient function for ASIC development.
So I think more efficient development can be achieved if there are modern development tools like real-time semantic checker through language server, build tool handling dependencies, and so on.
I plan to introduce generics to enable module/interface/package as type parameter.
But I'm not aiming Chisel like programability.
I tried to use Chisel in a large codebase to judge it can be used as SystemVerilog alternative in my company.
So I found many problems which causes difficulty to apply ASIC development flow.
I think that differences of semantics from SystemVerilog causes a part of these problems.
So I aim that Veryl has the almost same semantics as SystemVerilog.
I think this ease to interoperate with SystemVerilog codebase too.