> Most likely most game devs used someone else's RSP code
Yeah, almost all RSP code used was written by SGI/Nintendo and used as shipped in the SDK. For a long time emulators simply high level emulated the RSP code based on the hash of the RSP binary and accepted that a few games like Rogue Squadron simply wouldn't run.
There's a fun conspiracy theory that Nintendo intentionally made the default RSP binary they provided sub-optimized, so that later in the console's life they could release an optimized version so that games continue to appear to get better. This is likely entirely rubbish, of course.
Yeah, probably rubbish. What's more likely is that it started as a fairly straightforward port of the code for SGI's RealityEngine 3D accelerators which were focused a bit more on correct offline rendering than real time.
Not sure about your story, but there was a pretty bad bug in Mario 64 - their flagship game, that degraded performance. If I remember, the fix was a one-line change.
One line change, from a developer that consistently analyzed the decompiled data over the course of years, and with 2020's tooling.
The original devs were in a time crunch in the 90s with the limited experience they had for 3d games. It's possible that they were aware of the bug but said "eh, it's playable" and launched it.
I think you raise the right point. I remember reading about how the team was burned out.
> What was the atmosphere like during the final days of coding?
I think there was a lot of panicking going on. But it was still very organised, there was lots of people working very hard. I think it was quite laid back at the very end, not many bugs, gameplay was sorted out on time. One of the programmers had quite a hard time of it – two of them decided not to make games anymore because of Mario 64. Not because they didn’t enjoy it, but because they’d burnt themselves out.
You're probably thinking of the fact that the game was compiled without optimizations. IIRC Giles Goddard who worked on the game said it was likely intentional in order to minimize any risks for the launch title. Later versions of the game were compiled with optimizations on.
On a related note, Kaze has done a deep dive into optimizing the Super Mario 64 source code for his ROM hacks. Here’s one of his more popular videos on the subject: https://youtu.be/t_rzYnXEQlE
Basically a very, very early programmable GPU, but you were writing actual firmware for it, rather than writing the kind of high level shader functions that would come a decade+ later.
I always heard it referred to as "microcode", and there were different RSP microcode packages floating around, shipped by Nintendo/SGI, or programmed from scratch and included in the ROM.
Yeah, it was called that, but the term microcode was a bit of a stretch. It's a relatively straightforward 32bit MIPS with a 8x16bit SIMD vector unit strapped to the side as MIPS coprocessor 2. One cute thing is that since it can only access 4KiB of data, and load/store offsets in MIPS machine code are 12 bits, you can build any pointer in the load/store itself simply offsetting from the zero register. That's in contrast to most MIPS code where you generally need to build the pointer in a register first (or at least burn a register to be the global pointer).
Just to point out: You can still do this (with any processor that has a zero-register, otherwise you'd still have to "burn" a register and you could just use a real global pointer). This technique isn't made impossible by larger RAM, it just cannot access anything beyond 4KiB (1). If that is enough for your "hot" data, you could still use it.
(1) 2KiB for RISCV, because immediate offsets are 12bits signed, and the upper end of the address space is usually not mapped to RAM. That is, 4KiB are possible with RISCV if you control the memory map.
A lot of processors don't even map usable RAM at address zero is the real problem. For instance a lot of simpler RISC-V chips are copying the basic Cortex-M memory map which places ROM low in memory.
Even on systems with MMUs, it's very common that the kernel won't let you map RAM at address zero for several reasons. And the sizes are being talked about are about the size of a single page.
Rare had many teams working siloed and independently. The team that made Diddy Kong Racing made a custom microcode based heavily on the official Fast3D with several differences. This microcode was updated and used in that teams next N64 games Jet Force Gemini, and I believe it was also used in their next cancelled game Dinosaur Planet, which later was released on GC as Starfox Adventures. I'm reasonably certain it was used in Mickey's Speedway USA as well, but I'm not certain.
It’s kind of a shame that the standard library’s scheduler didn’t let you enqueue your own microcode tasks to be executed. I think a lot of games could have their own custom SIMD logic and made some unique stuff.
This is one of the things (along with repair) that finally motivated me to learn electronics. You could probably build your own Everdrive 64 for less than $20! (Plus hundreds of dollars for an oscilloscope and other equipment but you can keep that for the next project)
# If you have more than 2MB flash, you need to change the flash size by adding -DFLASH_SIZE_MB={one of 2,4,8,16} here.
2MB might be a minimum, as the standard bootcode needs 1MB + 4KB. See my boot_stub or modified 6102 bootcode for ways to bypass the 1MB checksum: https://hcs64.com/n64info.html
I think it's actually a limitation of the Raspberry Pi Pico only having 2MB of flash space while some other RP2040 boards can support up to that 16MB. Reading SD cards with the PIO might be a bit much so the project README mentions the 'next-gen' version with added RAM
There are knockoffs for a lot less, the author here is even using one. But then you aren't supporting the inventor of the thing, just a factory that's churning them out somewhere.
Nobody asked, but I'll list differences I've noticed between the real everdrive and the knockoffs: Games that require a cartridge battery won't won't work correctly on the clones, the SD card slot is a lot cheaper on the clones, and the clone doesn't have a USB port (which I think is used for live game dev debugging?)
I assume there's a fair amount of work needed to develop these carts, but they probably don't ever sell in large quantities. Yes, it's expensive, but I can't imagine anyone getting rich out of manufacturing these. They are however instrumental for the hardware and software preservation purposes, so it's important for the community to support them.
It's interesting to note that the Everdrives are still actively manufactured in Ukraine, despite the immense difficulties caused by the war.
Yup. Plus Krikzz has to contend with Chinese clones copying their hardware designs and selling for significantly less. I have a few Everdrives for various systems and I've been happy with them. Some of them are pretty reasonably priced IMO like the GB Everdrives which are like the same price as a new game.
Now I wonder if there are any AOT compilers for Java besides OP's (though it's of limited utility as it doesn't implement garbage collection) and GraalVM.
There's been plenty but they've fallen aside for various reasons.
- GCJ (iirc only pre 1.5-1.6 java support so never with generic versions, not sure if they ever implented JNI but relied on their own so libraries with native bindings had to be manually ported iirc)
- Excelsior JET was a strong option for a long time on desktops up until 2018, main selling point was resistance to decompilation but not sure if they ran afoul of Oracle licensing or couldn't keep up with the accelerated pace of JDK releases in later years.
(The below were options to various degrees for iOS developers)
- Avian VM ( https://readytalk.github.io/avian/ ), opensource and seems to be up but never really saw an uptake or proper debug tooling iirc, seems inactive by now.
- Robo VM was another strong option with strong support for IDE debuggers,etc since it was used by gamedevs and the initial libgdx author was involved in it. Sadly they were sold out to Xamarin shortly before MS bought out Xamarin and then promptly shut down since MS only had interest in Xamarin for their C# iOS/Android toolkits.
- RoboVM forks, luckily RoboVM core was liberally licensed so forks were possible for those working on mobile games with iOS ports even if the tooling wasn't as slick as the official RoboVM project (No idea if any of the open source variants have caught up, it was a bit chaotic initially with many forks).
- Intel had(have?) some AOT compiler for Java that was an option for libgdx developers for a while but RoboVM being more "native" had more eyes and no idea if Intel really had a business case for it's Java things ? (
(Funnily enough, I was actually doing an AOT one during late uni times to write a thesis on game GC's (and hoping to maybe commercialize), then Oracle bought out Sun and I wrote a JS AOT prototype instead. Hearing of Oracle vs Goog it felt sane but Oracle did showcase RoboVM later on so maybe it was silly)
I also wonder how Java Grinder compares with GCJ. Looks like it targets a lot of 8-bit micros, whereas GCJ is mostly targeting 32-bit+ C machines (via GCC backends)? (To oversimplify a lot.)
There's also a PS2 version, really cool stuff to see.
I don't know what schools are like nowadays but when I was in school I was taught Java and it feels like giving people ways to use that language in odd places is only going to bring about more good and fun projects to the world
What a wonderful project to read about. It’s too easy to get into a mindset where you auto-filter your ideas based on whether or not it’ll make money, or directly help you reach some career goal (especially in the HN bubble). This is fun. This is art. I wish I took more time to do “pointless” (in the least derogatory sense of the word) projects like this one. Bravo.
I think; for myself, at least - doing ‘useless’; or passion/pet projects are a way to ‘keep up your chops’ or the like as a developer, and keep your inspiration for it.
I do SEGA Saturn homebrew, and mostly actually just experiments (I’ve never published any of my Saturn projects so far, though I intend to for one of them), and I only really do it for the obscure interest of learning a kind of fairly obscure and complex processor and co-processor architecture for…the love of it?
I don’t care if I make money off it; I don’t really need to, I don’t care if people see or use it; because that’s not the purpose behind the exercise, and I guess that does seem almost strange but the curiosity and learning is what drives me.
I work a 9-5 as a mobile developer and it’s honestly usually not that challenging or growth-inducing. Not that I’m complaining, I love my job and my company, but in order to stay passionate and ‘into’ programming I find doing these weird passion projects really helps.
Hope that makes sense!
EDIT: I’m also a music producer, and one task I really like to do in a similar way to ‘keep up my chops’ is to re-produce an almost carbon copy of a song I’m interested in particular audio engineering qualities about to, EG, learn about particular techniques being used. I’ll obviously never be able to use or sell them, but I can re-apply the techniques I learned in that experience elsewhere.
> I’m interested in particular audio engineering qualities about to, EG, learn about particular techniques being used
Have you considered combining your enjoyment for audio production with your programming skills to develop audio plugins? This is also something that could be monetised, although the VST market seems to be pretty saturated. Start as a side project and see where it leads !
I've been meaning to go into SNES or Sega Genesis homebrew for a while now but can't seem to find the time. Good to hear that others are doing something fun though. :)
I just rewatched that clip, did the writer transition from technobable to relationship dialogue? Because the scene has this tension and saying "RISC is good" can also be heard as "risk is good".
RISC changed everything just like CISC did. It was a product of changing times & tradeoffs.
Both RISC and CISC are effectively marginalized, though. The current trend for everything beyond embedded is somewhere kinda in the middle. With the some pretty pointless nitpicking about if something is or isn't RISC and devolving that entire movement down to just "but it doesn't do memory indexing!" or whatever.
It turns out what makes a fast processor is something that uses both RISC and CISC ideas. And that's what nearly everything does these days as a result, because speed is all that matters
In the olden days we thought that massively complex instructions on chips with huge microcode would be just the thing. Simplify your code by calling a single machine code instruction that does a complete matrix manipulation, or solves a polynomial, kind of thing.
It turned out that making CPUs that worked that way was expensive and complicated, and if they didn't quite work you were stuck with it. Then, it further turned out that actually it was easier to be Good At Software than it was to be Good At Hardware. If you wrote a clever assembler, and a clever compiler, and some clever libraries, you still didn't need to care about solving horrible maths, you just called a function to do it.
And then, we came around to the idea that it was faster to use small simple instructions and cache a lot of them, and keep as many of the values we were juggling in registers so you didn't waste time with a memory cycle. So, CPUs got simpler instruction architectures and all that space previously used for selecting ever increasingly baroque instructions in the opcode could be used to select from an ever-growing number of registers.
Ultimately CPUs will be only able to do a lsl, mov, add, and xor, in a single conditional instruction, with 131072 64-bit registers to choose from, in a single clock cycle at 10GHz or so. People will still say "well yeah but do you really need that add?"
Iirc it's not such a meaningful distinction anymore. "CISC" x86 uses micro-operations internally. "RISC" ARM has several different instruction encodings (ARM, Thumb, Thumb-2, A64). Increasing numbers of people are working in high level languages anyway.
From that page, which collects a number of Usenet posts by John Mashey:
> The RISC characteristics:
> a) Are aimed at more performance from current compiler technology (i.e., enough registers).
> OR
> b) Are aimed at fast pipelining
> - in a virtual-memory environment
> - with the ability to still survive exceptions
> - without inextricably increasing the number of gate delays (notice that I say gate delays, NOT just how many gates).
The point b is where RISC chips really pulled away from CISC in terms of architectural design, especially chips like the MIPS, which Mashey worked on: The MIPS had a number of points where it exposed the tricks it used to pipeline more aggressively, even at the expense of making compilers somewhat harder to write and/or human assembly-language programmers think a bit harder. However, the lack of complicated addressing modes (post-increment, scale-and-offset, etc.) and the lack of register-memory opcodes with ALU operations, and total lack of memory-memory operations, is still a very common feature of RISC design.
That seems like RISC changing everything. How much did it change things? CISC has been transformed from a true competitor to RISC to a sort of abstraction layer on top of it.
That's not really true. x86 chips are not secretly RISC inside; the microcode often corresponds to individual instructions, it has complex things like 'lea' that are quite efficient, and you can't really abstract away things like variable length instructions.
Yeah, almost all RSP code used was written by SGI/Nintendo and used as shipped in the SDK. For a long time emulators simply high level emulated the RSP code based on the hash of the RSP binary and accepted that a few games like Rogue Squadron simply wouldn't run.