"The DMA registers [...] run the full CPU 3.6Mhz speed. [...] I put a 32 byte function that would draw a scanline of polygon data in there."
The SNES supports SlowROM and FastROM but, interestingly, its internal memory is "SlowRAM". This seems like it would significantly bottleneck the system's performance. I guess "FastRAM" was just too expensive.
Another World uses SlowROM, so almost everything in the system is slow. It's a clever trick to use the DMA registers as a tiny amount of "FastRAM" and run code from there.
Something similar has been done on the Game Boy Advance - it's faster to run code out of Video RAM than directly from ROM or (the majority of) normal RAM.
There’s a FastROM conversion for the Super Ghouls and Ghosts. The original was plagued with slowdown while the conversion runs very well. This leads me to think that the game was originally designed with fastROM in mind, but then they took manufacturing cost into account.
how much this actually was, I don’t know. Burger Becky mentioned begging her boss to use Fast ROM when interplay made RPM Racing but was also denied.
The trade offs of 1991!
Edit: of course Nintendo wasn’t as cost constrained. It was never an even playing field for 3rd parties
Moving to the much later Gamecube, it supported a locked cache mechanism that inter-operated with DMA. That way you can get your code in half the cache and streaming the data through the other half via DMA. Code runs guaranteed at max CPU speeds, never evicted. Data speeds are also fast when you orchestrate your DMA to cache nicely.
Instead of fancy locked cache or other cache tricks, the PS1 and PS2 also had scratchpads that ran at full speed. 1K for PS1 / 16KB for PS2. Very useful.
What's annoying about that memory space is it's fragmented: $4300-437f is the 128-byte region for DMA registers, but $43xc-f aren't usable (well, $43xf mirrors $43xb for whatever reason. $43xc-$43xe are open bus.)
So basically every 12 bytes, you get 4 bytes that are no longer usable before the next 12 bytes.
I haven't actually looked at this game's code, but it's certainly clever if the author found a way to avoid having to perform unconditional jumps in there that would sacrifice most of the gains in performance.
I'm not sure the ricoh 5A22 has large enough immediates for it, but there is a easy way to avoid a jump over a small amount of dead memory (example in x86):
430B 3D -- -- -- -- cmp eax dead32 # only affects flags
4310 xx dowhatever
Okay, a) that's very clever, but b) mvn is really quite slow. DMA would be faster (presuming the data is on two separate buses, you can't perform RAM -> RAM DMAs.) Barring that, a manually unrolled loop in a slow memory area would definitely beat out an mvn in a fast memory area.
I guess it's easy to judge this 25 years later with all we know now. That was a very cool idea to have implemented back then! Putting the mvn there would definitely be a boost compared to having the mvn be in a slow ROM area (6 master clock cycles per byte transferred.)
Interesting to note it was one 32 byte function in those registers netting a 10% speed increase. Reminds me of demo scene tricks. That function was for sure hand written assembly/machine code. It is something of a lost (and not needed by most people) art these days.
I feel like the nineties demoscene equivalent is taking over the machine (x86) and putting some code and important variables into cache lines that never get evicted and all of your data runs through the other cache lines by mapping of that data sparsely in memory.
Loved this... I laughed at the section when the developer goes all the way to scavenge an FX chip from starfox, solders it to their game, and writes the appropriate code for it... just gor the boss dismiss it in 5 seconds because of the cost. It is 2020 and I have experienced the same time and time again. I a am sure the developer got a thrill of the tech challenge at the time at least.
I'd recommend anyone watch Becky's video¹ on the development, she had to basically redo it three times! Boss kept demanding more with less hardware. She's active on twitter too, I'm sure you'll see her discussing this article there too²
Holy shit, she had to copy code into the DMA control registers to get it to blit fast enough. Which meant she had to fit the blit routine into 32 bytes. That's fucking hard core.
I did not realize there was such a gap in resolution between the Amiga version and the SNES one. When you compare at the end of the article it is almost light and day!
> The SNES's CPU name sounds esoteric but the Ricoh 5A22 is in fact a 6502 on steroids.
Correct me if I’m wrong, but isn’t it more a 65C816 than a 6502? Granted, the 65C816 was a better 6502, but IIRC, it’s closer to the former than the 6502.
The 65C816 had full backwards compatibility with the 6502 and starts in emulation mode after a reset making it essentially identical to the 6502 until switched to native mode. Even in native.mode it retains a lot of backwards compatibility with the 6502.
It has the same 24-bit address space as the early IBM 360 mainframes, the 80286 and a handful of other systems. It's much less cramped than the 16-bit environment of the 6502, the PDP-11, etc.
It's easy to underestimate the 6052 because other chips (the 68k) series are better on paper. However, interrupt handling is fast and DMA efficient so it is a good match to the family computer role.
Apple thought that the Apple ][ would be rapidly out of date so it rushed ahead with the failed Apple ///, the Lisa, Macintosh, etc. They did not realize the ][ was going to last as long as it did, otherwise they would have prioritized the Apple 2gs -- rather than the Mac which had two near-death experiences before the company became the leader in mobile.
You might want to read about the Genesis/MegaDrive version for additional context http://fabiensanglard.net/another_world_polygons_Genesis/ind... The two systems are broadly similar, especially wrt. their graphics support so the author seems to skip some things when discussing the SNES.
Back in the days games and ports were solo to tiny teams dev. Now we have what dozens to hundreds of developers, the chicanery is in management cohorts of people, politics and all. Boring times in comparison.
Only AAA games have dozens to hundreds of people involved, independent dev (gaming or otherwise) is still quite similar to what it was back in the day.
-join Nvidia "the way it's meant to be played!" "marketing program" receiving ton of $, in return tesselate the SHIT out of everything, go deeper than pixel count, and dont forget to tesselate ocean under the level https://techreport.com/review/21404/crysis-2-tessellation-to... ~30% drop on AMD hardware!
Note that "The Polygons of Another World" is an ongoing series, so some submissions are to the index of the series, while others (like this one) are to the latest installment at the time, which tends to change the focus of each discussion.
I'd really appreciate a comparison of how the sound was done. Although the SNES could not match the resolution of the Amiga, for my money OoTW sounds a lot better on the SNES.
From a hardware overview perspective, the Amiga's sound chip was essentially 4 "dumb" 8 bit PCM channels, so audio effects and synth stuff was fundamentally CPU driven, and I'm guessing most of the CPU horsepower was going into the 3d rendering, without much left over for sound.
The SNES has a completely separate audio subsystem, with 8 16bit PCM channels, with its own CPU and RAM purely dedicated to sound generation.
Yea, I used youtube transcript because I don't speak japanese. Or english much either. If you find out what she meant plese tell me and i will fix the article.
I checked the video (https://youtu.be/tiq0OL8rzso?t=925 at 15:25): She says "日本語を話しますか?", pronounced "Nihongo wo hanashimasuka?" meaning "Do you speak Japanese?", which she also translates for the audience in English immediately afterward.
Unsure about the 3DO port, probably not. I am juggling pet projects right now. If I ever look at an other version I would go first for the GBA since the author sent me the source code and it will be easier to look at.
Maybe someone else will do the 3DO analysis? With an emulator it is not hard to look at how it works.
you will laugh about, but I played this game when I was a kid on my PC, but I only finished it years later when I bought a N-Gage (I really miss that phone)
The SNES supports SlowROM and FastROM but, interestingly, its internal memory is "SlowRAM". This seems like it would significantly bottleneck the system's performance. I guess "FastRAM" was just too expensive.
Another World uses SlowROM, so almost everything in the system is slow. It's a clever trick to use the DMA registers as a tiny amount of "FastRAM" and run code from there.
Something similar has been done on the Game Boy Advance - it's faster to run code out of Video RAM than directly from ROM or (the majority of) normal RAM.