“In order to keep using your 80386SX, you’ll need to pay $500/year plus 10% of your power bill to keep it on. Your Intel account has been migrated to a Broadcom account, except we lost your password, so you’ll have to reset it via email. Please sign the new EULA which says we can enter your property and look for evidence in a manner that would make an FBI lawyer in a FISA court blush.”
"The Willamette and Northwood cores contain a 20-stage instruction pipeline. This is a significant increase in the number of stages compared to the Pentium III, which had only 10 stages in its pipeline. The Prescott core increased the length of the pipeline to 31 stages."
Many of the tricks do not work the same way due to how instructions are now broken down by the decoder into microops. You may end up with worse RISC code than what Intel or AMD microcoded. The CPU can optimize it as well if it sees CISC. And less cache pressure can still be valuable.
Speculation and branch prediction got vastly sped up since.
Compilers themselves got way better since as well, so you can sometimes get away with just intrinsics.
Yes, I know. This was like in ~2016 and even at that time you would need to build a custom image to fit OpenWRT in 4MB of Flash (I could fit LuCI and a few packages at the time, but I messed a lot with OpenWRT settings and every change I would have 1 in 4 chances of passing the 4MB limit and you would only discover this after building).
32MB is more reasonable, but even then I remember that it caused some issues with the bufferbloat scripts at the time, one of the reasons I went through the Gargoyle FW instead of vanilla OpenWRT.
In English: Thank you!
reply