Linux/4004: booting Linux on Intel 4004 for fun, art, and no profit

johnklos · 2024-09-20T17:45:06 1726854306

Wow. And I thought modern NetBSD on a 15 MHz m68030 with a 16 bit memory bus and 10 megabytes of RAM is slow. This is crazy!

It illustrates a point I've explained to many people over the years: once computers started coming with persistent storage, open address spaces and MMUs towards the late '80s and early '90s, we basically arrived at modern computing. An Amiga 3000 or i80486 computer can run the same things as a modern computer. Sure, we have ways to run things orders of magnitude faster, and sure, we now have things that didn't exist then (like GPUs that can run code), but there's no functional difference between those machines and new ones.

I love that Dmitry shows how loosely "functional" can be defined :)

dmitrygr · 2024-09-20T17:48:06 1726854486

I don’t know if this was a thing in America, but in USSR in the 70s and 80s it was very popular to play chess-by-correspondence. You would literally snail-mail letters back-and-forth with your move. Games would last months or years. It added an extra challenge to chess because by the time you got a response, you might have forgotten what strategy you had had.

This project is basically Linux-by-correspondence. The challenge is here too. By the time the command produces an output, you might have forgotten why you ran it.

noufalibrahim · 2024-09-20T19:00:08 1726858808

There was something called Agora (https://en.wikipedia.org/wiki/Agora_(web_browser)) which was sort of an email/http proxy. You could browse the web via. email and set "GET", "POST" etc. commands. It was my first exposure to the web.

It sounds very similar to what you mentioned.

asdefghyk · 2024-09-20T19:53:34 1726862014

There was also a system where a person could email a request for a webpage to a certain email address and the web page would be returned by email. Existed in late 1980s? I think or maybe early 1990s?

noufalibrahim · 2024-09-21T02:44:47 1726886687

That's what Agora did. I used it for a while using pine. I still remember the address by heart. It was agora@dna.affrc.go.jp.

I guess you never forget your first time : - )

password4321 · 2024-09-21T00:52:19 1726879939

https://news.ycombinator.com/item?id=30387879#30390269

> Richard Stallman browses the web via email.

https://stallman.org/stallman-computing.html

unit149 · 2024-09-21T05:10:23 1726895423

M-xing https sites on Emacs, altering the format of text queried, is a means of inferential consumption of news. Parallel strains of information and thought, inhabiting the same site may be recollected if the medium is altered.

anthk · 2024-09-23T06:58:44 1727074724

RMS could just use eww under Emacs.

johnklos · 2024-09-20T18:59:03 1726858743

I installed Windows 95 on an Amiga 3000 with a 25 MHz m68030 via floppy to see if DMF formatted disks would work and to play around. By the time it finished, I had forgotten what I wanted to try out.

unit149 · 2024-09-21T05:21:21 1726896081

Reorganizing the combined correspondence on an excel spreadsheet through an arbitrary numbering system years later, and experiencing the Spinozan "pang of bad conscience."

Did you write the PIC12F1840 HEX file in 4004 machine code for your PIC-based one-chip sound player?

retrac · 2024-09-20T21:50:04 1726869004

15 MHz m68030 with a 16 bit memory bus and 10 megabytes of RAM -- A Mac LC II, by any chance? :)

> towards the late '80s and early '90s

By the late 1960s, really. It would probably be possible to port Linux to the IBM Model 67 [1]; might even be easy since GCC can already target the instruction set. The MMU is sufficient. Maybe a tight squeeze with the max of 2 MB of fast core. Would be in a similar ballpark, a bit slower, to that 68030 machine.

Full virtualization, and hardware enforced memory and IO boundaries, were invented early on, too. It took a while for such features to trickle down to mini- and then micro- computers. And then much longer for popular software to take advantage.

[1] https://en.wikipedia.org/wiki/IBM_System/360_Model_67

vincent-manis · 2024-09-20T22:13:30 1726870410

I have fond memories of the System/360 M67 (and its successors, starting with a System/370 M168) I used from 1970 to the early 1980s. It ran the Michigan Terminal System, and we had all the modern conveniences (terminal operation in the IBM world was klunky, but PDP-10s at the same time did that right). And of course Unix dates from exactly that period.

The fact that Linux runs well on a modern zSeries demonstrates that, even with all the historical baggage this 60+-year-old architecture has, it carries with it many of the architectural innovations that current OSes and languages need.

jdougan · 2024-09-21T02:29:22 1726885762

Oh man, I has so much fun on MTS once that ridiculous fake PC-ish command line was removed. Wrote all kinds of wacky stuff using the command line macros.

johnklos · 2024-09-21T01:04:20 1726880660

Wonderful look in to our history, and you're undoubtedly correct about being able to target that system.

My example was about hardware that was affordable to mere mortals (although it's getting to be more expensive to buy that same hardware now as it was to buy it when it was new), but the idea is the same :)

johnklos · 2024-09-21T05:16:43 1726895803

Yes, a Macintosh LC II :)

YZF · 2024-09-21T05:20:49 1726896049

Those were very nice machines for the day...

justmarc · 2024-09-21T19:02:39 1726945359

A wonderful little machine. So, so much better than the PCs of its time.

jylam · 2024-09-20T18:26:44 1726856804

That's basically the concept of Turing Completeness. Any Turing complete system can run anything. It may be very slow, but it will run. ChatGPT could run on a 4004, all you need is time.

pclmulqdq · 2024-09-20T21:22:25 1726867345

A computer is technically not a Turing machine due to the lack of infinite RAM. It is a finite state machine with an absurdly large state space.

johnklos · 2024-09-20T22:38:43 1726871923

I've always interpreted the definition of storage as arbitrarily large, not specifically infinite. The universe, after all, is finite. The "well, acshually" arguments aren't interesting, because they're 100% abstract.

jfengel · 2024-09-21T00:33:14 1726878794

It is defined as arbitrarily large but not infinite. That's not because of physical concerns, but because some of the theorems don't work if the memory is actually infinite.

darby_nine · 2024-09-21T06:42:40 1726900960

You're comparing an a priori concept with a posteriori one. It's like claiming the number five doesn't "acshually" exist. Like yea, it's a concept, concepts don't exist.

A universe isn't a turing machine because it can't run all the programs that can run on a turing machine. This isn't exactly controversial.

kazinator · 2024-09-22T04:57:03 1726981023

What's the difference between arbitrary large and infinite? Would you say the number of possible Turing computable functions is merely arbitrary large and not actually infinite?

bdbdnxjdj · 2024-09-23T06:41:05 1727073665

There is a very clear distinction: one is finite the other is infinite

If you only allow arbitrary large turning machines, there is a fixed number of programs which can run

pclmulqdq · 2024-09-20T23:37:46 1726875466

When you're talking about something like neural networks on a 4004, the "well ackshually" argument does become very much relevant. The limitations of that kind of platform are hard enough that they do not approximate a Turing machine with respect to modern software.

johnklos · 2024-09-21T10:29:14 1726914554

Running Linux on a 4004 is possible, as we've seen, but running llama is just way too far? Interesting take.

pclmulqdq · 2024-09-21T14:54:48 1726930488

Llama takes a lot more MIPS and a lot more RAM than linux. Linux is more complicated, but computers were running linux 30 years ago. In this case, quantity has a quality all of its own.

jart · 2024-09-23T07:26:43 1727076403

It takes 14,493,515,821 cycles to boot Alpine Linux in an qemu.

    perf stat -Bddd qemu-system-x86_64   -m 2048   -cdrom alpine.iso   -boot d   -enable-kvm   -cpu host   -smp 2   -net nic -net user,hostfwd=tcp::2222-:22   -nographic   -serial mon:stdio   -monitor telnet:127.0.0.1:1234,server,nowait   -d in_asm,cpu   -D qemu.log

It takes 1,927,757,029,221 cycles to summarize a 1625 token Dijkstra essay with LLaMA 8B.

    perf stat -Bddd llamafile -m Meta-Llama-3.1-8B-Instruct.BF16.gguf -f ~/prompt1625.txt -c 4096 -n 40

Ignoring things like AVX512 you're looking at about 100x more compute to do something serious with LLaMA.

However! If you just want to demo it working, then you could generate 4 tokens using TinyLLaMA 1.1B which takes 25,164,386,466 cycles. That's about the same cost as booting Linux. So you could do TinyLLaMA if you can do Linux.

pclmulqdq · 2024-09-23T19:32:58 1727119978

That's closer than I thought, to be honest.

Note also that the 4004 lacks a floating-point unit of any kind - not just a vector unit. I think people make 8-bit integer quantizations of LLMs, though, which would be the fastest versions to run on a 4004.

jart · 2024-09-23T21:09:13 1727125753

A lot of quants just upcast to floats. Some of them work on integer multiplication using pmaddubsw. But oof, it looks like the i4004 doesn't even have that.

brudgers · 2024-09-20T22:06:12 1726869972

Because it has infinite RAM, technically the Turing machine is not a machine.

kazinator · 2024-09-22T05:00:10 1726981210

What? A Turing machine can literally be built. The only problem is supplying the infinite tape, or splicing on more when it runs to the end of a finite tape.

Machines that address storage using fixed width pointers (and have no other kind of storage) cannot be Turing machines.

brudgers · 2024-09-22T23:00:45 1727046045

The only problem is supplying the infinite tape

In so far as infinite tape is feasibile, you are correct.

IsTom · 2024-09-23T13:20:22 1727097622

Any useful TM program halts, so you need finite tape. Only figuring out beforehand how much will you need is difficult.

brudgers · 2024-09-23T17:51:35 1727113895

You don't want your telephone exchange to halt. That's why the engineering marvel Erlang. "Engineering" being the key word. The Halting problem isn't a real problem. Just pull the power cord.

IsTom · 2024-09-24T09:29:20 1727170160

You want every call handled in finite time. It's not that there is an unending computation in a telephone exchange, but a series of finite tasks. You can call it corecursion, coinduction or something else co-.

tcbawo · 2024-09-20T22:15:39 1726870539

Can’t any computer with external connectivity (ie serial or network connectivity) be considered to have infinite memory?

kazinator · 2024-09-22T05:01:37 1726981297

A computer with external connectivity to a tape machine capable of reading and writing a symbols and moving along the tape would qualify as a Turing machine, provided someone feeds it enough tape for any problem that's thrown at it.

lytedev · 2024-09-21T03:27:28 1726889248

How would that work?

nineteen999 · 2024-09-21T10:16:20 1726913780

"we're the dot in dot.com"

No wait ... the other one ...

"the network is the computer"

DonHopkins · 2024-09-21T14:09:49 1726927789

It was funny when Sun proudly and unilaterally proclaimed that Sun put the "dot" into "dot com", leaving it wide open for Microsoft to slyly counter that oh yeah, well Microsoft put the "COM" into "dot com" -- i.e. ActiveX, IE, MSJVM, IIS, OLE, Visual Basic, Excel, Word, etc!

And then IBM mocked "When they put the dot into dot-com, they forgot how they were going to connect the dots," after sassily rolling out Eclipse just to cast a dark shadow on Java. Badoom psssh!

https://www.itbusiness.ca/news/ibm-brings-on-demand-computin...

"The Network Is The Network and The Computer Is The Computer. We regret the confusion."

https://news.ycombinator.com/item?id=34256623

>Oh yeah, don't get me started about NFS! (Oops, too late.) I'll just link with short summaries: [...]

>NFS originally stood for "No File Security". [...]

>The network is the computer is insecure, indeed.

brudgers · 2024-09-20T22:04:29 1726869869

That's basically the concept of Turing

Tarpit,

where everything is possible but nothing of interest is easy.

https://en.m.wikipedia.org/wiki/Turing_tarpit

byteknight · 2024-09-20T18:49:20 1726858160

And a gargantuan amount of RAM.

qwerty456127 · 2024-09-20T18:40:11 1726857611

https://cs.stackexchange.com/a/60978

jojobas · 2024-09-21T00:08:10 1726877290

>all you need it time

Like geological time.

johnklos · 2024-09-21T05:13:57 1726895637

Dmitry talks about compiling the kernel in years. While I haven't done that, I have built NetBSD-vax and NetBSD-mac68k natively on a VAXstation 4000/60 and on a Mac Quadra 610. It's on the scale of multiple months instead of years, but it's enough to give me a feel for it.

jart · 2024-09-23T06:55:37 1727074537

Multicore is a pretty big functional difference. Like attaching an automobile to a wheel lets you have three more wheels.

unwiredben · 2024-09-21T14:27:58 1726928878

At Hackaday Supercon in 2002, the badge for attendees (https://hackaday.com/2022/10/12/the-2022-supercon-badge-is-a...) implemented a fictional 4-bit CPU along with control panel for directly entering instructions and running and stepping through code. I had a huge amount of fun implementing a space shooter video game on it, as the panel included a bit-by-bit view of one of its pages of memory. Comparing its Voya4 architecture with the 4004 was fascinating. Some similar tradeoffs, but the Voya4 has the benefit of 50 years of CPU instruction set exposure.

Alas, dimitygr's method wouldn't work on the badge, as the memory and RAM are all internal to the PIC24 that implements the CPU emulator.

BTW, 4-bit CPUs are still made and used. Many of the mass-produced IR remotes are programmed using a 4-bit MCU. See https://www.emmicroelectronic.com/sites/default/files/produc... for a datasheet.

schubart · 2024-09-22T15:30:41 1727019041

unwiredben · 2024-09-23T02:02:42 1727056962

and also its Voja4, not Voya (my mistake in remembering name spellings)

molticrystal · 2024-09-20T15:06:40 1726844800

I love giving the AVR example when people ask if something can run on an underpowered machine, now I have a new example to link.

Considering the frequencies and wattage I wonder how RF it spits out and what is detectable and decodable on the waterfall of a SDR.

By the way still reading through it, but at the time of this comment I see the word "soubroutine" which is probably a misspelling.

dmitrygr · 2024-09-20T15:47:14 1726847234

Fixed the typo. Thanks

alnwlsn · 2024-09-20T13:03:30 1726837410

Wow this was not a cheap project! Thanks Ebay collectors.

Also probably the only time I'd have gone for an LCD over a VFD. If you're running a multi-year long compile, it'll probably be burned in to hell by the end.

larodi · 2024-09-21T06:41:24 1726900884

Is a project worth receiving a honorary PHD if you ask me. Sadly university staff does not read HN too much.

wolf550e · 2024-09-21T13:12:48 1726924368

It's not original research. A Master's Thesis at most, or only a Bachelor's Degree Final Project. Regardless, doing such amazing work for a blog (and geek fame) is amazing.

capitantrueno · 2024-09-21T13:57:28 1726927048

Since when does an honorary PhD requires original research?

wolf550e · 2024-09-21T14:58:39 1726930719

Sorry, bad reading comprehension on my part! I see honorary PhDs often used to express political support, instead of honest admiration of achievements.

larodi · 2024-09-23T23:11:40 1727133100

hm perhaps you right, I just meant that this is worth appraisal since it is so academic-level detailed, and I can imagine few universities where this can be a PHD thesis. I guess some parts of the whole reading can also rate original research.

eqvinox · 2024-09-20T12:01:23 1726833683

oof. amazing.

…you can see in the high PC bits what's currently executing!

P.S.: Still loads the kernel faster than a virtual ISO on a server's shitty IPMI over the internet ;D

dmitrygr · 2024-09-20T15:48:10 1726847290

While it boots, you can look at LEDs and map them to kernel function easily by running “nm” on vmlinux.

Also, when in user space, you can tell between the main binary (way below 0x01000000) and shared libraries (loaded high near 0x77000000)

abound · 2024-09-21T17:17:27 1726939047

> Still loads the kernel faster than a virtual ISO on a server's shitty IPMI over the internet

This gave me flashbacks of trying to boot Dell M1000e blade servers from an NFS-hosted ISO running on a Raspberry Pi, which was painfully slow to boot and run.

Max-q · 2024-09-20T19:08:00 1726859280

This was a very interesting read. I have read a bit about the 4004 before so I knew it was strange. But the level of obscurity is mind-blowing. Now I just got the urge to see how well I would be able to make a CPU with the same transistor count. It's not that much fewer than a 6502. 8 bit would make it so much easier to program.

Thanks for documenting your work so well!

dmitrygr · 2024-09-20T21:14:36 1726866876

Make it out of transistors and i'll port my MIPS emulator to it :)

immmmmm · 2024-09-20T21:32:19 1726867939

Have you considered porting to the vacuum tube 1 bit / 1 Hz MC14500 clone?

https://hackaday.com/2021/12/27/single-bit-computer-from-vac...

dmitrygr · 2024-09-20T21:34:57 1726868097

a thought!

analog31 · 2024-09-21T16:32:00 1726936320

All it needs is to run a 4004 emulator. :-)

eulgro · 2024-09-20T12:00:31 1726833631

The video took 9 days to film. 4 hours per emulated second.

Also I wonder why he's using Windows 95?

dmitrygr · 2024-09-20T15:49:31 1726847371

Windows 2000

For the video, i wanted a laptop with a real serial port (no usb). This one fit the bill and was $20 on eBay. Windows 2000 is the prettiest windows IMHO, so that’s what I installed for the demo video.

danirod · 2024-09-20T12:54:12 1726836852

Sorry for the nitpick, but the laptop in the video looks like Windows 2000

phatskat · 2024-09-20T14:58:27 1726844307

The best Windows imo

pkphilip · 2024-09-20T17:42:20 1726854140

np1810 · 2024-09-20T21:19:53 1726867193

This is so awesome. I hope I can expand my knowledge such that I can understand most of this project, right now it was way past my limited CS proficiency.

Though my highlight (which I could completely comprehend) is "Section 14.b & 14.c - Getting the data..." All it took was 400K files (~275 photos/day after 4 years). We have so much of raw power of processing, storage & network still the most-used (probably) media-sync apps crashed or faced slow sync, AirDrop fails & lack of 'Select-All' UI feature. Crazy times we live/will live in... :)

solstice · 2024-09-21T17:45:58 1726940758

I wonder if Dmitry considered a syncing tool like MobiusSync or Syncthing to copy the pictures to a PC continuously after they are taken.

jart · 2024-09-20T12:37:25 1726835845

There needs to be something like a Nobel Prize for this kind of thing.

seqizz · 2024-09-20T13:52:31 1726840351

Probably closest would be the Ig: https://en.wikipedia.org/wiki/Ig_Nobel_Prize

jart · 2024-09-20T18:04:16 1726855456

That's kind of insulting honestly. Getting Linux to run on an i4004 is bona fide engineering. More real than engineering that we're paid to do most times. Looking at the list of Ig Nobel Prize winners it sounds like The Onion but not funny.

zamadatix · 2024-09-20T21:26:12 1726867572

Many things which receive an Ig Nobel (but not all) are bona fide <thing> that just happen to be funny or have a particularly strong aspect (not necessarily in their entirety) of triviality. I'd be honored if I did a project like this that got enough attention and generated enough amusement to deserve an Ig Nobel instead of offended I'm on the same list as projects that weren't all genuinely representative works of the field.

sitkack · 2024-09-20T20:49:26 1726865366

To the winners of an Ig Nobel, it is often their most cherished award.

qwerty456127 · 2024-09-20T21:04:09 1726866249

Running a modern kernel on a 4004 just doesn't sound too far from running it on an incandescent light bulb or on a potato. No doubt sorcerer level engineering.

bschmidt1 · 2024-09-22T15:23:25 1727018605

Running Linux on a computer? I could understand your comment if the display was something very non-conventional like a series of lightbulbs instead of digital output.

It's a cool project don't get me wrong, but Nobel Peace Prize? What is this thread? I would consider something like a strandbeest to be more award-worthy - but it too is just a neat invention, not contributing to world peace or anything near it.

fortyseven · 2024-09-20T18:40:28 1726857628

Supposed to be, apparently.

6SixTy · 2024-09-20T21:06:59 1726866419

There's the Turing award, which is an equivalent prize for computing. Could add an acknowledgement for strange and unusual applications of computer science.

irdc · 2024-09-21T10:10:21 1726913421

Ig Nobel is a pun on Nobel and ignoble, so how about a Tor Turing award?

Pet_Ant · 2024-09-20T14:55:40 1726844140

In the "Why MIPS?" section:

> some have shitty addressing modes necessitating that they would be slow (RISCV)

What is wrong with the RISC-V addressing modes?

TapamN · 2024-09-20T19:22:46 1726860166

It's not really the addressing modes, but the instruction format. Immediate values on RISC-V are not stored contiguously on certain RISC-V instructions.

On all MIPS instructions, the bits for a immediate add, load constant, branch, etc value are always stored in order.

On RISC-V, the bits are (sometimes) jumbled around. For example, on a unconditional branch, the bits for the destination offset are stored in the order of bit 19, bits 9-0, bit 10, bits 18-11. In hardware, reordering that is free, you just run your wires the right way to decode it. In software, you have to do a ton of bit manipulation to fix it up.

The reason RISC-V does that is to simplify the hardware design.

Pet_Ant · 2024-09-20T20:51:02 1726865462

Okay, so it's more about IMM representation within the bytecode rather than some memory addressing mode.

dmitrygr · 2024-09-20T21:09:02 1726866542

Well, lack of REG+REG and REG+SHIFTED_REG addressing modes handicaps it significantly. And no, it will not get magically fused by magic fusing fairies in your cpu

kouteiheika · 2024-09-21T10:59:10 1726916350

> Well, lack of REG+REG and REG+SHIFTED_REG addressing modes handicaps it significantly.

Does it? Well, there's a vendor specific extension for that (XTheadMemIdx):

https://github.com/XUANTIE-RV/thead-extension-spec/releases/...

Not sure about GCC, but on clang it is trivial to enable it. And if you really want to (assuming you have the hardware) you could compile exactly the same code with and without it and compare how much exactly it is handicapped if those instructions are not there.

Plus, on RISC-V multiplication/division (about which you've complained) is optional, and there is also a variant of RISC-V with only 16 registers instead of 32 (also very simple to enable on recent versions of clang, although Linux would probably need some modifications to be able to run on that).

So I'm not entirely convinced that RISC-V would be worse here.

dmitrygr · 2024-09-21T16:46:15 1726937175

Lack of mul/div isn’t actually good. Having it be done ins guest code is a magnitude slower than is host code.

My other issue was that there is NO working Linux user space for rv32. There is for rv64. No Debian. No Ubuntu. No anything for rv32

snvzz · 2024-09-21T02:50:41 1726887041

>lack of REG+REG and REG+SHIFTED_REG addressing modes handicaps it significantly

Is this a guess, or statistically supported on a body of empirical evidence like the RISC-V spec is?

dmitrygr · 2024-09-21T02:57:59 1726887479

Compile same code for aarch64 and riscv64. Compare.

It is well known design deficiency. Much like lack of bit field ops.

camel-cdr · 2024-09-21T14:46:40 1726930000

Do you have code examples for this? I'm looking for cases where RISC-V is lacking compared to arm.

snvzz · 2024-09-21T03:03:10 1726887790

Sounds extremely subjective.

i.e. it does not have my favored instructions or addressing modes, so it must be worse.

jodrellblank · 2024-09-21T14:06:05 1726927565

You’re replying to the author of the post explaining why he would have to do more in software to emulate RISC-V than MIPS and that would be more effort and run slower, and you’re telling him “that’s extremely subjective”?

How is that subjective?

Dylan16807 · 2024-09-21T23:12:26 1726960346

"well known design deficiency" is the subjective part.

smolder · 2024-09-21T09:07:59 1726909679

Much like juggling with your feet instead of hands is subjectively worse.

snvzz · 2024-09-21T09:50:03 1726912203

From a hardware perspective, juggling is better without flags and without unnecessary addressing modes.

Neither are concepts RISC-V invented, but rather, adopted. Ideas that have been plenty tested out there, and proven beneficial.

Large body of evidence trumps intuition and guesswork. Most of this is documented in the spec itself and/or in:

Computer Architecture: A Quantitative Approach (John L. Hennessy, David A. Patterson)

smolder · 2024-09-21T18:33:43 1726943623

I just meant that it's somewhat unergonomic when dealing with the instruction set directly, FWIW. Thanks for the reference.

Rohansi · 2024-09-20T15:03:03 1726844583

Probably nothing unless you want to emulate it on severely underpowered hardware.

PaulHoule · 2024-09-20T12:01:19 1726833679

Virtual machine (as in the Z-machine or the JVM) worked on early micros when you couldn’t use them as compiler targets. See

https://en.wikipedia.org/wiki/SWEET16

https://en.wikipedia.org/wiki/UCSD_Pascal

anshargal · 2024-09-21T05:44:05 1726897445

I wonder if there is a simple chip, similar to the 4004, 8080, or Z80, that can run at modern high frequencies — 4–5 GHz, or even higher due to its simplicity? Not much of a practical use, but it could be fun for emulation projects. 100x slower with emulation - still fast enough for retro platforms.

dmitrygr · 2024-09-21T05:55:48 1726898148

RP2350 can hit 400mhz on two cores without much effort. 800MIPS of total power, if you can parallelize your emulator

tails4e · 2024-09-21T08:46:43 1726908403

An FPGA could 400mhz with a bit of effort, 100mhz easy.

garganzol · 2024-09-20T16:13:12 1726848792

The proof of the Turing Completeness Theorem in action. Beautiful. Boot time is ~5 days.

nickpsecurity · 2024-09-21T16:18:23 1726935503

Jack Ganssle’s The Embedded Muse covers tools and techniques in embedded development. In one article, he reported that 4-bit MCU’s were still competitive in some sectors:

https://www.ganssle.com/rants/is4bitsdead.htm

I suspected we’d see either 4- or 8-bit return for neural networks. IBM did try the kilocore processor which was a massively-parallel, 8-bit system. Multicore 4- and 8-bitters are also the kind thing people could build on cheaper nodes, like Skywater’s 130nm.

blueflow · 2024-09-20T11:31:29 1726831889

At first i was like "I'm pretty sure this is bullshit or some cheat used" but then i was like "Oh, its dimitry."

Impressive work, as always.

adrian_b · 2024-09-20T12:18:25 1726834705

Very impressive work, but most of the work has been necessary because Intel 4004 was not really the first microprocessor, this was just BS propaganda used by Intel to push back by one year the date of the launch of the first microprocessor, to 1971.

The first true (civilian) microprocessor was Intel 8008, in 1972.

Intel 8008 was a monolithic implementation, i.e. in a single PMOS integrated circuit, of the processor of Datapoint 2200, therefore it deserves the name "microprocessor".

The processor of Datapoint 2200 had an ugly architecture, but there is no doubt that it was a general-purpose CPU and traces of its ISA remain present in the latest Intel and AMD CPUs.

On the other hand, the set of chips that included Intel 4004 was not intended for the implementation of a general-purpose computer, but it was intended just for the implementation of a classic desktop calculator, not even a programmable desktop calculator.

This is the reason for the many quirks of Intel 4004, e.g. the lack of instructions for the logic operations, and many others that have increased the amount of work required for implementing a MIPS emulator suitable for running Linux.

Even if Intel 4004 was intended for a restricted application, after Intel has offered to sell it to anyone, there have been many who have succeeded to use it in various creative ways for implementing microcontrollers for the automation of diverse industrial processes, saving some money or some space over a TTL implementation.

In the early days of the electronics industry it was very normal to find ways to use integrated circuits for purposes very different from those for which the circuits had been designed. Such applications do not make Intel 4004 a true microcontroller or microprocessor. Very soon many other companies, and later also Intel, have begun to produce true microcontrollers, designed for this purpose, either 4-bit or 8-bit MCUs, then Intel 4004 has no longer been used for new designs.

kens · 2024-09-20T16:48:54 1726850934

I'm glad to see the Datapoint 2200 is getting attention, but by reasonable definitions of "microprocessor", the Intel 4004 was first, the Texas Instruments TMX 1795 was second, and the Intel 8008 was third. It seems like you're ruling out the 4004 on the basis of "intent" since it was designed for a calculator. But my view is that the 4004 is a programmable, general-purpose CPU-on-a-chip, so it's a microprocessor. Much as I'd like to rule out the 4004 as the first microprocessor, I don't see any justifiable grounds to do this.

Intel's real innovation—the thing that made the microprocessor important—was creating the microprocessor as a product category. Selling a low-cost general-purpose processor chip to anyone who wanted it is what created the modern computer industry. By this perspective, too, the 4004 was the first microprocessor, creating the category.

My article in IEEE Spectrum on this subject goes into much more detail: https://spectrum.ieee.org/the-surprising-story-of-the-first-...

klelatti · 2024-09-20T16:28:56 1726849736

Your argument is that because the 4004 was built to power a calculator that disqualifies it as a microprocessor? Independent of the actual nature of the 4004 itself and its potential applications beyond its first intended use? Can’t see how that makes sense at all.

Your statement about Intel 'pushing back' the date to 1971 also makes little sense given Intel advertised [1] the 4004 as a CPU in Electronic News in Nov 1971.

[1] https://en.wikipedia.org/wiki/Intel_4004#/media/File:Intel_4...

adrian_b · 2024-09-21T08:24:38 1726907078

Because of its purpose, Intel 4004 did not have many features that had been recognized as necessary already since the first automatic computers, for example the lack of logic operations, which was mentioned in the parent article.

Therefore I do not believe that it is possible to consider Intel 4004 as a general-purpose processor. It had only the features strictly necessary for the implementation of the Busicom calculator.

The idea to sell 4004 for other uses has appeared long after the design was finished, when Busicom did not want to pay for the chipset as much as Intel desired, so Intel decided to try to sell the chipset to other customers too, and then they thought to advertise it as a "CPU".

Moreover, it is debatable whether Intel 4004 can be considered as a monolithic processor, because 4004 was not really usable without the rest of the chipset, which provided some of the functions that are normally considered to belong into a processor.

The Intel 4004 4-bit "CPU" implemented less functions than the 4-bit TTL ALU 74181, which was general-purpose and which was the main alternative at that time for implementing a CPU, but it had the advantage of including many registers, because the MOS registers were much cheaper in die area than the TTL registers, and these included registers were the reason why a CPU implemented with the Intel 4004 chipset had a lower integrated circuit count than the equivalent implementation with MSI TTL ICs.

Intel's advertisement of 4004 being a CPU was just an advertisement of the same kind of Tesla having a "Full Self-Driving".

klelatti · 2024-09-21T08:50:20 1726908620

> Because of its purpose, Intel 4004 did not have many features that had been recognized as necessary already since the first automatic computers, for example the lack of logic operations, which was mentioned in the parent article.

So you're setting a range of instructions without which a device can't be considered a general-purpose computer even if the missing instructions can be recreated in software with instructions that do exist.

Sorry disagree with this completely, as does every definition I've ever seen.

kens · 2024-09-21T16:46:32 1726937192

No, logic operations aren't "recognized as necessary". For instance, the IBM 1401—the most popular computer of the early 1960s—did not have logic operations. (This was very annoying when I implemented Bitcoin mining on it.)

The reason that the 4004 is considered a CPU and the 74181 is not a CPU is that the 4004 contains the control logic, while the 74181 is only the ALU.

Of course, "microprocessor" and "CPU" are social constructs, not objective definitions. (For instance, bit-slice processors like the Am2901 were considered microprocessors in Russia.) So you can craft your own definition if you want to declare a particular processor first. cough MP944 cough

MarkusWandel · 2024-09-20T13:57:34 1726840654

No kidding about unusual uses of ICs. Not related to microprocessors, but I have an old analog triple conversion HF receiver (Eddystone EC958/3 for what it's worth) that uses a TTL IC in an analog circuit! I'd have to look at the schematic again, I think it's a multi-stage counter, but basically what it uses it for is to generate a comb shaped spectrum, one "spike" of which can then be picked up by an analog circuit and locked to, to generate precisely spaced tuning steps for the high stability tuning.

dmitrygr · 2024-09-20T16:29:20 1726849760

The naming and propaganda wouldn’t matter. I just wanted something lower-end for sure than a 6510 and an AVR. 4004 is that

cdchn · 2024-09-20T16:38:23 1726850303

Is this the oldest piece of hardware that's ever run Linux, I'm left wondering?

dmitrygr · 2024-09-20T16:42:54 1726850574

It surely is

cdchn · 2024-09-20T19:15:11 1726859711

I'd figure the earliest thing anybody has run Linux on before this would be a 386. Although I suppose with this MIPS emulator ported to some other proto-processors it could go older, but just getting the hardware to do that would be a challenge.

PS: love the VFD

einr · 2024-09-21T08:22:28 1726906948

Linux will run on a 68020 which is a couple of years older than a 386.

fortran77 · 2024-09-20T21:31:29 1726867889

There was a 286 Xenix, and Minix (which inspired Linux) ran on an 8088.

cdchn · 2024-09-20T22:39:18 1726871958

Yeah, and UNIX was developed on a PDP-7 as far back as 1969. But I was talking about Linux.

dboreham · 2024-09-20T16:19:47 1726849187

Glad to see someone besides me posting this whenever 4004 history-rewriting comes up.

apricot · 2024-09-21T15:40:32 1726933232

IMO, just because a microprocessor has a quirky instruction set that doesn't include some standard instructions, or was made for a specific purpose in mind, doesn't make it not a microprocessor.

artyom · 2024-09-20T15:25:52 1726845952

I didn't know the guy but he clearly knows what he's doing, it's unbelievably entertaining to read the details of achieving an impossible task with the most underpowered tool possible.

ssrc · 2024-09-20T12:07:39 1726834059

I mean, it's fun and interesting bullshit that cheats a lot. I'm sure that you could emulate a MIPS using a one-bit processor like the MC14500[0] with enough supporting hardware, real or virtual. Looking forward to it, Dimitry.

[0] https://en.wikipedia.org/wiki/Motorola_MC14500B

dmitrygr · 2024-09-20T15:50:57 1726847457

I’ll work on setting a new lower record every ten years or so. My guess at the next three steps: one bit controller, transistors only, vacuum tubes.

mrguyorama · 2024-09-20T18:12:47 1726855967

At some point you will just need to offload the actual "processing" part to some nice old chap named Dave who has himself an abacus, and every now and then you send him a letter and he moves some stones and sends a letter back with the result.

MobiusHorizons · 2024-09-21T05:23:37 1726896217

I think you have invented the most cursed form of virtualized cloud computing where the memory is connected by a very slow network.

Bit of course you can solve this by sending letters speculatively, and having Dave keep copies of his letters back to you so he reference them for recently set values, that way you can at least keep Dave busy while the letters are in flight.

dmitrygr · 2024-09-20T18:21:57 1726856517

CPU-by-correspondence!

Frenchgeek · 2024-09-21T13:07:24 1726924044

Obligatory https://xkcd.com/505 , I think.

alnwlsn · 2024-09-20T13:05:47 1726837547

We need this for the Usagi Electric vacuum tube computer.

ylmzhamdi · 2024-09-30T10:35:45 1727692545

The open source culture may be offer.. İsn't it? Why can't we hack the Linux kernel and make it 16 or 8 bit versions of linux?

kazinator · 2024-09-22T05:07:39 1726981659

The funny thing is that the 4004 was used in emulation from pretty much it's beginning. It was used to make a calculator, the Busicom 141-PF. But the calculator couldn't be programmed in 4004 directly. 4004 ran virtual machine. The calculator was written in the virtual instruction set for this machine.

A chip like the 4004 could be used in some simple embedded applications using its instruction set directly. But that calculator was the original application was developed for. Already that was too complicated, not unlike, say, booting Linux.

ylmzhamdi · 2024-09-30T10:31:46 1727692306

Open source culture, why can't we hack the Linux kernel and make 16 or 8 bit versions of it?

xandrius · 2024-09-21T09:20:13 1726910413

Wonderful read but that ending bit about taking the video truly hurt...

It kind of shows that no one can be an expert in everything at the same time.

roygbiv2 · 2024-09-21T09:50:28 1726912228

What did he do wrong? Seems he knew what he needed to do but the technology wouldn't work.

dmitrygr · 2024-09-21T18:12:22 1726942342

I think the implication is that there is probably a much more reasonable way to do what I was trying to accomplish, and I didn’t do that. That is almost certainly true. This was my first time trying to film such an extreme time lapse so it is quite possible that I wasn’t aware of some tools designed for it

kazinator · 2024-09-22T04:51:46 1726980706

And the txr lisp virtual machine there is a t0 register, which is always nil, and that symbol can be used as an alias for it in the assembly. This was consciously inspired by the MIPS $zero.

A write to t0 is not simply ignored. It will generate an exception. Another MIPS-like thing is that t1 is an assembler temporary.

amszmidt · 2024-09-22T18:31:13 1727029873

The CADR had a convention for the A and M memory (essentially what we would call registers today) for the Lisp Machine microcode where the first 32 locations had special meaning. Location 2 was for constant 0, and 3 for constant -1.

bdbdnxjdj · 2024-09-23T06:33:21 1727073201

What a great project! Especially directly emulating MIPS in 4004

It would have been far easier to write an MIPS Emulator in a tiny VM and run that VM instead of MIPS

It would also be even slower ^^

mkl · 2024-09-21T11:10:26 1726917026

  BogoMIPS            : 0.00

Very impressive project. Recording it with a phone seems complicated though. Why not a Raspberry Pi with camera module? Or a decent webcam and any computer?

StefanBatory · 2024-09-20T18:13:42 1726856022

This is absolutely insane - hats off to you. To say it's impressive it's like to say nothing.

anyfoo · 2024-09-20T20:37:12 1726864632

Only started reading, but:

    Just to keep you on your toes, there does exist a single one-byte instruction that takes 16 cycles - FIN. However, that is only the beginning.

Please tell me that was intentional.

dmitrygr · 2024-09-20T20:39:17 1726864757

it was

qwerty456127 · 2024-09-20T21:05:57 1726866357

Didn't anybody try to build a 4004 core running at some hundreds MHz?

dmitrygr · 2024-09-20T21:07:26 1726866446

not that I found. And it makes sense since without MHz-fast 4002 and 4289, it is useless. Plus, swinging a bus from -15V to 0V at MHz speeds will take quite some drive current.

valdrinNereth · 2024-09-23T07:06:29 1727075189

Great job! But can you run DOOM on it?

hnpolicestate · 2024-09-20T20:31:56 1726864316

See below for boot time after optimization.

dmitrygr · 2024-09-20T20:33:39 1726864419

That is before many optimizations. Current boot time (on a real 4004 at 790KHz) is ~4.5 days. At 740 it would thus be 4.8 days

hnpolicestate · 2024-09-20T20:35:35 1726864535

Yes my fault. Editing original post.

hilbert42 · 2024-09-20T13:11:50 1726837910

Mission impossible — do it with Windows!

dmitrygr · 2024-09-20T15:51:41 1726847501

Windows ran on a similar MIPS machine (Microsoft jazz). The issue is emulating scsi. I think I’d need a lot more rom space to do that. Scam is messy and hard.

The alternative is to find the Windows MIPS DDK and build a paravirtualized disk driver for it like I did for Linux. That would make it more doable.

hilbert42 · 2024-09-21T07:14:20 1726902860

Ha. Jazz a dim distant memory but methinks it's concept is ~20 later than the 4004. Just thought about emulating/writing SCSI on the 4004, might be easier to climb Everest.

Perhaps we need a competition, assembly programmers only need apply. :-)

jesprenj · 2024-09-20T12:38:09 1726835889

> But for the one I'll have hanging in my office, I have loftier goals. With swap enabled, the kernel sources can actually be built right on-device. It will take some number of years. The partition where the kernel lives is /dev/pvd2 and is mounted under /boot. The device can build its own kernel from source, copy it to /boot/vmlinux, and reboot into it. If power is interrupted, thanks to ext4, it will reboot, recover the filesystem damage from the journal, and restart the compilation process. That is my plan, at least.

whartung · 2024-09-20T14:55:06 1726844106

I have two visions of this.

One, it reminds me of that "worlds longest song" or somesuch thing, where they play a note every 10 years.

The other is just a picture of someone, asleep at their desk, a pile of calendars with days checked off tossed to the side, random unwashed mugs and such all dimly lit by a desk lamp and see the `$ make linux` finally return to an new, unassuming `$` prompt. Like Neo in the Matrix.

rx_tx · 2024-09-20T20:34:02 1726864442

the "World's longest song" is from John Cage, called As Slow As Possible and is played continuously on an organ in a church in Germany.

It will be completed in the year 2640.

https://en.wikipedia.org/wiki/As_Slow_as_Possible

dmitrygr · 2024-09-20T15:45:42 1726847142

I like the second version!

Pet_Ant · 2024-09-20T14:10:12 1726841412

I wonder of you can calculate when it will finish by counting the instructions and then pin the date it will finish and stream the completion.

dmitrygr · 2024-09-20T15:46:50 1726847210

Yes. I have an emulator of this board (it is in the downloads too) which is much faster than the real thing. It shows how much realtime is needed to get to the current state. Doing a build in it will answer the question unequivocally.

ladyanita22 · 2024-09-20T18:35:49 1726857349

Update us please!!

01HNNWZ0MV43FF · 2024-09-20T16:57:09 1726851429

Ars Longa, Vita Brevis

teaearlgraycold · 2024-09-20T16:12:23 1726848743

I’d assume you’d have at least a few bit flips occur in the process.

dmitrygr · 2024-09-20T18:23:22 1726856602

Very large-process DRAM with frequent refreshing, in ceramic cases. It might last long enough without flips