Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

SIMD works great for doing the same thing to multiple pieces of data, but it doesn't do the scaling up that I described.

I'm no chip engineer, so maybe what I'm envisioning isn't possible. In essence, instead of making 4x 64-bit cores you make 128x 2-bit cores and then some architecture on the die to select groups of cores to build a processor of the required size, execute some instructions with that processor, and then disassemble the processor back into a pool of resources.

So SIMD might be able to calculate two 16-bit sums on a 32 bit processor in one cycle, but the hypothetical CPU I'm describing will be able to calculate a single 128 bit sum and eight 16 bit sums in one cycle, at the same time.




What you're describing is basically a modern FPGA[1]. You can wire it up as you want at runtime, and they can contain specialized hardware like hardware multipliers and fast local memory to accelerate certain workloads.

[1]: https://en.wikipedia.org/wiki/Field-programmable_gate_array




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: