Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

You are correct. Just a nit: it's usually referred to as "instructions per cycle" or IPC, rather than "operations per cycle."


Operations per cycle also matters. Circa P4, an instruction took around 40-50 cycles to complete. Yeah, it was +5GHz, though.

Core architecture brought it down to around 8 cycles to complete an op. Clock speeds dropped, but more shit got done at lower clock speeds.


That is still IPC, which refers to instructions per clock as an average throughput and not the instruction latency.

> Circa P4, an instruction took around 40-50 cycles to complete

Different instructions can have wildly different latencies. Even then an instruction taking 50 cycles sounds like double precision division or an 80 bit floating point operation. Most operations on the P4 had a latency of 1 - 7 cycles, but the P4's high clocks made memory latency and branch mispredictions a bigger issue.

Some instruction latency might have been part of the overall pipeline shortening that made the core architecture fast, but this is an oversimplification, and the numbers here don't apply to the vast majority of common instructions. Caches, deep out of order buffers, prefetching and branch prediction all play a part.


Late reply, but TIL and thanks.

All this time I thought everyone was going on about interprocess communication improvements :)




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: