>> Such 100 random IOs will cost 0.5 sec on 1 iron platter HDD for example.
>That's an incredibly, incredibly, iffy and mostly wrong statement which depends on arguably a corner case which doesn't often reflect reality (factors include which DBMS, row ordering, table size, cache size, block size, page size, RAM size, Hard Disk seek time, HDD throughput.
you're welcome to specify iron platter HDDs which would do 100 random IOs in significant different time than 0.5 sec.
>Even a table of 1 million rows with ~30B per row could easily be read into memory in about 300ms (100MB read time + ~5ms seek time, or ~= 5+(1e6*rowsize/ (100e3)) )
>Query Optimizers do exactly this.
Not always. If it has enough info, it may take a full table scan path in such a case. Still it will be about the same time - 300ms vs. 0.5 sec. i mentioned.
Of course it is for cold queries. You forgot to mention, i guess, that full table scan you propose may by-pass DB cache (default depends on DB, modern tendency is by-pass by default), and thus second query will take the same time, while table blocks brought in by random IO would frequently be stored in DB cache thus making second query run somewhat faster - depends on how much data was brought in.
>That's an incredibly, incredibly, iffy and mostly wrong statement which depends on arguably a corner case which doesn't often reflect reality (factors include which DBMS, row ordering, table size, cache size, block size, page size, RAM size, Hard Disk seek time, HDD throughput.
you're welcome to specify iron platter HDDs which would do 100 random IOs in significant different time than 0.5 sec.
>Even a table of 1 million rows with ~30B per row could easily be read into memory in about 300ms (100MB read time + ~5ms seek time, or ~= 5+(1e6*rowsize/ (100e3)) )
>Query Optimizers do exactly this.
Not always. If it has enough info, it may take a full table scan path in such a case. Still it will be about the same time - 300ms vs. 0.5 sec. i mentioned.
Of course it is for cold queries. You forgot to mention, i guess, that full table scan you propose may by-pass DB cache (default depends on DB, modern tendency is by-pass by default), and thus second query will take the same time, while table blocks brought in by random IO would frequently be stored in DB cache thus making second query run somewhat faster - depends on how much data was brought in.