Hacker News new | past | comments | ask | show | jobs | submit login

> Well, if the table is in InnoDB format it will result in one disk-seek, because the data is stored next to the primary key and can be deleted in one operation. If the table is MyISAM it will result in at least two seeks, because indexes and data are stored in different files. A hard drive can only do one seek at a time, so this detail can make the difference between 1X or 2X transactions per second.

Nonsense.

Innodb has clustered primary keys, which means that the row data is attached to the leaf nodes of the primary key index as the author correctly states. However, the leaf nodes and the non-leaf index nodes are actually stored in different segments of the table space! While in the same (giant) file, it is unlikely that they would ever be in contiguous space on disk enough to be read in a single random IO operation.

But it's more complicated than that: if any of the index pages or data pages have been read recently they will probably still be in the buffer pool, which means that they will require no disk operations.

But that's just the seek operation to find the row. The write operation is a different story yet.

What innodb will do is modify the row by marking it with the transaction id in which it was deleted. It will keep the row in place so readers with an older transaction id will still see it until all those transactions are complete. The change in the row and the row page will be written to the copies of the affected pages in memory only. Eventually the data pages and any affected index pages will be flushed to disk, potentially grouped with other changes to the same pages. IO operations occur on the level of reading and overwriting whole pages only, if not more.

Concurrently it will record in the log buffer every change it makes to the pages in memory. This won't get written to disk right away either, it will flush the log buffer to disk once per second in the default configuration.

So there are many more potential disk operations required of innodb than myisam. Generally innodb is preferably because it is vastly more reliable, and because it can handle concurrent read/writes to the same data -- MyISAM basically can't. MyISAM will in fact generally be FASTER for any single operation than InnoDB, because it simply does less.




I think you're right. I'll try to come up with a simpler example and update the piece. Thanks!


Corrected, though I'm not completely happy with it:

http://carlos.bueno.org/2010/11/full-stack.html




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: