Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I am not sure why ranges are used. It seems to be redundant, because the beginning of the validity of a value is identical to the ending if the validity of its previous value. A column with an invalidation time stamp seems to be sufficient for me. The rows, in which this column is null, are currently valid.


There might not be a previous value.

If you insert a record and then later delete it, you need to store a range or two timestamps to know when the insertion and the deletion happened.


The way I deal with this is my history tables have 3 fixed columns:

    * OperationType: An ENUM of "INSERT", "UPDATE", "DELETE"
    * OperationTimestamp: TIMESTAMPTZ
    * OperationId: BIGSERIAL (needed to can handle cases where a row is modified multiple times in a transaction)
This structure is super intuitive to understand, and very efficient since the History table is append-only. But if you are trying to use it to reconstruct the state of items from a specific time the past, it's a bit awkward and the use of a RANGE column instead would be easier to query.


Ah, I did not think of that!


Foreign keys / references; you query one history table at timestamp X, then the joined history table with a "where start > X and end < X" to get the foreign key's data at timestamp X.


With a single timestamp how would you write a queries such as "what was the state of the record at this specific moment in time" or "what changes were made to the record between start_time and end_time"? TFA is also using an index on the range table to ensure that you can't enter overlapping entries in the history.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: