Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I'll probably eventually hate myself for even bringing this up but I can't help but notice the similarity between this ID structure and Version 1 (aka timestamp) UUIDs. While I wouldn't go as far as recommending that you fully adopt that form, it might be worth considering if you could make these IDs compatible with UUIDs by defining a canonical transform. The critical differences are:

- UUIDs use a different epoch (15 Oct 1582 vs 1 Jan 1970) - UUIDs count 100 ns blocks instead of ms - UUIDs include a 6 byte "node id" - UUIDs allow only up to 15 bits of "sequence"

I think that last one is the biggest deal, since as currently specced redis allows 64 bits of sequence, which is obviously much bigger than 15. The options I see are either up the time resolution used by redis, encode some of redis's sequence bits into the UUID's time bits, or just live with it as a limitation--in practice 2^15 is a lot of messages to get in a single millisecond (though in cases of clocks jumping back might not be too much).

You'd also need to come up with some thing for the node id, perhaps the first 6 bytes of a cluster node ID or similar.



Thinking about this some more I realized you could also encode sequence into the low order bits of the timestamp, and rereading the RFC showed it actually makes this recommendation[1]. There are 10000 100-nanosecond periods per millisecond which gives about 13 more bits. Between that and the 13-15 bits available in the clock sequence you've got 26+ bits of sequence or ~67MM values per millisecond.

Since 64 bits is overkill for milliseconds (45 bits covers the next 1000 years or so) I was thinking you could put 2 bytes of the node id in the high order bytes there (perhaps could call this the "clock id"?) and the remaining 4 bytes of the node id could go in the high order bytes of the sequence, which would still leave 32 bits for actual sequence values (but we should only use 26 or so). This means we'd get a translation roughly as follows (numbering bytes and bits from high to low significance):

   Redis                                   Version 1 UUID
      Timestamp
        Byte 0-1  "clock id"               Bytes 4&5 of node id
        Byte 2-7  millis since 1 Jan 1970  * 10000 => ~45 high order timestamp bits
      Sequence
        Byte 0-3  "node id"                Bytes 0-3 of node id
        Byte 4-7
           6 bits wasted space             ignored
          26 bits actual sequence value
            13 high order bits             => clock sequence
            13 low order bits              => low order timestamp bits
Another implication of this scheme is that if redis has access to a clock that offers higher than millisecond resolution it could store everything more precise than millisecond into the sequence portion of the id.

On a side note it seems that the clock sequence in the UUID is intended to be reset to a random value at start up and every time a clock jump is detected rather than just incremented. Redis could do something similar by incrementing some of the 13 high-order bits of the sequence every time a clock jump is detected (and/or if the 13 low-order bits overflow)

[1] https://tools.ietf.org/html/rfc4122#section-4.2.1.2




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: