Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Standard sql:

    select key, value, asof
    from (
      select *, row_number() over (partition by key order by asof desc) as rn
      from <table>
    ) t
    where rn = 1
postgresql:

    select distinct on (key)
    key, value, asof
    from <table>
    order by key, asof desc
The DISTINCT ON clause takes the first row that the query returns, which the ORDER BY makes sure is the latest value

Disclaimer: I've used both methods, but haven't tested the perf against each other or the self-join.



Timescaledb adds a custom scan for speeding up distinct on out of the box. Just install extension, no extra work is needed.

https://blog.timescale.com/blog/how-we-made-distinct-queries...

I tried it and it really is that fast.


Fork adding it to postgresql if anyone's interested: https://github.com/jesperpedersen/postgres/tree/indexskipsca...

This is cool, thanks! Bit of a limitation that it only works on a single DISTINCT ON column, hopefully they'll be able to extend it to more in the future. And hope it makes it in for everyone that's on stock postgresql!


Yeah, indeed there's that limitation of a single distinct column. But it works for 80% cases, and it does not require any changes, so it's good. Looks like the earliest time proper implementation could hit mainline is 1.5years away.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: