This sounds about right. The query is using count and not sum which i believe ca...

tmostak · on July 30, 2020

Hi @nikita, good to reconnect.

When you say an array and not a hash table, do you just mean a simple perfect hash table indexed by the offset of the dictionary id? We use this fairly extensively for inputs of bounded domain (i.e. dictionary-encoded strings, moderately-sized integer ranges, even binned values, numeric or timestamp), but call it a perfect hashing. Assume we're talking about the same thing but wanted to clarify.

nikita · on July 30, 2020

Yes, that’s it.

I’m still of an opinion that it’s important to demonstrate performance on more complex queries with joins, subqueries, subselects, and clustered data movements. The count(*), group by query is a very very simple case.