Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

In most relational database management systems (RDBMS) views are not persisted/materialized or indexed separately.

Some RDBMS have materialized views that update concurrently and meet atomicity, consistency, isolation, and durability (ACID) requirements, but others do not and a separate command is needed to update views, in those cases you lose many of the benefits, but the view is fine for say, analytical queries in which a little lag is fine.

I am personally a fan of temporal tables, a technique in which rows are versioned, and you can query for the state of the database "at" a point in time. Here support is even less uniform than with materialized views, but if you have control over your database access layer or object relational mapping, you can brew your own. This gives you history if you need it (rare, but incredibly helpful) and fine grained control over indexing.

I recommend the latter approach for anyone looking to implement event sourcing and auditing. Including in each table a column for the source or cause of a change for example gives you an audit log for almost free.



>> In most relational database management systems (RDBMS) views are not persisted/materialized or indexed separately.

Right, which is why they're great. You can materialize a view, but then you have to manually refresh. And they are actually incredibly performant.

How they work is they are basically treated as a query clause. For example you query "Select * from my_view where key = 1" and it intelligently executes "Select * from base_table where key = 1 and record = latest" (That is to say, it doesn't stupidly requery the whole base table each time, nor is there any risk of inconsistency)

The query optimizer is incredibly smart and honestly pretty sure this can scale beyond the needs of any startup.


I know how views work in several RDBMS, and I don't think that description lines up with what I understand.

> That is to say, it doesn't stupidly requery the whole base table each time, nor is there any risk of inconsistency

I mentioned materialized views, but then you mention the base_table. I think this statement is inaccurate whether it's a materialized view or not:

1. With materialized views as implemented by PostgreSQL and MS SQL Server, it doesn't query the base_table, it queries an artificially generated table. In the case of PostgreSQL, that table will be out of date (inconsistent) the moment there is a change to base_table. A manual refresh is necessary and expensive every time the view becomes inconsistent.

2. With non-materialized views, you are incorrect with at least PostgreSQL and MS SQL Server, the view does not "cache" its filters or joins or clauses. The view just functions as a query modification, and it does in fact query the "base table each time".

I think what you need to satisfy your requirements is actually a filtered or partial index, of the form:

    CREATE INDEX latest_idx 
        ON base_table (some_base_table_keys, ...)
     WHERE record = latest;
That index will be consistent and will keep the query optimizer from, as you say, stupidly requerying the whole base table, as long as the criteria holds and the keys your searching for are a prefix of those keys.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: