CMU 15-721: In-Memory Databases / Advanced Database Systems [video]

lichtenberger · on Nov 30, 2019

Great lecture so far. Haven't had the time to watch the whole lecture, but one thing I want to mention is that there are techniques to improve the buffer manager performance as for instance described here by Goetz Graefe: http://www.vldb.org/pvldb/vol8/p37-graefe.pdf

I've implemented an even simpler solution for my Open Source Data Store (https://sirix.io) in that each page stores a number of references, which are itself lightweight pointer objects (in Java) and it simply stores an in-memory reference as well as a pointer to the location to fetch it from disk/a flash drive. If the buffer manager has these Objects as keys on eviction we can simply null the reference to the in-memory page instance.

dowakin · on Nov 30, 2019

Also check out other courses from Prof. Andy Pavlo https://www.youtube.com/channel/UCHnBsf2rH-K7pn09rb3qvkA/pla...

calpaterson · on Nov 30, 2019

I find it hard to reconcile the incredible generousity of making this material available for free on the internet with the cringeworthiness of Andy Pavlo's style. I love the material but the "6th form humour" is really off-putting and he doesn't need it.

See https://www.youtube.com/watch?v=m72mt4VN9ik&t=781 for an example of what I mean

balfirevic · on Nov 30, 2019

> he doesn't need it.

Nobody needs anything. You didn't need to write your comment. And neither did I have to write this one. That's terrible criteria to judge anything, if it even means anything sensible at all.

rgoulter · on Nov 30, 2019

I'm fine with "it's down to taste".

"This course video doesn't need juvenile jokes" is an expression of taste. And some people enjoy the jokes.

rumanator · on Dec 1, 2019

I have to agree with the parent poster: the skits do take away from the content. I'm grateful the author was so generous to share his knowledge with the world, but the skits are distracting and off-putting.

thelastbender12 · on Nov 30, 2019

Ha! It really comes down to personal preferences. Distinctive style of presenting material is what really makes a lecture engaging to me. Also, as a selection filter given the abundance of content online.

ahmedalsudani · on Nov 30, 2019

Personally, I enjoy the material on databases—but if I’m being honest, the main reason I watch the first lecture is to find out what sort of trouble Andy Pavlo has gotten into.

ripley12 · on Nov 30, 2019

Andy’s gags are a tiny, tiny fraction of the material. Like, ~5min out of 10+ hours of lectures in a given class. They’re really easy to skip over if you don’t like them.

har777 · on Nov 30, 2019

I disagree. I thought it was hilarious :P

james_s_tayler · on Nov 30, 2019

I love his style. He makes it fun.

manigandham · on Nov 30, 2019

Check out the CMU database group for all the other content and the multiple courses: https://db.cs.cmu.edu/

Here's their Youtube channel: https://www.youtube.com/channel/UCHnBsf2rH-K7pn09rb3qvkA/fea...

jules · on Nov 30, 2019

I've watched some of these and the material and teacher are awesome. Two questions come to mind:

1. A ton of effort seems to be spent on making things run in parallel, but that introduces quite a bit of overhead too, so how well does a sequential baseline actually perform? By sequential baseline I mean a single thread that just executes all incoming transactions one by one in sequence.

2. This course seems to spend a lot of time on things that the teacher says are things you shouldn't do anyway. For instance there is an entire lecture on skip lists and Bw-trees, and at the end the teacher mentions that these are terrible. This is interesting from a historical perspective, but not only does this take a lot of time, I also lose track of which things you should and which things you shouldn't do. It'd be interesting to have a compressed course that spends less time on things you should not do, perhaps by adding annotations to the video to skip sections that are about things you should not do.

twoodfin · on Dec 1, 2019

A ton of effort seems to be spent on making things run in parallel, but that introduces quite a bit of overhead too, so how well does a sequential baseline actually perform? By sequential baseline I mean a single thread that just executes all incoming transactions one by one in sequence.

You should check out the H-Store research project[1] and its commercial successor VoltDB. They’re basically a study in how much you can win with a federation of single-threaded database systems.

[1] https://hstore.cs.brown.edu/papers/hstore-endofera.pdf

rohansuri · on Dec 2, 2019

Do you have the link to lecture on SkipLists and Bw Trees?

Thanks

jules · on Dec 2, 2019

Lecture 7.

cnbscience · on Nov 30, 2019

Amazing material as always! feeling proud to be an alumni.

arkj · on Nov 30, 2019

That prof got sprayed on by a lady asking for 200 dollars (lecture 1, 13:00). Anybody knows the background?

thepete2 · on Nov 30, 2019

That has to be a joke, maybe meant to keep your attention. He has made cuts so why else would he have kept it in?

dominotw · on Nov 30, 2019

i think its this from the top comment https://www.youtube.com/watch?v=m72mt4VN9ik&t=781