Definitions are super important. Getting agreement on definitions is really hard and can easily get us stuck in a loop.
In my last job, we spent months talking about "What is a Sale?" is it when someone signs a contract or when they pay or when they move in, etc...
Then you add your sale metric into a table and as that table is used in different places the sales numbers till don't match, people don't remember what a sale is and the conversation starts up again.
Why is Narrator different?
In Narrator, you don't define what a sale is, you break up your data into customer actions.
So "Signed Contract", "Paid Invoice", "Moved In" and thus as people ask different questions we can alway clearly see what they are referring to.
Step 1 to alignment is guaranteeing consistency and transparency.
How do you deal with states?
You are right that does make it really hard. We see customers leveraging the incremental aspect of the activity stream to diff the stream with the updated_at to pull out changes as the activity stream updates. (every 15 minutes so yes you will loose changes in between that time). This doesn't solve the problem but does take you much closer. And then when you do add proper timestamps then you can have historical data in the activity stream and cleaned data from you new tables merged. All the users using that activity are NOT affected.
Not perfect but allows us to have as accurate data as possible.
What about the market?
Yes, I agree the market is a challenge since the world has been only using star schema. We hope that standardization, speed, reusability aspect of Narrator is so compelling that people slowly switch.
How do we get numbers to match?
Consistency and Transparency. Every one who uses the "Completed Order" uses the same revenue. SO it is consistent!
Then you can add the "Shipped Order" activity which has the shipping amount.
By having clear CONSISTENT definitions and clear activities you end up in a world where your numbers match. The only ways for numbers not to match is if some is being deliberate about getting the data not to match which thanks to dataset, is always transparent.
Definitions are a social problem but technology, limitations and consistency can help a lot.
In my last job, we spent months talking about "What is a Sale?" is it when someone signs a contract or when they pay or when they move in, etc...
Then you add your sale metric into a table and as that table is used in different places the sales numbers till don't match, people don't remember what a sale is and the conversation starts up again.
Why is Narrator different?
In Narrator, you don't define what a sale is, you break up your data into customer actions. So "Signed Contract", "Paid Invoice", "Moved In" and thus as people ask different questions we can alway clearly see what they are referring to.
Step 1 to alignment is guaranteeing consistency and transparency.
How do you deal with states? You are right that does make it really hard. We see customers leveraging the incremental aspect of the activity stream to diff the stream with the updated_at to pull out changes as the activity stream updates. (every 15 minutes so yes you will loose changes in between that time). This doesn't solve the problem but does take you much closer. And then when you do add proper timestamps then you can have historical data in the activity stream and cleaned data from you new tables merged. All the users using that activity are NOT affected.
Not perfect but allows us to have as accurate data as possible.
What about the market? Yes, I agree the market is a challenge since the world has been only using star schema. We hope that standardization, speed, reusability aspect of Narrator is so compelling that people slowly switch.
How do we get numbers to match? Consistency and Transparency. Every one who uses the "Completed Order" uses the same revenue. SO it is consistent! Then you can add the "Shipped Order" activity which has the shipping amount.
By having clear CONSISTENT definitions and clear activities you end up in a world where your numbers match. The only ways for numbers not to match is if some is being deliberate about getting the data not to match which thanks to dataset, is always transparent.
Definitions are a social problem but technology, limitations and consistency can help a lot.