"So rather than spending time constructing complicated queries with a bunch of joins, create SQL Views instead."
No thanks. While I can write manual queries to join... and even make SQL Views... try selling it to the non-developers in the company. The entire point of a BI tool is to eliminate myself as a bottleneck.
Perhaps things have changed by now, I don't know.
...
And to get it out of the way: the same applies for Airbnb's Superset. It's cool and has a lot of different chart types, but 1) really hard to install and 2) data exploration is still hard.
The point with 'insights' is that you can explore the data.
(long URL instead of a shortlink as those are cleared when I reset the demo database)
With no coding time I got a view showing orders by date by country. Now if I click on the rightmost field ("count id"), I can drill deeper and see the orders by the day/country right there. I can use it to investigate anomalies in data.
Right now you must click the data in the column, but very soon it'll be possible to just click it on the graph as well.
This is the niche insights fills and why Metabase and Superset weren't good enough.
For what it's worth, you can do drill-through exploration with Metabase fair well. You can click on the graph and zoom-in, see underlying records, as well as re-pivot the result cell today
We also allow you to group by and filter by "joined" columns across foreign keys. We just never put the word "join" in front of end users. As far as users go, they just know that a transaction has a user and that they can filter by the user's country.
Regarding views vs joins, imo the entire point of doing SQL views (i.e. doing joins or transforms to create wide denormalized tables) is so that end users get a clean, explorable dataset they can understand. If you are dependent on doing 3-way joins in the BI tool then only people that know how to put together 3-way joins will use it.
What we've seen this over and over in companies that use Looker and move to us is that because only analysts or engineers can write LookML, every small modification goes through an engineer or analyst. With Metabase, for the most part, once you spend a bit of time cleaning up the data they see, people just figure things out on their own.
(blatant self promotion from a Metabase team member)
Serious question even though as I read and reread it, I am aware it does sound like snark...
How is this a sustainable product/company? (AKA How do you make money?)
Is love to dig in to what looks like an awesome product. But to justify the time investment, it would be helpful to know how it will be around long term.
I strongly feel that if a domain language is not significantly more complex than the domain itself, then the person who is supposed to have domain knowledge should be able to learn the language without much difficulty. I've seen a lot of attempts to build 'simple' analytical systems without SQL, and it works well for simple queries -
but 'business people', when they're smart and motivated, soon start asking more and more specific and detailed questions, with additional conditions and joins - and it soon turns out that SQL is, in fact, an appropriate instrument for complexity that they need.
Your examples seem pretty simple. What about a question like this: how did our 5-week cohort (users that at the time of the event were registered between 4 and 5 weeks) ARPU vary over time and over character class they chose at registration? Is there a statistically significant corellation between participating in any of the 20 custom events we've done in the last half a year and user retention (which must be calculated relative to each users' cohort)?
Of course, a lot of business analysts are moving away from SQL to R, but I doubt that they're after simplicity.
Indeed my examples are simple. However this was the problem: there was no tool around to easily answer those simple questions. Now there is... at least one that fits my arbitrary definitions of what such a tool should do.
Hopefully having a tool to ask these simple queries will give rise to smarter questions, which will either prompt the development of custom dashboards or improvements in the tool itself.
I agree in theory that "business people", if and when they start asking these questions, can and should learn SQL. However my experience seems to point the other way. They'll ask the developer to do more and more reports, eventually hiring someone for the "data analyst" role.
> eventually hiring someone for the "data analyst" role
Oh, I was presuming that a 'business person' and 'data analyst' is the same person. However, usually 'data analyst's job isn't just to write SQL - he's a person who understands the product, marketing, dives into data and researches it, and comes back with actionable insights for the whole business.
And if you're looking for easier scenarios, Excel and it's pivot tables are usually quite enough.
Hey, this is cool. Thanks for making it available for free to the world.
Note apart, I feel like knowing SQL is something that every developer should be comfortable with. The amount of magic that an ORM does, will tend to zero the more SQL you know. It's weird that we as developers don't know how to query our main data repository, don't you think so?
Once again, thanks for making this available, I feel like this would be useful for a lot of people.
A few suggestions for useful features:
Time Series functionality would be handy (Exponentially weighted moving average, centered moving average etc).
Aggregations across sliding intervals (by shift / by day/week/month/ Finical year etc). Being able to quickly answer questions like What is the production for week to date, current shift etc. How about previous week/shift etc.
One other comment I will give is I find SQL to be useful as a protection against product lock-in. It's essentially the lowest common denominator language for dealing with data extraction/transformation etc. Our business has been burnt before by having our reporting/BI written in a tool specific language it made migrating away from the product very difficult. Ever since I've very much been of the mindset to do as much using SQL as possible.
The plan is to add a view called "generated SQL", where you can see and copy/export the raw query that went into generating the data. This way you can always validate the data and get some "insight" into how it's fetched.
Wanted to share about my BI startup here for those who're interested too: holistics.io - we're a small, self-funded BI startup in Asia (Singapore). Started 2 years ago and have customers around the globe, from small startups to big unicorns.
We're not free or open-source, but a lot more affordable than other competitors. Our pricing starts with $50/mo for 5 users. We have a few customers who moved to us from open-source alternatives or other commercial competitors.
We support PostgreSQL, MySQL, SQL Server, Redshift, Greenplum, BigQuery, Presto, etc.
If you try it out let us know you're from HN and your first 2 months are on us.
I'd like to build something in this area that used NLU (https://en.wikipedia.org/wiki/Natural_language_understanding) to let the user ask an actual question, after they have marked up their database schema. That would be a fun product to build.
I haven't tried, but if it doesn't, it shouldn't be hard to add.
So far I have only tried with PostgreSQL (full support) and SQLite (some date functions break), however any SQL dialect should work fine... except maybe when it comes to the date functions, which everyone has implemented differently.
RedShift is very close to PostgreSQL (it even uses PostgreSQL drivers to connect to their clusters from programming languages or data IDEs).
A sneakier difference is query latency. RedShift is scalable for queries that scan through very large amounts of data - but there is overhead for each query. One complicated JOIN can be faster than 50 small queries that ORMs and web apps like to make.
Supports MySQL, PostgreSQL, MongoDB, Microsoft SQL Server, AWS Redshift, Google BigQuery, Druid, H2, SQLite, Oracle, Crate, Google Analytics, Vertica.