I think performance and cost are the big motivators here.
I work on a web framework for building data apps like reports and dashboards, and we use duckDB’s WASM runtime for interactions (e.g when a user changes a date range filter). It’s really fast, and you don’t spend snowflake credits.
One of the founders of Evidence here. Thanks the kind words Mike - that means a lot coming from you.
I think that distinction is right -- we are focused on making a framework that is easy to use with a data analyst skill set, which generally means as little javascript as possible.
As an example, the way you program client-side interactions in Evidence is by templating SQL which we run in duckDB web assembly, rather than by writing javascript.
Evidence is also open source, for anyone who's interested.
This looks very interesting to me, I'm building a BI reporting tool in my company at the moment, but browsing the docs I felt what I was missing was a clear overview of the architecture.
e.g. you say above that Evidence takes templated SQL and runs it in DuckDB WASM
I guess I am wondering where and when the queries are happening
If I set up a Snowflake data source is it doing a build-time import (like in the new Observable, from this thread) into DuckDB? or DuckDB is connecting to the sources via extensions?
Where does the data live?
My question is really just "how does it work?" and the "What is Evidence? > How does Evidence work?" section on the docs homepage doesn't really answer that at all, it's just a list of things that it does.
That’s good feedback on the docs. The tool has evolved pretty dramatically from where it started and we should revisit those diagrams.
Evidence is a static site generator.
Queries against your sources happen at build time and save to parquet.
Queries against the built in DuckDB web assembly instance happen at runtime.
Sources (snowflake, Postgres, csv files etc.) run at build time.
Pages in evidence are defined as markdown files. You write markdown, components, and code fences.
SQL code fences in pages run in the built in duck db wasm instance which can query across the results from all of your sources. These queries run in the client. We call this feature universal SQL, and it’s quite new.
How fast exactly is DuckDB-Wasm for filtering for interactive coordinated views? Could the inputs be a brush selector range (x0, x1) from a time-series chart and then when you brush it the other components would re-render within… milliseconds? There used to be a cool JS library for this called crossfilter. Not sure if this could be a replacement for it?
I'm quite excited about this, and would also love to have it distributed as an NPM package.
I work on an OSS web framework for reporting/ decision support applications (https://github.com/evidence-dev/evidence), and we use WASM duckDB as our query engine. Several folks have asked for PRQL support, and this looks like it could be a pretty seamless way to add it.
I work on an OSS reporting and analytics tool (https://github.com/evidence-dev/evidence) and the amount of time and effort that goes into a really good “print pdf”, and how valuable people find it, has been one of the more surprising parts of the project to me.
Out of interest, what exactly do people use Grafana for? Is it always monitoring infrastructure/ systems or is it more general purpose than that? What is special about it?
I come previously ran an analytics team and I work on an open source BI tool (https://github.com/evidence-dev/evidence) but I have never actually used grafana or come across it when talking to other "business analytics" folks. Everyone in my world is just using tableau or looker or jupyter notebooks.
Easy to configure, good looking dashboards with a lot of different integrations.
Meaning pretty much any team with basic know-how can get a monitoring dashboard going, or several for different resources and cases. It's main focus is monitoring.
This is the right mindset for sure. Most of the time the initial question is very loosely defined, but actually having these conversations with the people who "want data", and helping them structure their thinking is also a hugely rewarding part of working in data and analytics, and will help you advance in your career.
It can be easy to have a cynical view of what people are asking for, but in my experience there is often real value you can uncover.
One thing which helped me a lot is having a decent understanding of accounting and finance. A fun, and fairly quick, way to develop that is by taking a course on financial modelling (in excel). Modelling a business in a spreadsheet is a lot of fun, and it helps you build good intuition on the underlying "physics" of how a business makes money.
I helped build the analytics group at a PE fund, and this really fits with my experience.
Good decision support is where most of the value is, and it’s about building things that draw conclusions, not just throwing the data over the fence with 50 filters and expecting the end consumer to do the actual analysis.
I now work on an open source, code-based BI tool called Evidence, which incorporates a lot of these ideas, and might be of interest to people in this thread.
Agree with both of you, and would add that knowing who is using the system, and what they need to get out of it is really the key to making them shine.
Too many systems have too much data for too many customer categories and end up being useless to everybody.
If you are interested in this, but would prefer to define reports with a markup language (and SQL), I work on an open source code-based BI tool called Evidence, which might be of interest to you.
It's effectively a static site generator aimed at building automated reports and analysis.
Evidence looks incredible! I would start using it right away except I don't see it having any sort of date range picker / filter capability. Is the concept of being able to let our end users filter / drill into graphs with customized queries at odds with how Evidence works? Or are these things just not built yet?
Big fan of evidence, really elegant design. There is also space for both declarative and imperative approaches when it comes to dashboarding / reporting etc.
I really, really like your idea with Evidence. I long for a mode in Metabase that’s like a “notebook mode”, where the main focus is narrative and it’s ornamented with viz, rather than the other way around.
I want to be able to publish this notebook when I’m done and then be able to hand that around, the same concept that you’ve built Evidence around. I think that’s a very good idea, so thanks.
The main thing keeping me from switching is that Metabase’s query builder and visualizations are too good for 95% of my work. It’s hard to picture going back to writing _all_ the SQL.
I hear that. We're making a lot of progress on reducing the amount of sql you need to write, keeping it DRY etc. Making the dev experience really buttery and high leverage is definitely a priority.
Here are a few of the things we're working on in that regard:
1. Making our components issue their own queries so that you don't need to write full sql expressions, just fragments when you're defining the chart you want.
2. Improving intellisense -- right now you get "slash commands" and snippets in our vscode extension to invoke components (which are really sweet), but we're aiming to get to a full intellisense type of experience.
3. Supporting external metrics layers where it makes sense. We've got some users using Cube, we're interested in Malloy, and the dbt semantic layer, those types of things.
One of our team members is very keen on building something he calls "Evidence studio", sort of a wsywig you could invoke during development for generating basic queries, setting up charts etc. that syncs the components into their text representation. That'd be further off though :)
Platforms like this are always pretty funny. It's basically a gateway drug to their cloud platform (which isn't free) that they hope you use, but then they keep it open source so that they don't have to pay salaries to w2 employees. Smart thinkin'!
For anyone who is interested in our cloud service, it's an easy way to put your project online, keep it up to date with your data, and place it behind access control.
For many organizations, hosting Evidence in their own infrastructure is easy enough, but if you don't want to manage that, we are happy to manage it for you.
It is not free (that's how we pay the salaries), but pricing is available here:
I work on a web framework for building data apps like reports and dashboards, and we use duckDB’s WASM runtime for interactions (e.g when a user changes a date range filter). It’s really fast, and you don’t spend snowflake credits.