We were stuck on 1.x for a long time. Downsampling seemed to be eternally broken...

We were stuck on 1.x for a long time. Downsampling seemed to be eternally broken (or rather not performant enough) regardless of versions so we wrote our own downsampler doing it on ingestion (in riemann.io).

And as world seemed to converge on Prometheus/prometheus-compatible interfaces we will probably eventually migrate to VictoriaMetrics or something else "talking prometheus"

InfluxQL was shit. Flux looks far more complex for 90%+ things we use PromQL for now so it is another disadvantage. I'm sure it's cool for data science but all we need to do is to turn some things to rate and do some basic math or stats on it.

> Can they handle millions of metrics again? What would bring us back?

we had one instance with ~25 mil distinct series eating around 26 GB RAM. I'd suggest looking into VictoriaMetrics. Mimir is a bit more complicated to run and seems to require far more hardware for similar performance, but has distinction (whether that's advantage or not, eh...) of using object store instead of plain old disk which makes HA a bit easier.