Looks nice. Since imports numpy can utilize (more of) numpy's operations to squeeze validation functions and nested fors to one. Should result in shorter code but readability will probably depend on reader's experience in array programming.
We likely overestimate AI's short-term impact, and there might even be a financial bubble about to pop. But I also think we underestimate the long-term impact. We're building absolutely amazing capabilities faster than many would have thought possible only a few years ago - I especially think applications to science and engineering will be huge and transformative.
This is in part why we built Darts. Now I think we can say the situation is quite different. Darts offers many things offered by the R forecast package, and then some (for instance the ability to train ML models on large datasets made of multiple potentially high-dimensional series).
I would say that compared to Greykite, Darts really attempts to unify a wide variety of forecasting models under a common simple and user-friendly API. There are many differences, but for instance, AFAIK there's no deep learning model in Greykite (it focuses on two algorithms: their built-in algorithm and Prophet), whereas Darts tries to lower the barrier for using deep learning models for forecasting. Crucially for ML-based models, it also means being able to train on multiple (possibly thousands or more) of possibly multi-dimensional time series.
In some cases Darts is wrapping around existing models (like Prophet, or statsmodels-based models for instance); in other cases we wrote our own implementations, so it's really a mix.
I was trying to use Darts earlier for some multivariate data and was struggling to figure out how to use it for it and eventually just gave up and switched to making my own code.
Is there a good "how to" multivariate data example? Or is it just turning every column in my pandas dataframe into a series to pass into the covariates array?
And rather than just bother you, is there a discord/forum to ask questions on darts?
> Or is it just turning every column in my pandas dataframe into a series to pass into the covariates array?
Basically if you have a multivariate series represented as a pandas dataframe with several columns, the way to go is to create your TimeSeries by calling TimeSeries.from_dataframe(my_df). That will return a multivariate time series.
We don't yet have a discord channel, but I'm planning to open a Slack channel sometime soon. If you have other questions feel free to drop me an email: julien@unit8.co
Hi! I'm one core developer (and creator) of the library. Thanks for all the comments. I just wanted to highlight a couple of things that we think are quite cool about Darts:
* It makes using all sorts of forecasting models (from ARIMA to deep learning) easy, using fit() and predict(), similar to scikit-learn.
* It's easy to fit deep learning and other ML-based models on multiple time series, potentially on big datasets too. The time series can be multivariate.
* Darts is not only wrapping existing models. We also have our own implementations, for instance of TCN (Temporal Convolutional Networks), or adaptations N-BEATS (which we extended to handle multivariate series), DeepAR and others.
* Darts makes it very easy to include past and/or future covariates as inputs for the predictions.
* Some models offer probabilistic forecasts; sometimes with the possibility to configure your favourite likelihood function (e.g. Gaussian for continuous values or Poisson for discrete values).
* Everything uses the "TimeSeries" class, which makes the API consistent across tools and models, and make it harder to make mistakes. For instance it's easy to consume the output of one model by another model, and all models can be backtested the same way.
I love to see more time series models becoming available in an easy-to-use format. There's always been such a gap between what is possible and what is convenient to use, much moreso than with other kinds of models.
This was also one of the areas where R always had better options than Python, but that seems to be gradually changing as well.
Darts looks very thorough and user-friendly, it makes me really want to work on a forecasting project!
It might be very helpful to readers/users if you could add a section to your documentation comparing Darts to Tslearn [0] (edit, and Sktime [1]), which already has a lot of time series models with the Scikit-learn style interface.
It would also be helpful to have some kind of writeup that explains the TimeSeries data structure and why you use that, instead of just a Series/DataFrame.
Finally - you really shouldn't say "non-Facebook alternative", because your Prophet implementation is literally a wrapper around Facebook's Prophet library. If anything, I suggest moving the Prophet, Torch, and Pmdarima dependencies to setuptools "extras", so you don't force the users to depend on those projects.
Thanks for the feedback, I absolutely agree about the need for easy-to-use tools for dealing with time series. This is exactly the motivation that prompted us to work on Darts initially.
I like your suggestions of adding comparison to the few other libraries out there, as well as explaining the need for having our own TimeSeries data structure. We should try to do that sometime soon.
Concerning dependencies, we already have some dependencies as extras. "pip install darts" will install everything, but "pip install u8darts" will install only the core (without Prophet and pmdarima), or "pip install u8darts[torch]" only the core+pytorch models.
Do you have any plans to implement some sort of model averaging or stacking? I believe it would bring great benefits to this landscape to have a working implementation of hierarchical stacking across various backends wrapped in a Python library.
model = NaiveEnsembleModel([model1, model2, ...])
model.fit(my_series)
prediction = model.predict()
Will return an average prediction. Look at RegressionEnsembleModel for an ensemble model which uses a regression model to learn how to combine the individual forecasts.
At the moment Darts doesn't have hierarchical reconciliation methods (if that's what you meant), but it's on the backlog :)
Could have been made shorter at the price of readability.