What does HackerNews think of darts?
A python library for easy manipulation and forecasting of time series.
The problem with time series forecasting in general is that they make a lot of assumptions on the shape of your data, and you'll find you're spending a lot of time figuring out mutating your data. For example, they expect that your data comes at a very regular interval. This is fine if it's, say, the data from a weather station. This doesn't work well in clinical settings (imagine a patient admitted into the ER -- there is a burst of data, followed by no data).
That said, there's some interesting stuff out there that I've been experimenting with that seems to be more tolerant of irregular time series and can be quite useful. If you're interested in exchanging ideas, drop me a line (email in my profile).
* Statistical models (ETS, (V)ARIMA(X), etc)
* ML models (sklearn models, LGBM, etc)
* Many recent deep learning models (N-BEATS, TFT, etc)
* Seamlessly works on multi-dimensional series
* Models can be trained on multiple series
* Several models support taking in external data (covariates), known either in the past only, or also in the future
* Many models offer rich support for probabilistic forecasts
* Model evaluation is easy: Darts has many metrics, offers backtest etc
* Deep learning scales to large datasets, using GPUs, TPUs, etc
* You can do reconciliation of forecasts at different hierarchical levels
* There's even now an explainability module for some of the models - showing you what matters for computing the forecasts
* (coming soon): an anomaly detection module :)
* (also, it even include FB Prophet if you really want to use it)
Warning: I'm probably biased because I'm Darts creator.
I would generally prefer R for this kind of stuff as the experts generally write the code, but Darts seems OK and is well-tested, at the very least (haven't had a chance to use it in anger yet).
Why should I use this over Darts[1] or just Statsmodels[2], if I need more lower level access and diagnostics? Both of these are far more established.
I dislike that Facebook Prophet was chosen as a benchmark; it's not a difficult benchmark to beat for the majority of time series use cases. It signifies to me that this project might targeting cargo cult data science. Prophet is not particularly good at non-daily timeseries and non-seasonal timeseries. The paper itself admits this[3]. Moreover, it's just a generalized additive model that incorporates holidays.
I don't intend to sound demeaning here, really. But I'm trying to understand what the point is. This doesn't look like someone's weekend project, but we already have plenty of established projects which tackle this effectively.
There are three major markets for time series work:
1. You're an analyst with a lot of domain knowledge who needs to analyze daily, seasonal data but you don't have a strong statistical or engineering background. This person should probably just choose Prophet (again, the developers of Prophet explicitly acknowledge that it's designed for scalable good enough models by non-stats people, not for the best model given the data).
2. You're a data scientist with a good statistical background and you need to produce forecasts. You can afford to dig into what the model is doing and select a model based on a series of diagnostics and knowledge about the data itself. This person should probably choose a more complete suite, like Darts. The important thing here is developing good models quickly while being able to do more than just press a button.
3. You're a data scientist (or statistician) which a very strong statistical background who needs to produce the best model they can for answering a specific question. This person is probably going to use R, Stan, Statsmodels or PyMC to come up with something bespoke. They may or may not need to systematize it, but they don't need to produce quantity over quality.
How does this thing improve the state of the art for any of these markets?
--
1 https://github.com/unit8co/darts
[1]: https://github.com/unit8co/darts/
[2]: https://medium.com/unit8-machine-learning-publication/traini...