Run TimeGPT distributedly on top of DaskDask is an open source parallel computing library for Python. In this guide, we will explain how to use
TimeGPT
on top of Dask.
Outline:
Note You can installIf executing on a distributedfugue
withpip
:
Dask
cluster, ensure that the nixtla
library is installed across all the workers.
pandas
DataFrame. In this tutorial, we
will use a dataset that contains hourly electricity prices from
different markets.
unique_id | ds | y | |
---|---|---|---|
0 | BE | 2016-10-22 00:00:00 | 70.00 |
1 | BE | 2016-10-22 01:00:00 | 37.10 |
2 | BE | 2016-10-22 02:00:00 | 37.10 |
3 | BE | 2016-10-22 03:00:00 | 44.75 |
4 | BE | 2016-10-22 04:00:00 | 37.10 |
pandas
DataFrame to a Dask DataFrame.
unique_id | ds | y | |
---|---|---|---|
npartitions=2 | |||
0 | string | string | float64 |
4200 | … | … | … |
8399 | … | … | … |
TimeGPT
on top of Dask
is almost identical to the
non-distributed case. The only difference is that you need to use a
Dask
DataFrame, which we already defined in the previous step.
First, instantiate the NixtlaClient
class.
👍 Use an Azure AI endpoint To use an Azure AI endpoint, set thebase_url
argument:nixtla_client = NixtlaClient(base_url="you azure ai endpoint", api_key="your api_key")
NixtlaClient
class such as
forecast
or
cross_validation
.
unique_id | ds | TimeGPT | |
---|---|---|---|
0 | BE | 2016-12-31 00:00:00 | 45.190453 |
1 | BE | 2016-12-31 01:00:00 | 43.244446 |
2 | BE | 2016-12-31 02:00:00 | 41.958389 |
3 | BE | 2016-12-31 03:00:00 | 39.796486 |
4 | BE | 2016-12-31 04:00:00 | 39.204533 |
📘 Available models in Azure AI If you are using an Azure AI endpoint, please be sure to setmodel="azureai"
:nixtla_client.forecast(..., model="azureai")
For the public API, we support two models:timegpt-1
andtimegpt-1-long-horizon
. By default,timegpt-1
is used. Please see this tutorial on how and when to usetimegpt-1-long-horizon
.
unique_id | ds | cutoff | TimeGPT | |
---|---|---|---|---|
0 | BE | 2016-12-30 04:00:00 | 2016-12-30 03:00:00 | 39.375439 |
1 | BE | 2016-12-30 05:00:00 | 2016-12-30 03:00:00 | 40.039215 |
2 | BE | 2016-12-30 06:00:00 | 2016-12-30 03:00:00 | 43.455849 |
3 | BE | 2016-12-30 07:00:00 | 2016-12-30 03:00:00 | 47.716408 |
4 | BE | 2016-12-30 08:00:00 | 2016-12-30 03:00:00 | 50.31665 |
TimeGPT
on top of Dask
. To
do this, please refer to the Exogenous
Variables
tutorial. Just keep in mind that instead of using a pandas DataFrame,
you need to use a Dask
DataFrame instead.