Run TimeGPT distributedly on top of Dask
TimeGPT
on top of Dask.
Outline:
Note You can installIf executing on a distributedfugue
withpip
:
Dask
cluster, ensure that the nixtla
library is installed across all the workers.
pandas
DataFrame. In this tutorial, we
will use a dataset that contains hourly electricity prices from
different markets.
unique_id | ds | y | |
---|---|---|---|
0 | BE | 2016-10-22 00:00:00 | 70.00 |
1 | BE | 2016-10-22 01:00:00 | 37.10 |
2 | BE | 2016-10-22 02:00:00 | 37.10 |
3 | BE | 2016-10-22 03:00:00 | 44.75 |
4 | BE | 2016-10-22 04:00:00 | 37.10 |
pandas
DataFrame to a Dask DataFrame.
unique_id | ds | y | |
---|---|---|---|
npartitions=2 | |||
0 | string | string | float64 |
4200 | … | … | … |
8399 | … | … | … |
TimeGPT
on top of Dask
is almost identical to the
non-distributed case. The only difference is that you need to use a
Dask
DataFrame, which we already defined in the previous step.
First, instantiate the
NixtlaClient
class.
👍 Use an Azure AI endpoint To use an Azure AI endpoint, set theThen use any method from thebase_url
argument:nixtla_client = NixtlaClient(base_url="you azure ai endpoint", api_key="your api_key")
NixtlaClient
class such as
forecast
or
cross_validation
.
unique_id | ds | TimeGPT | |
---|---|---|---|
0 | BE | 2016-12-31 00:00:00 | 45.190453 |
1 | BE | 2016-12-31 01:00:00 | 43.244446 |
2 | BE | 2016-12-31 02:00:00 | 41.958389 |
3 | BE | 2016-12-31 03:00:00 | 39.796486 |
4 | BE | 2016-12-31 04:00:00 | 39.204533 |
📘 Available models in Azure AI If you are using an Azure AI endpoint, please be sure to setmodel="azureai"
:nixtla_client.forecast(..., model="azureai")
For the public API, we support two models:timegpt-1
andtimegpt-1-long-horizon
. By default,timegpt-1
is used. Please see this tutorial on how and when to usetimegpt-1-long-horizon
.
unique_id | ds | cutoff | TimeGPT | |
---|---|---|---|---|
0 | BE | 2016-12-30 04:00:00 | 2016-12-30 03:00:00 | 39.375439 |
1 | BE | 2016-12-30 05:00:00 | 2016-12-30 03:00:00 | 40.039215 |
2 | BE | 2016-12-30 06:00:00 | 2016-12-30 03:00:00 | 43.455849 |
3 | BE | 2016-12-30 07:00:00 | 2016-12-30 03:00:00 | 47.716408 |
4 | BE | 2016-12-30 08:00:00 | 2016-12-30 03:00:00 | 50.31665 |
TimeGPT
on top of Dask
. To
do this, please refer to the Exogenous
Variables
tutorial. Just keep in mind that instead of using a pandas DataFrame,
you need to use a Dask
DataFrame instead.