Ray is an open source unified compute framework to scale Python workloads. In this guide, we will explain how to use TimeGPT on top of Ray.

Outline:

  1. Installation

  2. Load Your Data

  3. Initialize Ray

  4. Use TimeGPT on Ray

  5. Shutdown Ray

1. Installation

Install Ray through Fugue. Fugue provides an easy-to-use interface for distributed computing that lets users execute Python code on top of several distributed computing frameworks, including Ray.

Note

You can install fugue with pip:

pip install fugue[ray]

If executing on a distributed Ray cluster, ensure that the nixtla library is installed across all the workers.

2. Load Data

You can load your data as a pandas DataFrame. In this tutorial, we will use a dataset that contains hourly electricity prices from different markets.

import pandas as pd
df = pd.read_csv(
    'https://raw.githubusercontent.com/Nixtla/transfer-learning-time-series/main/datasets/electricity-short.csv',
    parse_dates=['ds'],
) 
df.head()
unique_iddsy
0BE2016-10-22 00:00:0070.00
1BE2016-10-22 01:00:0037.10
2BE2016-10-22 02:00:0037.10
3BE2016-10-22 03:00:0044.75
4BE2016-10-22 04:00:0037.10

3. Initialize Ray

Initialize Ray and convert the pandas DataFrame to a Ray DataFrame.

import ray
from ray.cluster_utils import Cluster
ray_cluster = Cluster(
    initialize_head=True,
    head_node_args={"num_cpus": 2}
)
ray.init(address=ray_cluster.address, ignore_reinit_error=True)
ray_df = ray.data.from_pandas(df)
ray_df
MaterializedDataset(
   num_blocks=1,
   num_rows=6720,
   schema={unique_id: object, ds: datetime64[ns], y: float64}
)

4. Use TimeGPT on Ray

Using TimeGPT on top of Ray is almost identical to the non-distributed case. The only difference is that you need to use a Ray DataFrame.

First, instantiate the NixtlaClient class.

from nixtla import NixtlaClient
nixtla_client = NixtlaClient(
    # defaults to os.environ.get("NIXTLA_API_KEY")
    api_key = 'my_api_key_provided_by_nixtla'
)

👍 Use an Azure AI endpoint

To use an Azure AI endpoint, set the base_url argument:

nixtla_client = NixtlaClient(base_url="you azure ai endpoint", api_key="your api_key")

Then use any method from the NixtlaClient class such as forecast or cross_validation.

ray_df
MaterializedDataset(
   num_blocks=1,
   num_rows=6720,
   schema={unique_id: object, ds: datetime64[ns], y: float64}
)
fcst_df = nixtla_client.forecast(ray_df, h=12)

📘 Available models in Azure AI

If you are using an Azure AI endpoint, please be sure to set model="azureai":

nixtla_client.forecast(..., model="azureai")

For the public API, we support two models: timegpt-1 and timegpt-1-long-horizon.

By default, timegpt-1 is used. Please see this tutorial on how and when to use timegpt-1-long-horizon.

To visualize the result, use the to_pandas method to convert the output of Ray to a pandas DataFrame.

fcst_df.to_pandas().tail()
unique_iddsTimeGPT
55NP2018-12-24 07:00:0055.387066
56NP2018-12-24 08:00:0056.115517
57NP2018-12-24 09:00:0056.090714
58NP2018-12-24 10:00:0055.813717
59NP2018-12-24 11:00:0055.528519
cv_df = nixtla_client.cross_validation(ray_df, h=12, freq='H', n_windows=5, step_size=2)
cv_df.to_pandas().tail()
unique_iddscutoffTimeGPT
295NP2018-12-23 19:00:002018-12-23 11:00:0053.632019
296NP2018-12-23 20:00:002018-12-23 11:00:0052.512775
297NP2018-12-23 21:00:002018-12-23 11:00:0051.894035
298NP2018-12-23 22:00:002018-12-23 11:00:0051.06572
299NP2018-12-23 23:00:002018-12-23 11:00:0050.32592

You can also use exogenous variables with TimeGPT on top of Ray. To do this, please refer to the Exogenous Variables tutorial. Just keep in mind that instead of using a pandas DataFrame, you need to use a Ray DataFrame instead.

5. Shutdown Ray

When you are done, shutdown the Ray session.

ray.shutdown()