Core
NeuralForecast contains two main components, PyTorch implementations deep learning predictive models, as well as parallelization and distributed computation utilities. The first component comprises low-level PyTorch model estimator classes like models.NBEATS
and models.RNN
. The second component is a high-level core.NeuralForecast
wrapper class that operates with sets of time series data stored in pandas DataFrames.
source
NeuralForecast
NeuralForecast (models:List[Any], freq:Union[str,int], local_scaler_type:Optional[str]=None)
The core.StatsForecast
class allows you to efficiently fit multiple
NeuralForecast
models for large sets of time series. It operates with pandas DataFrame
df
that identifies series and datestamps with the unique_id
and ds
columns. The y
column denotes the target time series variable.
Type | Default | Details | |
---|---|---|---|
models | List | Instantiated neuralforecast.models see collection here. | |
freq | Union | Frequency of the data. Must be a valid pandas or polars offset alias, or an integer. | |
local_scaler_type | Optional | None | Scaler to apply per-serie to all features before fitting, which is inverted after predicting. Can be ‘standard’, ‘robust’, ‘robust-iqr’, ‘minmax’ or ‘boxcox’ |
Returns | NeuralForecast | Returns instantiated NeuralForecast class. |
source
NeuralForecast.fit
NeuralForecast.fit (df:Union[pandas.core.frame.DataFrame,polars.dataframe .frame.DataFrame,neuralforecast.compat.SparkDataFrame ,Sequence[str],NoneType]=None, static_df:Union[pandas .core.frame.DataFrame,polars.dataframe.frame.DataFram e,neuralforecast.compat.SparkDataFrame,NoneType]=None , val_size:Optional[int]=0, sort_df:bool=True, use_init_models:bool=False, verbose:bool=False, id_col:str='unique_id', time_col:str='ds', target_col:str='y', distributed_config:Optional[neura lforecast.common._base_model.DistributedConfig]=None)
*Fit the core.NeuralForecast.
Fit models
to a large set of time series from DataFrame df
. and
store fitted models for later inspection.*
Type | Default | Details | |
---|---|---|---|
df | Union | None | DataFrame with columns [unique_id , ds , y ] and exogenous variables.If None, a previously stored dataset is required. |
static_df | Union | None | DataFrame with columns [unique_id ] and static exogenous. |
val_size | Optional | 0 | Size of validation set. |
sort_df | bool | True | Sort df before fitting. |
use_init_models | bool | False | Use initial model passed when NeuralForecast object was instantiated. |
verbose | bool | False | Print processing steps. |
id_col | str | unique_id | Column that identifies each serie. |
time_col | str | ds | Column that identifies each timestep, its values can be timestamps or integers. |
target_col | str | y | Column that contains the target. |
distributed_config | Optional | None | Configuration to use for DDP training. Currently only spark is supported. |
Returns | NeuralForecast | Returns NeuralForecast class with fitted models . |
source
NeuralForecast.predict
NeuralForecast.predict (df:Union[pandas.core.frame.DataFrame,polars.dataf rame.frame.DataFrame,neuralforecast.compat.SparkD ataFrame,NoneType]=None, static_df:Union[pandas.c ore.frame.DataFrame,polars.dataframe.frame.DataFr ame,neuralforecast.compat.SparkDataFrame,NoneType ]=None, futr_df:Union[pandas.core.frame.DataFrame ,polars.dataframe.frame.DataFrame,neuralforecast. compat.SparkDataFrame,NoneType]=None, sort_df:bool=True, verbose:bool=False, engine=None, **data_kwargs)
*Predict with core.NeuralForecast.
Use stored fitted models
to predict large set of time series from
DataFrame df
.*
Type | Default | Details | |
---|---|---|---|
df | Union | None | DataFrame with columns [unique_id , ds , y ] and exogenous variables.If a DataFrame is passed, it is used to generate forecasts. |
static_df | Union | None | DataFrame with columns [unique_id ] and static exogenous. |
futr_df | Union | None | DataFrame with [unique_id , ds ] columns and df ’s future exogenous. |
sort_df | bool | True | Sort df before fitting. |
verbose | bool | False | Print processing steps. |
engine | NoneType | None | Distributed engine for inference. Only used if df is a spark dataframe or if fit was called on a spark dataframe. |
data_kwargs | kwargs | Extra arguments to be passed to the dataset within each model. | |
Returns | pandas or polars DataFrame | **DataFrame with insample models columns for point predictions and probabilisticpredictions for all fitted models . ** |
source
NeuralForecast.cross_validation
NeuralForecast.cross_validation (df:Union[pandas.core.frame.DataFrame,pol ars.dataframe.frame.DataFrame,NoneType]= None, static_df:Union[pandas.core.frame. DataFrame,polars.dataframe.frame.DataFra me,NoneType]=None, n_windows:int=1, step_size:int=1, val_size:Optional[int]=0, test_size:Optional[int]=None, sort_df:bool=True, use_init_models:bool=False, verbose:bool=False, refit:Union[bool,int]=False, id_col:str='unique_id', time_col:str='ds', target_col:str='y', **data_kwargs)
*Temporal Cross-Validation with core.NeuralForecast.
core.NeuralForecast
’s cross-validation efficiently fits a list of
NeuralForecast models through multiple windows, in either chained or
rolled manner.*
Type | Default | Details | |
---|---|---|---|
df | Union | None | DataFrame with columns [unique_id , ds , y ] and exogenous variables.If None, a previously stored dataset is required. |
static_df | Union | None | DataFrame with columns [unique_id ] and static exogenous. |
n_windows | int | 1 | Number of windows used for cross validation. |
step_size | int | 1 | Step size between each window. |
val_size | Optional | 0 | Length of validation size. If passed, set n_windows=None . |
test_size | Optional | None | Length of test size. If passed, set n_windows=None . |
sort_df | bool | True | Sort df before fitting. |
use_init_models | bool | False | Use initial model passed when object was instantiated. |
verbose | bool | False | Print processing steps. |
refit | Union | False | Retrain model for each cross validation window. If False, the models are trained at the beginning and then used to predict each window. If positive int, the models are retrained every refit windows. |
id_col | str | unique_id | Column that identifies each serie. |
time_col | str | ds | Column that identifies each timestep, its values can be timestamps or integers. |
target_col | str | y | Column that contains the target. |
data_kwargs | kwargs | Extra arguments to be passed to the dataset within each model. | |
Returns | Union | **DataFrame with insample models columns for point predictions and probabilisticpredictions for all fitted models . ** |
source
NeuralForecast.predict_insample
NeuralForecast.predict_insample (step_size:int=1)
*Predict insample with core.NeuralForecast.
core.NeuralForecast
’s predict_insample
uses stored fitted models
to predict historic values of a time series from the stored dataframe.*
Type | Default | Details | |
---|---|---|---|
step_size | int | 1 | Step size between each window. |
Returns | pandas.DataFrame | **DataFrame with insample predictions for all fitted models . ** |
source
NeuralForecast.save
NeuralForecast.save (path:str, model_index:Optional[List]=None, save_dataset:bool=True, overwrite:bool=False)
*Save NeuralForecast core class.
core.NeuralForecast
’s method to save current status of models,
dataset, and configuration. Note that by default the models
are not
saving training checkpoints to save disk memory, to get them change the
individual model **trainer_kwargs
to include
enable_checkpointing=True
.*
Type | Default | Details | |
---|---|---|---|
path | str | Directory to save current status. | |
model_index | Optional | None | List to specify which models from list of self.models to save. |
save_dataset | bool | True | Whether to save dataset or not. |
overwrite | bool | False | Whether to overwrite files or not. |
/opt/hostedtoolcache/Python/3.10.15/x64/lib/python3.10/site-packages/fastcore/docscrape.py:230: UserWarning: potentially wrong underline length...
Parameters
----------- in
Load NeuralForecast
...
else: warn(msg)
source
NeuralForecast.load
NeuralForecast.load (path, verbose=False, **kwargs)
*Load NeuralForecast
core.NeuralForecast
’s method to load checkpoint from path.*
Type | Default | Details | |
---|---|---|---|
path | str | Directory with stored artifacts. | |
verbose | bool | False | |
kwargs | Additional keyword arguments to be passed to the functionload_from_checkpoint . | ||
Returns | NeuralForecast | Instantiated NeuralForecast class. |
os.environ['NIXTLA_ID_AS_COL'] = '1'
# Test predict_insample step_size
h = 12
train_end = AirPassengersPanel_train['ds'].max()
sizes = AirPassengersPanel_train['unique_id'].value_counts().to_numpy()
for step_size, test_size in [(7, 0), (9, 0), (7, 5), (9, 5)]:
models = [NHITS(h=h, input_size=12, max_steps=1)]
nf = NeuralForecast(models=models, freq='M')
nf.fit(AirPassengersPanel_train)
# Note: only apply set_test_size() upon nf.fit(), otherwise it would have set the test_size = 0
nf.models[0].set_test_size(test_size)
forecasts = nf.predict_insample(step_size=step_size)
last_cutoff = train_end - test_size * pd.offsets.MonthEnd() - h * pd.offsets.MonthEnd()
n_expected_cutoffs = (sizes[0] - test_size - nf.h + step_size) // step_size
# compare cutoff values
expected_cutoffs = np.flip(np.array([last_cutoff - step_size * i * pd.offsets.MonthEnd() for i in range(n_expected_cutoffs)]))
actual_cutoffs = np.array([pd.Timestamp(x) for x in forecasts[forecasts['unique_id']==nf.uids[1]]['cutoff'].unique()])
np.testing.assert_array_equal(expected_cutoffs, actual_cutoffs, err_msg=f"{step_size=},{expected_cutoffs=},{actual_cutoffs=}")
# check forecast-points count per series
cutoffs_by_series = forecasts.groupby(['unique_id', 'cutoff']).size().unstack('unique_id')
pd.testing.assert_series_equal(cutoffs_by_series['Airline1'], cutoffs_by_series['Airline2'], check_names=False)
def config_optuna(trial):
return {"input_size": trial.suggest_categorical('input_size', [12, 24]),
"hist_exog_list": trial.suggest_categorical('hist_exog_list', [['trend'], ['y_[lag12]'], ['trend', 'y_[lag12]']]),
"futr_exog_list": ['trend'],
"max_steps": 10,
"val_check_steps": 5}
config_ray = {'input_size': tune.choice([12, 24]),
'hist_exog_list': tune.choice([['trend'], ['y_[lag12]'], ['trend', 'y_[lag12]']]),
'futr_exog_list': ['trend'],
'max_steps': 10,
'val_check_steps': 5}
# Test predict_insample step_size
h = 12
train_end = AirPassengers_pl['time'].max()
sizes = AirPassengers_pl['uid'].value_counts().to_numpy()
for step_size, test_size in [(7, 0), (9, 0), (7, 5), (9, 5)]:
models = [NHITS(h=h, input_size=12, max_steps=1)]
nf = NeuralForecast(models=models, freq='1mo')
nf.fit(
AirPassengers_pl,
id_col='uid',
time_col='time',
target_col='target',
)
# Note: only apply set_test_size() upon nf.fit(), otherwise it would have set the test_size = 0
nf.models[0].set_test_size(test_size)
forecasts = nf.predict_insample(step_size=step_size)
n_expected_cutoffs = (sizes[0][1] - test_size - nf.h + step_size) // step_size
# compare cutoff values
last_cutoff = train_end - test_size * pd.offsets.MonthEnd() - h * pd.offsets.MonthEnd()
expected_cutoffs = np.flip(np.array([last_cutoff - step_size * i * pd.offsets.MonthEnd() for i in range(n_expected_cutoffs)]))
pl_cutoffs = forecasts.filter(polars.col('uid') ==nf.uids[1]).select('cutoff').unique(maintain_order=True)
actual_cutoffs = np.array([pd.Timestamp(x['cutoff']) for x in pl_cutoffs.rows(named=True)])
np.testing.assert_array_equal(expected_cutoffs, actual_cutoffs, err_msg=f"{step_size=},{expected_cutoffs=},{actual_cutoffs=}")
# check forecast-points count per series
cutoffs_by_series = forecasts.group_by(['uid', 'cutoff']).count()
assert_frame_equal(cutoffs_by_series.filter(polars.col('uid') == "Airline1").select(['cutoff', 'count']), cutoffs_by_series.filter(polars.col('uid') == "Airline2").select(['cutoff', 'count'] ), check_row_order=False)