Torch Time Series Dataset
TimeSeriesLoader
DataLoader
TimeSeriesLoader DataLoader.
Small change to PyTorch’s Data loader.
Combines a dataset and a sampler, and provides an iterable over the given dataset.
The class ~torch.utils.data.DataLoader supports both map-style and
iterable-style datasets with single- or multi-process loading, customizing
loading order and optional automatic batching (collation) and memory pinning.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset | Dataset to load data from. | required | |
batch_size | int | How many samples per batch to load. Defaults to 1. | required |
shuffle | bool | Set to True to have the data reshuffled at every epoch. Defaults to False. | required |
sampler | Sampler or Iterable | Defines the strategy to draw samples from the dataset. | required |
drop_last | bool | Set to True to drop the last incomplete batch. Defaults to False. | required |
**kwargs | Additional keyword arguments for DataLoader. |
BaseTimeSeriesDataset
Dataset
Base class for time series datasets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
temporal_cols | Column names for temporal features. | required | |
max_size | int | Maximum size of time series. | required |
min_size | int | Minimum size of time series. | required |
y_idx | int | Index of target variable. | required |
static | Optional | Static features array. | None |
static_cols | Optional | Column names for static features. | None |
LocalFilesTimeSeriesDataset
BaseTimeSeriesDataset
Time series dataset that loads data from local files.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
files_ds | List[str] | List of file paths. | required |
temporal_cols | Column names for temporal features. | required | |
id_col | str | Name of ID column. | required |
time_col | str | Name of time column. | required |
target_col | str | Name of target column. | required |
last_times | Last time for each time series. | required | |
indices | Series indices. | required | |
max_size | int | Maximum size of time series. | required |
min_size | int | Minimum size of time series. | required |
y_idx | int | Index of target variable. | required |
static | Optional | Static features array. | None |
static_cols | Optional | Column names for static features. | None |
LocalFilesTimeSeriesDataset.from_data_directories
| Name | Type | Description | Default |
|---|---|---|---|
directories | List of directory paths. | required | |
static_df | Optional | Static features DataFrame. | None |
exogs | List | List of exogenous variable names. Defaults to []. | [] |
id_col | str | Name of ID column. Defaults to “unique_id”. | ‘unique_id’ |
time_col | str | Name of time column. Defaults to “ds”. | ‘ds’ |
target_col | str | Name of target column. Defaults to “y”. | ‘y’ |
| Name | Type | Description |
|---|---|---|
LocalFilesTimeSeriesDataset | Dataset created from directories. |
TimeSeriesDataset
BaseTimeSeriesDataset
Time series dataset implementation.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
temporal | Temporal data array. | required | |
temporal_cols | Column names for temporal features. | required | |
indptr | Index pointers for time series grouping. | required | |
y_idx | int | Index of target variable. | required |
static | Optional | Static features array. | None |
static_cols | Optional | Column names for static features. | None |
TimeSeriesDataset.append
| Name | Type | Description | Default |
|---|---|---|---|
futr_dataset | TimeSeriesDataset | Future dataset to append. | required |
| Name | Type | Description |
|---|---|---|
TimeSeriesDataset | TimeSeriesDataset | Copy of dataset with future observations appended. |
| Type | Description |
|---|---|
ValueError | If datasets have different number of groups. |
TimeSeriesDataset.trim_dataset
Returns:
| Name | Type | Description |
|---|---|---|
TimeSeriesDataset | Trimmed dataset. |
| Type | Description |
|---|---|
Exception | If trim size exceeds minimum series length. |
TimeSeriesDataModule
LightningDataModule
PyTorch Lightning data module for time series datasets.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset | BaseTimeSeriesDataset | Time series dataset. | required |
batch_size | int | Batch size for training. Defaults to 32. | 32 |
valid_batch_size | int | Batch size for validation. Defaults to 1024. | 1024 |
drop_last | bool | Whether to drop the last incomplete batch. Defaults to False. | False |
shuffle_train | bool | Whether to shuffle training data. Defaults to True. | True |
**dataloaders_kwargs | Additional keyword arguments for data loaders. |

