source

TimeSeriesLoader

 TimeSeriesLoader (dataset, **kwargs)

*TimeSeriesLoader DataLoader. Source code.

Small change to PyTorch’s Data loader. Combines a dataset and a sampler, and provides an iterable over the given dataset.

The class ~torch.utils.data.DataLoader supports both map-style and iterable-style datasets with single- or multi-process loading, customizing loading order and optional automatic batching (collation) and memory pinning.

Parameters:
batch_size: (int, optional): how many samples per batch to load (default: 1).
shuffle: (bool, optional): set to True to have the data reshuffled at every epoch (default: False).
sampler: (Sampler or Iterable, optional): defines the strategy to draw samples from the dataset.
Can be any Iterable with __len__ implemented. If specified, shuffle must not be specified.
*


source

BaseTimeSeriesDataset

 BaseTimeSeriesDataset (temporal_cols, max_size:int, min_size:int,
                        y_idx:int, static=None, static_cols=None,
                        sorted=False)

*An abstract class representing a :class:Dataset.

All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite :meth:__getitem__, supporting fetching a data sample for a given key. Subclasses could also optionally overwrite :meth:__len__, which is expected to return the size of the dataset by many :class:~torch.utils.data.Sampler implementations and the default options of :class:~torch.utils.data.DataLoader. Subclasses could also optionally implement :meth:__getitems__, for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.

.. note:: :class:~torch.utils.data.DataLoader by default constructs an index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.*


source

LocalFilesTimeSeriesDataset

 LocalFilesTimeSeriesDataset (files_ds:List[str], temporal_cols,
                              id_col:str, time_col:str, target_col:str,
                              last_times, indices, max_size:int,
                              min_size:int, y_idx:int, static=None,
                              static_cols=None, sorted=False)

*An abstract class representing a :class:Dataset.

All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite :meth:__getitem__, supporting fetching a data sample for a given key. Subclasses could also optionally overwrite :meth:__len__, which is expected to return the size of the dataset by many :class:~torch.utils.data.Sampler implementations and the default options of :class:~torch.utils.data.DataLoader. Subclasses could also optionally implement :meth:__getitems__, for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.

.. note:: :class:~torch.utils.data.DataLoader by default constructs an index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.*


source

TimeSeriesDataset

 TimeSeriesDataset (temporal, temporal_cols, indptr, max_size:int,
                    min_size:int, y_idx:int, static=None,
                    static_cols=None, sorted=False)

*An abstract class representing a :class:Dataset.

All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite :meth:__getitem__, supporting fetching a data sample for a given key. Subclasses could also optionally overwrite :meth:__len__, which is expected to return the size of the dataset by many :class:~torch.utils.data.Sampler implementations and the default options of :class:~torch.utils.data.DataLoader. Subclasses could also optionally implement :meth:__getitems__, for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.

.. note:: :class:~torch.utils.data.DataLoader by default constructs an index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.*


source

TimeSeriesDataModule

 TimeSeriesDataModule (dataset:__main__.BaseTimeSeriesDataset,
                       batch_size=32, valid_batch_size=1024,
                       num_workers=0, drop_last=False, shuffle_train=True)

*A DataModule standardizes the training, val, test splits, data preparation and transforms. The main advantage is consistent data splits, data preparation and transforms across models.

Example::

import lightning.pytorch as L
import torch.utils.data as data
from pytorch_lightning.demos.boring_classes import RandomDataset

class MyDataModule(L.LightningDataModule):
    def prepare_data(self):
        # download, IO, etc. Useful with shared filesystems
        # only called on 1 GPU/TPU in distributed
        ...

    def setup(self, stage):
        # make assignments here (val/train/test split)
        # called on every process in DDP
        dataset = RandomDataset(1, 100)
        self.train, self.val, self.test = data.random_split(
            dataset, [80, 10, 10], generator=torch.Generator().manual_seed(42)
        )

    def train_dataloader(self):
        return data.DataLoader(self.train)

    def val_dataloader(self):
        return data.DataLoader(self.val)

    def test_dataloader(self):
        return data.DataLoader(self.test)

    def on_exception(self, exception):
        # clean up state after the trainer faced an exception
        ...

    def teardown(self):
        # clean up state after the trainer stops, delete files...
        # called on every process in DDP
        ...*
# To test correct future_df wrangling of the `update_df` method
# We are checking that we are able to recover the AirPassengers dataset
# using the dataframe or splitting it into parts and initializing.