Time Series

A time series is a sequence of data points indexed by time, used to model phenomena that changes over time, such as stock prices, temperature, or product sales. A time series can generally be thought of as comprising the following components:

  • Trend: The consistent, long-term direction of the data, whether upward or downward. It reflects the persistent, overall movement in the series over time.

  • Seasonality: A repeated cycle around a known and fixed period.

  • Remainder: The residuals or random noise left in the data after the trend and seasonal effects have been accounted for.

Forecasting

Forecasting is the process of predicting the future values of a time series based on historical data. It plays a crucial role in the decision-making process across various fields such as finance, healthcare, retail, and economics, among others.

Forecasting can use a variety of approaches, from statistical approaches to novel techniques such as machine learning, deep learning, and foundation models. These models can be further classified into univariate and multivariate models, depending on the number of variables used to make the predictions, or local or global models, with local models estimating parameters independently for each series and global models estimating parameters jointly across multiple series.

Forecasts themselves can be presented as point forecasts, which predict a single future value, or as probabilistic forecasts, which provide a full probability distribution of future values, and hence, providing a measure of uncertainty.

Foundation Model

Foundation model refers to a type of large, pre-trained model that can be adapted to a wide range of tasks, including time series forecasting. Originally developed for domains such as natural language processing and computer vision, foundation models are now increasingly applied to sequential data like time series. These models are typically trained on extensive datasets, capturing complex patterns and dependencies that can be fine-tuned for specific tasks.

TimeGPT

Developed by Nixtla, TimeGPT is the first foundation model for time series forecasting. TimeGPT was trained on billions of observations from publicly available datasets across multiple domains and can produce accurate forecasts for new time series without additional training, using only historical values as inputs. The model ‘reads’ time series data similarly to how humans read a sentence—sequentially from left to right. It looks at windows of past data, which we can think of as ‘tokens’, and predicts what comes next. This prediction is based on patterns the model identifies in past data and extrapolates into the future.

Tokens

TimeGPT processes time series data in chunks. Each data point in a series can be thought of as a ‘token’, akin to how individual words or characters are treated in natural language processing (NLP).

Fine-tuning

Fine-tuning is a process used in machine learning where a pre-trained model like TimeGPT undergoes additional training to adapt it for a specific dataset. Initially, TimeGPT can operate in a zero-shot manner, meaning it can generate forecasts as-is. While this zero-shot approach provides a solid baseline, the performance of TimeGPT can often be improved through fine-tuning. During this process, the TimeGPT model undergoes additional training using the specific dataset, starting from the pre-trained parameters. The updated model then produces the forecasts.

Learn how to fine-tune TimeGPT

Historical Forecasts

Historical forecasts, also known as in-sample forecasts, are the predictions made for the historical data. These forecasts are commonly used to evaluate the performance of forecasting models by comparing the predicted values against the actual values.

Learn how to make historical forecasts with TimeGPT

Anomaly Detection

Anomaly detection refers to the process of identifying unusual observations that deviate significantly from the expected behavior of the data. Anomalies, also known as outliers, can be caused by a variety of factors, such as errors in the data collection process, sudden changes in the underlying patterns of the data, or unexpected events. These anomalies can pose challenges for many forecasting models, as they may distort trends, seasonal patterns, or estimates of autocorrelation. Consequently, anomalies can significantly impact the accuracy of forecasts. Therefore, it is crucial to be able to identify them accurately.

Anomaly detection has many applications across different industries, including detecting fraud in financial transactions, monitoring the performance of online services, or identifying unusual patterns in energy usage.

Learn how to detect anomalies with TimeGPT

Time Series Cross Validation

Time series cross-validation is a method for evaluating how a model would have performed on historical data. It works by defining a sliding window across past observations and predicting the period following it. It differs from standard cross-validation by maintaining the chronological order of the data instead of randomly splitting it.

This method allows for a more accurate estimation of a forecasting model’s predictive capabilities by considering multiple sequential periods. When only one window is used, this method resembles a standard train-test split, with the last set of observations serving as the test data and all preceding data as the training set.

Learn how to perform cross-validation with TimeGPT

Exogenous Variables

Exogenous variables are external factors that can influence the behavior of a time series but are not directly affected by it. For example, in retail sales forecasting, exogenous variables could include factors such as holidays, promotions, prices, or weather data for electricity load forecasts. By incorporating these variables into the forecasting model, it is possible to capture the relationships between the target series and external factors, leading to more accurate predictions.

Learn how to include exogenous variables in TimeGPT