Glossary
These are some key concepts related to time series forecasting, designed to help you better understand and leverage the capabilities of TimeGPT.
Time Series
A time series is a sequence of data points indexed by time, used to model phenomena that changes over time, such as stock prices, temperature, or product sales. A time series can generally be thought of as comprising the following components:
-
Trend: The consistent, long-term direction of the data, whether upward or downward. It reflects the persistent, overall movement in the series over time.
-
Seasonality: A repeated cycle around a known and fixed period.
-
Remainder: The residuals or random noise left in the data after the trend and seasonal effects have been accounted for.
Forecasting
Forecasting is the process of predicting the future values of a time series based on historical data. It plays a crucial role in the decision-making process across various fields such as finance, healthcare, retail, and economics, among others.
Forecasting can use a variety of approaches, from statistical approaches to novel techniques such as machine learning, deep learning, and foundation models. These models can be further classified into univariate and multivariate models, depending on the number of variables used to make the predictions, or local or global models, with local models estimating parameters independently for each series and global models estimating parameters jointly across multiple series.
Forecasts themselves can be presented as point forecasts, which predict a single future value, or as probabilistic forecasts, which provide a full probability distribution of future values, and hence, providing a measure of uncertainty.
Foundation Model
Foundation model refers to a type of large, pre-trained model that can be adapted to a wide range of tasks, including time series forecasting. Originally developed for domains such as natural language processing and computer vision, foundation models are now increasingly applied to sequential data like time series. These models are typically trained on extensive datasets, capturing complex patterns and dependencies that can be fine-tuned for specific tasks.
TimeGPT
Developed by Nixtla, TimeGPT
is the first foundation model for time
series forecasting. TimeGPT
was trained on billions of observations
from publicly available datasets across multiple domains and can produce
accurate forecasts for new time series without additional training,
using only historical values as inputs. The model ‘reads’ time series
data similarly to how humans read a sentence—sequentially from left to
right. It looks at windows of past data, which we can think of as
‘tokens’, and predicts what comes next. This prediction is based on
patterns the model identifies in past data and extrapolates into the
future.
Tokens
TimeGPT
processes time series data in chunks. Each data point in a
series can be thought of as a ‘token’, akin to how individual words or
characters are treated in natural language processing (NLP).
Fine-tuning
Fine-tuning is a process used in machine learning where a pre-trained
model like TimeGPT
undergoes additional training to adapt it for a
specific dataset. Initially, TimeGPT
can operate in a zero-shot
manner, meaning it can generate forecasts as-is. While this zero-shot
approach provides a solid baseline, the performance of TimeGPT
can
often be improved through fine-tuning. During this process, the
TimeGPT
model undergoes additional training using the specific
dataset, starting from the pre-trained parameters. The updated model
then produces the forecasts.
Learn how to fine-tune TimeGPT
Historical Forecasts
Historical forecasts, also known as in-sample forecasts, are the predictions made for the historical data. These forecasts are commonly used to evaluate the performance of forecasting models by comparing the predicted values against the actual values.
Learn how to make historical forecasts with TimeGPT
Anomaly Detection
Anomaly detection refers to the process of identifying unusual observations that deviate significantly from the expected behavior of the data. Anomalies, also known as outliers, can be caused by a variety of factors, such as errors in the data collection process, sudden changes in the underlying patterns of the data, or unexpected events. These anomalies can pose challenges for many forecasting models, as they may distort trends, seasonal patterns, or estimates of autocorrelation. Consequently, anomalies can significantly impact the accuracy of forecasts. Therefore, it is crucial to be able to identify them accurately.
Anomaly detection has many applications across different industries, including detecting fraud in financial transactions, monitoring the performance of online services, or identifying unusual patterns in energy usage.
Learn how to detect anomalies with TimeGPT
Time Series Cross Validation
Time series cross-validation is a method for evaluating how a model would have performed on historical data. It works by defining a sliding window across past observations and predicting the period following it. It differs from standard cross-validation by maintaining the chronological order of the data instead of randomly splitting it.
This method allows for a more accurate estimation of a forecasting model’s predictive capabilities by considering multiple sequential periods. When only one window is used, this method resembles a standard train-test split, with the last set of observations serving as the test data and all preceding data as the training set.
Learn how to perform cross-validation with TimeGPT
Exogenous Variables
Exogenous variables are external factors that can influence the behavior of a time series but are not directly affected by it. For example, in retail sales forecasting, exogenous variables could include factors such as holidays, promotions, prices, or weather data for electricity load forecasts. By incorporating these variables into the forecasting model, it is possible to capture the relationships between the target series and external factors, leading to more accurate predictions.