## Autocorrelation plots ```python theme={null} fig, axs = plt.subplots(nrows=1, ncols=2) plot_acf(df["y"], lags=20, ax=axs[0],color="fuchsia") axs[0].set_title("Autocorrelation"); # Plot plot_pacf(df["y"], lags=20, ax=axs[1],color="lime") axs[1].set_title('Partial Autocorrelation') plt.show(); ```

### Decomposition of the time series How to decompose a time series and why? In time series analysis to forecast new values, it is very important to know past data. More formally, we can say that it is very important to know the patterns that values follow over time. There can be many reasons that cause our forecast values to fall in the wrong direction. Basically, a time series consists of four components. The variation of those components causes the change in the pattern of the time series. These components are: * **Level:** This is the primary value that averages over time. * **Trend:** The trend is the value that causes increasing or decreasing patterns in a time series. * **Seasonality:** This is a cyclical event that occurs in a time series for a short time and causes short-term increasing or decreasing patterns in a time series. * **Residual/Noise:** These are the random variations in the time series. Combining these components over time leads to the formation of a time series. Most time series consist of level and noise/residual and trend or seasonality are optional values. If seasonality and trend are part of the time series, then there will be effects on the forecast value. As the pattern of the forecasted time series may be different from the previous time series. The combination of the components in time series can be of two types: \* Additive \* Multiplicative ### Additive time series If the components of the time series are added to make the time series. Then the time series is called the additive time series. By visualization, we can say that the time series is additive if the increasing or decreasing pattern of the time series is similar throughout the series. The mathematical function of any additive time series can be represented by: $y(t) = level + Trend + seasonality + noise$ ### Multiplicative time series If the components of the time series are multiplicative together, then the time series is called a multiplicative time series. For visualization, if the time series is having exponential growth or decline with time, then the time series can be considered as the multiplicative time series. The mathematical function of the multiplicative time series can be represented as. $y(t) = Level * Trend * seasonality * Noise$ ```python theme={null} from statsmodels.tsa.seasonal import seasonal_decompose a = seasonal_decompose(df["y"], model = "add", period=1) a.plot(); ```

Breaking down a time series into its components helps us to identify the behavior of the time series we are analyzing. In addition, it helps us to know what type of models we can apply, for our example of the Life expectancy data set, we can observe that our time series shows an increasing trend throughout the year, on the other hand, it can be observed also that the time series has no seasonality. By looking at the previous graph and knowing each of the components, we can get an idea of which model we can apply: \* We have trend \* There is no seasonality ## Split the data into training and testing Let’s divide our data into sets 1. Data to train our model. 2. Data to test our model. For the test data we will use the last 6 years to test and evaluate the performance of our model. ```python theme={null} train = df[df.ds<='2013-01-01'] test = df[df.ds>'2013-01-01'] ``` ```python theme={null} train.shape, test.shape ``` ```text theme={null} ((54, 3), (6, 3)) ``` ```python theme={null} sns.lineplot(train,x="ds", y="y", label="Train") sns.lineplot(test, x="ds", y="y", label="Test") plt.show() ```

## Implementation of AutoETS with StatsForecast ```python theme={null} from statsforecast import StatsForecast from statsforecast.models import AutoETS ``` ### Instantiate Model ```python theme={null} sf = StatsForecast(models=[AutoETS(model="AZN")], freq='YS') ``` ### Fit the Model ```python theme={null} sf.fit(df=train) ``` ```text theme={null} StatsForecast(models=[AutoETS]) ``` ### Model Prediction ```python theme={null} y_hat = sf.predict(h=6) y_hat ``` | | unique\_id | ds | AutoETS | | - | ---------- | ---------- | --------- | | 0 | 1 | 2014-01-01 | 82.952553 | | 1 | 1 | 2015-01-01 | 83.146150 | | 2 | 1 | 2016-01-01 | 83.339747 | | 3 | 1 | 2017-01-01 | 83.533344 | | 4 | 1 | 2018-01-01 | 83.726940 | | 5 | 1 | 2019-01-01 | 83.920537 | ```python theme={null} sf.plot(train, y_hat) ```

Let’s add a confidence interval to our forecast. ```python theme={null} y_hat = sf.predict(h=6, level=[80,90,95]) y_hat ``` | | unique\_id | ds | AutoETS | AutoETS-lo-95 | AutoETS-lo-90 | AutoETS-lo-80 | AutoETS-hi-80 | AutoETS-hi-90 | AutoETS-hi-95 | | - | ---------- | ---------- | --------- | ------------- | ------------- | ------------- | ------------- | ------------- | ------------- | | 0 | 1 | 2014-01-01 | 82.952553 | 82.500416 | 82.573107 | 82.656916 | 83.248190 | 83.331999 | 83.404691 | | 1 | 1 | 2015-01-01 | 83.146150 | 82.693437 | 82.766221 | 82.850137 | 83.442163 | 83.526078 | 83.598863 | | 2 | 1 | 2016-01-01 | 83.339747 | 82.884744 | 82.957897 | 83.042237 | 83.637257 | 83.721597 | 83.794749 | | 3 | 1 | 2017-01-01 | 83.533344 | 83.073235 | 83.147208 | 83.232495 | 83.834192 | 83.919479 | 83.993452 | | 4 | 1 | 2018-01-01 | 83.726940 | 83.257894 | 83.333304 | 83.420247 | 84.033634 | 84.120577 | 84.195987 | | 5 | 1 | 2019-01-01 | 83.920537 | 83.437859 | 83.515461 | 83.604931 | 84.236144 | 84.325614 | 84.403216 | ```python theme={null} sf.plot(train, y_hat, level=[95]) ```

### Forecast method Memory Efficient Exponential Smoothing predictions. This method avoids memory burden due from object storage. It is analogous to fit\_predict without storing information. It assumes you know the forecast horizon in advance. ```python theme={null} y_hat = sf.forecast(df=train, h=6, fitted=True) y_hat ``` | | unique\_id | ds | AutoETS | | - | ---------- | ---------- | --------- | | 0 | 1 | 2014-01-01 | 82.952553 | | 1 | 1 | 2015-01-01 | 83.146150 | | 2 | 1 | 2016-01-01 | 83.339747 | | 3 | 1 | 2017-01-01 | 83.533344 | | 4 | 1 | 2018-01-01 | 83.726940 | | 5 | 1 | 2019-01-01 | 83.920537 | ### In sample predictions Access fitted Exponential Smoothing insample predictions. ```python theme={null} sf.forecast_fitted_values() ``` | | unique\_id | ds | y | AutoETS | | --- | ---------- | ---------- | --------- | --------- | | 0 | 1 | 1960-01-01 | 69.123902 | 69.005305 | | 1 | 1 | 1961-01-01 | 69.760244 | 69.237346 | | 2 | 1 | 1962-01-01 | 69.149756 | 69.495763 | | ... | ... | ... | ... | ... | | 51 | 1 | 2011-01-01 | 82.187805 | 82.348633 | | 52 | 1 | 2012-01-01 | 82.239024 | 82.561938 | | 53 | 1 | 2013-01-01 | 82.690244 | 82.758963 | ## Model Evaluation Now we are going to evaluate our model with the results of the predictions, we will use different types of metrics MAE, MAPE, MASE, RMSE, SMAPE to evaluate the accuracy. ```python theme={null} from functools import partial import utilsforecast.losses as ufl from utilsforecast.evaluation import evaluate ``` ```python theme={null} evaluate( y_hat.merge(test), metrics=[ufl.mae, ufl.mape, partial(ufl.mase, seasonality=1), ufl.rmse, ufl.smape], train_df=train, ) ``` | | unique\_id | metric | AutoETS | | - | ---------- | ------ | -------- | | 0 | 1 | mae | 0.421060 | | 1 | 1 | mape | 0.005073 | | 2 | 1 | mase | 1.340056 | | 3 | 1 | rmse | 0.483558 | | 4 | 1 | smape | 0.002528 | ## References 1. [Nixtla AutoETS API](../../src/core/models.html#autoets) 2. [Rob J. Hyndman and George Athanasopoulos (2018). “Forecasting Principles and Practice (3rd ed)”](https://otexts.com/fpp3/tscv.html).