**What is ARIMA?**

- AR (AutoRegressive): The model uses the dependent relationship between an observation and some number of lagged observations.
- I (Integrated): It involves differencing the raw observations to make the time series stationary, a key step for the AR and MA components to work effectively.
- MA (Moving Average): The model incorporates the dependency between an observation and a residual error from a moving average model applied to lagged observations.

**Why ARIMA?**

**Implementing ARIMA in R**

if (!require("forecast", quietly = TRUE)) {

install.packages("forecast")

}

# Load the 'forecast' library after ensuring it's installed

library(forecast)

# Assume we have a time series 'electricity_demand' in kWh

# This is our mock time series data

set.seed(123) # For reproducibility

electricity_demand <- ts(rnorm(60, mean = 1500, sd = 300), frequency = 12, start = c(2019, 1))

# First, we plot the data to check for any obvious trends or seasonality

plot(electricity_demand, main = "Monthly Electricity Demand", xlab = "Time", ylab = "Demand (kWh)")

# Next, we use the 'auto.arima' function to automatically select the best ARIMA model based on AIC

fit <- auto.arima(electricity_demand)

# Summarize the fit to understand the chosen model

summary(fit)

# Forecast the next 12 months (1 year) of electricity demand

future_demand <- forecast(fit, h = 12)

# Plot the forecasted demand

plot(future_demand, main = "ARIMA Model Forecast", xlab = "Time", ylab = "Demand (kWh)")

# Print the forecasted values

print(future_demand$mean)

**Model Identification**

The name of the time series that the model is applied to is electricity_demand.

ARIMA(0,0,0)(0,0,1)[12]: This indicates the type of model fitted to the data. The notation can be interpreted as follows: The first set of parameters (0,0,0) indicates that there are no autoregressive terms (AR), no differencing (I), and no moving average terms (MA) in the non-seasonal part of the model. The second set of parameters (0,0,1)[12] indicates that there is one seasonal moving average term and the seasonal period is 12, which usually corresponds to monthly data with an annual cycle. "With non-zero mean" suggests that the model includes a mean term in its formulation.

- sma1: The coefficient for the seasonal moving average term is -0.3059, with a standard error of 0.1746. This indicates the relationship between the current month's seasonal component and the residual error of the previous month's seasonal component.
- mean: The mean of the series is estimated to be 1515.4874, with a standard error of 25.7335. This value represents the average monthly electricity demand around which the seasonal fluctuations occur.
- sigma^2: This represents the variance of the residuals from the model and is estimated to be 70026. A lower value is generally better, indicating that the model's predictions are closer to the actual values.
- log likelihood: The value of -419.41 is a measure of the model's likelihood, with higher values indicating a better model fit.
- AIC (Akaike Information Criterion): 844.82 is a measure of the relative quality of the statistical model for a given set of data. It deals with the trade-off between the goodness of fit and the simplicity of the model. Lower AIC values suggest a better model.
- AICC (Corrected Akaike Information Criterion): 845.24 is a version of AIC adjusted for small sample sizes.
- BIC (Bayesian Information Criterion): 851.1 is another criterion for model selection, with lower values indicating a better model, taking into account the complexity of the model.
- ME (Mean Error): 1.310345 suggests that the model's forecasts are, on average, slightly over the actual values.
- RMSE (Root Mean Squared Error): 260.1771 indicates the model's typical forecast error magnitude.
- MAE (Mean Absolute Error): 209.0784 is the average magnitude of the errors in the predictions, without considering their direction.
- MPE (Mean Percentage Error): -3.025129 indicates that on average, the model's predictions are about 3% less than the actual values.
- MAPE (Mean Absolute Percentage Error): 14.37263% is the average absolute percent error per forecast, which gives an idea of the error magnitude in percentage terms.
- MASE (Mean Absolute Scaled Error): 0.5804085 is a measure of accuracy in a time series forecast that is scaled against the naïve model; values greater than one indicate a model performing worse than a naïve forecast.
- ACF1 (Autocorrelation of residuals at lag 1): -0.1024486 indicates a slight negative correlation between the residuals across consecutive forecasts, suggesting that there is no significant autocorrelation left in the residuals.

**When to Use ARIMA?**

**Challenges and Considerations**

**Conclusion**

For graduate students looking to apply time-series forecasting in their research, ARIMA models offer a robust framework. The power of ARIMA lies in its ability to transform non-stationary historical data into a stationary series that can reveal insights and forecast future trends. Whether it’s predicting stock prices, weather patterns, or energy demand, ARIMA provides a window into the future, grounded in the rigor of statistical analysis.

BridgeText can help you with all of your statistics needs.