Description
Part 3: Visualizing Time Series Data
By visualizing time series data, we can detect patterns, identify its confidence, and spot potential problems like outliers and missing values.
For this activity, in 250-500 words, answer the following:
Plot the time series data in a line chart using the plot() function. Use the date feature as your indexed column when plotting the data. Show the results.
Zoom into a particular range of time: pick a range of 2 months from your dataset and plot it into a line chart. Show the results and explain the difference between this step and step 1.
Add linear or polynomial trend lines to your time series dataset: plot a trend line using the regplot function from the seaborn library. Show the results and interpret the trend line.
Suppress Seasonality: aggregate your data using the mean function at the yearly level to remove seasonality from your dataset. Plot the data and interpret the graph.
Lag Scatter Plot: plot a scatter plot to test the correlation between lag values. Import the lag plot class from the pandas plotting library. Then, show and interpret the graph.
Autocorrelation Plots: plot correlations with all possible lag values in your time-series dataset. Import the autocorrelation plot class from pandas plotting library. Show and interpret the graph. Explain how an autocorrelation function (ACF) and partial autocorrelation function (PACF) can be useful in forecasting.
- Part 4: Time Series Forecasting Using ARIMA Modeling
- For this activity, in 500-750 words, answer the following:
- Construct a time plot of the data and inspect the graph for any anomalies. This time plot should suggest whether any differencing is needed. Explain.
- Use the autocorrelation and partial autocorrelation plots to identify and select the preliminary values of the autoregression (AR) order, p, the order of differencing, d, and the moving average order, q. Explain your findings.
- Fit and train the ARIMA model based on your selected p, d, q values.
- Evaluate your model statistically by interpreting its t-test, p values, R Squared, adjusted R squared, MAE, and MAPE. Interpret the results.
Forecast the next five periods and report the results.
Parameter Tuning: fit another ARIMA model with a different value(s) for the p, d, q parameters. Compare both models and interpret the findings. Which one is better in terms of white noise, variation, unusual patterns, trends, seasonality.., etc. Why?
Evaluate each model using the walk forward validation. Explain.