Quakes (Univariate)
Introduction
In this example we will predict the of number of earthquakes per year with a magnitude higher or equal to six. The data for the analysis has been collected from USGS and aggregated from 1950 to 2020.
Descriptive Analysis
First let's use a few utilities contained in the Forecast package to have a first impression on the data:
Numerical Summary
qk = quakes()
sqk = summarize(qk)
┌──────────┬──────┬───────┬────────┬─────────┬───────┬───────┬───────────┐ │ Variable │ Min │ 1Q │ Median │ Mean │ 3Q │ Max │ H0 Normal │ ├──────────┼──────┼───────┼────────┼─────────┼───────┼───────┼───────────┤ │ quakes │ 86.0 │ 123.0 │ 141.0 │ 141.014 │ 156.5 │ 207.0 │ 0.9551 │ └──────────┴──────┴───────┴────────┴─────────┴───────┴───────┴───────────┘ ┌──────────┬─────────┬──────────┬───────────┬────────────┐ │ Variable │ Mean │ Variance │ Skewness │ Kurtosis │ ├──────────┼─────────┼──────────┼───────────┼────────────┤ │ quakes │ 141.014 │ 758.7 │ 0.0880805 │ 0.00499622 │ └──────────┴─────────┴──────────┴───────────┴────────────┘ ┌──────────┬───────┐ │ Variable │ Int64 │ ├──────────┼───────┤ │ quakes │ 71 │ └──────────┴───────┘
Data's behavior seems to follow a Normal distribution with no strong indications of seasonal patterns in its plot.
Autoregressive Behavior
plot(pacf(qk.quakes))
The partial autoregression function shows us that there seems to be a significant correlation to the number of earthquakes taking place last year. If we were looking for seasonality we could check on periods of 11 or 15 years since they show a nearly significant correlations but since they're most likely spureous (...or are they?) we will ignore them in this analysis.
Fitting an AR Model
ar_qk = ar(qk)
Multivariate Autoregressive Model ar(X, order=1, constant=true) Residuals Summary ┌──────────┬──────────┬──────────┬───────────┬─────────────┬────────┬─────────┬───────────┐ │ Variable │ Min │ 1Q │ Median │ Mean │ 3Q │ Max │ H0 Normal │ ├──────────┼──────────┼──────────┼───────────┼─────────────┼────────┼─────────┼───────────┤ │ quakes │ -43.9199 │ -17.1699 │ 0.0410644 │ 4.28356e-14 │ 13.642 │ 76.1713 │ 0.0860809 │ └──────────┴──────────┴──────────┴───────────┴─────────────┴────────┴─────────┴───────────┘ ┌──────────┬─────────────┬──────────┬──────────┬──────────┐ │ Variable │ Mean │ Variance │ Skewness │ Kurtosis │ ├──────────┼─────────────┼──────────┼──────────┼──────────┤ │ quakes │ 4.28356e-14 │ 575.97 │ 0.616127 │ 0.40403 │ └──────────┴─────────────┴──────────┴──────────┴──────────┘ Coefficients Φ0 ┌ ┐ │ 69.954 *** │ └ ┘ Φ1 ┌ ┐ │ 0.503 *** │ └ ┘ Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘^’ 0.1 ‘ ’ 1 and ‘+’ if fixed Σ2 Variance/Covariance Matrix ┌ ┐ │ 567.742 │ └ ┘ Information Criteria ┌───────┬─────────┬─────────┬─────────┐ │ AIC │ AICC │ BIC │ H&Q │ ├───────┼─────────┼─────────┼─────────┤ │ 6.398 │ 6.43122 │ 463.171 │ 6.42335 │ └───────┴─────────┴─────────┴─────────┘ Statistics ┌───────────┬─────────────────┬──────────┬──────────┐ │ Variable │ Fisher's p-test │ R2 │ R2adj │ ├───────────┼─────────────────┼──────────┼──────────┤ │ quakes │ 2.48416e-11 │ 0.251561 │ 0.240714 │ └───────────┴─────────────────┴──────────┴──────────┘
In the AR model of order one we have highly significant coefficients and increasing its order does not provide important changes in the Information Criteria, however, the residuals show a barely significant normality behavior and we may consider to transform our data to improve on that. Given tha large noise in the model, tranformations to improve results will not be dramatic and therefore we will continue with a simple AR model of order one for our forecasting.
Forecasting Earthquakes
fc_qk = forecast(ar_qk,10);
plot(fc_qk)
The plot shows us the forecast for the next ten years and, as we see, the large noise in the model does not allow us to be very accurate in our forecasting, but at least we can confidently say that there is a resonable chance to have a larger number of big earthquakes in 2021 and 2022 than the number we had in 2020.
Forecast Information Forecasting ar(X, order=1, constant=true) Mean Forecasting ┌────────────┬─────────────┐ │ year │ mean_quakes │ ├────────────┼─────────────┤ │ 2021-01-01 │ 131.351 │ │ 2022-01-01 │ 136.058 │ │ 2023-01-01 │ 138.426 │ │ 2024-01-01 │ 139.618 │ │ 2025-01-01 │ 140.218 │ │ 2026-01-01 │ 140.52 │ │ 2027-01-01 │ 140.672 │ │ 2028-01-01 │ 140.748 │ │ 2029-01-01 │ 140.787 │ │ 2030-01-01 │ 140.806 │ └────────────┴─────────────┘ Prediction Intervals alpha at: (0.8, 0.95) Upper: ┌────────────┬───────────────┬───────────────┐ │ year │ upper1_quakes │ upper2_quakes │ ├────────────┼───────────────┼───────────────┤ │ 2021-01-01 │ 161.887 │ 178.052 │ │ 2022-01-01 │ 170.243 │ 188.339 │ │ 2023-01-01 │ 173.475 │ 192.028 │ │ 2024-01-01 │ 174.882 │ 193.55 │ │ 2025-01-01 │ 175.536 │ 194.233 │ │ 2026-01-01 │ 175.852 │ 194.556 │ │ 2027-01-01 │ 176.007 │ 194.713 │ │ 2028-01-01 │ 176.085 │ 194.791 │ │ 2029-01-01 │ 176.123 │ 194.83 │ │ 2030-01-01 │ 176.143 │ 194.849 │ └────────────┴───────────────┴───────────────┘ Lower: ┌────────────┬───────────────┬───────────────┐ │ year │ lower1_quakes │ lower2_quakes │ ├────────────┼───────────────┼───────────────┤ │ 2021-01-01 │ 100.816 │ 84.6508 │ │ 2022-01-01 │ 101.873 │ 83.7765 │ │ 2023-01-01 │ 103.377 │ 84.8237 │ │ 2024-01-01 │ 104.354 │ 85.6862 │ │ 2025-01-01 │ 104.899 │ 86.2029 │ │ 2026-01-01 │ 105.187 │ 86.4837 │ │ 2027-01-01 │ 105.336 │ 86.6303 │ │ 2028-01-01 │ 105.412 │ 86.7054 │ │ 2029-01-01 │ 105.45 │ 86.7436 │ │ 2030-01-01 │ 105.469 │ 86.7628 │ └────────────┴───────────────┴───────────────┘