Skip to content
Prev 302971 / 398503 Next

Olympics: 200m Men Final

Continuing on with fun, if silly, analyses: a little voice in my head
suggests a time series model and, rather than putting any thought
into, I'll use some R-goodness.

Setting up the data as Rui provided, we need to add some NA's to
account for WWII:

library(zoo)
golddata.ts <- as.ts(zoo(golddata[,6], order.by = golddata[,1]))
# Here we let Gabor and Achim think about how get the NAs in there smoothly

library(forecast)
golddata.model <- auto.arima(golddata.ts)

# Prof Hyndman has forgotten more about time series than I will ever know

summary(golddata.model) # ARIMA(2,1,0)+drift seems a bit heavy handed,
but that's what it gives
forecast(golddata.model, 1)

  Point Forecast    Lo 80    Hi 80    Lo 95  Hi 95
2012       19.66507 19.23618 20.09396 19.00915 20.321

# But looking at a graph, this seems to have an odd jump up
plot(forecast(golddata.model))

# Maybe we overfit -- let's kill the drift

golddata.model2 <- auto.arima(golddata.ts, allowdrift = FALSE)

summary(golddata.model2) # ARIMA(1,1,0) seems better
plot(forecast(golddata.model2)) # I like the graph more too
forecast(golddata.model, 2)

   Point Forecast    Lo 80    Hi 80    Lo 95    Hi 95
2012       19.56139 19.04134 20.08145 18.76604 20.35674

Not so very good at all, but a little bit of R fun nevertheless ;-)

And in the category of "how good is your prediction when you already
know the answer and don't care at all about statistical rigor", it
seems that "regress on year" might still be winning. Anyone want to
take some splines out for a spin?

Cheers,
Michael
On Thu, Aug 9, 2012 at 11:31 PM, Mark Leeds <markleeds2 at gmail.com> wrote: