Hello, Have you seen the log-linear prediction of the 100m winning time in R mailed to the list yesterday by David Smith, subject Revolutions Blog: July roundup? "A log-linear regression in R predicted the gold-winning Olympic 100m sprint time to be 9.68 seconds (it was actually 9.63 seconds): http://bit.ly/QfChUh" The original by Markus Gesmann can be found at http://lamages.blogspot.pt/2012/07/london-olympics-and-prediction-for-100m.html I've made the same, just changing the address to the 200m historical data, and the predicted time was 19.27. Usain Bolt has just made 19.32. If you want to check it, the address and the 'which' argument are: url <- "http://www.databasesports.com/olympics/sport/sportevent.htm?sp=ATH&enum=120" Plus a change in the graphic functions' y axis arguments to allow for times around the double to be ploted and seen. # # Original by Markus Gesmann: # http://lamages.blogspot.pt/2012/07/london-olympics-and-prediction-for-100m.html library(XML) library(drc) url <- "http://www.databasesports.com/olympics/sport/sportevent.htm?sp=ATH&enum=120" data <- readHTMLTable(readLines(url), which=3, header=TRUE) golddata <- subset(data, Medal %in% "GOLD") golddata$Year <- as.numeric(as.character(golddata$Year)) golddata$Result <- as.numeric(as.character(golddata$Result)) tail(golddata,10) logistic <- drm(Result~Year, data=subset(golddata, Year>=1900), fct = L.4()) log.linear <- lm(log(Result)~Year, data=subset(golddata, Year>=1900)) years <- seq(1896,2012, 4) predictions <- exp(predict(log.linear, newdata=data.frame(Year=years))) plot(logistic, xlim=c(1896,2012), ylim=range(golddata$Result) + c(-0.5, 0.5), xlab="Year", main="Olympic 100 metre", ylab="Winning time for the 100m men's final (s)") points(golddata$Year, golddata$Result) lines(years, predictions, col="red") points(2012, predictions[length(years)], pch=19, col="red") text(2012 - 0.5, predictions[length(years)] - 0.5, round(predictions[length(years)],2)) Rui Barradas
Hi Rui: I hate to sound like a pessimist/cynic and also I should state that I didn't look at any of the analysis by you or the other person. But, my question, ( for anyone who wants to chime in ) is: given that all these olympic 100-200 meter runners post times that are generally within 0.1-0.3 seconds of each other or even less, doesn't it stand to reason that a model, given the historical times, is going to predict well. I don't know what the statistical term is for this but intuitively, if there's extremely little variation in the responses, then there's going to be extremely little variation in the predictions and the result is that you won't be too far off ever as long as your predictors are not too strange. !!!!! ( weight, past performances, height, whatever ) Anyone can feel free to chime in and tell me I'm wrong but , if you're going to do that, I'd appreciate statistical reasoning, even though I don't have any. thanks. mark On Thu, Aug 9, 2012 at 4:23 PM, Rui Barradas <ruipbarradas@sapo.pt> wrote:> Hello, > > Have you seen the log-linear prediction of the 100m winning time in R > mailed to the list yesterday by David Smith, subject Revolutions Blog: > July roundup? > > "A log-linear regression in R predicted the gold-winning Olympic 100m > sprint time to be 9.68 seconds (it was actually 9.63 seconds): > http://bit.ly/QfChUh" > > The original by Markus Gesmann can be found at > http://lamages.blogspot.pt/**2012/07/london-olympics-and-** > prediction-for-100m.html<http://lamages.blogspot.pt/2012/07/london-olympics-and-prediction-for-100m.html> > > I've made the same, just changing the address to the 200m historical data, > and the predicted time was 19.27. Usain Bolt has just made 19.32. If you > want to check it, the address and the 'which' argument are: > > url <- "http://www.databasesports.**com/olympics/sport/sportevent.** > htm?sp=ATH&enum=120<http://www.databasesports.com/olympics/sport/sportevent.htm?sp=ATH&enum=120> > " > > Plus a change in the graphic functions' y axis arguments to allow for > times around the double to be ploted and seen. > > # > # Original by Markus Gesmann: > # http://lamages.blogspot.pt/**2012/07/london-olympics-and-** > prediction-for-100m.html<http://lamages.blogspot.pt/2012/07/london-olympics-and-prediction-for-100m.html> > library(XML) > library(drc) > url <- "http://www.databasesports.**com/olympics/sport/sportevent.** > htm?sp=ATH&enum=120<http://www.databasesports.com/olympics/sport/sportevent.htm?sp=ATH&enum=120> > " > data <- readHTMLTable(readLines(url), which=3, header=TRUE) > golddata <- subset(data, Medal %in% "GOLD") > golddata$Year <- as.numeric(as.character(**golddata$Year)) > golddata$Result <- as.numeric(as.character(**golddata$Result)) > tail(golddata,10) > logistic <- drm(Result~Year, data=subset(golddata, Year>=1900), fct > L.4()) > log.linear <- lm(log(Result)~Year, data=subset(golddata, Year>=1900)) > years <- seq(1896,2012, 4) > predictions <- exp(predict(log.linear, newdata=data.frame(Year=years)**)) > plot(logistic, xlim=c(1896,2012), > ylim=range(golddata$Result) + c(-0.5, 0.5), > xlab="Year", main="Olympic 100 metre", > ylab="Winning time for the 100m men's final (s)") > points(golddata$Year, golddata$Result) > lines(years, predictions, col="red") > points(2012, predictions[length(years)], pch=19, col="red") > text(2012 - 0.5, predictions[length(years)] - 0.5, > round(predictions[length(**years)],2)) > > Rui Barradas > > ______________________________**________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> > PLEASE do read the posting guide http://www.R-project.org/** > posting-guide.html <http://www.R-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi, Can't we use these predictions based on statistical principles for predicting payment gateway usages, transaction flows etc. ? Sometimes our gateways fails and there could be a mean-time between failures. Is there a certain roadmap( book ? ) for learning this type of prediction ? We are novice capacity planners. Thanks, Mohan -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Rui Barradas Sent: Friday, August 10, 2012 1:54 AM To: r-help Subject: [R] Olympics: 200m Men Final Hello, Have you seen the log-linear prediction of the 100m winning time in R mailed to the list yesterday by David Smith, subject Revolutions Blog: July roundup? "A log-linear regression in R predicted the gold-winning Olympic 100m sprint time to be 9.68 seconds (it was actually 9.63 seconds): http://bit.ly/QfChUh" The original by Markus Gesmann can be found at http://lamages.blogspot.pt/2012/07/london-olympics-and-prediction-for-10 0m.html I've made the same, just changing the address to the 200m historical data, and the predicted time was 19.27. Usain Bolt has just made 19.32. If you want to check it, the address and the 'which' argument are: url <- "http://www.databasesports.com/olympics/sport/sportevent.htm?sp=ATH&enum =120" Plus a change in the graphic functions' y axis arguments to allow for times around the double to be ploted and seen. # # Original by Markus Gesmann: # http://lamages.blogspot.pt/2012/07/london-olympics-and-prediction-for-10 0m.html library(XML) library(drc) url <- "http://www.databasesports.com/olympics/sport/sportevent.htm?sp=ATH&enum =120" data <- readHTMLTable(readLines(url), which=3, header=TRUE) golddata <- subset(data, Medal %in% "GOLD") golddata$Year <- as.numeric(as.character(golddata$Year)) golddata$Result <- as.numeric(as.character(golddata$Result)) tail(golddata,10) logistic <- drm(Result~Year, data=subset(golddata, Year>=1900), fct L.4()) log.linear <- lm(log(Result)~Year, data=subset(golddata, Year>=1900)) years <- seq(1896,2012, 4) predictions <- exp(predict(log.linear, newdata=data.frame(Year=years))) plot(logistic, xlim=c(1896,2012), ylim=range(golddata$Result) + c(-0.5, 0.5), xlab="Year", main="Olympic 100 metre", ylab="Winning time for the 100m men's final (s)") points(golddata$Year, golddata$Result) lines(years, predictions, col="red") points(2012, predictions[length(years)], pch=19, col="red") text(2012 - 0.5, predictions[length(years)] - 0.5, round(predictions[length(years)],2)) Rui Barradas ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. DISCLAIMER:\ ===============...{{dropped:30}}