Below is some working code that, generally speaking, accomplishes why I want,
but am looking for a necessary improvement in the final step. The code below
scrapes data from a website (thousands of pages actually) and organizes
athlete?s scores in a data frame. The final variable, called Workout05 in the
original data is a timed event. So, I use strplit() to pull out the data I want
in that column and format it using as.POSIXct() as you can see in the code below
(using a regular expression I?m sure would improve on how to pull out those data
in the column, but that is not my primary question).
After I have all data, I want to find the empirical CDF of the data, so I use
ecdf() on those data just as I would on other variables. Now, the main issue I?m
interested is in the final step where you plug in a specific time to find its
percentile
## These are below in context of the real problem as well
fn <- ecdf(dat$score5)
fn(dat$score5[1])
This works, but not in the way I want. What I want is for a user to easily be
able to enter their time in ?lay? terms such as 5:35 and from that it would
return the percentile rank.
So, I?d like something like the following to be able to work
fn(5:35)
The larger context for this problem for why I want this can be seen if you visit
my web app built using shiny. I?ve built a site where athletes can build
customized reports based on their performance on certain events by entering in
data. This specific issue would be found on the ?get my percentile? tab where a
user can use the text input box to enter their time in a way humans typically
understand it and then it gets passed to the R fn() function that runs in the
background and builds the plot for them.
https://hdoran.shinyapps.io/openAnalysis/
So, my question is how can I structure this such that a time can be expressed as
simply minute:seconds (e.g., 4:52) in a text box so that it would still work to
return a percentile rank as I?ve described here.
Thanks
library(XML)
i = 1; j = 0; division = 1
url <-
paste(paste('http://games.crossfit.com/scores/leaderboard.php?stage=5&sort=0&page=',
i, sep=''), paste('&division=1®ion=', j,
sep=''),
'&numberperpage=100&competition=0&frontpage=0&expanded=1&year=15&full=1&showtoggles=0&hidedropdowns=0&showathleteac=1&=&is_mobile=0',
sep='')
tmp <- try(readHTMLTable(readLines(url), which=1, header=TRUE))
if(!is.null(dim(tmp))){ # new part here
names(tmp) <- gsub("\\n", "", names(tmp))
names(tmp) <- gsub(" +", "", names(tmp))
tmp[] <- lapply(tmp, function(x) gsub("\\n", "",
x))
tmp$region <- j
}
dat <- tmp
aa <- strsplit(dat$Workout05, split = '\\(')
bb <- sapply(aa, function(x) x[2])
aa <- strsplit(bb, split = '\\)')
dat$score5 <- as.character(sapply(strsplit(bb, split = '\\)'),
function(x) x))
dat$score5 <- as.POSIXct(dat$score5, format="%M:%S")
fn <- ecdf(dat$score5)
fn(dat$score5[1])
[[alternative HTML version deleted]]