Hi, I?m trying to query the Github API, and I?m running into some data munging issues, so I was hoping someone on the list might advise. Here?s my code. To run it you need to replace client_id and client_secret with your own authorization information for Github. library(github) library(RCurl) library(httpuv) library(jsonlite) # Set up the query ctx = interactive.login(?client_id?, ?client_secret?) pull <- function(i){ ? get.pull.request.files(owner = ?rails?, repo = ?rails?, id = i, ctx = get.github.context(), per_page=1000) } data <- read.csv(getURL(?https://gist.githubusercontent.com/aronlindberg/a3d135a303664046c94a/raw/e42a0734ec4542eccf5f4d5bdeed5afbdd1720e9/pull_ids?), sep = ?\n?) list <- read.csv(textConnection(data), header = FALSE) pull_lists <- lapply(list$V1, pull) get_files <- function(pull_lists){ ? sapply(pull_lists$content, ?[[?, ?filename? ) } file_lists <- lapply(pull_lists, get_files) Everything works fine until the last command, which generates: Error in FUN(X[[1L]], ...) : subscript out of bounds I?ve read here: http://stackoverflow.com/questions/18461499/subscript-out-of-bounds-on-character-vector which leads me to believe that the reason for the error is that when I run file_lists <- lapply(pull_lists, get_files) some of the entries are missing. However, I cannot figure out how to clean up the data. I have tried something along the lines of: clean_files <- function(pull_lists){ ? pull_lists$content[which(nchar(pull_lists$content)==NULL)]<-NA } clean_lists <- lapply(pull_lists, clean_files) But that simply replaces *every* value with NA (similarly if I change ==NULL to <1, or <2). How can I make this code work? Best, Aron --? Aron Lindberg Doctoral Candidate,?Information Systems Weatherhead School of Management? Case Western Reserve University aronlindberg.github.io [[alternative HTML version deleted]]