Dear R buddies, This weekend I became interested in solving Google Code Jam problems using R. I guess R may work very well in this kind of contests but the input of file has been a problem for me. Take this case for example (http://code.google.com/codejam/contest/dashboard?c=agdjb2RlamFtchALEghjb250ZXN0cxjRzBQM), the files are usually of the form: A(number of lines for group 1) a11 a12 a13 a21 a22 a23 ... B(number of lines for group 2) b11 b12 b13 b21 b22 b23 ... I guess SAS may work pretty well in this kind of situation with data step. But I don't know how to handle them using R. Any suggestions? Thanks a lot. Best wishes, -- ??? Hesen Peng http://hesen.peng.googlepages.com/
Have you looked at the documentation and help files for R Import/Export; http://cran.r-project.org/doc/manuals/R-data.pdf and the read functions ?read.table ?readLines ?count.fields This is pretty basic stuff. After an extremely cursory look at that problem I was guessing that it is more like linear programming than statistics. -- David Winsemius On Nov 29, 2008, at 6:51 PM, Hesen Peng wrote:> Dear R buddies, > > This weekend I became interested in solving Google Code Jam problems > using R. I guess R may work very well in this kind of contests but the > input of file has been a problem for me. Take this case for example > (http://code.google.com/codejam/contest/dashboard?c=agdjb2RlamFtchALEghjb250ZXN0cxjRzBQM > ), > the files are usually of the form: > > A(number of lines for group 1) > a11 a12 a13 > a21 a22 a23 > ... > B(number of lines for group 2) > b11 b12 b13 > b21 b22 b23 > ... > > I guess SAS may work pretty well in this kind of situation with data > step. But I don't know how to handle them using R. Any suggestions? > Thanks a lot. > > Best wishes, > -- > ??? Hesen Peng > http://hesen.peng.googlepages.com/ > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Try this. First we read it in using fill = TRUE so that lines with one number get filled out with NAs. The first line is T so assign first cell to T and create DF0 which does not have that line. Then split the data into a list of data frames starting each group at the line with the NA in column 3. Finally lapply over that list removing the first row (which is the row that contains NAs). Lines <- "3 3 10000 0 0 0 10000 0 0 0 10000 3 5000 0 0 0 2000 0 0 0 4000 5 0 1250 0 3000 0 3000 1000 1000 1000 2000 1000 2000 1000 3000 2000" DF <- read.table(textConnection(Lines), fill = TRUE) T <- DF[1,1] DF0 <- DF[-1,] DFs <- lapply(split(DF0, cumsum(is.na(DF0[,3]))), na.omit) DFs On Sat, Nov 29, 2008 at 6:51 PM, Hesen Peng <hesen.peng at emory.edu> wrote:> Dear R buddies, > > This weekend I became interested in solving Google Code Jam problems > using R. I guess R may work very well in this kind of contests but the > input of file has been a problem for me. Take this case for example > (http://code.google.com/codejam/contest/dashboard?c=agdjb2RlamFtchALEghjb250ZXN0cxjRzBQM), > the files are usually of the form: > > A(number of lines for group 1) > a11 a12 a13 > a21 a22 a23 > ... > B(number of lines for group 2) > b11 b12 b13 > b21 b22 b23 > ... > > I guess SAS may work pretty well in this kind of situation with data > step. But I don't know how to handle them using R. Any suggestions? > Thanks a lot. > > Best wishes, > -- > ??? Hesen Peng > http://hesen.peng.googlepages.com/ > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >