Daisy Englert Duursma
2010-Apr-19 00:41 UTC
[R] selecting rows based on number that occurs after letter
Hello, I am trying to cycle through a csv and make some summary statistics. I need to select rows based on the number in the row name that comes after the letter 'y'. For example, ? BA1y1 would equal 1, ?C3A2r3y1 would equal 1 and ?MA3r3y1r3 would equal 1. I currently have my code ?cycling through by the 5th character but my rows have variable length and the y can occur in several different locations in the row name. Here is my current code sdat <- read.csv (paste(data.dir,"/summary.data.csv", sep="")) year <-c("1", "2", "3") for (y in year) { sdat2 <- sdat[sapply(strsplit(as.character(sdat$GCM), ""), function(zzz)zzz[5] == y),] Thanks Daisy Englert Duursma Bioclimatic Modeller Macquarie University Sydney, NSW, Australia
jim holtman
2010-Apr-19 00:55 UTC
[R] selecting rows based on number that occurs after letter
Use regular expressions:> x <- c( "BA1y1", "C3A2r3y1", "MA3r3y1r3", "MA23r34y123z99") > # extact the number after the 'y' > num <- sub(".*y(\\d <file://d/>+).*", "\\1 <file://0.0.0.1/>", x) > > num[1] "1" "1" "1" "123">On Sun, Apr 18, 2010 at 8:41 PM, Daisy Englert Duursma < daisy.duursma@gmail.com> wrote:> Hello, > > I am trying to cycle through a csv and make some summary statistics. > I need to select rows based on the number in the row name that comes > after the letter 'y'. For example, BA1y1 would equal 1, C3A2r3y1 > would equal 1 and MA3r3y1r3 would equal 1. > > I currently have my code cycling through by the 5th character but my > rows have variable length and the y can occur in several different > locations in the row name. > > Here is my current code > > sdat <- read.csv (paste(data.dir,"/summary.data.csv", sep="")) > > year <-c("1", "2", "3") > for (y in year) { > > sdat2 <- sdat[sapply(strsplit(as.character(sdat$GCM), ""), > function(zzz)zzz[5] == y),] > > Thanks > Daisy Englert Duursma > > Bioclimatic Modeller > Macquarie University > Sydney, NSW, Australia > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]]
Daisy Englert Duursma
2010-Apr-27 05:10 UTC
[R] selecting rows based on number that occurs after letter
Hello, Thanks for your help, I have played around with the suggestion a bit but I can still not sub-setting in the way I need If I have a matrix x as follows: > x <- matrix(c("BA1y1","BA2y3","C3A1r1y1","C3A2r2y2t4","C3r2y1y1",1,2,3, 11,12) , nrow=5, ncol=2, dimnames=list(c("a","b","c","d","e"), c("GCM","y"))) > x GCM y a "BA1y1" "1" b "BA2y3" "2" c "C3A1r1y1" "3" d "C3A2r2y2t4" "11" e "C3r2t1y1" "12" and I want to loop through 3 subsets based on the numeric value after the y, how do I do this? What I currently have is:> year <-c("1", "2", "3") > for (y in year) { > subx <- x[sapply(strsplit(as.character(x$GCM), ""), function(zzz)zzz[5] == y),]} Basically this loops through and subsets the rows when the 5th character has the defined y value (1, 2,or 3). The problem is that y can occur anywhere in the GCM value. Thanks for the help. Daisy -- Daisy Englert Duursma Room E8C156 Dept. Biological Sciences Macquarie University NSW 2109 Australia Tel +61 2 9850 9256