Daisy Englert Duursma
2010-Apr-19 00:41 UTC
[R] selecting rows based on number that occurs after letter
Hello,
I am trying to cycle through a csv and make some summary statistics.
I need to select rows based on the number in the row name that comes
after the letter 'y'. For example, ? BA1y1 would equal 1, ?C3A2r3y1
would equal 1 and ?MA3r3y1r3 would equal 1.
I currently have my code ?cycling through by the 5th character but my
rows have variable length and the y can occur in several different
locations in the row name.
Here is my current code
sdat <- read.csv (paste(data.dir,"/summary.data.csv",
sep=""))
year <-c("1", "2", "3")
for (y in year) {
sdat2 <- sdat[sapply(strsplit(as.character(sdat$GCM), ""),
function(zzz)zzz[5] == y),]
Thanks
Daisy Englert Duursma
Bioclimatic Modeller
Macquarie University
Sydney, NSW, Australia
jim holtman
2010-Apr-19 00:55 UTC
[R] selecting rows based on number that occurs after letter
Use regular expressions:> x <- c( "BA1y1", "C3A2r3y1", "MA3r3y1r3", "MA23r34y123z99") > # extact the number after the 'y' > num <- sub(".*y(\\d <file://d/>+).*", "\\1 <file://0.0.0.1/>", x) > > num[1] "1" "1" "1" "123">On Sun, Apr 18, 2010 at 8:41 PM, Daisy Englert Duursma < daisy.duursma@gmail.com> wrote:> Hello, > > I am trying to cycle through a csv and make some summary statistics. > I need to select rows based on the number in the row name that comes > after the letter 'y'. For example, BA1y1 would equal 1, C3A2r3y1 > would equal 1 and MA3r3y1r3 would equal 1. > > I currently have my code cycling through by the 5th character but my > rows have variable length and the y can occur in several different > locations in the row name. > > Here is my current code > > sdat <- read.csv (paste(data.dir,"/summary.data.csv", sep="")) > > year <-c("1", "2", "3") > for (y in year) { > > sdat2 <- sdat[sapply(strsplit(as.character(sdat$GCM), ""), > function(zzz)zzz[5] == y),] > > Thanks > Daisy Englert Duursma > > Bioclimatic Modeller > Macquarie University > Sydney, NSW, Australia > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem that you are trying to solve? [[alternative HTML version deleted]]
Daisy Englert Duursma
2010-Apr-27 05:10 UTC
[R] selecting rows based on number that occurs after letter
Hello,
Thanks for your help, I have played around with the suggestion a bit
but I can still not sub-setting in the way I need
If I have a matrix x as follows:
> x <-
matrix(c("BA1y1","BA2y3","C3A1r1y1","C3A2r2y2t4","C3r2y1y1",1,2,3,
11,12)
, nrow=5, ncol=2,
dimnames=list(c("a","b","c","d","e"),
c("GCM","y")))
> x
GCM y
a "BA1y1" "1"
b "BA2y3" "2"
c "C3A1r1y1" "3"
d "C3A2r2y2t4" "11"
e "C3r2t1y1" "12"
and I want to loop through 3 subsets based on the numeric value after
the y, how do I do this?
What I currently have is:> year <-c("1", "2", "3")
> for (y in year) {
> subx <- x[sapply(strsplit(as.character(x$GCM), ""),
function(zzz)zzz[5] == y),]
}
Basically this loops through and subsets the rows when the 5th
character has the defined y value (1, 2,or 3). The problem is that y
can occur anywhere in the GCM value.
Thanks for the help.
Daisy
--
Daisy Englert Duursma
Room E8C156
Dept. Biological Sciences
Macquarie University NSW 2109
Australia
Tel +61 2 9850 9256