Hi R Users, I have been trying to work out how to rename column names using grep, basically I have generated these column names using tapply: [1] "NAME" "X1.1" "X2.1" "X3.1" "X4.1" "X5.1" "X6.1" "X7.1" "X8.1" [10] "X1.2" "X2.2" "X3.2" "X4.2" "X5.2" "X6.2" "X7.2" "X8.2" "X1.3" [19] "X2.3" "X3.3" "X4.3" "X5.3" "X6.3" "X7.3" "X8.3" "X1.5" "X2.5" [28] "X3.5" "X4.5" "X5.5" "X6.5" "X7.5" "X8.5" "X1.6" "X2.6" "X3.6" [37] "X4.6" "X5.6" "X6.6" "X7.6" "X8.6" "X1.8" "X2.8" "X3.8" "X4.8" [46] "X5.8" "X6.8" "X7.8" "X8.8" "X1.9" "X2.9" "X3.9" "X4.9" "X5.9" [55] "X6.9" "X7.9" "X8.9" "X1.10" "X2.10" "X3.10" "X4.10" "X5.10" "X6.10" [64] "X7.10" "X8.10" "X1.12" "X2.12" "X3.12" "X4.12" "X5.12" "X6.12" "X7.12" [73] "X8.12" "X1.13" "X2.13" "X3.13" "X4.13" "X5.13" "X6.13" "X7.13" "X8.13" [82] "X1.14" "X2.14" "X3.14" "X4.14" "X5.14" "X6.14" "X7.14" "X8.14" "X1.15" [91] "X2.15" "X3.15" "X4.15" "X5.15" "X6.15" "X7.15" "X8.15" "X1.16" "X2.16" [100] "X3.16" "X4.16" "X5.16" "X6.16" "X7.16" "X8.16" "X1.17" "X2.17" "X3.17" [109] "X4.17" "X5.17" "X6.17" "X7.17" "X8.17" "X1.18" "X2.18" "X3.18" "X4.18" [118] "X5.18" "X6.18" "X7.18" "X8.18" "X1.19" "X2.19" "X3.19" "X4.19" "X5.19" [127] "X6.19" "X7.19" "X8.19" "X1.20" "X2.20" "X3.20" "X4.20" "X5.20" "X6.20" [136] "X7.20" "X8.20" "X1.21" "X2.21" "X3.21" "X4.21" "X5.21" "X6.21" "X7.21" [145] "X8.21" "X1.22" "X2.22" "X3.22" "X4.22" "X5.22" "X6.22" "X7.22" "X8.22" [154] "X1.23" "X2.23" "X3.23" "X4.23" "X5.23" "X6.23" "X7.23" "X8.23" "X1.24" [163] "X2.24" "X3.24" "X4.24" "X5.24" "X6.24" "X7.24" "X8.24" "X1.25" "X2.25" [172] "X3.25" "X4.25" "X5.25" "X6.25" "X7.25" "X8.25" "X1.26" "X2.26" "X3.26" [181] "X4.26" "X5.26" "X6.26" "X7.26" "X8.26" "X1.27" "X2.27" "X3.27" "X4.27" [190] "X5.27" "X6.27" "X7.27" "X8.27" "X1.28" "X2.28" "X3.28" "X4.28" "X5.28" [199] "X6.28" "X7.28" "X8.28" "X1.29" "X2.29" "X3.29" "X4.29" "X5.29" "X6.29" [208] "X7.29" "X8.29" "X1.30" "X2.30" "X3.30" "X4.30" "X5.30" "X6.30" "X7.30" [217] "X8.30" "X1.31" "X2.31" "X3.31" "X4.31" "X5.31" "X6.31" "X7.31" "X8.31" [226] "X1.32" "X2.32" "X3.32" "X4.32" "X5.32" "X6.32" "X7.32" "X8.32" "X1.33" [235] "X2.33" "X3.33" "X4.33" "X5.33" "X6.33" "X7.33" "X8.33" What the names mean are behaviour.day the X is not important to the data, it is the numbers I am trying to select on. So I want to split the data by day i.e. selecting for the number after the decimal. I am using this code (where scananal is the data) with out looping so the number following the decimal I change manually (NB the data have been changed to character): DAY <- grep("(X[[:digit:]]+).3",colnames(scananal)) However, this will select for day 3, 30, 31, 32, etc I have tried to use fixed = TRUE, but that just returns integer(0). But if I use 30, it will select only 30. Not sure what I'm doing wrong here, and I assumed that fixed = T would fix this, but doesn't. I have tried to loop this too, but with no luck, so if anyone can point me in the right direction about how to loop using grep I would be most grateful! The main problem I have is where to put the loop, for example: for(i in 1:33){ print(i) DAY[[i]] <- grep("(X[[:digit:]]+).[[i]]",colnames(scananal)) } which doesn't work, and no doubt there are obvious reasons for this! Any help would be much appreciated, All the best, Ross -- View this message in context: http://r.789695.n4.nabble.com/grep-problem-decimal-points-looping-tp2319773p2319773.html Sent from the R help mailing list archive at Nabble.com.
On Aug 10, 2010, at 9:17 AM, RCulloch wrote:> > Hi R Users, > > I have been trying to work out how to rename column names using grep, > basically I have generated these column names using tapply: > > [1] "NAME" "X1.1" "X2.1" "X3.1" "X4.1" "X5.1" "X6.1" "X7.1" > "X8.1" > [10] "X1.2" "X2.2" "X3.2" "X4.2" "X5.2" "X6.2" "X7.2" "X8.2" > "X1.3" > [19] "X2.3" "X3.3" "X4.3" "X5.3" "X6.3" "X7.3" "X8.3" "X1.5" > "X2.5" > [28] "X3.5" "X4.5" "X5.5" "X6.5" "X7.5" "X8.5" "X1.6" "X2.6" > "X3.6" > [37] "X4.6" "X5.6" "X6.6" "X7.6" "X8.6" "X1.8" "X2.8" "X3.8" > "X4.8" > [46] "X5.8" "X6.8" "X7.8" "X8.8" "X1.9" "X2.9" "X3.9" "X4.9" > "X5.9" > [55] "X6.9" "X7.9" "X8.9" "X1.10" "X2.10" "X3.10" "X4.10" "X5.10" > "X6.10" > [64] "X7.10" "X8.10" "X1.12" "X2.12" "X3.12" "X4.12" "X5.12" "X6.12" > "X7.12" > [73] "X8.12" "X1.13" "X2.13" "X3.13" "X4.13" "X5.13" "X6.13" "X7.13" > "X8.13" > [82] "X1.14" "X2.14" "X3.14" "X4.14" "X5.14" "X6.14" "X7.14" "X8.14" > "X1.15" > [91] "X2.15" "X3.15" "X4.15" "X5.15" "X6.15" "X7.15" "X8.15" "X1.16" > "X2.16" > [100] "X3.16" "X4.16" "X5.16" "X6.16" "X7.16" "X8.16" "X1.17" "X2.17" > "X3.17" > [109] "X4.17" "X5.17" "X6.17" "X7.17" "X8.17" "X1.18" "X2.18" "X3.18" > "X4.18" > [118] "X5.18" "X6.18" "X7.18" "X8.18" "X1.19" "X2.19" "X3.19" "X4.19" > "X5.19" > [127] "X6.19" "X7.19" "X8.19" "X1.20" "X2.20" "X3.20" "X4.20" "X5.20" > "X6.20" > [136] "X7.20" "X8.20" "X1.21" "X2.21" "X3.21" "X4.21" "X5.21" "X6.21" > "X7.21" > [145] "X8.21" "X1.22" "X2.22" "X3.22" "X4.22" "X5.22" "X6.22" "X7.22" > "X8.22" > [154] "X1.23" "X2.23" "X3.23" "X4.23" "X5.23" "X6.23" "X7.23" "X8.23" > "X1.24" > [163] "X2.24" "X3.24" "X4.24" "X5.24" "X6.24" "X7.24" "X8.24" "X1.25" > "X2.25" > [172] "X3.25" "X4.25" "X5.25" "X6.25" "X7.25" "X8.25" "X1.26" "X2.26" > "X3.26" > [181] "X4.26" "X5.26" "X6.26" "X7.26" "X8.26" "X1.27" "X2.27" "X3.27" > "X4.27" > [190] "X5.27" "X6.27" "X7.27" "X8.27" "X1.28" "X2.28" "X3.28" "X4.28" > "X5.28" > [199] "X6.28" "X7.28" "X8.28" "X1.29" "X2.29" "X3.29" "X4.29" "X5.29" > "X6.29" > [208] "X7.29" "X8.29" "X1.30" "X2.30" "X3.30" "X4.30" "X5.30" "X6.30" > "X7.30" > [217] "X8.30" "X1.31" "X2.31" "X3.31" "X4.31" "X5.31" "X6.31" "X7.31" > "X8.31" > [226] "X1.32" "X2.32" "X3.32" "X4.32" "X5.32" "X6.32" "X7.32" "X8.32" > "X1.33" > [235] "X2.33" "X3.33" "X4.33" "X5.33" "X6.33" "X7.33" "X8.33" > > What the names mean are behaviour.day the X is not important to the > data, it > is the numbers I am trying to select on. > > So I want to split the data by day i.e. selecting for the number > after the > decimal. > > I am using this code (where scananal is the data) with out looping > so the > number following the decimal I change manually (NB the data have been > changed to character): >You need to learn the special character"$" which marks the no- character end of string. After creating a replica of your column-names with scan and grep: inp <- scan(what="character") inX <- inp[grep("X", inp)] > DAY <- grep("(X[[:digit:]]+).3$",inX) > inX[DAY] [1] "X1.3" "X2.3" "X3.3" "X4.3" "X5.3" "X6.3" "X7.3" "X8.3"> DAY <- grep("(X[[:digit:]]+).3",colnames(scananal)) > > However, this will select for day 3, 30, 31, 32, etc I have tried to > use > fixed = TRUE, but that just returns integer(0). But if I use 30, it > will > select only 30. Not sure what I'm doing wrong here, and I assumed > that fixed > = T would fix this, but doesn't. > > I have tried to loop this too, but with no luck, so if anyone can > point me > in the right direction about how to loop using grep I would be most > grateful! > > The main problem I have is where to put the loop, for example: > > for(i in 1:33){ > print(i) > DAY[[i]] <- grep("(X[[:digit:]]+).[[i]]",colnames(scananal)) > } > > > which doesn't work, and no doubt there are obvious reasons for this! > Any > help would be much appreciated, > > All the best, > > Ross > > > > > > > -- > View this message in context: http://r.789695.n4.nabble.com/grep-problem-decimal-points-looping-tp2319773p2319773.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD West Hartford, CT
On Aug 10, 2010, at 9:51 AM, David Winsemius wrote:> > On Aug 10, 2010, at 9:17 AM, RCulloch wrote: > >> >> Hi R Users, >> >> I have been trying to work out how to rename column names using grep, >> basically I have generated these column names using tapply: >> >> [1] "NAME" "X1.1" "X2.1" "X3.1" "X4.1" "X5.1" "X6.1" "X7.1" >> "X8.1" >> [10] "X1.2" "X2.2" "X3.2" "X4.2" "X5.2" "X6.2" "X7.2" >> "X8.2" "X1.3" >> [19] "X2.3" "X3.3" "X4.3" "X5.3" "X6.3" "X7.3" "X8.3" >> "X1.5" "X2.5" >> [28] "X3.5" "X4.5" "X5.5" "X6.5" "X7.5" "X8.5" "X1.6" >> "X2.6" "X3.6" >> [37] "X4.6" "X5.6" "X6.6" "X7.6" "X8.6" "X1.8" "X2.8" >> "X3.8" "X4.8" >> [46] "X5.8" "X6.8" "X7.8" "X8.8" "X1.9" "X2.9" "X3.9" >> "X4.9" "X5.9" >> [55] "X6.9" "X7.9" "X8.9" "X1.10" "X2.10" "X3.10" "X4.10" "X5.10" >> "X6.10" >> [64] "X7.10" "X8.10" "X1.12" "X2.12" "X3.12" "X4.12" "X5.12" "X6.12" >> "X7.12" >> [73] "X8.12" "X1.13" "X2.13" "X3.13" "X4.13" "X5.13" "X6.13" "X7.13" >> "X8.13" >> [82] "X1.14" "X2.14" "X3.14" "X4.14" "X5.14" "X6.14" "X7.14" "X8.14" >> "X1.15" >> [91] "X2.15" "X3.15" "X4.15" "X5.15" "X6.15" "X7.15" "X8.15" "X1.16" >> "X2.16" >> [100] "X3.16" "X4.16" "X5.16" "X6.16" "X7.16" "X8.16" "X1.17" "X2.17" >> "X3.17" >> [109] "X4.17" "X5.17" "X6.17" "X7.17" "X8.17" "X1.18" "X2.18" "X3.18" >> "X4.18" >> [118] "X5.18" "X6.18" "X7.18" "X8.18" "X1.19" "X2.19" "X3.19" "X4.19" >> "X5.19" >> [127] "X6.19" "X7.19" "X8.19" "X1.20" "X2.20" "X3.20" "X4.20" "X5.20" >> "X6.20" >> [136] "X7.20" "X8.20" "X1.21" "X2.21" "X3.21" "X4.21" "X5.21" "X6.21" >> "X7.21" >> [145] "X8.21" "X1.22" "X2.22" "X3.22" "X4.22" "X5.22" "X6.22" "X7.22" >> "X8.22" >> [154] "X1.23" "X2.23" "X3.23" "X4.23" "X5.23" "X6.23" "X7.23" "X8.23" >> "X1.24" >> [163] "X2.24" "X3.24" "X4.24" "X5.24" "X6.24" "X7.24" "X8.24" "X1.25" >> "X2.25" >> [172] "X3.25" "X4.25" "X5.25" "X6.25" "X7.25" "X8.25" "X1.26" "X2.26" >> "X3.26" >> [181] "X4.26" "X5.26" "X6.26" "X7.26" "X8.26" "X1.27" "X2.27" "X3.27" >> "X4.27" >> [190] "X5.27" "X6.27" "X7.27" "X8.27" "X1.28" "X2.28" "X3.28" "X4.28" >> "X5.28" >> [199] "X6.28" "X7.28" "X8.28" "X1.29" "X2.29" "X3.29" "X4.29" "X5.29" >> "X6.29" >> [208] "X7.29" "X8.29" "X1.30" "X2.30" "X3.30" "X4.30" "X5.30" "X6.30" >> "X7.30" >> [217] "X8.30" "X1.31" "X2.31" "X3.31" "X4.31" "X5.31" "X6.31" "X7.31" >> "X8.31" >> [226] "X1.32" "X2.32" "X3.32" "X4.32" "X5.32" "X6.32" "X7.32" "X8.32" >> "X1.33" >> [235] "X2.33" "X3.33" "X4.33" "X5.33" "X6.33" "X7.33" "X8.33" >> >> What the names mean are behaviour.day the X is not important to the >> data, it >> is the numbers I am trying to select on. >> >> So I want to split the data by day i.e. selecting for the number >> after the >> decimal. >> >> I am using this code (where scananal is the data) with out looping >> so the >> number following the decimal I change manually (NB the data have been >> changed to character): >> > > You need to learn the special character"$" which marks the no- > character end of string. After creating a replica of your column- > names with scan and grep: > inp <- scan(what="character") > inX <- inp[grep("X", inp)] > > > DAY <- grep("(X[[:digit:]]+).3$",inX) > > inX[DAY] > [1] "X1.3" "X2.3" "X3.3" "X4.3" "X5.3" "X6.3" "X7.3" "X8.3" > >> DAY <- grep("(X[[:digit:]]+).3",colnames(scananal)) >> >> However, this will select for day 3, 30, 31, 32, etc I have tried >> to use >> fixed = TRUE, but that just returns integer(0). But if I use 30, it >> will >> select only 30. Not sure what I'm doing wrong here, and I assumed >> that fixed >> = T would fix this, but doesn't. >> >> I have tried to loop this too, but with no luck, so if anyone can >> point me >> in the right direction about how to loop using grep I would be most >> grateful! >> >> The main problem I have is where to put the loop, for example: >> >> for(i in 1:33){ >> print(i) >> DAY[[i]] <- grep("(X[[:digit:]]+).[[i]]",colnames(scananal)) >> }Hit the send button a bit prematurely. I have not figured out what sort of process or result you hope to achieve but perhaps showing how to improve the use of grep inside a loop will help: for(i in 1:33){ patt <- paste("(X[[:digit:]]+).", i, "$", sep=""); if (length(inX[grep(patt,inX)]) >0 ) { DAY[i] <- list( grep(patt,inX) ) } } > DAY[1:5] [[1]] [1] 1 2 3 4 5 6 7 8 [[2]] [1] 9 10 11 12 13 14 15 16 [[3]] [1] 17 18 19 20 21 22 23 24 [[4]] NULL [[5]] [1] 25 26 27 28 29 30 31 32 This first constructs a pattern. It also needs to test if there are any results at each iteration because there are no days=="4". Unless you supply the result of grep() as a list it only records the first day in a series, so it only gives you the starting locations. Maybe if you clarified what you will be doing with this DAY construct, there might be more of a target to shoot for. You could use lapply on those column numbers at the moment.> --David Winsemius, MD West Hartford, CT
Try this also: nm <- scan('clipboard', what = '') transform(structure(do.call(rbind, strsplit(nm[-1], "\\.")), .Dimnames list(NULL, c('V1', 'V2'))), V1 = gsub("X", "", V1)) On Tue, Aug 10, 2010 at 10:17 AM, RCulloch <ross.culloch@dur.ac.uk> wrote:> > Hi R Users, > > I have been trying to work out how to rename column names using grep, > basically I have generated these column names using tapply: > > [1] "NAME" "X1.1" "X2.1" "X3.1" "X4.1" "X5.1" "X6.1" "X7.1" "X8.1" > [10] "X1.2" "X2.2" "X3.2" "X4.2" "X5.2" "X6.2" "X7.2" "X8.2" > "X1.3" > [19] "X2.3" "X3.3" "X4.3" "X5.3" "X6.3" "X7.3" "X8.3" "X1.5" > "X2.5" > [28] "X3.5" "X4.5" "X5.5" "X6.5" "X7.5" "X8.5" "X1.6" "X2.6" > "X3.6" > [37] "X4.6" "X5.6" "X6.6" "X7.6" "X8.6" "X1.8" "X2.8" "X3.8" > "X4.8" > [46] "X5.8" "X6.8" "X7.8" "X8.8" "X1.9" "X2.9" "X3.9" "X4.9" > "X5.9" > [55] "X6.9" "X7.9" "X8.9" "X1.10" "X2.10" "X3.10" "X4.10" "X5.10" > "X6.10" > [64] "X7.10" "X8.10" "X1.12" "X2.12" "X3.12" "X4.12" "X5.12" "X6.12" > "X7.12" > [73] "X8.12" "X1.13" "X2.13" "X3.13" "X4.13" "X5.13" "X6.13" "X7.13" > "X8.13" > [82] "X1.14" "X2.14" "X3.14" "X4.14" "X5.14" "X6.14" "X7.14" "X8.14" > "X1.15" > [91] "X2.15" "X3.15" "X4.15" "X5.15" "X6.15" "X7.15" "X8.15" "X1.16" > "X2.16" > [100] "X3.16" "X4.16" "X5.16" "X6.16" "X7.16" "X8.16" "X1.17" "X2.17" > "X3.17" > [109] "X4.17" "X5.17" "X6.17" "X7.17" "X8.17" "X1.18" "X2.18" "X3.18" > "X4.18" > [118] "X5.18" "X6.18" "X7.18" "X8.18" "X1.19" "X2.19" "X3.19" "X4.19" > "X5.19" > [127] "X6.19" "X7.19" "X8.19" "X1.20" "X2.20" "X3.20" "X4.20" "X5.20" > "X6.20" > [136] "X7.20" "X8.20" "X1.21" "X2.21" "X3.21" "X4.21" "X5.21" "X6.21" > "X7.21" > [145] "X8.21" "X1.22" "X2.22" "X3.22" "X4.22" "X5.22" "X6.22" "X7.22" > "X8.22" > [154] "X1.23" "X2.23" "X3.23" "X4.23" "X5.23" "X6.23" "X7.23" "X8.23" > "X1.24" > [163] "X2.24" "X3.24" "X4.24" "X5.24" "X6.24" "X7.24" "X8.24" "X1.25" > "X2.25" > [172] "X3.25" "X4.25" "X5.25" "X6.25" "X7.25" "X8.25" "X1.26" "X2.26" > "X3.26" > [181] "X4.26" "X5.26" "X6.26" "X7.26" "X8.26" "X1.27" "X2.27" "X3.27" > "X4.27" > [190] "X5.27" "X6.27" "X7.27" "X8.27" "X1.28" "X2.28" "X3.28" "X4.28" > "X5.28" > [199] "X6.28" "X7.28" "X8.28" "X1.29" "X2.29" "X3.29" "X4.29" "X5.29" > "X6.29" > [208] "X7.29" "X8.29" "X1.30" "X2.30" "X3.30" "X4.30" "X5.30" "X6.30" > "X7.30" > [217] "X8.30" "X1.31" "X2.31" "X3.31" "X4.31" "X5.31" "X6.31" "X7.31" > "X8.31" > [226] "X1.32" "X2.32" "X3.32" "X4.32" "X5.32" "X6.32" "X7.32" "X8.32" > "X1.33" > [235] "X2.33" "X3.33" "X4.33" "X5.33" "X6.33" "X7.33" "X8.33" > > What the names mean are behaviour.day the X is not important to the data, > it > is the numbers I am trying to select on. > > So I want to split the data by day i.e. selecting for the number after the > decimal. > > I am using this code (where scananal is the data) with out looping so the > number following the decimal I change manually (NB the data have been > changed to character): > > DAY <- grep("(X[[:digit:]]+).3",colnames(scananal)) > > However, this will select for day 3, 30, 31, 32, etc I have tried to use > fixed = TRUE, but that just returns integer(0). But if I use 30, it will > select only 30. Not sure what I'm doing wrong here, and I assumed that > fixed > = T would fix this, but doesn't. > > I have tried to loop this too, but with no luck, so if anyone can point me > in the right direction about how to loop using grep I would be most > grateful! > > The main problem I have is where to put the loop, for example: > > for(i in 1:33){ > print(i) > DAY[[i]] <- grep("(X[[:digit:]]+).[[i]]",colnames(scananal)) > } > > > which doesn't work, and no doubt there are obvious reasons for this! Any > help would be much appreciated, > > All the best, > > Ross > > > > > > > -- > View this message in context: > http://r.789695.n4.nabble.com/grep-problem-decimal-points-looping-tp2319773p2319773.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]