Hi everyone, I have a data frame that looks *sort of* like this: name <- letters[1:5] signal.1 <- c("12", "bad signal", "noise", "10", "X") length.signal.1 <- 5:9 intensity.signal.1 <- 3:7 signal.2 <- c("13", "noise", "19.2", "X", "V") length.signal.2 <- 2:6 intensity.signal.2 <- 1:5 signal.3 <- c("NA", "15.4", "error", "NA", "17") length.signal.3 <- c("NA", 2, 3, "NA", 4) intensity.signal.3 <- c("NA",4, 5, "NA", 5) #(there are actually up to 16 signals and 50 names, but I made this short for the example) df <- data.frame(cbind(name, signal.1, length.signal.1, intensity.signal.1, signal.2, length.signal.2, intensity.signal.2, signal.3, length.signal.3, intensity.signal.3)) I need to "fish out" some values and have them in a new data frame. I am only interested in values in columns 2, 5 and 8 (actually seq(2, 50, 3) in my real df) I want the values that are not: "bad signal" "noise" "error" "NA" "V" This is the output I want (the name column is unimportant for my purposes, its just there as a reference for the example). (name) S1 S2 A 12 13 B 15.4 (another value found in the other signals >3 not shown on example) C 19.2 (another value found in the other signals >3 not shown on example) D 10 X E X 17 I do know that there will always be 2 values exactly that do not match the exclusions named above, or none at all I have tried different approaches, grep, matching,%nin%... But as I am not an advanced used, I am very likely doing something wrong, because I either get a vector, or I get a matrix with TRUE FALSE, and usually I get the whole rows, and I don't want that :( I have also being searching the list for answers without avail. Any suggestions? Examples including syntax are appreciated (syntax is a major weak point for me). Laura [[alternative HTML version deleted]]
Hi Laura, On Thu, Aug 11, 2011 at 7:01 AM, Lali <laurafe at gmail.com> wrote:> Hi everyone, > I have a data frame that looks *sort of* like this: > > name <- letters[1:5] > signal.1 <- c("12", "bad signal", "noise", "10", "X") > length.signal.1 <- 5:9 > intensity.signal.1 <- 3:7 > signal.2 <- c("13", "noise", "19.2", "X", "V") > length.signal.2 <- 2:6 > intensity.signal.2 <- 1:5 > signal.3 <- c("NA", "15.4", "error", "NA", "17") > length.signal.3 <- c("NA", 2, 3, "NA", 4) > intensity.signal.3 <- c("NA",4, 5, "NA", 5) > > #(there are actually up to 16 signals and 50 names, but I made this short > for the example) > > df <- data.frame(cbind(name, signal.1, length.signal.1, intensity.signal.1, > signal.2, > ? ? ? ? ? ? ? ? ? ? ? length.signal.2, intensity.signal.2, signal.3, > length.signal.3, > ? ? ? ? ? ? ? ? ? ? ? intensity.signal.3)) > > > > I need to "fish out" some values and have them in a new data frame. > > I am only interested in values in columns 2, 5 and 8 (actually seq(2, 50, 3) > in my real df) > I want the values that are not: > "bad signal" > "noise" > "error" > "NA" > "V" > > This is the output I want (the name column is unimportant for my purposes, > its just there as a reference for the example). > > (name) ?S1 ? ? ? S2 > A ? ? ? ?12 ? ? ? ?13 > B ? ? ? ?15.4 ? ? (another value found in the other signals >3 not shown on > example) > C ? ? ? ?19.2 ? ? (another value found in the other signals >3 not shown on > example) > D ? ? ? ?10 ? ? ? ?X > E ? ? ? ?X ? ? ? ? 17 > > I do know that there will always be 2 values exactly that do not match the > exclusions named above, or none at all > > I have tried different approaches, grep, matching,%nin%... But as I am not > an advanced used, I am very likely doing something wrong, because I either > get a vector, or I get a matrix with TRUE FALSE, and usually I get the whole > rows, and I don't want that :( > I have also being searching the list for answers without avail. > Any suggestions? Examples including syntax are appreciated (syntax is a > major weak point for me).Here is a solution using the reshape and plyr packages library(reshape) dfm <- melt(df[c(1, 2, 5, 8)], id = 1) dfm.r <- dfm[!dfm$value %in% c("bad signal", "noise", "error", "NA", "V"), ] dfm.r <- ddply(dfm.r, .(name), transform, index = paste("S", 1:length(name), sep = "")) cast(dfm.r, name ~ index) Best, Ista> > > Laura > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org
On Fri, Aug 12, 2011 at 3:54 AM, Lali <laurafe at gmail.com> wrote:> Hi Ista, > Thanks for your suggestion, I am still trying to wrap my head around the > functions you used, as I am not familiar with any of them, but it works > perfectly! > I do want to understand the code, if you don't mind I would like to ask a > few questions > In this line: > dfm <- melt(df[c(1, 2, 5, 8)], id = 1) > What does the id=1 do? The variables are already specified in?df[c(1, 2, 5, > 8)], right?It retains the first variable in df in dfm. The remaining variables are collapsed into a single variable named "value".> What does this line do: > dfm.r <- ddply(dfm.r, .(name), transform, index = paste("S", > 1:length(name), sep = ""))It adds a variable named "index" with values of Si for each level of "name" where i= 1:length(name) HTH, Ista> Thanks again for the help. > Laura > > > > On Thu, Aug 11, 2011 at 5:57 PM, Ista Zahn <izahn at psych.rochester.edu> > wrote: >> >> Hi Laura, >> >> On Thu, Aug 11, 2011 at 7:01 AM, Lali <laurafe at gmail.com> wrote: >> > Hi everyone, >> > I have a data frame that looks *sort of* like this: >> > >> > name <- letters[1:5] >> > signal.1 <- c("12", "bad signal", "noise", "10", "X") >> > length.signal.1 <- 5:9 >> > intensity.signal.1 <- 3:7 >> > signal.2 <- c("13", "noise", "19.2", "X", "V") >> > length.signal.2 <- 2:6 >> > intensity.signal.2 <- 1:5 >> > signal.3 <- c("NA", "15.4", "error", "NA", "17") >> > length.signal.3 <- c("NA", 2, 3, "NA", 4) >> > intensity.signal.3 <- c("NA",4, 5, "NA", 5) >> > >> > #(there are actually up to 16 signals and 50 names, but I made this >> > short >> > for the example) >> > >> > df <- data.frame(cbind(name, signal.1, length.signal.1, >> > intensity.signal.1, >> > signal.2, >> > ? ? ? ? ? ? ? ? ? ? ? length.signal.2, intensity.signal.2, signal.3, >> > length.signal.3, >> > ? ? ? ? ? ? ? ? ? ? ? intensity.signal.3)) >> > >> > >> > >> > I need to "fish out" some values and have them in a new data frame. >> > >> > I am only interested in values in columns 2, 5 and 8 (actually seq(2, >> > 50, 3) >> > in my real df) >> > I want the values that are not: >> > "bad signal" >> > "noise" >> > "error" >> > "NA" >> > "V" >> > >> > This is the output I want (the name column is unimportant for my >> > purposes, >> > its just there as a reference for the example). >> > >> > (name) ?S1 ? ? ? S2 >> > A ? ? ? ?12 ? ? ? ?13 >> > B ? ? ? ?15.4 ? ? (another value found in the other signals >3 not shown >> > on >> > example) >> > C ? ? ? ?19.2 ? ? (another value found in the other signals >3 not shown >> > on >> > example) >> > D ? ? ? ?10 ? ? ? ?X >> > E ? ? ? ?X ? ? ? ? 17 >> > >> > I do know that there will always be 2 values exactly that do not match >> > the >> > exclusions named above, or none at all >> > >> > I have tried different approaches, grep, matching,%nin%... But as I am >> > not >> > an advanced used, I am very likely doing something wrong, because I >> > either >> > get a vector, or I get a matrix with TRUE FALSE, and usually I get the >> > whole >> > rows, and I don't want that :( >> > I have also being searching the list for answers without avail. >> > Any suggestions? Examples including syntax are appreciated (syntax is a >> > major weak point for me). >> >> Here is a solution using the reshape and plyr packages >> >> library(reshape) >> dfm <- melt(df[c(1, 2, 5, 8)], id = 1) >> dfm.r <- dfm[!dfm$value %in% c("bad signal", "noise", "error", "NA", "V"), >> ] >> dfm.r <- ddply(dfm.r, .(name), transform, index = paste("S", >> 1:length(name), sep = "")) >> cast(dfm.r, name ~ index) >> >> Best, >> Ista >> > >> > >> > Laura >> > >> > ? ? ? ?[[alternative HTML version deleted]] >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> >> >> -- >> Ista Zahn >> Graduate student >> University of Rochester >> Department of Clinical and Social Psychology >> http://yourpsyche.org > >-- Ista Zahn Graduate student University of Rochester Department of Clinical and Social Psychology http://yourpsyche.org