Hi all, I'm having a problem once again, trying to do something very simple. Consider the following data frame: x <- read.table(textConnection("locus1 locus2 locus3 A T C A T NA T C C A T G"), header = TRUE) closeAllConnections() I am trying to make a new data frame, replacing "A" with "A/A", "T" with "T/T", "G" with "G/G", and "C" with "C/C." Note also the presence of an "NA" (missing data) in the data frame, which should be carried over to the new data frame. Here is what I am trying, which fails miserably: x2 <- data.frame(matrix(nrow = nrow(x), ncol = ncol(x))) for (i in 1:nrow(x)){ for (j in 1:ncol(x)){ if(x[i, j] == 'A') {x2[i, j] <- 'A/A'} else{ if(x[i, j] == 'T') {x2[i, j] <- 'T/T'} else{ if(x[i, j] == 'G') {x2[i, j] <- 'G/G'} else{ if(x[i, j] == 'G') {x2[i, j] <- 'G/G'} else{x2[i, j] <- NA} } } } } } I get the following error message: Error in if (x[i, j] == "A") { : missing value where TRUE/FALSE needed So what am I doing wrong? If you can provide me with specific code that fixes the problem and gets the job done, that would be the most useful. Thanks very much in advance for your help! Sincerely, ----------------------------------- Josh Banta, Ph.D Center for Genomics and Systems Biology New York University 100 Washington Square East New York, NY 10003 Tel: (212) 998-8465 http://plantevolutionaryecology.org [[alternative HTML version deleted]]
Henrique Dallazuanna
2011-Feb-17 17:22 UTC
[R] Find and replace all the elements in a data frame
Try this: xNew <- as.data.frame(mapply(paste, x, x, sep = "/")) xNew[is.na(x)] <- NA xNew On Thu, Feb 17, 2011 at 2:54 PM, Josh B <joshb41@yahoo.com> wrote:> Hi all, > > I'm having a problem once again, trying to do something very simple. > Consider > the following data frame: > > x <- read.table(textConnection("locus1 locus2 locus3 > A T C > A T NA > T C C > A T G"), header = TRUE) > closeAllConnections() > > I am trying to make a new data frame, replacing "A" with "A/A", "T" with > "T/T", > "G" with "G/G", and "C" with "C/C." Note also the presence of an "NA" > (missing > data) in the data frame, which should be carried over to the new data > frame. > > Here is what I am trying, which fails miserably: > > x2 <- data.frame(matrix(nrow = nrow(x), ncol = ncol(x))) > > for (i in 1:nrow(x)){ > for (j in 1:ncol(x)){ > if(x[i, j] == 'A') {x2[i, j] <- 'A/A'} else{ > if(x[i, j] == 'T') {x2[i, j] <- 'T/T'} else{ > if(x[i, j] == 'G') {x2[i, j] <- 'G/G'} else{ > if(x[i, j] == 'G') {x2[i, j] <- 'G/G'} else{x2[i, j] <- > NA} > } > } > } > } > } > > I get the following error message: > Error in if (x[i, j] == "A") { : missing value where TRUE/FALSE needed > > So what am I doing wrong? If you can provide me with specific code that > fixes > the problem and gets the job done, that would be the most useful. > > > Thanks very much in advance for your help! > > Sincerely, > ----------------------------------- > Josh Banta, Ph.D > Center for Genomics and Systems Biology > New York University > 100 Washington Square East > New York, NY 10003 > Tel: (212) 998-8465 > http://plantevolutionaryecology.org > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Henrique Dallazuanna Curitiba-Paraná-Brasil 25° 25' 40" S 49° 16' 22" O [[alternative HTML version deleted]]
You may write as this: for (i in 1:nrow(x)){ for (j in 1:ncol(x)){ if (!is.na(x[i, j])) { if(x[i, j] == 'A') {x2[i, j] <- 'A/A'} else{ if(x[i, j] == 'T') {x2[i, j] <- 'T/T'} else{ if(x[i, j] == 'G') {x2[i, j] <- 'G/G'} else{ if(x[i, j] == 'G') {x2[i, j] <- 'G/G'} else{x2[i, j] <- NA} } } } } } } On Thu, Feb 17, 2011 at 11:54 AM, Josh B <joshb41@yahoo.com> wrote:> Hi all, > > I'm having a problem once again, trying to do something very simple. > Consider > the following data frame: > > x <- read.table(textConnection("locus1 locus2 locus3 > A T C > A T NA > T C C > A T G"), header = TRUE) > closeAllConnections() > > I am trying to make a new data frame, replacing "A" with "A/A", "T" with > "T/T", > "G" with "G/G", and "C" with "C/C." Note also the presence of an "NA" > (missing > data) in the data frame, which should be carried over to the new data > frame. > > Here is what I am trying, which fails miserably: > > x2 <- data.frame(matrix(nrow = nrow(x), ncol = ncol(x))) > > for (i in 1:nrow(x)){ > for (j in 1:ncol(x)){ > if(x[i, j] == 'A') {x2[i, j] <- 'A/A'} else{ > if(x[i, j] == 'T') {x2[i, j] <- 'T/T'} else{ > if(x[i, j] == 'G') {x2[i, j] <- 'G/G'} else{ > if(x[i, j] == 'G') {x2[i, j] <- 'G/G'} else{x2[i, j] <- > NA} > } > } > } > } > } > > I get the following error message: > Error in if (x[i, j] == "A") { : missing value where TRUE/FALSE needed > > So what am I doing wrong? If you can provide me with specific code that > fixes > the problem and gets the job done, that would be the most useful. > > > Thanks very much in advance for your help! > > Sincerely, > ----------------------------------- > Josh Banta, Ph.D > Center for Genomics and Systems Biology > New York University > 100 Washington Square East > New York, NY 10003 > Tel: (212) 998-8465 > http://plantevolutionaryecology.org > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Gong-Yi Liao Department of Statistics University of Connecticut 215 Glenbrook Road U4120 Storrs, CT 06269-4120 860-486-9478 [[alternative HTML version deleted]]
Josh, you've made it far too complicated. Here's one simpler way (note that I changed your read.table statement to make the values NOT factors, since I wouldn't think you want that).> x <- read.table(textConnection("locus1 locus2 locus3+ A T C + A T NA + T C C + A T G"), header = TRUE, as.is=TRUE)> closeAllConnections() > > x2 <- x > x2[x2 == "A"] <- "A/A" > x2[x2 == "T"] <- "T/T" > x2[x2 == "G"] <- "G/G" > x2[x2 == "C"] <- "C/C" > x2locus1 locus2 locus3 1 A/A T/T C/C 2 A/A T/T <NA> 3 T/T C/C C/C 4 A/A T/T G/G If you do for some reason want a factor, you'll need to adjust the levels for each column before doing this. Sarah On Thu, Feb 17, 2011 at 11:54 AM, Josh B <joshb41 at yahoo.com> wrote:> Hi all, > > I'm having a problem once again, trying to do something very simple. Consider > the following data frame: > > x <- read.table(textConnection("locus1 locus2 locus3 > A T C > A T NA > T C C > A T G"), header = TRUE) > closeAllConnections() > > I am trying to make a new data frame, replacing "A" with "A/A", "T" with "T/T", > "G" with "G/G", and "C" with "C/C." Note also the presence of an "NA" (missing > data) in the data frame, which should be carried over to the new data frame. > > Here is what I am trying, which fails miserably: > > x2 <- data.frame(matrix(nrow = nrow(x), ncol = ncol(x))) > > for (i in 1:nrow(x)){ > ? ?for (j in 1:ncol(x)){ > ? ? ? ?if(x[i, j] == 'A') {x2[i, j] <- 'A/A'} else{ > ? ? ? ? ? ?if(x[i, j] == 'T') {x2[i, j] <- 'T/T'} else{ > ? ? ? ? ? ? ? ? if(x[i, j] == 'G') {x2[i, j] <- 'G/G'} else{ > ? ? ? ? ? ? ? ? ? ?if(x[i, j] == 'G') {x2[i, j] <- 'G/G'} else{x2[i, j] <- NA} > ? ? ? ? ? ? ? ?} > ? ? ? ? ? } > ? ? ? } > ? ?} > } > > I get the following error message: > Error in if (x[i, j] == "A") { : missing value where TRUE/FALSE needed > > So what am I doing wrong? If you can provide me with specific code that fixes > the problem and gets the job done, that would be the most useful. > > > Thanks very much in advance for your help! > > Sincerely, > ----------------------------------- > Josh Banta, Ph.D > Center for Genomics and Systems Biology > New York University > 100 Washington Square East > New York, NY 10003 > Tel: (212) 998-8465 > http://plantevolutionaryecology.org > > > > ? ? ? ?[[alternative HTML version deleted]]-- Sarah Goslee http://www.functionaldiversity.org
baptiste auguie
2011-Feb-17 17:29 UTC
[R] Find and replace all the elements in a data frame
Hi, You could use car::recode to change the levels of the factors, library(car) transform(x, locus1 = recode(locus1, "'A' = 'A/A' ; else = 'T/T'"), locus2 = recode(locus2, "'T'='T/T' ; 'C' = 'C/C'"), locus3 = recode(locus3, "'C'='C/C' ; 'G' = 'G/G'")) HTH, baptiste On 17 February 2011 17:54, Josh B <joshb41 at yahoo.com> wrote:> Hi all, > > I'm having a problem once again, trying to do something very simple. Consider > the following data frame: > > x <- read.table(textConnection("locus1 locus2 locus3 > A T C > A T NA > T C C > A T G"), header = TRUE) > closeAllConnections() > > I am trying to make a new data frame, replacing "A" with "A/A", "T" with "T/T", > "G" with "G/G", and "C" with "C/C." Note also the presence of an "NA" (missing > data) in the data frame, which should be carried over to the new data frame. > > Here is what I am trying, which fails miserably: > > x2 <- data.frame(matrix(nrow = nrow(x), ncol = ncol(x))) > > for (i in 1:nrow(x)){ > ? ?for (j in 1:ncol(x)){ > ? ? ? ?if(x[i, j] == 'A') {x2[i, j] <- 'A/A'} else{ > ? ? ? ? ? ?if(x[i, j] == 'T') {x2[i, j] <- 'T/T'} else{ > ? ? ? ? ? ? ? ? if(x[i, j] == 'G') {x2[i, j] <- 'G/G'} else{ > ? ? ? ? ? ? ? ? ? ?if(x[i, j] == 'G') {x2[i, j] <- 'G/G'} else{x2[i, j] <- NA} > ? ? ? ? ? ? ? ?} > ? ? ? ? ? } > ? ? ? } > ? ?} > } > > I get the following error message: > Error in if (x[i, j] == "A") { : missing value where TRUE/FALSE needed > > So what am I doing wrong? If you can provide me with specific code that fixes > the problem and gets the job done, that would be the most useful. > > > Thanks very much in advance for your help! > > Sincerely, > ----------------------------------- > Josh Banta, Ph.D > Center for Genomics and Systems Biology > New York University > 100 Washington Square East > New York, NY 10003 > Tel: (212) 998-8465 > http://plantevolutionaryecology.org > > > > ? ? ? ?[[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >
Seemingly Similar Threads
- Splitting a data frame into several completely separate data frames
- Partial italic in graph titles when looping
- Plotting a quadratic line on top of an xy scatterplot
- Pointing to a specific place on the x-axis with an arrow
- Help with customizing a histogram figure