Francesca
2024-Sep-12 07:42 UTC
[R] "And" condition spanning over multiple columns in data frame
Dear contributors, I need to create a set of columns, based on conditions of a dataframe as follows. I have managed to do the trick for one column, but I do not seem to find any good example where the condition is extended to all the dataframe. I have these dataframe called c10Dt: id cp1 cp2 cp3 cp4 cp5 cp6 cp7 cp8 cp9 cp10 cp11 cp12 1 1 NA NA NA NA NA NA NA NA NA NA NA NA 2 4 8 18 15 10 12 11 9 18 8 16 15 NA 3 3 8 5 5 4 NA 5 NA 6 NA 10 10 10 4 3 5 5 4 4 3 2 1 3 2 1 1 2 5 1 NA NA NA NA NA NA NA NA NA NA NA NA 6 2 5 5 10 10 9 10 10 10 NA 10 9 10 -- Columns are id, cp1, cp2.. and so on. What I need to do is the following, made on just one column: c10Dt <- mutate(c10Dt, exit1= ifelse(is.na(cp1) & id!=1, 1, 0)) So, I create a new variable, called exit1, in which the program selects cp1, checks if it is NA, and if it is NA but also the value of the column "id" is not 1, then it gives back a 1, otherwise 0. So, what I want is that it selects all the cases in which the id=2,3, or 4 is not NA in the corresponding values of the matrix. I managed to do it manually column by column, but I feel there should be something smarter here. The problem is that I need to replicate this over all the columns from cp2, to cp12, but keeping fixed the id column instead. I have tried with c10Dt %>% mutate(x=across(starts_with("cp"), ~ifelse(. == NA)) & id!=1,1,0 ) but the problem with across is that it will implement the condition only on cp_ columns. How do I tell R to use the column id with all the other columns? Thanks for any help provided. Francesca ---------------------------------- [[alternative HTML version deleted]]
Ivan Krylov
2024-Sep-12 08:42 UTC
[R] "And" condition spanning over multiple columns in data frame
? Thu, 12 Sep 2024 09:42:57 +0200 Francesca <francesca.pancotto at gmail.com> ?????:> c10Dt <- mutate(c10Dt, exit1= ifelse(is.na(cp1) & id!=1, 1, 0))> So, I create a new variable, called exit1, in which the program > selects cp1, checks if it is NA, and if it is NA but also the value > of the column "id" is not 1, then it gives back a 1, otherwise 0. > So, what I want is that it selects all the cases in which the id=2,3, > or 4 is not NA in the corresponding values of the matrix.Since all your columns except the first one are the desired "cp*" columns, you can obtain your "exit" columns in bulk: ( c10Dt$id != 1 & # will be recycled column-wise, as we need is.na(c10Dt[-1]) ) |> # ...and then convert back into a data.frame, as.data.frame() |> # rename the columns... (\(x) setNames(x, sub('cp', 'exit', names(x))))() |> # ...and finally attach to the original data.frame cbind(c10Dt) -- Best regards, Ivan
Eric Berger
2024-Sep-12 08:44 UTC
[R] "And" condition spanning over multiple columns in data frame
Hi, To rephrase what you are trying to do, you want a copy of all the cp columns, in which all the NAs become 1s and any other value becomes a zero. There is an exception for the first row, where the NAs should become 0s. a <- c10Dt b <- matrix(as.numeric(is.na(a[,-1])), nrow=nrow(a)) b[1,] <- 0 # first row gets special treatment colnames(b) <- paste0("exit",1:ncol(b)) d <- cbind(a,b) d On Thu, Sep 12, 2024 at 10:43?AM Francesca <francesca.pancotto at gmail.com> wrote:> > Dear contributors, > I need to create a set of columns, based on conditions of a dataframe as > follows. > I have managed to do the trick for one column, but I do not seem to find > any good example where the condition is extended to all the dataframe. > > I have these dataframe called c10Dt: > > > > id cp1 cp2 cp3 cp4 cp5 cp6 cp7 cp8 cp9 cp10 cp11 cp12 > 1 1 NA NA NA NA NA NA NA NA NA NA NA NA > 2 4 8 18 15 10 12 11 9 18 8 16 15 NA > 3 3 8 5 5 4 NA 5 NA 6 NA 10 10 10 > 4 3 5 5 4 4 3 2 1 3 2 1 1 2 > 5 1 NA NA NA NA NA NA NA NA NA NA NA NA > 6 2 5 5 10 10 9 10 10 10 NA 10 9 10 > -- > > Columns are id, cp1, cp2.. and so on. > > What I need to do is the following, made on just one column: > > c10Dt <- mutate(c10Dt, exit1= ifelse(is.na(cp1) & id!=1, 1, 0)) > > So, I create a new variable, called exit1, in which the program selects > cp1, checks if it is NA, and if it is NA but also the value of the column > "id" is not 1, then it gives back a 1, otherwise 0. > So, what I want is that it selects all the cases in which the id=2,3, or 4 > is not NA in the corresponding values of the matrix. > I managed to do it manually column by column, but I feel there should be > something smarter here. > > The problem is that I need to replicate this over all the columns from cp2, > to cp12, but keeping fixed the id column instead. > > I have tried with > > c10Dt %>% > mutate(x=across(starts_with("cp"), ~ifelse(. == NA)) & id!=1,1,0 ) > > but the problem with across is that it will implement the condition only on > cp_ columns. How do I tell R to use the column id with all the other > columns? > > > Thanks for any help provided. > > > Francesca > > > ---------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Rui Barradas
2024-Sep-12 14:36 UTC
[R] "And" condition spanning over multiple columns in data frame
?s 08:42 de 12/09/2024, Francesca escreveu:> Dear contributors, > I need to create a set of columns, based on conditions of a dataframe as > follows. > I have managed to do the trick for one column, but I do not seem to find > any good example where the condition is extended to all the dataframe. > > I have these dataframe called c10Dt: > > > > id cp1 cp2 cp3 cp4 cp5 cp6 cp7 cp8 cp9 cp10 cp11 cp12 > 1 1 NA NA NA NA NA NA NA NA NA NA NA NA > 2 4 8 18 15 10 12 11 9 18 8 16 15 NA > 3 3 8 5 5 4 NA 5 NA 6 NA 10 10 10 > 4 3 5 5 4 4 3 2 1 3 2 1 1 2 > 5 1 NA NA NA NA NA NA NA NA NA NA NA NA > 6 2 5 5 10 10 9 10 10 10 NA 10 9 10 > -- Columns are id, cp1, cp2.. and so on. What I need to do is the > following, made on just one column: c10Dt <- mutate(c10Dt, exit1= > ifelse(is.na(cp1) & id!=1, 1, 0)) So, I create a new variable, called > exit1, in which the program selects cp1, checks if it is NA, and if it > is NA but also the value of the column "id" is not 1, then it gives back > a 1, otherwise 0. So, what I want is that it selects all the cases in > which the id=2,3, or 4 is not NA in the corresponding values of the > matrix. I managed to do it manually column by column, but I feel there > should be something smarter here. The problem is that I need to > replicate this over all the columns from cp2, to cp12, but keeping fixed > the id column instead. I have tried with c10Dt %>% > mutate(x=across(starts_with("cp"), ~ifelse(. == NA)) & id!=1,1,0 ) but > the problem with across is that it will implement the condition only on > cp_ columns. How do I tell R to use the column id with all the other > columns? Thanks for any help provided. Francesca > ----------------------------------Hello, Something like this? 1. If an ifelse instruction is meant to create a binary result, coerce the logical condition to integer instead. You can make it more clear by substituting as.integer for the plus sign below; 2. the .names argument is used to create new columns and keeping the original ones. df1 <- read.table(text = "id cp1 cp2 cp3 cp4 cp5 cp6 cp7 cp8 cp9 cp10 cp11 cp12 1 1 NA NA NA NA NA NA NA NA NA NA NA NA 2 4 8 18 15 10 12 11 9 18 8 16 15 NA 3 3 8 5 5 4 NA 5 NA 6 NA 10 10 10 4 3 5 5 4 4 3 2 1 3 2 1 1 2 5 1 NA NA NA NA NA NA NA NA NA NA NA NA 6 2 5 5 10 10 9 10 10 10 NA 10 9 10", header = TRUE) df1 library(dplyr) df1 %>% mutate(across(starts_with("cp"), ~ +(is.na(.) & id != 1), .names = "{col}_new")) Hope this helps, Rui Barradas -- Este e-mail foi analisado pelo software antiv?rus AVG para verificar a presen?a de v?rus. www.avg.com