Hi Jim, I tried it:> b$pheno<-ifelse(b$PLASER==2 | b$FLASER==2 |is.na(b$PLASER) & b$FLASER == 2,2,1) > table(b$pheno,exclude = NULL)1 2 <NA> 859 828 11> b$pheno<-ifelse(b$PLASER==2 | b$FLASER==2 |is.na(b$FLASER) & b$PLASER == 2,2,1) > table(b$pheno,exclude = NULL)1 2 <NA> 859 828 11 Am I am doing something wrong? Thanks Ana On Fri, Jun 12, 2020 at 8:06 PM Jim Lemon <drjimlemon at gmail.com> wrote:> > Hi Ana, > From your desired result, it looks like those two NA values in PLASER > are the ones you want to drop. > If so, try this: > > b$pheno<-ifelse(b$PLASER==2 | b$FLASER==2 | > is.na(b$PLASER) & b$FLASER == 2,2,1) > > and if I have it the wrong way round, swap FLASER and PLASER in the > bit I have added. > > Jim > > On Sat, Jun 13, 2020 at 10:46 AM Ana Marija <sokovic.anamarija at gmail.com> wrote: > > > > Hello > > > > I have a data frame like this: > > > > > head(b) > > FID IID FLASER PLASER > > 1: fam1000 G1000 1 1 > > 2: fam1001 G1001 1 1 > > 3: fam1003 G1003 1 2 > > 4: fam1005 G1005 1 1 > > 5: fam1009 G1009 1 1 > > 6: fam1052 G1052 1 1 > > ... > > > > > table(b$PLASER,b$FLASER, exclude = NULL) > > > > 1 2 3 <NA> > > 1 836 14 0 0 > > 2 691 70 43 2 > > 3 2 7 21 0 > > <NA> 4 1 0 7 > > > > I am trying to make a new column "pheno" so that I reduce the number of NAs > > > > right now I am doing: > > > > > b$pheno=ifelse(b$PLASER==2 | b$FLASER==2,2,1) > > > table(b$pheno, exclude = NULL) > > > > 1 2 <NA> > > 859 828 11 > > > > I would like to reduce this number of NAs to be 7 > > so I would like to have in "pheno column" > > 7 NAs > > 825 2s (825=691+14+70+7+43) > > and the rest would be 1s (866=1698-7-825) > > > > How can I change the above command to get these numbers? > > > > Thanks > > Ana > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code.
Obviously my guess was wrong. I thought you wanted to impute the value of "pheno" from FLASER if PLASER was missing. From just your summary table, it's hard to guess the distribution of NA values. My guess that the two undesirable NAs were cases where PLASER was missing and FLASER was 2. My tactic at this point would be to look at the cases where either FLASER or PLASER was missing and work out the logic to impute the two that are giving you trouble. Jim On Sat, Jun 13, 2020 at 11:16 AM Ana Marija <sokovic.anamarija at gmail.com> wrote:> > Hi Jim, > > I tried it: > > b$pheno<-ifelse(b$PLASER==2 | b$FLASER==2 |is.na(b$PLASER) & b$FLASER == 2,2,1) > > table(b$pheno,exclude = NULL) > > 1 2 <NA> > 859 828 11 > > b$pheno<-ifelse(b$PLASER==2 | b$FLASER==2 |is.na(b$FLASER) & b$PLASER == 2,2,1) > > table(b$pheno,exclude = NULL) > > 1 2 <NA> > 859 828 11 > > Am I am doing something wrong? > > Thanks > Ana > > On Fri, Jun 12, 2020 at 8:06 PM Jim Lemon <drjimlemon at gmail.com> wrote: > > > > Hi Ana, > > From your desired result, it looks like those two NA values in PLASER > > are the ones you want to drop. > > If so, try this: > > > > b$pheno<-ifelse(b$PLASER==2 | b$FLASER==2 | > > is.na(b$PLASER) & b$FLASER == 2,2,1) > > > > and if I have it the wrong way round, swap FLASER and PLASER in the > > bit I have added. > > > > Jim > > > > On Sat, Jun 13, 2020 at 10:46 AM Ana Marija <sokovic.anamarija at gmail.com> wrote: > > > > > > Hello > > > > > > I have a data frame like this: > > > > > > > head(b) > > > FID IID FLASER PLASER > > > 1: fam1000 G1000 1 1 > > > 2: fam1001 G1001 1 1 > > > 3: fam1003 G1003 1 2 > > > 4: fam1005 G1005 1 1 > > > 5: fam1009 G1009 1 1 > > > 6: fam1052 G1052 1 1 > > > ... > > > > > > > table(b$PLASER,b$FLASER, exclude = NULL) > > > > > > 1 2 3 <NA> > > > 1 836 14 0 0 > > > 2 691 70 43 2 > > > 3 2 7 21 0 > > > <NA> 4 1 0 7 > > > > > > I am trying to make a new column "pheno" so that I reduce the number of NAs > > > > > > right now I am doing: > > > > > > > b$pheno=ifelse(b$PLASER==2 | b$FLASER==2,2,1) > > > > table(b$pheno, exclude = NULL) > > > > > > 1 2 <NA> > > > 859 828 11 > > > > > > I would like to reduce this number of NAs to be 7 > > > so I would like to have in "pheno column" > > > 7 NAs > > > 825 2s (825=691+14+70+7+43) > > > and the rest would be 1s (866=1698-7-825) > > > > > > How can I change the above command to get these numbers? > > > > > > Thanks > > > Ana > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code.
On 2020-06-13 11:30 +1000, Jim Lemon wrote:> On Fri, Jun 12, 2020 at 8:06 PM Jim Lemon wrote: > > On Sat, Jun 13, 2020 at 10:46 AM Ana Marija wrote: > > > > > > I am trying to make a new column > > > "pheno" so that I reduce the number > > > of NAs > > > > it looks like those two NA values in > > PLASER are the ones you want to drop. > > From just your summary table, it's hard to > guess the distribution of NA values.Dear Ana, This small sample b <- read.table(text="FLASER;PLASER 1;2 ;2 ; 1; 2; 2;2 3;2 3;3 1;1", sep=";", header=TRUE) table(b$PLASER,b$FLASER, exclude = NULL) yields the same combinations you showed earlier: 1 2 3 <NA> 1 1 0 0 0 2 1 1 1 1 3 0 0 1 0 <NA> 1 1 0 1 If you want to eliminate the four <NA>-based combinations completely, this line b$pheno <- ifelse(b$PLASER==2 | b$FLASER==2 | is.na(b$PLASER) | is.na(b$PLASER) & b$FLASER %in% 1:2 | is.na(b$FLASER) & b$PLASER == 2, 2, 1) table(b$pheno, exclude = NULL) will do it: 1 2 2 7 Best, Rasmus -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 833 bytes Desc: not available URL: <https://stat.ethz.ch/pipermail/r-help/attachments/20200613/40a643bb/attachment.sig>