Since you have only a few troublesome NA values, if you look at them, or even better, post them: b[is.na(b$FLASER) | is.na(b$PLASER),] perhaps we can work out the appropriate logic to get rid of only the ones you don't want. Jim On Sat, Jun 13, 2020 at 12:50 PM Ana Marija <sokovic.anamarija at gmail.com> wrote:> > Hi Rasmus, > > thank you for getting back to be, the command your provided seems to > add all 11 NAs to 2s > > b$pheno <- > + ifelse(b$PLASER==2 | > + b$FLASER==2 | > + is.na(b$PLASER) | > + is.na(b$PLASER) & b$FLASER %in% 1:2 | > + is.na(b$FLASER) & b$PLASER == 2, > + 2, 1) > > table(b$pheno, exclude = NULL) > > 1 2 > 859 839 > > Once again my desired results is to keep these 7 NAs as NAs > > table(b$PLASER,b$FLASER, exclude = NULL) > > 1 2 3 <NA> > 1 836 14 0 0 > 2 691 70 43 2 > 3 2 7 21 0 > <NA> 4 1 0 7 > > and have > 825 2s (825=691+14+70+7+43) > and the rest would be 1s (866=1698-7-825) > > On Fri, Jun 12, 2020 at 9:29 PM Rasmus Liland <jral at posteo.no> wrote: > > > > On 2020-06-13 11:30 +1000, Jim Lemon wrote: > > > On Fri, Jun 12, 2020 at 8:06 PM Jim Lemon wrote: > > > > On Sat, Jun 13, 2020 at 10:46 AM Ana Marija wrote: > > > > > > > > > > I am trying to make a new column > > > > > "pheno" so that I reduce the number > > > > > of NAs > > > > > > > > it looks like those two NA values in > > > > PLASER are the ones you want to drop. > > > > > > From just your summary table, it's hard to > > > guess the distribution of NA values. > > > > Dear Ana, > > > > This small sample > > > > b <- read.table(text="FLASER;PLASER > > 1;2 > > ;2 > > ; > > 1; > > 2; > > 2;2 > > 3;2 > > 3;3 > > 1;1", sep=";", header=TRUE) > > > > table(b$PLASER,b$FLASER, exclude = NULL) > > > > yields the same combinations you showed > > earlier: > > > > 1 2 3 <NA> > > 1 1 0 0 0 > > 2 1 1 1 1 > > 3 0 0 1 0 > > <NA> 1 1 0 1 > > > > If you want to eliminate the four <NA>-based > > combinations completely, this line > > > > b$pheno <- > > ifelse(b$PLASER==2 | > > b$FLASER==2 | > > is.na(b$PLASER) | > > is.na(b$PLASER) & b$FLASER %in% 1:2 | > > is.na(b$FLASER) & b$PLASER == 2, > > 2, 1) > > table(b$pheno, exclude = NULL) > > > > will do it: > > > > 1 2 > > 2 7 > > > > Best, > > Rasmus > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Great idea! Here it is:> b[is.na(b$FLASER) | is.na(b$PLASER),]FID IID FLASER PLASER pheno 1: fam1837 G1837 1 NA 2 2: fam2410 G2410 NA NA 2 3: fam2838 G2838 NA 2 2 4: fam3367 G3367 1 NA 2 5: fam3410 G3410 1 NA 2 6: fam3492 G3492 1 NA 2 7: fam3834 G3834 2 NA 2 8: fam4708 G4708 NA 2 2 9: fam5162 G5162 NA NA 2 10: fam5274 G5274 NA NA 2 11: fam0637 G637 NA NA 2 12: fam0640 G640 NA NA 2 13: fam0743 G743 NA NA 2 14: fam0911 G911 NA NA 2 On Fri, Jun 12, 2020 at 10:29 PM Jim Lemon <drjimlemon at gmail.com> wrote:> > Since you have only a few troublesome NA values, if you look at them, > or even better, post them: > > b[is.na(b$FLASER) | is.na(b$PLASER),] > > perhaps we can work out the appropriate logic to get rid of only the > ones you don't want. > > Jim > > On Sat, Jun 13, 2020 at 12:50 PM Ana Marija <sokovic.anamarija at gmail.com> wrote: > > > > Hi Rasmus, > > > > thank you for getting back to be, the command your provided seems to > > add all 11 NAs to 2s > > > b$pheno <- > > + ifelse(b$PLASER==2 | > > + b$FLASER==2 | > > + is.na(b$PLASER) | > > + is.na(b$PLASER) & b$FLASER %in% 1:2 | > > + is.na(b$FLASER) & b$PLASER == 2, > > + 2, 1) > > > table(b$pheno, exclude = NULL) > > > > 1 2 > > 859 839 > > > > Once again my desired results is to keep these 7 NAs as NAs > > > table(b$PLASER,b$FLASER, exclude = NULL) > > > > 1 2 3 <NA> > > 1 836 14 0 0 > > 2 691 70 43 2 > > 3 2 7 21 0 > > <NA> 4 1 0 7 > > > > and have > > 825 2s (825=691+14+70+7+43) > > and the rest would be 1s (866=1698-7-825) > > > > On Fri, Jun 12, 2020 at 9:29 PM Rasmus Liland <jral at posteo.no> wrote: > > > > > > On 2020-06-13 11:30 +1000, Jim Lemon wrote: > > > > On Fri, Jun 12, 2020 at 8:06 PM Jim Lemon wrote: > > > > > On Sat, Jun 13, 2020 at 10:46 AM Ana Marija wrote: > > > > > > > > > > > > I am trying to make a new column > > > > > > "pheno" so that I reduce the number > > > > > > of NAs > > > > > > > > > > it looks like those two NA values in > > > > > PLASER are the ones you want to drop. > > > > > > > > From just your summary table, it's hard to > > > > guess the distribution of NA values. > > > > > > Dear Ana, > > > > > > This small sample > > > > > > b <- read.table(text="FLASER;PLASER > > > 1;2 > > > ;2 > > > ; > > > 1; > > > 2; > > > 2;2 > > > 3;2 > > > 3;3 > > > 1;1", sep=";", header=TRUE) > > > > > > table(b$PLASER,b$FLASER, exclude = NULL) > > > > > > yields the same combinations you showed > > > earlier: > > > > > > 1 2 3 <NA> > > > 1 1 0 0 0 > > > 2 1 1 1 1 > > > 3 0 0 1 0 > > > <NA> 1 1 0 1 > > > > > > If you want to eliminate the four <NA>-based > > > combinations completely, this line > > > > > > b$pheno <- > > > ifelse(b$PLASER==2 | > > > b$FLASER==2 | > > > is.na(b$PLASER) | > > > is.na(b$PLASER) & b$FLASER %in% 1:2 | > > > is.na(b$FLASER) & b$PLASER == 2, > > > 2, 1) > > > table(b$pheno, exclude = NULL) > > > > > > will do it: > > > > > > 1 2 > > > 2 7 > > > > > > Best, > > > Rasmus > > > ______________________________________________ > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code.
Right, back from shopping. Since you have fourteen rows containing NAs and you only want seven, we can infer that half of them must go. As they are neatly divided into seven rows in which only one NA appears and seven in which two stare meaninglessly out at us. I will assume that the latter are the ones to be discarded. As your condition for calculating "pheno" stated that a 2 in either FLASER or PLASER should result in a 2 in pheno, the following statement closely conforms to that: b<-read.table(text="FID IID FLASER PLASER fam1837 G1837 1 NA fam2410 G2410 NA NA fam2838 G2838 NA 2 fam3367 G3367 1 NA fam3410 G3410 1 NA fam3492 G3492 1 NA fam0911 G911 NA NA fam3834 G3834 2 NA fam4708 G4708 NA 2 fam5162 G5162 NA NA fam5274 G5274 NA NA fam0637 G637 NA NA fam0640 G640 NA NA fam0743 G743 NA NA fam0911 G911 NA NA", header=TRUE,stringsAsFactors=FALSE) b$pheno<-ifelse(b$FLASER == 2 | b$PLASER == 2,2,1) # use the valid FLASER values when PLASER is NA b[is.na(b$pheno),]$pheno<-ifelse(!is.na(b[is.na(b$pheno),]$FLASER), b[is.na(b$pheno),]$FLASER,NA) # use the valid PLASER values when FLASER if NA b[is.na(b$pheno),]$pheno<-ifelse(!is.na(b[is.na(b$pheno),]$PLASER), b[is.na(b$pheno),]$PLASER,NA) b I could write that mess in one straitjacket of conditional statements but my brain hurts enough. Jim On Sat, Jun 13, 2020 at 1:59 PM Ana Marija <sokovic.anamarija at gmail.com> wrote:> > Great idea! > Here it is: > > b[is.na(b$FLASER) | is.na(b$PLASER),] > FID IID FLASER PLASER pheno > 1: fam1837 G1837 1 NA 2 > 2: fam2410 G2410 NA NA 2 > 3: fam2838 G2838 NA 2 2 > 4: fam3367 G3367 1 NA 2 > 5: fam3410 G3410 1 NA 2 > 6: fam3492 G3492 1 NA 2 > 7: fam3834 G3834 2 NA 2 > 8: fam4708 G4708 NA 2 2 > 9: fam5162 G5162 NA NA 2 > 10: fam5274 G5274 NA NA 2 > 11: fam0637 G637 NA NA 2 > 12: fam0640 G640 NA NA 2 > 13: fam0743 G743 NA NA 2 > 14: fam0911 G911 NA NA 2 > > On Fri, Jun 12, 2020 at 10:29 PM Jim Lemon <drjimlemon at gmail.com> wrote: > > > > Since you have only a few troublesome NA values, if you look at them, > > or even better, post them: > > > > b[is.na(b$FLASER) | is.na(b$PLASER),] > > > > perhaps we can work out the appropriate logic to get rid of only the > > ones you don't want. > > > > Jim > > > > On Sat, Jun 13, 2020 at 12:50 PM Ana Marija <sokovic.anamarija at gmail.com> wrote: > > > > > > Hi Rasmus, > > > > > > thank you for getting back to be, the command your provided seems to > > > add all 11 NAs to 2s > > > > b$pheno <- > > > + ifelse(b$PLASER==2 | > > > + b$FLASER==2 | > > > + is.na(b$PLASER) | > > > + is.na(b$PLASER) & b$FLASER %in% 1:2 | > > > + is.na(b$FLASER) & b$PLASER == 2, > > > + 2, 1) > > > > table(b$pheno, exclude = NULL) > > > > > > 1 2 > > > 859 839 > > > > > > Once again my desired results is to keep these 7 NAs as NAs > > > > table(b$PLASER,b$FLASER, exclude = NULL) > > > > > > 1 2 3 <NA> > > > 1 836 14 0 0 > > > 2 691 70 43 2 > > > 3 2 7 21 0 > > > <NA> 4 1 0 7 > > > > > > and have > > > 825 2s (825=691+14+70+7+43) > > > and the rest would be 1s (866=1698-7-825) > > > > > > On Fri, Jun 12, 2020 at 9:29 PM Rasmus Liland <jral at posteo.no> wrote: > > > > > > > > On 2020-06-13 11:30 +1000, Jim Lemon wrote: > > > > > On Fri, Jun 12, 2020 at 8:06 PM Jim Lemon wrote: > > > > > > On Sat, Jun 13, 2020 at 10:46 AM Ana Marija wrote: > > > > > > > > > > > > > > I am trying to make a new column > > > > > > > "pheno" so that I reduce the number > > > > > > > of NAs > > > > > > > > > > > > it looks like those two NA values in > > > > > > PLASER are the ones you want to drop. > > > > > > > > > > From just your summary table, it's hard to > > > > > guess the distribution of NA values. > > > > > > > > Dear Ana, > > > > > > > > This small sample > > > > > > > > b <- read.table(text="FLASER;PLASER > > > > 1;2 > > > > ;2 > > > > ; > > > > 1; > > > > 2; > > > > 2;2 > > > > 3;2 > > > > 3;3 > > > > 1;1", sep=";", header=TRUE) > > > > > > > > table(b$PLASER,b$FLASER, exclude = NULL) > > > > > > > > yields the same combinations you showed > > > > earlier: > > > > > > > > 1 2 3 <NA> > > > > 1 1 0 0 0 > > > > 2 1 1 1 1 > > > > 3 0 0 1 0 > > > > <NA> 1 1 0 1 > > > > > > > > If you want to eliminate the four <NA>-based > > > > combinations completely, this line > > > > > > > > b$pheno <- > > > > ifelse(b$PLASER==2 | > > > > b$FLASER==2 | > > > > is.na(b$PLASER) | > > > > is.na(b$PLASER) & b$FLASER %in% 1:2 | > > > > is.na(b$FLASER) & b$PLASER == 2, > > > > 2, 1) > > > > table(b$pheno, exclude = NULL) > > > > > > > > will do it: > > > > > > > > 1 2 > > > > 2 7 > > > > > > > > Best, > > > > Rasmus > > > > ______________________________________________ > > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code.