Hi All, I have a data frame with several columns and I want to create another column by using the values of the other columns. My problem is that some the row values for some columns have missing values and I could not get the result I waned . Here is the sample of my data and my attempt. vdat<-read.table(text="obs, Year, x1, x2, x3 1, 2001, 25 ,10, 10 2, 2001, , 15, 25 3, 2001, 50, 10, 4, 2001, 20, , 60",sep=",",header=TRUE,stringsAsFactors=F) vdat$xy <- 0 vdat$xy <- 2*(vdat$x1) + 5*(vdat$x2) + 3*(vdat$x3) vdat obs Year x1 x2 x3 xy 1 1 2001 25 10 10 130 2 2 2001 NA 15 25 NA 3 3 2001 50 10 NA NA 4 4 2001 20 NA 60 NA The desired result si this, obs Year x1 x2 x3 xy 1 1 2001 25 10 10 130 2 2 2001 NA 15 25 150 3 3 2001 50 10 NA 150 4 4 2001 20 NA 60 220 How do I get my desired result? Thank you
Hi Val, For this particular problem, you can just replace NAs with zeros. vdat[is.na(vdat)]<-0 vdat$xy <- 2*(vdat$x1) + 5*(vdat$x2) + 3*(vdat$x3) vdat obs Year x1 x2 x3 xy 1 1 2001 25 10 10 130 2 2 2001 0 15 25 150 3 3 2001 50 10 0 150 4 4 2001 20 0 60 220 Note that this is not a general solution to the problem of NA values. Jim On Sun, Apr 14, 2019 at 12:54 PM Val <valkremk at gmail.com> wrote:> > Hi All, > I have a data frame with several columns and I want to create > another column by using the values of the other columns. My > problem is that some the row values for some columns have missing > values and I could not get the result I waned . > > Here is the sample of my data and my attempt. > > vdat<-read.table(text="obs, Year, x1, x2, x3 > 1, 2001, 25 ,10, 10 > 2, 2001, , 15, 25 > 3, 2001, 50, 10, > 4, 2001, 20, , 60",sep=",",header=TRUE,stringsAsFactors=F) > vdat$xy <- 0 > vdat$xy <- 2*(vdat$x1) + 5*(vdat$x2) + 3*(vdat$x3) > vdat > > obs Year x1 x2 x3 xy > 1 1 2001 25 10 10 130 > 2 2 2001 NA 15 25 NA > 3 3 2001 50 10 NA NA > 4 4 2001 20 NA 60 NA > > The desired result si this, > > obs Year x1 x2 x3 xy > 1 1 2001 25 10 10 130 > 2 2 2001 NA 15 25 150 > 3 3 2001 50 10 NA 150 > 4 4 2001 20 NA 60 220 > > How do I get my desired result? > Thank you > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
If the NA's are really 0's, replace them with 0 before doing the calculation. (see ?is.na). If they are not 0's, think again about doing this as the results would probably mislead. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sat, Apr 13, 2019 at 7:54 PM Val <valkremk at gmail.com> wrote:> Hi All, > I have a data frame with several columns and I want to create > another column by using the values of the other columns. My > problem is that some the row values for some columns have missing > values and I could not get the result I waned . > > Here is the sample of my data and my attempt. > > vdat<-read.table(text="obs, Year, x1, x2, x3 > 1, 2001, 25 ,10, 10 > 2, 2001, , 15, 25 > 3, 2001, 50, 10, > 4, 2001, 20, , 60",sep=",",header=TRUE,stringsAsFactors=F) > vdat$xy <- 0 > vdat$xy <- 2*(vdat$x1) + 5*(vdat$x2) + 3*(vdat$x3) > vdat > > obs Year x1 x2 x3 xy > 1 1 2001 25 10 10 130 > 2 2 2001 NA 15 25 NA > 3 3 2001 50 10 NA NA > 4 4 2001 20 NA 60 NA > > The desired result si this, > > obs Year x1 x2 x3 xy > 1 1 2001 25 10 10 130 > 2 2 2001 NA 15 25 150 > 3 3 2001 50 10 NA 150 > 4 4 2001 20 NA 60 220 > > How do I get my desired result? > Thank you > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi Bert and Jim, Thank you for the suggestion. However, those missing values should not be replaced by 0's. I want exclude those missing values from the calculation and create the index using only the non-missing values. On Sat, Apr 13, 2019 at 10:14 PM Jim Lemon <drjimlemon at gmail.com> wrote:> > Hi Val, > For this particular problem, you can just replace NAs with zeros. > > vdat[is.na(vdat)]<-0 > vdat$xy <- 2*(vdat$x1) + 5*(vdat$x2) + 3*(vdat$x3) > vdat > obs Year x1 x2 x3 xy > 1 1 2001 25 10 10 130 > 2 2 2001 0 15 25 150 > 3 3 2001 50 10 0 150 > 4 4 2001 20 0 60 220 > > Note that this is not a general solution to the problem of NA values. > > Jim > > On Sun, Apr 14, 2019 at 12:54 PM Val <valkremk at gmail.com> wrote: > > > > Hi All, > > I have a data frame with several columns and I want to create > > another column by using the values of the other columns. My > > problem is that some the row values for some columns have missing > > values and I could not get the result I waned . > > > > Here is the sample of my data and my attempt. > > > > vdat<-read.table(text="obs, Year, x1, x2, x3 > > 1, 2001, 25 ,10, 10 > > 2, 2001, , 15, 25 > > 3, 2001, 50, 10, > > 4, 2001, 20, , 60",sep=",",header=TRUE,stringsAsFactors=F) > > vdat$xy <- 0 > > vdat$xy <- 2*(vdat$x1) + 5*(vdat$x2) + 3*(vdat$x3) > > vdat > > > > obs Year x1 x2 x3 xy > > 1 1 2001 25 10 10 130 > > 2 2 2001 NA 15 25 NA > > 3 3 2001 50 10 NA NA > > 4 4 2001 20 NA 60 NA > > > > The desired result si this, > > > > obs Year x1 x2 x3 xy > > 1 1 2001 25 10 10 130 > > 2 2 2001 NA 15 25 150 > > 3 3 2001 50 10 NA 150 > > 4 4 2001 20 NA 60 220 > > > > How do I get my desired result? > > Thank you > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code.