Un texte encapsul? et encod? dans un jeu de caract?res inconnu a ?t? nettoy?... Nom : non disponible URL : <https://stat.ethz.ch/pipermail/r-help/attachments/20131003/e1fb7f66/attachment.pl>
On 13-10-03 6:42 PM, Jesse Gervais wrote:> Hello there, > > I try to construct a variable with R, but I have some difficulties. > > Assume that I use a data set named = mydata. I want to create a variable > that is the mean (totmean) or the sum (totsum) of 6 variables (var1, var2, > var3, var4, var5, var6). However, I want only participants who have > responded to at least 4 items to be include. Those who have one or two > missing for var1-var6 should be coded NA for totmean and totsum. > > How I do that? >Write a function that computes what you want for a vector of 6 values. Then put the data into an array, and use apply( the_array, 1, your_function) to compute it for the full dataset. Duncan Murdoch
Hi Jesse, Here is one approach: score <- function(dat, minimumN) { # get the number of columns (variables) k <- ncol(dat) # take the row means, excluding missing mean <- rowMeans(dat, na.rm=TRUE) # get the number missing for each row nmiss <- rowSums(is.na(dat)) # if nmiss is greater than threshold, set to missing # k = columns, minimum N is mean[nmiss > (k - minimumN)] <- NA # calculate the sum (reweighted by missing) sum <- mean * k # put results in a dataframe and return data.frame(Mean = mean, Sum = sum, Nmissing = nmiss) } # score 4 variables, requiring at least 3 score(mtcars[, c("mpg", "disp", "hp", "wt")], 3) # or put back into dataset, just the mean mtcars$ScaleMean <- score(mtcars[, c("mpg", "disp", "hp", "wt")], 3)$Mean It will be somewhat faster than using apply() because all you need are rowMeans, which uses more optimized code. Cheers, Josh On Thu, Oct 3, 2013 at 3:42 PM, Jesse Gervais <jgervais89@gmail.com> wrote:> Hello there, > > I try to construct a variable with R, but I have some difficulties. > > Assume that I use a data set named = mydata. I want to create a variable > that is the mean (totmean) or the sum (totsum) of 6 variables (var1, var2, > var3, var4, var5, var6). However, I want only participants who have > responded to at least 4 items to be include. Those who have one or two > missing for var1-var6 should be coded NA for totmean and totsum. > > How I do that? > > Thank you! > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://joshuawiley.com/ Senior Analyst - Elkhart Group Ltd. http://elkhartgroup.com [[alternative HTML version deleted]]
On 10/04/2013 08:42 AM, Jesse Gervais wrote:> Hello there, > > I try to construct a variable with R, but I have some difficulties. > > Assume that I use a data set named = mydata. I want to create a variable > that is the mean (totmean) or the sum (totsum) of 6 variables (var1, var2, > var3, var4, var5, var6). However, I want only participants who have > responded to at least 4 items to be include. Those who have one or two > missing for var1-var6 should be coded NA for totmean and totsum. > > How I do that? >Hi Jesse, Say your data looks like this: mydata<-data.frame(var1=rnorm(100),var2=rnorm(100), var3=rnorm(100),var4=rnorm(100), var5=rnorm(100),var6=rnorm(100)) mydata$var1[sample(1:100,20)]<-NA mydata$var2[sample(1:100,20)]<-NA mydata$var3[sample(1:100,20)]<-NA mydata$var4[sample(1:100,20)]<-NA mydata$var5[sample(1:100,20)]<-NA mydata$var6[sample(1:100,20)]<-NA valid.n<-function(x) return(sum(!is.na(x))) gt4<-unlist(apply(as.matrix(mydata),1,FUN=valid.n))<=4 totmean<-mean(unlist(mydata[gt4,]),na.rm=TRUE) totsum<-sum(unlist(mydata[gt4,]),na.rm=TRUE) Jim