Un texte encapsul? et encod? dans un jeu de caract?res inconnu a ?t? nettoy?... Nom : non disponible URL : <https://stat.ethz.ch/pipermail/r-help/attachments/20131003/e1fb7f66/attachment.pl>
On 13-10-03 6:42 PM, Jesse Gervais wrote:> Hello there, > > I try to construct a variable with R, but I have some difficulties. > > Assume that I use a data set named = mydata. I want to create a variable > that is the mean (totmean) or the sum (totsum) of 6 variables (var1, var2, > var3, var4, var5, var6). However, I want only participants who have > responded to at least 4 items to be include. Those who have one or two > missing for var1-var6 should be coded NA for totmean and totsum. > > How I do that? >Write a function that computes what you want for a vector of 6 values. Then put the data into an array, and use apply( the_array, 1, your_function) to compute it for the full dataset. Duncan Murdoch
Hi Jesse,
Here is one approach:
score <- function(dat, minimumN) {
# get the number of columns (variables)
k <- ncol(dat)
# take the row means, excluding missing
mean <- rowMeans(dat, na.rm=TRUE)
# get the number missing for each row
nmiss <- rowSums(is.na(dat))
# if nmiss is greater than threshold, set to missing
# k = columns, minimum N is
mean[nmiss > (k - minimumN)] <- NA
# calculate the sum (reweighted by missing)
sum <- mean * k
# put results in a dataframe and return
data.frame(Mean = mean, Sum = sum, Nmissing = nmiss)
}
# score 4 variables, requiring at least 3
score(mtcars[, c("mpg", "disp", "hp",
"wt")], 3)
# or put back into dataset, just the mean
mtcars$ScaleMean <- score(mtcars[, c("mpg", "disp",
"hp", "wt")], 3)$Mean
It will be somewhat faster than using apply() because all you need are
rowMeans, which uses more optimized code.
Cheers,
Josh
On Thu, Oct 3, 2013 at 3:42 PM, Jesse Gervais <jgervais89@gmail.com>
wrote:
> Hello there,
>
> I try to construct a variable with R, but I have some difficulties.
>
> Assume that I use a data set named = mydata. I want to create a variable
> that is the mean (totmean) or the sum (totsum) of 6 variables (var1, var2,
> var3, var4, var5, var6). However, I want only participants who have
> responded to at least 4 items to be include. Those who have one or two
> missing for var1-var6 should be coded NA for totmean and totsum.
>
> How I do that?
>
> Thank you!
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://joshuawiley.com/
Senior Analyst - Elkhart Group Ltd.
http://elkhartgroup.com
[[alternative HTML version deleted]]
On 10/04/2013 08:42 AM, Jesse Gervais wrote:> Hello there, > > I try to construct a variable with R, but I have some difficulties. > > Assume that I use a data set named = mydata. I want to create a variable > that is the mean (totmean) or the sum (totsum) of 6 variables (var1, var2, > var3, var4, var5, var6). However, I want only participants who have > responded to at least 4 items to be include. Those who have one or two > missing for var1-var6 should be coded NA for totmean and totsum. > > How I do that? >Hi Jesse, Say your data looks like this: mydata<-data.frame(var1=rnorm(100),var2=rnorm(100), var3=rnorm(100),var4=rnorm(100), var5=rnorm(100),var6=rnorm(100)) mydata$var1[sample(1:100,20)]<-NA mydata$var2[sample(1:100,20)]<-NA mydata$var3[sample(1:100,20)]<-NA mydata$var4[sample(1:100,20)]<-NA mydata$var5[sample(1:100,20)]<-NA mydata$var6[sample(1:100,20)]<-NA valid.n<-function(x) return(sum(!is.na(x))) gt4<-unlist(apply(as.matrix(mydata),1,FUN=valid.n))<=4 totmean<-mean(unlist(mydata[gt4,]),na.rm=TRUE) totsum<-sum(unlist(mydata[gt4,]),na.rm=TRUE) Jim