Hello to all, I recently downloaded R to my PC and am enjoying getting acquainted with it. Thank you to everyone involved in the R-project! I am interested in doing a log-linear analysis with R on a data set with dichotomous variables. There are 11 variables (columns) and around 1000 subjects (rows). How do I aggregate my data, i.e. how do I make a new dataset that includes the variable giving the counts for rows with the same configuration of responses? I know this is possible using the package 'cfa' (configural frequency analysis, a contributed package in R) but I can't coerce the output from the cfa command into a data frame. Any suggestions would be greatly appreciated. Thanks! Jagat Sheth Washington University in St. Louis Department of Psychiatry 40 N. Kingshighway St. Louis, MO 63108 Tel: (314) 286-2253 Fax: (314) 286-2265 email: shethj at epi.wustl.edu -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
> Date: Mon, 06 Nov 2000 08:48:04 -0600 > From: "Jagat Sheth" <shethj at epi.wustl.edu> > > Hello to all, > > I recently downloaded R to my PC and am enjoying getting acquainted with it.Thank you to everyone involved in the R-project!> > I am interested in doing a log-linear analysis with R on a data setwith dichotomous variables. There are 11 variables (columns) and around 1000 subjects (rows). How do I aggregate my data, i.e. how do I make a new dataset that includes the variable giving the counts for rows with the same configuration of responses? I know this is possible using the package 'cfa' (configural frequency analysis, a contributed package in R) but I can't coerce the output from the cfa command into a data frame. I'm not fully sure I understand. You have a data frame, one row per subject with 11 columns? What's the response here? If one of the columns is the response (and it is dichotomous) then you can just use logistic regression without any transformation. To do a log-linear analysis using a few of the variables as a joint response you can use multinom from package nnet on the original data, and it will summarize the data as you request en route. To use loglin all you need to do is use table on the data frame do.call("table", dataset) *BUT* I am not sure that is at all appropriate as an analysis. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
>>> Prof Brian Ripley <ripley at stats.ox.ac.uk> 11/06 9:01 AM >>>> Date: Mon, 06 Nov 2000 08:48:04 -0600 > From: "Jagat Sheth" <shethj at epi.wustl.edu> > > Hello to all, > > I recently downloaded R to my PC and am enjoying getting acquainted with it.Thank you to everyone involved in the R-project!> > I am interested in doing a log-linear analysis with R on a data setwith dichotomous variables. There are 11 variables (columns) and around 1000 subjects (rows). How do I aggregate my data, i.e. how do I make a new dataset that includes the variable giving the counts for rows with the same configuration of responses? I know this is possible using the package 'cfa' (configural frequency analysis, a contributed package in R) but I can't coerce the output from the cfa command into a data frame. I'm not fully sure I understand. You have a data frame, one row per subject with 11 columns? What's the response here? Sorry for my ambiguity! The response variable I want is not in my original dataset having one row per observation and 11 columns. I would like to make a new dataset having one row per 'configuration' from my original dataset and 12 columns. The 12-th column will be the dependent variable I want, namely 'freq', the number of times the given 'configuration' appeared in the original data set. I would like to do a log-linear analysis on this new data set, eg. loglm( freq ~., newdataset). If I try to set 'freq' equal to 1 for each row in my original data set, then I am prompted by R to increase the heapsize for memory when running loglin or loglm. Thanks again. J. Sheth If one of the columns is the response (and it is dichotomous) then you can just use logistic regression without any transformation. To do a log-linear analysis using a few of the variables as a joint response you can use multinom from package nnet on the original data, and it will summarize the data as you request en route. To use loglin all you need to do is use table on the data frame do.call("table", dataset) *BUT* I am not sure that is at all appropriate as an analysis. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Hi Jaget, If I have correctly understood your problem, this might give you a start: sum.response.patterns<-function(mat) { nrows<-dim(mat)[1] sorted.mat<-mat[order(mat[,1],mat[,2],mat[,3],mat[,4],mat[,5]),] pattern.count<-rep(1,nrows) j<-1 for(i in 1:(nrows-1)) { if(sum(abs(sorted.mat[i,]-sorted.mat[i+1,])) == 0) pattern.count[j]<-pattern.count[j]+1 else j<-j+1 } return(pattern.count[1:j]) } I tried it with this: test.mat<-matrix(as.numeric(runif(500) > 0.5),ncol=5) and it seemed to do what you requested. You have to use a matrix with this one. Perhaps one of the experts can get rid of that loop (sigh!) Jim -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
OOPS! This is what I get for reading the R help list at the end of the day. I forgot to solve the "sort" problem in the following function: sum.response.patterns<-function(mat) { nrows<-dim(mat)[1] sorted.mat<-mat[order(mat[,1],mat[,2],mat[,3],mat[,4],mat[,5]),] pattern.count<-rep(1,nrows) j<-1 for(i in 1:(nrows-1)) { if(sum(abs(sorted.mat[i,]-sorted.mat[i+1,])) == 0) pattern.count[j]<-pattern.count[j]+1 else j<-j+1 } return(pattern.count[1:j]) } So, this morning I have fooled around trying to get something like I would expect from an UNIX 'sort' function and, while I have gotten the argument list as a character string: paste(paste("mat[,",1:dim(mat)[2],"]",sep="")sep=",",collapse="") I have drawn a blank on how to make this into an argument list. Might I beg assistance? Jim -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Apologies for this extended correspondence with myself. I worked out another solution. binvec2dec<-function(binvec) { result<-0 for(i in 1:length(binvec)) result<-result+binvec[i]*2^i return(result) } sum.response.patterns<-function(mat) { matdim<-dim(mat) sorted.mat<-mat[order(apply(mat,1,binvec2dec)),] sorted.mat<-cbind(sorted.mat,rep(1,matdim[1])) print(sorted.mat) j<-1 rindex<-1:matdim[2] count.index<-matdim[2]+1 for(i in 1:(matdim[1]-1)) { if(sum(abs(sorted.mat[i,rindex]-sorted.mat[i+1,rindex])) == 0) sorted.mat[j,count.index]<-sorted.mat[j,count.index]+1 else { j<-j+1 sorted.mat[j,]<-sorted.mat[i+1,] } } return(sorted.mat[1:j,]) } I do hope that this is the right answer... Jim -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._