A Singh
2009-Jul-15 17:02 UTC
[R] Differing Variable Length Inconsistencies in Random Effects/Regression Models
Dear All, I am quite new to R and am having a problem trying to run a linear model with random effects/ a regression- with particular regard to my variable lengths being different and the models refusing to compute any further. The codes I have been using are as follows: vc<-read.table("P:\\R\\Testvcomp10.txt",header=T)>> attach(vc) > > family<-factor(family) > colms<-(vc)[,4:13] ## this to assign the 10 columns containing marker > data to a new variable, as column names are themselves not in any > recognisable sequence > > vcdf<-data.frame(family,peg.no,ec.length,syll.length,colms) > library(lme4)>> for (c in levels(family)) > + { for (i in 1:length(colms)) > + { fit<-lmer(peg.no~1 + (1|c/i), vcdf) > + } > + summ<-summary(fit) > + av<-anova(fit) > + print(summ) > + print(av) > + } > > This gives me: > > Error in model.frame.default(data = vcdf, formula = peg.no ~ 1 + (1 + : > variable lengths differ (found for 'c')I had posted a similar message on the R mixed model list a few days ago, with respect to my fundamental methods, and Jerome Goudet had kindly referred me to an alternative approach using residuals obtained from a random effects model in lmer(), and then doing regressions using those [residuals being the dependent variable and my marker data columns the independent variable]. The code for that is as follows: vc<-read.table("P:\\R\\Text Files\\Testvcomp10.txt",header=T,sep="",dec=".",na.strings=NA,strip.white=T)> attach(vc) > > family<-factor(family) > colms<-(vc)[,4:13] > > names(vc)[1] "male.parent" "family" "offspring.id" "P1L55" "P1L73" [6] "P1L74" "P1L77" "P1L91" "P1L96" "P1L98" [11] "P1L100" "P1L114" "P1L118" "peg.no" "ec.length" [16] "syll.length"> > vcdf<-data.frame(family, colms, peg.no, ec.length, syll.length) > > library(lme4)> famfit<-lmer(peg.no~1 + (1|family), na.action=na.omit, vcdf) > resfam<-residuals(famfit) > for( i in 1:length(colms))+ { + print ("Marker", i) + regfam<-abline(lm(resfam~i)) + print(regfam) + } This again gives me the error: [1] "Marker" Error in model.frame.default(formula = resfam ~ i, drop.unused.levels = TRUE) : variable lengths differ (found for 'i') My variables all have missing values somewhere or the other. The missing values are not consistent for all individuals, i.e some individuals have some values missing, others have others, and as much as I have tried to use na.action=na.omit and na.rm=T, the differing variable length problem is dogging me persistently.. I also tried to isolate the residuals, save them in a new variable (called 'resfam' here), and tried to save that in the data.frame()->vcdf, that I had created earlier. The problem with that was that when the residuals were computed, lmer() ignored missing data in 'peg.no' with respect to 'family', which is obviously not the same data missing for say another variable E.g. 'ec.length'- leading again to an inconsistency in variable lengths. Data.frame would then not accept that addition at all to the previous set. This was fairly obvious right from the start, but I decided to try it anyway. Didn't work. I apologise if the solution to working with these different variable lengths is obvious and I don't know it- but I don't know R that well at all. My data files can be downloaded at the following location: <http://www.filesanywhere.com/fs/v.aspx?v=896d6b88616173be71ab> (excel- .xlsx) <http://www.filesanywhere.com/fs/v.aspx?v=896d6b88616174a76e9e> (.txt file) Any pointers would be greatly appreciated, as this is holding me up loads. Thanks a ton for your help, Aditi ---------------------- A Singh Aditi.Singh at bristol.ac.uk School of Biological Sciences University of Bristol
Mark Difford
2009-Jul-15 21:14 UTC
[R] Differing Variable Length Inconsistencies in Random Effects/Regression Models
Hi Aditi, Parts of _your_ code for the solution offered by Jerome Goudet are wrong; see my comments.> famfit<-lmer(peg.no~1 + (1|family), na.action=na.omit, vcdf) ## use: > na.action=na.exclude > resfam<-residuals(famfit) > for( i in 1:length(colms))+ { + print ("Marker", i) + regfam<-abline(lm(resfam~i)) ## you need to use: abline(lm(resfam~colms[,i])) + print(regfam) + } Corrected code: famfit<-lmer(peg.no~1 + (1|family), na.action=na.exclude, vcdf) resfam<-residuals(famfit) for( i in 1:length(colms)) { print ("Marker", i) regfam<-abline(lm(resfam~colms[,i])) } This should work. Regards, Mark. A Singh wrote:> > > Dear All, > > I am quite new to R and am having a problem trying to run a linear model > with random effects/ a regression- with particular regard to my variable > lengths being different and the models refusing to compute any further. > > The codes I have been using are as follows: > > vc<-read.table("P:\\R\\Testvcomp10.txt",header=T) >>> attach(vc) >> >> family<-factor(family) >> colms<-(vc)[,4:13] ## this to assign the 10 columns containing marker >> data to a new variable, as column names are themselves not in any >> recognisable sequence >> >> vcdf<-data.frame(family,peg.no,ec.length,syll.length,colms) >> library(lme4) > >>> for (c in levels(family)) >> + { for (i in 1:length(colms)) >> + { fit<-lmer(peg.no~1 + (1|c/i), vcdf) >> + } >> + summ<-summary(fit) >> + av<-anova(fit) >> + print(summ) >> + print(av) >> + } >> >> This gives me: >> >> Error in model.frame.default(data = vcdf, formula = peg.no ~ 1 + (1 + : >> variable lengths differ (found for 'c') > > > I had posted a similar message on the R mixed model list a few days ago, > with respect to my fundamental methods, and Jerome Goudet had kindly > referred me to an alternative approach using residuals obtained from a > random effects model in lmer(), and then doing regressions using those > [residuals being the dependent variable and my marker data columns the > independent variable]. > > The code for that is as follows: > > vc<-read.table("P:\\R\\Text > Files\\Testvcomp10.txt",header=T,sep="",dec=".",na.strings=NA,strip.white=T) >> attach(vc) >> >> family<-factor(family) >> colms<-(vc)[,4:13] >> >> names(vc) > [1] "male.parent" "family" "offspring.id" "P1L55" "P1L73" > > [6] "P1L74" "P1L77" "P1L91" "P1L96" "P1L98" > > [11] "P1L100" "P1L114" "P1L118" "peg.no" > "ec.length" > [16] "syll.length" >> >> vcdf<-data.frame(family, colms, peg.no, ec.length, syll.length) >> >> library(lme4) > >> famfit<-lmer(peg.no~1 + (1|family), na.action=na.omit, vcdf) >> resfam<-residuals(famfit) >> for( i in 1:length(colms)) > + { > + print ("Marker", i) > + regfam<-abline(lm(resfam~i)) > + print(regfam) > + } > > This again gives me the error: > > > [1] "Marker" > Error in model.frame.default(formula = resfam ~ i, drop.unused.levels = > TRUE) : > variable lengths differ (found for 'i') > > > My variables all have missing values somewhere or the other. The missing > values are not consistent for all individuals, i.e some individuals have > some values missing, others have others, > and as much as I have tried to use na.action=na.omit and na.rm=T, the > differing variable length problem is dogging me persistently.. > > I also tried to isolate the residuals, save them in a new variable (called > 'resfam' here), and tried to save that in the data.frame()->vcdf, that I > had created earlier. > > The problem with that was that when the residuals were computed, lmer() > ignored missing data in 'peg.no' with respect to 'family', which is > obviously not the same data missing for say another variable E.g. > 'ec.length'- leading again to an inconsistency in variable lengths. > Data.frame would then not accept that addition at all to the previous set. > This was fairly obvious right from the start, but I decided to try it > anyway. Didn't work. > > I apologise if the solution to working with these different variable > lengths is obvious and I don't know it- but I don't know R that well at > all. > > My data files can be downloaded at the following location: > > <http://www.filesanywhere.com/fs/v.aspx?v=896d6b88616173be71ab> (excel- > .xlsx) > > <http://www.filesanywhere.com/fs/v.aspx?v=896d6b88616174a76e9e> > (.txt file) > > > Any pointers would be greatly appreciated, as this is holding me up loads. > > Thanks a ton for your help, > > Aditi > > > > ---------------------- > A Singh > Aditi.Singh at bristol.ac.uk > School of Biological Sciences > University of Bristol > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://www.nabble.com/Differing-Variable-Length-Inconsistencies-in-Random-Effects-Regression-Models-tp24502146p24505495.html Sent from the R help mailing list archive at Nabble.com.
Mark Difford
2009-Jul-15 21:31 UTC
[R] Differing Variable Length Inconsistencies in Random Effects/Regression Models
Perhaps I should have added the following: To see that it "works," run the following: famfit<-lmer(peg.no~1 + (1|family), na.action=na.exclude, vcdf) resfam<-residuals(famfit) for( i in 1:length(colms)) { print(coef(lm(resfam~colms[,i]))) } Regards, Mark. A Singh wrote:> > > Dear All, > > I am quite new to R and am having a problem trying to run a linear model > with random effects/ a regression- with particular regard to my variable > lengths being different and the models refusing to compute any further. > > The codes I have been using are as follows: > > vc<-read.table("P:\\R\\Testvcomp10.txt",header=T) >>> attach(vc) >> >> family<-factor(family) >> colms<-(vc)[,4:13] ## this to assign the 10 columns containing marker >> data to a new variable, as column names are themselves not in any >> recognisable sequence >> >> vcdf<-data.frame(family,peg.no,ec.length,syll.length,colms) >> library(lme4) > >>> for (c in levels(family)) >> + { for (i in 1:length(colms)) >> + { fit<-lmer(peg.no~1 + (1|c/i), vcdf) >> + } >> + summ<-summary(fit) >> + av<-anova(fit) >> + print(summ) >> + print(av) >> + } >> >> This gives me: >> >> Error in model.frame.default(data = vcdf, formula = peg.no ~ 1 + (1 + : >> variable lengths differ (found for 'c') > > > I had posted a similar message on the R mixed model list a few days ago, > with respect to my fundamental methods, and Jerome Goudet had kindly > referred me to an alternative approach using residuals obtained from a > random effects model in lmer(), and then doing regressions using those > [residuals being the dependent variable and my marker data columns the > independent variable]. > > The code for that is as follows: > > vc<-read.table("P:\\R\\Text > Files\\Testvcomp10.txt",header=T,sep="",dec=".",na.strings=NA,strip.white=T) >> attach(vc) >> >> family<-factor(family) >> colms<-(vc)[,4:13] >> >> names(vc) > [1] "male.parent" "family" "offspring.id" "P1L55" "P1L73" > > [6] "P1L74" "P1L77" "P1L91" "P1L96" "P1L98" > > [11] "P1L100" "P1L114" "P1L118" "peg.no" > "ec.length" > [16] "syll.length" >> >> vcdf<-data.frame(family, colms, peg.no, ec.length, syll.length) >> >> library(lme4) > >> famfit<-lmer(peg.no~1 + (1|family), na.action=na.omit, vcdf) >> resfam<-residuals(famfit) >> for( i in 1:length(colms)) > + { > + print ("Marker", i) > + regfam<-abline(lm(resfam~i)) > + print(regfam) > + } > > This again gives me the error: > > > [1] "Marker" > Error in model.frame.default(formula = resfam ~ i, drop.unused.levels = > TRUE) : > variable lengths differ (found for 'i') > > > My variables all have missing values somewhere or the other. The missing > values are not consistent for all individuals, i.e some individuals have > some values missing, others have others, > and as much as I have tried to use na.action=na.omit and na.rm=T, the > differing variable length problem is dogging me persistently.. > > I also tried to isolate the residuals, save them in a new variable (called > 'resfam' here), and tried to save that in the data.frame()->vcdf, that I > had created earlier. > > The problem with that was that when the residuals were computed, lmer() > ignored missing data in 'peg.no' with respect to 'family', which is > obviously not the same data missing for say another variable E.g. > 'ec.length'- leading again to an inconsistency in variable lengths. > Data.frame would then not accept that addition at all to the previous set. > This was fairly obvious right from the start, but I decided to try it > anyway. Didn't work. > > I apologise if the solution to working with these different variable > lengths is obvious and I don't know it- but I don't know R that well at > all. > > My data files can be downloaded at the following location: > > <http://www.filesanywhere.com/fs/v.aspx?v=896d6b88616173be71ab> (excel- > .xlsx) > > <http://www.filesanywhere.com/fs/v.aspx?v=896d6b88616174a76e9e> > (.txt file) > > > Any pointers would be greatly appreciated, as this is holding me up loads. > > Thanks a ton for your help, > > Aditi > > > > ---------------------- > A Singh > Aditi.Singh at bristol.ac.uk > School of Biological Sciences > University of Bristol > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- View this message in context: http://www.nabble.com/Differing-Variable-Length-Inconsistencies-in-Random-Effects-Regression-Models-tp24502146p24506118.html Sent from the R help mailing list archive at Nabble.com.
Maybe Matching Threads
- Error: length(f1) == length(f2) is not TRUE
- Multiple lmer runs using 2 'for' loops
- Error: length(f1) == length(f2) is not TRUE (fwd)
- Splitting massive output into multiple text files
- Printing 'k' levels of factors 'n' times each, but 'n' is unequal for all levels ?