Paula Couto
2018-Feb-26 01:07 UTC
[R] glm package - Negative binomial regression model - Error
HI there I am running this model in negative binomial regression, using glm. I had no problems with running the model with a set of data, but now that i'm trying to run if for new one. I always have this same error when running the regression:> > #Run Regression > x=cbind(factor2ind(d$year),factor2ind(d$month_week)) > > out<- glm(cbind(influenza, n_sample) ~ x, family=quasibinomial, > data=d) > > d$prop<-out$fitted.valuesError in `$<-.data.frame`(`*tmp*`, prop, value = c(0.0486530542835839, : replacement has 208 rows, data has 365> d$n_p1<-d$prop*d$factor*10 > > obs<-aggregate(d$prop, by = list(d$month_week), FUN=summary) > pred<-aggregate(d$n_p1, by = list(d$month_week), FUN=summary) >By the way, I previously prepared the data set and defined that: d$factor<-sapply(d$year,f)> d$n_sample<-(d$n_muestras*d$factor*10) > d$prop<-(d$influenza/d$n_sample)But I still don't understand why it keeps saying that dataframe has less replacements than rows. Could anybody help me with this? Many thankss!!! P [[alternative HTML version deleted]]
Thierry Onkelinx
2018-Feb-26 08:02 UTC
[R] glm package - Negative binomial regression model - Error
Dear Paula, There are probably missing observations in your data set. Read the na.action part of the glm help file. na.exclude is most likely what you are looking for. Best regards, ir. Thierry Onkelinx Statisticus / Statistician Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance thierry.onkelinx at inbo.be Havenlaan 88 bus 73, 1000 Brussel www.inbo.be /////////////////////////////////////////////////////////////////////////////////////////// To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey /////////////////////////////////////////////////////////////////////////////////////////// <https://www.inbo.be> 2018-02-26 2:07 GMT+01:00 Paula Couto <paulaveronica at gmail.com>:> HI there > > I am running this model in negative binomial regression, using glm. > I had no problems with running the model with a set of data, but now that > i'm trying to run if for new one. I always have this same error when > running the regression: > > > > > #Run Regression > > x=cbind(factor2ind(d$year),factor2ind(d$month_week)) > > > > out<- glm(cbind(influenza, n_sample) ~ x, family=quasibinomial, > > data=d) > > > > d$prop<-out$fitted.values > > Error in `$<-.data.frame`(`*tmp*`, prop, value = c(0.0486530542835839, : > replacement has 208 rows, data has 365 > > > d$n_p1<-d$prop*d$factor*10 > > > > obs<-aggregate(d$prop, by = list(d$month_week), FUN=summary) > > pred<-aggregate(d$n_p1, by = list(d$month_week), FUN=summary) > > > > By the way, I previously prepared the data set and defined that: > d$factor<-sapply(d$year,f) > > d$n_sample<-(d$n_muestras*d$factor*10) > > d$prop<-(d$influenza/d$n_sample) > > But I still don't understand why it keeps saying that dataframe has less > replacements than rows. > Could anybody help me with this? > Many thankss!!! > P > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Paula Couto
2018-Feb-26 14:55 UTC
[R] glm package - Negative binomial regression model - Error
Thank you so much, Thierry!! I will try that now and see if that solves the issue Bests, Paula On Feb 26, 2018 03:02, "Thierry Onkelinx" <thierry.onkelinx at inbo.be> wrote: Dear Paula, There are probably missing observations in your data set. Read the na.action part of the glm help file. na.exclude is most likely what you are looking for. Best regards, ir. Thierry Onkelinx Statisticus / Statistician Vlaamse Overheid / Government of Flanders INSTITUUT VOOR NATUUR- EN BOSONDERZOEK / RESEARCH INSTITUTE FOR NATURE AND FOREST Team Biometrie & Kwaliteitszorg / Team Biometrics & Quality Assurance thierry.onkelinx at inbo.be Havenlaan 88 <https://maps.google.com/?q=Havenlaan+88&entry=gmail&source=g> bus 73, 1000 Brussel www.inbo.be //////////////////////////////////////////////////////////// /////////////////////////////// To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey //////////////////////////////////////////////////////////// /////////////////////////////// <https://www.inbo.be> 2018-02-26 2:07 GMT+01:00 Paula Couto <paulaveronica at gmail.com>:> HI there > > I am running this model in negative binomial regression, using glm. > I had no problems with running the model with a set of data, but now that > i'm trying to run if for new one. I always have this same error when > running the regression: > > > > > #Run Regression > > x=cbind(factor2ind(d$year),factor2ind(d$month_week)) > > > > out<- glm(cbind(influenza, n_sample) ~ x, family=quasibinomial, > > data=d) > > > > d$prop<-out$fitted.values > > Error in `$<-.data.frame`(`*tmp*`, prop, value = c(0.0486530542835839, : > replacement has 208 rows, data has 365 > > > d$n_p1<-d$prop*d$factor*10 > > > > obs<-aggregate(d$prop, by = list(d$month_week), FUN=summary) > > pred<-aggregate(d$n_p1, by = list(d$month_week), FUN=summary) > > > > By the way, I previously prepared the data set and defined that: > d$factor<-sapply(d$year,f) > > d$n_sample<-(d$n_muestras*d$factor*10) > > d$prop<-(d$influenza/d$n_sample) > > But I still don't understand why it keeps saying that dataframe has less > replacements than rows. > Could anybody help me with this? > Many thankss!!! > P > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posti > ng-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]