Kum-Hoe Hwang
2010-Feb-16 08:24 UTC
[R] Error of Stepwise Regression with number of rows in use has changed: remove missing values?
Howdy, R Grues I have enjoyed R, but I cannot solve one problem easily. Please help my problem. When I tried the R script, I got the following Error. This error results from input data file exported through a Excel spreadsheet software. Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) + ?: ??number of rows in use has changed: remove missing values? Could you direct me to solve the Error? Thanks in advance,> ############### outputs from R console ############### > pop <- step(+ ? ? ? ? ? ? lm(pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) + ? ? ? ? ? ? ? ?+ as.numeric(do.grant) + as.numeric(city.grant) + as.numeric(DMZ.dist) + as.numeric(Seoul.dist), data=borderI.data, na.action = na.omit) + ? ? ? ? ? ? ) Start: ?AIC=494.27 pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) + ?? ?as.numeric(do.grant) + as.numeric(city.grant) + as.numeric(DMZ.dist) + ?? ?as.numeric(Seoul.dist) ?? ? ? ? ? ? ? ? ? ? ? ? ? Df Sum of Sq ? ?RSS ? ?AIC - as.numeric(do.grant) ? ? ?1 ? ? ?0.71 6622.9 492.28 - as.factor(policy) ? ? ? ? 1 ? ? ?1.21 6623.4 492.29 - as.numeric(DMZ.dist) ? ? ?1 ? ? ?1.91 6624.1 492.30 - as.numeric(city.grant) ? ?1 ? ? ?5.07 6627.3 492.36 - as.numeric(nation.grant) ?1 ? ? 11.51 6633.7 492.47 - as.numeric(year) ? ? ? ? ?1 ? ? 29.58 6651.8 492.80 <none> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?6622.2 494.27 - as.numeric(Seoul.dist) ? ?1 ? ?673.22 7295.4 503.79 Step: ?AIC=492.28 pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) + ?? ?as.numeric(city.grant) + as.numeric(DMZ.dist) + as.numeric(Seoul.dist) ?? ? ? ? ? ? ? ? ? ? ? ? ? Df Sum of Sq ? ?RSS ? ?AIC - as.factor(policy) ? ? ? ? 1 ? ? ?1.99 6624.9 490.32 - as.numeric(DMZ.dist) ? ? ?1 ? ? ?2.09 6625.0 490.32 - as.numeric(city.grant) ? ?1 ? ? ?7.18 6630.1 490.41 - as.numeric(nation.grant) ?1 ? ? 20.08 6643.0 490.64 - as.numeric(year) ? ? ? ? ?1 ? ? 28.89 6651.8 490.80 <none> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?6622.9 492.28 - as.numeric(Seoul.dist) ? ?1 ? ?697.46 7320.4 502.20 Step: ?AIC=490.32 pop.rate ~ as.numeric(year) + as.numeric(nation.grant) + as.numeric(city.grant) + ?? ?as.numeric(DMZ.dist) + as.numeric(Seoul.dist) ?? ? ? ? ? ? ? ? ? ? ? ? ? Df Sum of Sq ? ?RSS ? ?AIC - as.numeric(DMZ.dist) ? ? ?1 ? ? ?2.08 6627.0 488.35 - as.numeric(city.grant) ? ?1 ? ? 10.65 6635.6 488.51 - as.numeric(nation.grant) ?1 ? ? 31.30 6656.2 488.88 - as.numeric(year) ? ? ? ? ?1 ? ? 31.44 6656.4 488.88 <none> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?6624.9 490.32 - as.numeric(Seoul.dist) ? ?1 ? ?732.88 7357.8 500.80 Step: ?AIC=488.35 pop.rate ~ as.numeric(year) + as.numeric(nation.grant) + as.numeric(city.grant) + ?? ?as.numeric(Seoul.dist) ?? ? ? ? ? ? ? ? ? ? ? ? ? Df Sum of Sq ? ?RSS ? ?AIC - as.numeric(city.grant) ? ?1 ? ? ?9.86 6636.9 486.53 - as.numeric(year) ? ? ? ? ?1 ? ? 31.42 6658.4 486.92 - as.numeric(nation.grant) ?1 ? ? 33.33 6660.3 486.95 <none> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?6627.0 488.35 - as.numeric(Seoul.dist) ? ?1 ? ?754.40 7381.4 499.18 Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) + ?: ------------------------------------------------------------------------------------------------------------------------------------------- ??number of rows in use has changed: remove missing values? ------------------------------------------------------------------------------------------ -- Kum-Hoe Hwang, Ph.D. Phone : 82-31-250-3516 Email : phdhwang at gmail.com
Mohamed Lajnef
2010-Feb-16 10:48 UTC
[R] Error of Stepwise Regression with number of rows in use has changed: remove missing values?
Hi Kum, If you look at the code step function ( by typing step in the R console), the condition (if (length(fit$residuals) != n) ) is not fulfilled, this explains the error! i hope this can help Regards M Kum-Hoe Hwang a ?crit :> Howdy, R Grues > > I have enjoyed R, but I cannot solve one problem easily. Please help my problem. > When I tried the R script, I got the following Error. This error > results from input data file exported through a Excel spreadsheet > software. > > Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) + > as.numeric(nation.grant) + : > number of rows in use has changed: remove missing values? > > Could you direct me to solve the Error? > Thanks in advance, > > > >> ############### outputs from R console ############### >> pop <- step( >> > + lm(pop.rate ~ as.numeric(year) + as.factor(policy) + > as.numeric(nation.grant) > + + as.numeric(do.grant) + as.numeric(city.grant) + > as.numeric(DMZ.dist) + as.numeric(Seoul.dist), data=borderI.data, > na.action = na.omit) > + ) > Start: AIC=494.27 > pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) + > as.numeric(do.grant) + as.numeric(city.grant) + as.numeric(DMZ.dist) + > as.numeric(Seoul.dist) > Df Sum of Sq RSS AIC > - as.numeric(do.grant) 1 0.71 6622.9 492.28 > - as.factor(policy) 1 1.21 6623.4 492.29 > - as.numeric(DMZ.dist) 1 1.91 6624.1 492.30 > - as.numeric(city.grant) 1 5.07 6627.3 492.36 > - as.numeric(nation.grant) 1 11.51 6633.7 492.47 > - as.numeric(year) 1 29.58 6651.8 492.80 > <none> 6622.2 494.27 > - as.numeric(Seoul.dist) 1 673.22 7295.4 503.79 > Step: AIC=492.28 > pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) + > as.numeric(city.grant) + as.numeric(DMZ.dist) + as.numeric(Seoul.dist) > Df Sum of Sq RSS AIC > - as.factor(policy) 1 1.99 6624.9 490.32 > - as.numeric(DMZ.dist) 1 2.09 6625.0 490.32 > - as.numeric(city.grant) 1 7.18 6630.1 490.41 > - as.numeric(nation.grant) 1 20.08 6643.0 490.64 > - as.numeric(year) 1 28.89 6651.8 490.80 > <none> 6622.9 492.28 > - as.numeric(Seoul.dist) 1 697.46 7320.4 502.20 > Step: AIC=490.32 > pop.rate ~ as.numeric(year) + as.numeric(nation.grant) + > as.numeric(city.grant) + > as.numeric(DMZ.dist) + as.numeric(Seoul.dist) > Df Sum of Sq RSS AIC > - as.numeric(DMZ.dist) 1 2.08 6627.0 488.35 > - as.numeric(city.grant) 1 10.65 6635.6 488.51 > - as.numeric(nation.grant) 1 31.30 6656.2 488.88 > - as.numeric(year) 1 31.44 6656.4 488.88 > <none> 6624.9 490.32 > - as.numeric(Seoul.dist) 1 732.88 7357.8 500.80 > Step: AIC=488.35 > pop.rate ~ as.numeric(year) + as.numeric(nation.grant) + > as.numeric(city.grant) + > as.numeric(Seoul.dist) > Df Sum of Sq RSS AIC > - as.numeric(city.grant) 1 9.86 6636.9 486.53 > - as.numeric(year) 1 31.42 6658.4 486.92 > - as.numeric(nation.grant) 1 33.33 6660.3 486.95 > <none> 6627.0 488.35 > - as.numeric(Seoul.dist) 1 754.40 7381.4 499.18 > > Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) + > as.numeric(nation.grant) + : > ------------------------------------------------------------------------------------------------------------------------------------------- > number of rows in use has changed: remove missing values? > ------------------------------------------------------------------------------------------ > > > > > -- > Kum-Hoe Hwang, Ph.D. > > Phone : 82-31-250-3516 > Email : phdhwang at gmail.com > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > >-- Mohamed Lajnef,IE INSERM U955 eq 15 P?le de Psychiatrie H?pital CHENEVIER 40, rue Mesly 94010 CRETEIL Cedex FRANCE Mohamed.lajnef at inserm.fr tel : 01 49 81 31 31 (poste 18470) Sec : 01 49 81 32 90 fax : 01 49 81 30 99
Peter Ehlers
2010-Feb-16 11:09 UTC
[R] Error of Stepwise Regression with number of rows in use has changed: remove missing values?
On 2010-02-16 1:24, Kum-Hoe Hwang wrote:> Howdy, R Grues > > I have enjoyed R, but I cannot solve one problem easily. Please help my problem. > When I tried the R script, I got the following Error. This error > results from input data file exported through a Excel spreadsheet > software. > > Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) + > as.numeric(nation.grant) + : > number of rows in use has changed: remove missing values? > > Could you direct me to solve the Error? > Thanks in advance,This is a common situation when you use step() on data where the predictors have missing values. A case (row) is included in the model only if all the predictors for that model are non-missing for the case. As you vary which predictors are to be in the model, the included cases will vary, resulting in models based on different data. (Think of your cases as subjects; you want all your models to be based on the same set of subjects.) Finally: (Re-)read the help page and note the 'warning'. -Peter Ehlers> > >> ############### outputs from R console ############### >> pop<- step( > + lm(pop.rate ~ as.numeric(year) + as.factor(policy) + > as.numeric(nation.grant) > + + as.numeric(do.grant) + as.numeric(city.grant) + > as.numeric(DMZ.dist) + as.numeric(Seoul.dist), data=borderI.data, > na.action = na.omit) > + ) > Start: AIC=494.27 > pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) + > as.numeric(do.grant) + as.numeric(city.grant) + as.numeric(DMZ.dist) + > as.numeric(Seoul.dist) > Df Sum of Sq RSS AIC > - as.numeric(do.grant) 1 0.71 6622.9 492.28 > - as.factor(policy) 1 1.21 6623.4 492.29 > - as.numeric(DMZ.dist) 1 1.91 6624.1 492.30 > - as.numeric(city.grant) 1 5.07 6627.3 492.36 > - as.numeric(nation.grant) 1 11.51 6633.7 492.47 > - as.numeric(year) 1 29.58 6651.8 492.80 > <none> 6622.2 494.27 > - as.numeric(Seoul.dist) 1 673.22 7295.4 503.79 > Step: AIC=492.28 > pop.rate ~ as.numeric(year) + as.factor(policy) + as.numeric(nation.grant) + > as.numeric(city.grant) + as.numeric(DMZ.dist) + as.numeric(Seoul.dist) > Df Sum of Sq RSS AIC > - as.factor(policy) 1 1.99 6624.9 490.32 > - as.numeric(DMZ.dist) 1 2.09 6625.0 490.32 > - as.numeric(city.grant) 1 7.18 6630.1 490.41 > - as.numeric(nation.grant) 1 20.08 6643.0 490.64 > - as.numeric(year) 1 28.89 6651.8 490.80 > <none> 6622.9 492.28 > - as.numeric(Seoul.dist) 1 697.46 7320.4 502.20 > Step: AIC=490.32 > pop.rate ~ as.numeric(year) + as.numeric(nation.grant) + > as.numeric(city.grant) + > as.numeric(DMZ.dist) + as.numeric(Seoul.dist) > Df Sum of Sq RSS AIC > - as.numeric(DMZ.dist) 1 2.08 6627.0 488.35 > - as.numeric(city.grant) 1 10.65 6635.6 488.51 > - as.numeric(nation.grant) 1 31.30 6656.2 488.88 > - as.numeric(year) 1 31.44 6656.4 488.88 > <none> 6624.9 490.32 > - as.numeric(Seoul.dist) 1 732.88 7357.8 500.80 > Step: AIC=488.35 > pop.rate ~ as.numeric(year) + as.numeric(nation.grant) + > as.numeric(city.grant) + > as.numeric(Seoul.dist) > Df Sum of Sq RSS AIC > - as.numeric(city.grant) 1 9.86 6636.9 486.53 > - as.numeric(year) 1 31.42 6658.4 486.92 > - as.numeric(nation.grant) 1 33.33 6660.3 486.95 > <none> 6627.0 488.35 > - as.numeric(Seoul.dist) 1 754.40 7381.4 499.18 > > Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) + > as.numeric(nation.grant) + : > ------------------------------------------------------------------------------------------------------------------------------------------- > number of rows in use has changed: remove missing values? > ------------------------------------------------------------------------------------------ > > > > > -- > Kum-Hoe Hwang, Ph.D. > > Phone : 82-31-250-3516 > Email : phdhwang at gmail.com >-- Peter Ehlers University of Calgary