Hi, I split a data set into two partitions (80 and 42), use the first as the training set in glm and the second as testing set in glm predict. But when I call glm.predict, I get the warning message: Warning message: 'newdata' had 42 rows but variable(s) found have 80 rows --------------------- s = sample(1:122) glm.my.data=glm(my.data.class[s[1:80]]~t(my.data)[s[1:80],1:60],family="binomial") pred.my.data = predict(glm.gse13355,as.data.frame(t(my.data)[s[81:122],1:60]),type="response") Warning message: 'newdata' had 42 rows but variable(s) found have 80 rows length(pred.my.data) [1] 80 Thanks Carol [[alternative HTML version deleted]]
Charles Berry
2012-May-03 16:19 UTC
[R] warning with glm.predict, wrong number of data rows
carol white <wht_crl <at> yahoo.com> writes:> > Hi, > I split a data set into two partitions (80 and 42), use the first as thetraining set in glm and the second as> testing set in glm predict. But when I call glm.predict, I get the warningmessage:?> > Warning message: > 'newdata' had 42 rows but variable(s) found have 80 rows? > ---------------------[snip] The warning correctly diagnoses the problem. The posting guide asks for a 'reproducible example', but you did not give us one. There is one below. Note what happens when predict() tries to reconstruct the variable 'x[1:4]' as dictated by the formula. How many elements can 'x[1:4]' have when newdata has (say) nrowsNew? Use the subset argument to select a subset of observations.> y <- sample(factor(1:2),80,repl=T) > y <- sample(factor(1:2),5,repl=T) > x <- 1:4 > fit <- glm( y[1:4] ~ x[1:4], family = binomial) > fitCall: glm(formula = y[1:4] ~ x[1:4], family = binomial) Coefficients: (Intercept) x[1:4] -1.110e-16 0.000e+00 Degrees of Freedom: 3 Total (i.e. Null); 2 Residual Null Deviance: 5.545 Residual Deviance: 5.545 AIC: 9.545> predict(fit,newdata=data.frame(x=1:2))1 2 3 4 -1.110223e-16 -1.110223e-16 NA NA Warning message: 'newdata' had 2 rows but variable(s) found have 4 rows> predict(fit,newdata=data.frame(x=1:5))1 2 3 4 -1.110223e-16 -1.110223e-16 -1.110223e-16 -1.110223e-16 Warning message: 'newdata' had 5 rows but variable(s) found have 4 rows>HTH, Chuck [rest deleted]