Moi! A student here has been getting a bit irritated with some side effects of scale() (OS is Windows XP, the behaviour occurs in R 2.0.0, but not 1.7.1). The problem is that she scales a variable in a data frame, then does a regression, and tries to get some predictions for some new data. However, at this point she gets an error (see the example below). This seems to be because the scaled variable in the new data frame does not have the center and scale attributes, but the one in the old data frame does. The work-around is to put the scaled variable intro a new data frame, which again won't have the attributes. But it seems odd to me that whether a scale()'d variable has attributes depends on where it's placed. I presume that this is because I'm not understanding something about the way R is working, rather than it being a bug. Would anyone care to enlighten me? > Data1=data.frame(xx=1:10, yy=2.1:12) > Data1$xx=scale(Data1$xx) > > reg1=lm(yy~xx, data=Data1) > New=data.frame(xx=2:4) > b=predict(reg1, New, se.fit=T) Error: variable 'xx' was fitted with nmatrix.1 but numeric was supplied > > New=data.frame(xx=scale(2:4, center=5.5, scale=3.02)) > b=predict(reg1, New, se.fit=T) Error: variable 'xx' was fitted with nmatrix.1 but numeric was supplied > Bob -- Bob O'Hara Department of Mathematics and Statistics P.O. Box 68 (Gustaf H??llstr??min katu 2b) FIN-00014 University of Helsinki Finland Telephone: +358-9-191 51479 Mobile: +358 50 599 0540 Fax: +358-9-191 51400 WWW: http://www.RNI.Helsinki.FI/~boh/ Journal of Negative Results - EEB: www.jnr-eeb.org
Hi Bob, note that `scale()' returns a matrix! This should work: Data1$xx <- scale(Data1$xx) sapply(Data1, data.class) #### Data1$xx <- c(scale(Data1$xx)) sapply(Data1, data.class) I hope it helps. Best, Dimitris ---- Dimitris Rizopoulos Ph.D. Student Biostatistical Centre School of Public Health Catholic University of Leuven Address: Kapucijnenvoer 35, Leuven, Belgium Tel: +32/16/396887 Fax: +32/16/337015 Web: http://www.med.kuleuven.ac.be/biostat/ http://www.student.kuleuven.ac.be/~m0390867/dimitris.htm ----- Original Message ----- From: "Anon." <bob.ohara at helsinki.fi> To: <r-help at stat.math.ethz.ch> Sent: Wednesday, October 20, 2004 3:56 PM Subject: [R] Odd behaviour with scale()> Moi! > > A student here has been getting a bit irritated with some side > effects of scale() (OS is Windows XP, the behaviour occurs in R > 2.0.0, but not 1.7.1). The problem is that she scales a variable in > a data frame, then does a regression, and tries to get some > predictions for some new data. However, at this point she gets an > error (see the example below). This seems to be because the scaled > variable in the new data frame does not have the center and scale > attributes, but the one in the old data frame does. > > The work-around is to put the scaled variable intro a new data > frame, which again won't have the attributes. But it seems odd to > me that whether a scale()'d variable has attributes depends on where > it's placed. I presume that this is because I'm not understanding > something about the way R is working, rather than it being a bug. > Would anyone care to enlighten me? > > > Data1=data.frame(xx=1:10, yy=2.1:12) > > Data1$xx=scale(Data1$xx) > > > > reg1=lm(yy~xx, data=Data1) > > New=data.frame(xx=2:4) > > b=predict(reg1, New, se.fit=T) > Error: variable 'xx' was fitted with nmatrix.1 but numeric was > supplied > > > > New=data.frame(xx=scale(2:4, center=5.5, scale=3.02)) > > b=predict(reg1, New, se.fit=T) > Error: variable 'xx' was fitted with nmatrix.1 but numeric was > supplied > > > > Bob > > -- > Bob O'Hara > Department of Mathematics and Statistics > P.O. Box 68 (Gustaf H??llstr??min katu 2b) > FIN-00014 University of Helsinki > Finland > > Telephone: +358-9-191 51479 > Mobile: +358 50 599 0540 > Fax: +358-9-191 51400 > WWW: http://www.RNI.Helsinki.FI/~boh/ > Journal of Negative Results - EEB: www.jnr-eeb.org > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html >
The problem is that scale() returns a matrix, even if only a vector is supplied. Thus the regression model actually has a matrix (scaled xx) as the predictor. See:> str(Data1)`data.frame': 10 obs. of 2 variables: $ xx: num [1:10, 1] -1.486 -1.156 -0.826 -0.495 -0.165 ... ..- attr(*, "scaled:center")= num 5.5 ..- attr(*, "scaled:scale")= num 3.03 $ yy: num 2.1 3.1 4.1 5.1 6.1 7.1 8.1 9.1 10.1 11.1 and note that xx is not `num [1:10]', but `num [1:10, 1]'. If you give a single column matrix as xx to predict(), it would work:> predict(reg1, data.frame(xx=I(matrix(2:4, ncol=1))), se.fit=T)$fit 1 2 3 12.65530 15.68295 18.71060 $se.fit 1 2 3 4.582710e-16 6.513913e-16 8.510747e-16 $df [1] 8 $residual.scale [1] 6.210772e-16 HTH, Andy> From: Anon. > > Moi! > > A student here has been getting a bit irritated with some > side effects > of scale() (OS is Windows XP, the behaviour occurs in R > 2.0.0, but not > 1.7.1). The problem is that she scales a variable in a data > frame, then > does a regression, and tries to get some predictions for some > new data. > However, at this point she gets an error (see the example > below). This > seems to be because the scaled variable in the new data frame > does not > have the center and scale attributes, but the one in the old > data frame > does. > > The work-around is to put the scaled variable intro a new data frame, > which again won't have the attributes. But it seems odd to me that > whether a scale()'d variable has attributes depends on where it's > placed. I presume that this is because I'm not understanding > something > about the way R is working, rather than it being a bug. Would anyone > care to enlighten me? > > > Data1=data.frame(xx=1:10, yy=2.1:12) > > Data1$xx=scale(Data1$xx) > > > > reg1=lm(yy~xx, data=Data1) > > New=data.frame(xx=2:4) > > b=predict(reg1, New, se.fit=T) > Error: variable 'xx' was fitted with nmatrix.1 but numeric > was supplied > > > > New=data.frame(xx=scale(2:4, center=5.5, scale=3.02)) > > b=predict(reg1, New, se.fit=T) > Error: variable 'xx' was fitted with nmatrix.1 but numeric > was supplied > > > > Bob > > -- > Bob O'Hara > Department of Mathematics and Statistics > P.O. Box 68 (Gustaf H??llstr??min katu 2b) > FIN-00014 University of Helsinki > Finland > > Telephone: +358-9-191 51479 > Mobile: +358 50 599 0540 > Fax: +358-9-191 51400 > WWW: http://www.RNI.Helsinki.FI/~boh/ > Journal of Negative Results - EEB: www.jnr-eeb.org > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >