Hello All, I am a R newbie and am probably misinterpreting something really obvious... In the Rcmdr package there is a scatter3d() function that can fit a curve and also provide coefficients for the model. If I'm understanding this right, I think it's calling the lower level stats package function lm(), which is the part that actually does the curve fitting. Anyway, what has me perplexed is that the model summary from scatter3d() has different coefficients than the one generated by lm(). However, the actual surface plotted by scatter3d() looks like the function generated by lm(). In the scatter3d() docs I didn't see anything about transforming the coefficients or changing them somehow - perhaps I have not been looking in the right place? I'm using a Linux box: 2.6.17-1.2187_FC5smp, R version 2.3.1, Rcmdr version 1.2-0, in case that helps. Thanks very much for any enlightenment! anja Here's an example of the output on the same data by both functions. If anyone wants the dataset, let me know:> scatter3d(samples$x1, samples$y, samples$x2, fit="linear",residuals=TRUE, bg="white", axis.scales=TRUE, grid=TRUE, ellipsoid=FALSE, xlab="x1", ylab="y", zlab="x2", model.summary=TRUE) $linear Call: lm(formula = y ~ x + z) Residuals: Min 1Q Median 3Q Max -0.096984 -0.022303 0.004758 0.029354 0.091188 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.708945 0.007005 101.20 <2e-16 *** x 0.278540 0.011262 24.73 <2e-16 *** z -0.688175 0.011605 -59.30 <2e-16 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 0.03936 on 105 degrees of freedom Multiple R-Squared: 0.972, Adjusted R-squared: 0.9715 F-statistic: 1822 on 2 and 105 DF, p-value: < 2.2e-16> summary(lm(formula=samples$y~samples$x1+samples$x2))Call: lm(formula = samples$y ~ samples$x1 + samples$x2) Residuals: Min 1Q Median 3Q Max -7865.0 -1808.6 385.8 2380.5 7394.9 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 92204.502 1323.217 69.68 <2e-16 *** samples$x1 225.882 9.133 24.73 <2e-16 *** samples$x2 -558.076 9.411 -59.30 <2e-16 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 3192 on 105 degrees of freedom Multiple R-Squared: 0.972, Adjusted R-squared: 0.9715 F-statistic: 1822 on 2 and 105 DF, p-value: < 2.2e-16 -- Angela M. Baldo Computational Biologist USDA, ARS Plant Genetic Resources Unit & Grape Genetics Research Unit New York State Agricultural Experiment Station 630 W. North Street Geneva, NY 14456-0462 USA voice 315 787-2413 or 607 254-9413 fax 315 787-2339 or 607 254-9339 angela.baldo at ars.usda.gov http://www.ars.usda.gov/NAA/Geneva
Hello All, I am a R newbie and am probably misinterpreting something really obvious... In the Rcmdr package there is a scatter3d() function that can fit a curve and also provide coefficients for the model. If I'm understanding this right, I think it's calling the lower level stats package function lm(), which is the part that actually does the curve fitting. Anyway, what has me perplexed is that the model summary from scatter3d() has different coefficients than the one generated by lm(). However, the actual surface plotted by scatter3d() looks like the function generated by lm(). In the scatter3d() docs I didn't see anything about transforming the coefficients or changing them somehow - perhaps I have not been looking in the right place? I'm using a Linux box: 2.6.17-1.2187_FC5smp, R version 2.3.1, Rcmdr version 1.2-0, in case that helps. Thanks very much for any enlightenment! anja Here's an example of the output on the same data by both functions. If anyone wants the dataset, let me know:> scatter3d(samples$x1, samples$y, samples$x2, fit="linear",residuals=TRUE, bg="white", axis.scales=TRUE, grid=TRUE, ellipsoid=FALSE, xlab="x1", ylab="y", zlab="x2", model.summary=TRUE) $linear Call: lm(formula = y ~ x + z) Residuals: Min 1Q Median 3Q Max -0.096984 -0.022303 0.004758 0.029354 0.091188 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.708945 0.007005 101.20 <2e-16 *** x 0.278540 0.011262 24.73 <2e-16 *** z -0.688175 0.011605 -59.30 <2e-16 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 0.03936 on 105 degrees of freedom Multiple R-Squared: 0.972, Adjusted R-squared: 0.9715 F-statistic: 1822 on 2 and 105 DF, p-value: < 2.2e-16> summary(lm(formula=samples$y~samples$x1+samples$x2))Call: lm(formula = samples$y ~ samples$x1 + samples$x2) Residuals: Min 1Q Median 3Q Max -7865.0 -1808.6 385.8 2380.5 7394.9 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 92204.502 1323.217 69.68 <2e-16 *** samples$x1 225.882 9.133 24.73 <2e-16 *** samples$x2 -558.076 9.411 -59.30 <2e-16 *** --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 3192 on 105 degrees of freedom Multiple R-Squared: 0.972, Adjusted R-squared: 0.9715 F-statistic: 1822 on 2 and 105 DF, p-value: < 2.2e-16
Dear Anja, As you suggest, models in scatter3d() are fit via lm() and also mgcv(). scatter3d() rescales the three variables to fit in the unit cube; I believe that the new version of rgl makes the rescaling unnecessary, so eventually I'll probably rework scatter3d() to avoid it. It would be better if ?scatter3d mentioned this; I've made that change in the development version of the package. BTW, a nice thing about R is that the source code is there, so you can look to see what a function does. I hope this helps, John -------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox --------------------------------> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of angela baldo > Sent: Friday, September 29, 2006 5:23 PM > To: r-help at stat.math.ethz.ch > Subject: [R] scatter3d() model.summary coefficients? > > Hello All, > > I am a R newbie and am probably misinterpreting something > really obvious... > > In the Rcmdr package there is a scatter3d() function that can > fit a curve and also provide coefficients for the model. If > I'm understanding this right, I think it's calling the lower > level stats package function lm(), which is the part that > actually does the curve fitting. > > Anyway, what has me perplexed is that the model summary from > scatter3d() has different coefficients than the one generated > by lm(). However, the actual surface plotted by scatter3d() > looks like the function generated by lm(). > > In the scatter3d() docs I didn't see anything about > transforming the coefficients or changing them somehow - > perhaps I have not been looking in the right place? > > I'm using a Linux box: 2.6.17-1.2187_FC5smp, R version 2.3.1, > Rcmdr version 1.2-0, in case that helps. > > Thanks very much for any enlightenment! > > anja > > > Here's an example of the output on the same data by both > functions. If anyone wants the dataset, let me know: > > > scatter3d(samples$x1, samples$y, samples$x2, fit="linear", > residuals=TRUE, bg="white", axis.scales=TRUE, grid=TRUE, > ellipsoid=FALSE, xlab="x1", ylab="y", zlab="x2", > model.summary=TRUE) $linear > > Call: > lm(formula = y ~ x + z) > > Residuals: > Min 1Q Median 3Q Max > -0.096984 -0.022303 0.004758 0.029354 0.091188 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 0.708945 0.007005 101.20 <2e-16 *** > x 0.278540 0.011262 24.73 <2e-16 *** > z -0.688175 0.011605 -59.30 <2e-16 *** > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > Residual standard error: 0.03936 on 105 degrees of freedom > Multiple R-Squared: 0.972, Adjusted R-squared: 0.9715 > F-statistic: 1822 on 2 and 105 DF, p-value: < 2.2e-16 > > > summary(lm(formula=samples$y~samples$x1+samples$x2)) > > Call: > lm(formula = samples$y ~ samples$x1 + samples$x2) > > Residuals: > Min 1Q Median 3Q Max > -7865.0 -1808.6 385.8 2380.5 7394.9 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 92204.502 1323.217 69.68 <2e-16 *** > samples$x1 225.882 9.133 24.73 <2e-16 *** > samples$x2 -558.076 9.411 -59.30 <2e-16 *** > --- > Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 > > Residual standard error: 3192 on 105 degrees of freedom > Multiple R-Squared: 0.972, Adjusted R-squared: 0.9715 > F-statistic: 1822 on 2 and 105 DF, p-value: < 2.2e-16 > > -- > Angela M. Baldo > Computational Biologist > USDA, ARS > Plant Genetic Resources Unit > & Grape Genetics Research Unit > New York State Agricultural Experiment Station 630 W. North > Street Geneva, NY 14456-0462 USA > > voice 315 787-2413 or 607 254-9413 > fax 315 787-2339 or 607 254-9339 > > angela.baldo at ars.usda.gov > http://www.ars.usda.gov/NAA/Geneva > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >