Hello. I'm a new user of R, and I have a question regarding the use of aov and lm-functions. I'm doing a fractional factorial experiment at our production site, and I need to familiarize myself with the analysis before I conduct the experiment. I've been working my way through the examples provided at http://www.itl.nist.gov/div898/handbook/pri/section4/pri472.htm http://www.itl.nist.gov/div898/handbook/pri/section4/pri472.htm , but I can't get the results provided in the trial model calculations. Why is this. Here is how I have tried to do it:> data.catapult=read.table("Fractional.txt",header=T) #Read the data in the > table provided in the example.> data.catapultDistance h s b l e 1 28.00 3.25 0 1 0 80 2 99.00 4.00 10 2 2 62 3 126.50 4.75 20 2 4 80 4 126.50 4.75 0 2 4 45 5 45.00 3.25 20 2 4 45 6 35.00 4.75 0 1 0 45 7 45.00 4.00 10 1 2 62 8 28.25 4.75 20 1 0 80 9 85.00 4.75 0 1 4 80 10 8.00 3.25 20 1 0 45 11 36.50 4.75 20 1 4 45 12 33.00 3.25 0 1 4 45 13 84.50 4.00 10 2 2 62 14 28.50 4.75 20 2 0 45 15 33.50 3.25 0 2 0 45 16 36.00 3.25 20 2 0 80 17 84.00 4.75 0 2 0 80 18 45.00 3.25 20 1 4 80 19 37.50 4.00 10 1 2 62 20 106.00 3.25 0 2 4 80> aov.catapult > aov(Distance~h+s+b+l+e+h*s+h*b+h*l+h*e+s*b+s*l+s*e+b*l+b*e+l*e,data=data.catapult) > summary(aov.catapult)Df Sum Sq Mean Sq F value Pr(>F) h 1 2909 2909 15.854 0.01638 * s 1 1964 1964 10.701 0.03076 * b 1 7537 7537 41.072 0.00305 ** l 1 6490 6490 35.369 0.00401 ** e 1 2297 2297 12.518 0.02406 * h:s 1 122 122 0.667 0.45998 h:b 1 345 345 1.878 0.24247 h:l 1 354 354 1.929 0.23724 h:e 1 0 0 0.001 0.97578 s:b 1 161 161 0.877 0.40199 s:l 1 20 20 0.107 0.75966 s:e 1 114 114 0.622 0.47427 b:l 1 926 926 5.049 0.08795 . b:e 1 124 124 0.677 0.45689 l:e 1 158 158 0.860 0.40623 Residuals 4 734 184 --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 This seems just about right to me. However, when I attempt to make the linear model, based on main factors and two-factor interactions, I get a completely different result:> lm.catapult > lm(Distance~h+s+b+l+e+h*s+h*b+h*l+h*e+s*b+s*l+s*e+b*l+b*e+l*e,data=data.catapult) > summary(lm.catapult)Call: lm(formula = Distance ~ h + s + b + l + e + h * s + h * b + h * l + h * e + s * b + s * l + s * e + b * l + b * e + l * e, data = data.catapult) Residuals: 1 2 3 4 5 6 7 8 9 10 -0.8100 22.3875 -3.6763 -3.8925 -3.8925 -0.8576 7.0852 -0.8100 -0.8100 -0.8576 11 12 13 14 15 16 17 18 19 20 -0.8576 -0.8576 7.8875 -3.8925 -3.8925 -3.6763 -3.6763 -0.8100 -0.4148 -3.6763 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 25.031042 100.791955 0.248 0.8161 h -3.687500 22.466457 -0.164 0.8776 s 0.475446 2.446791 0.194 0.8554 b -39.417973 44.906164 -0.878 0.4296 l -18.938988 12.233954 -1.548 0.1965 e -0.158449 1.230683 -0.129 0.9038 h:s -0.368750 0.451546 -0.817 0.4600 h:b 12.375000 9.030925 1.370 0.2425 h:l 3.135417 2.257731 1.389 0.2372 h:e 0.008333 0.258026 0.032 0.9758 s:b -0.634375 0.677319 -0.937 0.4020 s:l -0.055469 0.169330 -0.328 0.7597 s:e 0.015268 0.019352 0.789 0.4743 b:l 7.609375 3.386597 2.247 0.0879 . b:e 0.318397 0.387008 0.823 0.4569 l:e 0.089732 0.096760 0.927 0.4062 --- Signif. codes: 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 13.55 on 4 degrees of freedom Multiple R-squared: 0.9697, Adjusted R-squared: 0.8563 F-statistic: 8.545 on 15 and 4 DF, p-value: 0.02559 This result is nothing like the results provided in the example. Why is this? Any help is very much appreciated. Regards, St?le. -- View this message in context: http://r.789695.n4.nabble.com/Fractional-Factorial-Wrong-values-using-lm-function-tp4634400.html Sent from the R help mailing list archive at Nabble.com.
Ståle Nordås
2012-Jun-25 12:43 UTC
[R] Fractional Factorial - Wrong values using lm-function
Hello. Thank you for the help. However, I'm not sure your reply answers my question. Let me rephrase: I'm trying to reproduce the values in the second table in the example on http://www.itl.nist.gov/div898/handbook/pri/section4/pri472.htm. The table shows the summary of the linear model, which are the values I'm trying to reproduce, using the input in the example. When I use the lm-function on the data, I get values completely different from those given in the example (I've provided these values in my first post). Obviously I'm missing something - why can't I reproduce the values in the example using: lm.catapult = lm(Distance~h+s+b+l+e+h*s+h*b+h*l+h*e+s*b+s*l+s*e+b*l+b*e+l*e,data=data.catapult)> summary(lm.catapult)? I hope this was clearer. Regards, St?le Nord?s -----Original Message----- From: arun [mailto:smartpink111 at yahoo.com] Sent: 25. juni 2012 13:50 To: St?le Nord?s Cc: R help Subject: Re: [R] Fractional Factorial - Wrong values using lm-function Hi, You need to use, anova(lm.catapult) Analysis of Variance Table Response: Distance ????????? Df Sum Sq Mean Sq F value?? Pr(>F) h????????? 1 2909.3? 2909.3 15.8538 0.016378 * s????????? 1 1963.6? 1963.6 10.7005 0.030755 * b????????? 1 7536.9? 7536.9 41.0720 0.003046 ** l????????? 1 6490.3? 6490.3 35.3687 0.004010 ** e????????? 1 2297.0? 2297.0 12.5177 0.024056 * h:s??????? 1? 122.4?? 122.4? 0.6669 0.459978 h:b??????? 1? 344.6?? 344.6? 1.8777 0.242467 h:l??????? 1? 353.9?? 353.9? 1.9286 0.237236 h:e??????? 1??? 0.2???? 0.2? 0.0010 0.975783 s:b??????? 1? 161.0?? 161.0? 0.8772 0.401991 s:l??????? 1?? 19.7??? 19.7? 0.1073 0.759658 s:e??????? 1? 114.2?? 114.2? 0.6225 0.474270 b:l??????? 1? 926.4?? 926.4? 5.0486 0.087946 . b:e??????? 1? 124.2?? 124.2? 0.6769 0.456887 l:e??????? 1? 157.8?? 157.8? 0.8600 0.406226 Residuals? 4? 734.0?? 183.5?? #the summary result you got is the summary of linear model, while the summary of aov is the anova summary. A.K. ?????????????? ----- Original Message ----- From: Staleno <sn at bergen-plastics.no> To: r-help at r-project.org Cc: Sent: Monday, June 25, 2012 5:26 AM Subject: [R] Fractional Factorial - Wrong values using lm-function Hello. I'm a new user of R, and I have a question regarding the use of aov and lm-functions. I'm doing a fractional factorial experiment at our production site, and I need to familiarize myself with the analysis before I conduct the experiment. I've been working my way through the examples provided at http://www.itl.nist.gov/div898/handbook/pri/section4/pri472.htm http://www.itl.nist.gov/div898/handbook/pri/section4/pri472.htm , but I can't get the results provided in the trial model calculations. Why is this. Here is how I have tried to do it:> data.catapult=read.table("Fractional.txt",header=T) #Read the data in > the table provided in the example.> data.catapult? Distance? ? h? s b l? e 1? ? 28.00 3.25? 0 1 0 80 2? ? 99.00 4.00 10 2 2 62 3? ? 126.50 4.75 20 2 4 80 4? ? 126.50 4.75? 0 2 4 45 5? ? 45.00 3.25 20 2 4 45 6? ? 35.00 4.75? 0 1 0 45 7? ? 45.00 4.00 10 1 2 62 8? ? 28.25 4.75 20 1 0 80 9? ? 85.00 4.75? 0 1 4 80 10? ? 8.00 3.25 20 1 0 45 11? ? 36.50 4.75 20 1 4 45 12? ? 33.00 3.25? 0 1 4 45 13? ? 84.50 4.00 10 2 2 62 14? ? 28.50 4.75 20 2 0 45 15? ? 33.50 3.25? 0 2 0 45 16? ? 36.00 3.25 20 2 0 80 17? ? 84.00 4.75? 0 2 0 80 18? ? 45.00 3.25 20 1 4 80 19? ? 37.50 4.00 10 1 2 62 20? 106.00 3.25? 0 2 4 80> aov.catapult > aov(Distance~h+s+b+l+e+h*s+h*b+h*l+h*e+s*b+s*l+s*e+b*l+b*e+l*e,data=da > ta.catapult) > summary(aov.catapult)? ? ? ? ? ? Df Sum Sq Mean Sq F value? Pr(>F) h? ? ? ? ? ? 1? 2909? ? 2909? 15.854 0.01638 * s? ? ? ? ? ? 1? 1964? ? 1964? 10.701 0.03076 * b? ? ? ? ? ? 1? 7537? ? 7537? 41.072 0.00305 ** l? ? ? ? ? ? 1? 6490? ? 6490? 35.369 0.00401 ** e? ? ? ? ? ? 1? 2297? ? 2297? 12.518 0.02406 * h:s? ? ? ? ? 1? ? 122? ? 122? 0.667 0.45998 h:b? ? ? ? ? 1? ? 345? ? 345? 1.878 0.24247 h:l? ? ? ? ? 1? ? 354? ? 354? 1.929 0.23724 h:e? ? ? ? ? 1? ? ? 0? ? ? 0? 0.001 0.97578 s:b? ? ? ? ? 1? ? 161? ? 161? 0.877 0.40199 s:l? ? ? ? ? 1? ? 20? ? ? 20? 0.107 0.75966 s:e? ? ? ? ? 1? ? 114? ? 114? 0.622 0.47427 b:l? ? ? ? ? 1? ? 926? ? 926? 5.049 0.08795 . b:e? ? ? ? ? 1? ? 124? ? 124? 0.677 0.45689 l:e? ? ? ? ? 1? ? 158? ? 158? 0.860 0.40623 Residuals? ? 4? ? 734? ? 184 --- Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 This seems just about right to me. However, when I attempt to make the linear model, based on main factors and two-factor interactions, I get a completely different result:> lm.catapult > lm(Distance~h+s+b+l+e+h*s+h*b+h*l+h*e+s*b+s*l+s*e+b*l+b*e+l*e,data=dat > a.catapult) > summary(lm.catapult)Call: lm(formula = Distance ~ h + s + b + l + e + h * s + h * b + h * ? ? l + h * e + s * b + s * l + s * e + b * l + b * e + l * e, ? ? data = data.catapult) Residuals: ? ? ? 1? ? ? 2? ? ? 3? ? ? 4? ? ? 5? ? ? 6? ? ? 7? ? ? 8? ? ? 9 10 -0.8100 22.3875 -3.6763 -3.8925 -3.8925 -0.8576? 7.0852 -0.8100 -0.8100 -0.8576 ? ? 11? ? ? 12? ? ? 13? ? ? 14? ? ? 15? ? ? 16? ? ? 17? ? ? 18? ? ? 19 20 -0.8576 -0.8576? 7.8875 -3.8925 -3.8925 -3.6763 -3.6763 -0.8100 -0.4148 -3.6763 Coefficients: ? ? ? ? ? ? ? Estimate Std. Error t value Pr(>|t|) (Intercept)? 25.031042 100.791955? 0.248? 0.8161 h? ? ? ? ? ? -3.687500? 22.466457? -0.164? 0.8776 s? ? ? ? ? ? 0.475446? 2.446791? 0.194? 0.8554 b? ? ? ? ? -39.417973? 44.906164? -0.878? 0.4296 l? ? ? ? ? -18.938988? 12.233954? -1.548? 0.1965 e? ? ? ? ? ? -0.158449? 1.230683? -0.129? 0.9038 h:s? ? ? ? ? -0.368750? 0.451546? -0.817? 0.4600 h:b? ? ? ? ? 12.375000? 9.030925? 1.370? 0.2425 h:l? ? ? ? ? 3.135417? 2.257731? 1.389? 0.2372 h:e? ? ? ? ? 0.008333? 0.258026? 0.032? 0.9758 s:b? ? ? ? ? -0.634375? 0.677319? -0.937? 0.4020 s:l? ? ? ? ? -0.055469? 0.169330? -0.328? 0.7597 s:e? ? ? ? ? 0.015268? 0.019352? 0.789? 0.4743 b:l? ? ? ? ? 7.609375? 3.386597? 2.247? 0.0879 . b:e? ? ? ? ? 0.318397? 0.387008? 0.823? 0.4569 l:e? ? ? ? ? 0.089732? 0.096760? 0.927? 0.4062 --- Signif. codes:? 0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Residual standard error: 13.55 on 4 degrees of freedom Multiple R-squared: 0.9697,? ? Adjusted R-squared: 0.8563 F-statistic: 8.545 on 15 and 4 DF,? p-value: 0.02559 This result is nothing like the results provided in the example. Why is this? Any help is very much appreciated. Regards, St?le. -- View this message in context: http://r.789695.n4.nabble.com/Fractional-Factorial-Wrong-values-using-lm-function-tp4634400.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Bert Gunter
2012-Jun-25 14:09 UTC
[R] Fractional Factorial - Wrong values using lm-function
Staleno: As always, you need to read the Help file carefully. From ?aov: "The main difference from lm is in the way print, summary and so on handle the fit: this is expressed in the traditional language of the analysis of variance rather than that of linear models. " summary.aov() computes sequential ss. lm() uses the t-statistics for the estimated coefficients. They are not the same if non-orthogonal contrasts are used. Of course, the coefficients and fits **are** identical. If you don't know what this means, consult a statistician or linear models references. -- Bert On Mon, Jun 25, 2012 at 2:26 AM, Staleno <sn@bergen-plastics.no> wrote:> Hello. > > I'm a new user of R, and I have a question regarding the use of aov and > lm-functions. I'm doing a fractional factorial experiment at our production > site, and I need to familiarize myself with the analysis before I conduct > the experiment. I've been working my way through the examples provided at > http://www.itl.nist.gov/div898/handbook/pri/section4/pri472.htm > http://www.itl.nist.gov/div898/handbook/pri/section4/pri472.htm , but I > can't get the results provided in the trial model calculations. Why is > this. > Here is how I have tried to do it: > > > data.catapult=read.table("Fractional.txt",header=T) #Read the data in the > > table provided in the example. > > > data.catapult > Distance h s b l e > 1 28.00 3.25 0 1 0 80 > 2 99.00 4.00 10 2 2 62 > 3 126.50 4.75 20 2 4 80 > 4 126.50 4.75 0 2 4 45 > 5 45.00 3.25 20 2 4 45 > 6 35.00 4.75 0 1 0 45 > 7 45.00 4.00 10 1 2 62 > 8 28.25 4.75 20 1 0 80 > 9 85.00 4.75 0 1 4 80 > 10 8.00 3.25 20 1 0 45 > 11 36.50 4.75 20 1 4 45 > 12 33.00 3.25 0 1 4 45 > 13 84.50 4.00 10 2 2 62 > 14 28.50 4.75 20 2 0 45 > 15 33.50 3.25 0 2 0 45 > 16 36.00 3.25 20 2 0 80 > 17 84.00 4.75 0 2 0 80 > 18 45.00 3.25 20 1 4 80 > 19 37.50 4.00 10 1 2 62 > 20 106.00 3.25 0 2 4 80 > > > aov.catapult > > > aov(Distance~h+s+b+l+e+h*s+h*b+h*l+h*e+s*b+s*l+s*e+b*l+b*e+l*e,data=data.catapult) > > summary(aov.catapult) > Df Sum Sq Mean Sq F value Pr(>F) > h 1 2909 2909 15.854 0.01638 * > s 1 1964 1964 10.701 0.03076 * > b 1 7537 7537 41.072 0.00305 ** > l 1 6490 6490 35.369 0.00401 ** > e 1 2297 2297 12.518 0.02406 * > h:s 1 122 122 0.667 0.45998 > h:b 1 345 345 1.878 0.24247 > h:l 1 354 354 1.929 0.23724 > h:e 1 0 0 0.001 0.97578 > s:b 1 161 161 0.877 0.40199 > s:l 1 20 20 0.107 0.75966 > s:e 1 114 114 0.622 0.47427 > b:l 1 926 926 5.049 0.08795 . > b:e 1 124 124 0.677 0.45689 > l:e 1 158 158 0.860 0.40623 > Residuals 4 734 184 > --- > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > This seems just about right to me. However, when I attempt to make the > linear model, based on main factors and two-factor interactions, I get a > completely different result: > > > lm.catapult > > > lm(Distance~h+s+b+l+e+h*s+h*b+h*l+h*e+s*b+s*l+s*e+b*l+b*e+l*e,data=data.catapult) > > summary(lm.catapult) > > Call: > lm(formula = Distance ~ h + s + b + l + e + h * s + h * b + h * > l + h * e + s * b + s * l + s * e + b * l + b * e + l * e, > data = data.catapult) > > Residuals: > 1 2 3 4 5 6 7 8 9 > 10 > -0.8100 22.3875 -3.6763 -3.8925 -3.8925 -0.8576 7.0852 -0.8100 -0.8100 > -0.8576 > 11 12 13 14 15 16 17 18 19 > 20 > -0.8576 -0.8576 7.8875 -3.8925 -3.8925 -3.6763 -3.6763 -0.8100 -0.4148 > -3.6763 > > Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 25.031042 100.791955 0.248 0.8161 > h -3.687500 22.466457 -0.164 0.8776 > s 0.475446 2.446791 0.194 0.8554 > b -39.417973 44.906164 -0.878 0.4296 > l -18.938988 12.233954 -1.548 0.1965 > e -0.158449 1.230683 -0.129 0.9038 > h:s -0.368750 0.451546 -0.817 0.4600 > h:b 12.375000 9.030925 1.370 0.2425 > h:l 3.135417 2.257731 1.389 0.2372 > h:e 0.008333 0.258026 0.032 0.9758 > s:b -0.634375 0.677319 -0.937 0.4020 > s:l -0.055469 0.169330 -0.328 0.7597 > s:e 0.015268 0.019352 0.789 0.4743 > b:l 7.609375 3.386597 2.247 0.0879 . > b:e 0.318397 0.387008 0.823 0.4569 > l:e 0.089732 0.096760 0.927 0.4062 > --- > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 > > Residual standard error: 13.55 on 4 degrees of freedom > Multiple R-squared: 0.9697, Adjusted R-squared: 0.8563 > F-statistic: 8.545 on 15 and 4 DF, p-value: 0.02559 > > This result is nothing like the results provided in the example. Why is > this? Any help is very much appreciated. > > Regards, Ståle. > > -- > View this message in context: > http://r.789695.n4.nabble.com/Fractional-Factorial-Wrong-values-using-lm-function-tp4634400.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm [[alternative HTML version deleted]]