thr3ads.net - R help - [R] meaning of tests presented in anova(ols(...)) {Design package} [Jul 2008]

If this information is useful, please help other people find it:
Share via:

Dylan Beaudette

2008-Jul-15 04:34 UTC

[R] meaning of tests presented in anova(ols(...)) {Design package}

Hi,

I am curious about how to interpret the table produced by
anova(ols(...)), from the Design package. I have a multiple linear
regression model, with some interaction, defined by:

ols(formula = log(ksat * 60 * 60) ~ log(sar) * pol(activity,
    3) + log(conc) * pol(sand, 3), data = sm.clean, x = TRUE,
    y = TRUE)

         n Model L.R.       d.f.         R2      Sigma
      1834       1203         14       0.48        1.2

Residuals:
   Min     1Q Median     3Q    Max
-5.033 -0.859  0.016  0.739  4.868

Coefficients:
                       Value Std. Error     t        Pr(>|t|)
Intercept         11.3886790  2.0220171  5.63 0.0000000205580
sar               -4.3991263  1.0157588 -4.33 0.0000156609226
activity         -40.0591221  5.6907822 -7.04 0.0000000000027
activity^2        33.0570116  5.0578520  6.54 0.0000000000819
activity^3        -8.1645147  1.3750370 -5.94 0.0000000034548
conc               0.3841260  0.0813200  4.72 0.0000024942478
sand              -0.0096212  0.0327415 -0.29 0.7689032898947
sand^2             0.0008495  0.0008589  0.99 0.3227487169683
sand^3             0.0000025  0.0000066  0.39 0.6994987342042
sar * activity    12.8134698  2.9513942  4.34 0.0000149300007
sar * activity^2  -9.9981381  2.6310765 -3.80 0.0001494462966
sar * activity^3   2.1481278  0.7168339  3.00 0.0027662261037
conc * sand       -0.0157426  0.0076013 -2.07 0.0384966958735
conc * sand^2      0.0003419  0.0001989  1.72 0.0857381555491
conc * sand^3     -0.0000027  0.0000015 -1.77 0.0777025949762


Looking at what I 'think' are "marginal p-values" i.e. results
of a
test against coef_i != 0, there are several terms with non-significant
coefficients (at p<0.05). Does a non-significant coefficient warrant
removal from the model, or perhaps a mention in the discussion?

Compared to the above example, what tests are performed when calling
anova() on this object? Here is the output in R:

               Analysis of Variance          Response: log(ksat * 60 * 60)

 Factor                                        d.f. Partial SS MS     F
 sar  (Factor+Higher Order Factors)               4  168.43     42.11  27.0
  All Interactions                                3  142.13     47.38  30.4
 activity  (Factor+Higher Order Factors)          6  536.84     89.47  57.3
  All Interactions                                3  142.13     47.38  30.4
  Nonlinear (Factor+Higher Order Factors)         4  257.25     64.31  41.2
 conc  (Factor+Higher Order Factors)              4  443.02    110.75  71.0
  All Interactions                                3   76.74     25.58  16.4
 sand  (Factor+Higher Order Factors)              6 1906.29    317.71 203.6
  All Interactions                                3   76.74     25.58  16.4
  Nonlinear (Factor+Higher Order Factors)         4  263.00     65.75  42.1
 sar * activity  (Factor+Higher Order Factors)    3  142.13     47.38  30.4
  Nonlinear                                       2   95.32     47.66  30.5
  Nonlinear Interaction : f(A,B) vs. AB           2   95.32     47.66  30.5
 conc * sand  (Factor+Higher Order Factors)       3   76.74     25.58  16.4
  Nonlinear                                       2    4.98      2.49   1.6
  Nonlinear Interaction : f(A,B) vs. AB           2    4.98      2.49   1.6
 TOTAL NONLINEAR                                  8  455.20     56.90  36.5
 TOTAL INTERACTION                                6  218.87     36.48  23.4
 TOTAL NONLINEAR + INTERACTION                   10  573.36     57.34  36.7
 REGRESSION                                      14 2631.53    187.97 120.4
 ERROR                                         1819 2839.25      1.56
 P
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 <.0001
 0.203
 0.203
 <.0001
 <.0001
 <.0001
 <.0001

Are more of the 'terms' significant (at p<0.05) due to pooling of
model terms? I have looked through Frank's book on the topic, but
can't quite wrap my head around what the above is telling me. I am
mostly interested in presenting a model for use as a applied tool, and
interpretation of terms / interaction is very important.

Thanks,

Dylan

Mark Difford

2008-Jul-15 09:24 UTC

head link

[R] meaning of tests presented in anova(ols(...)) {Design package}

Hi Dylan,
>> I am curious about how to interpret the table produced by 
>> anova(ols(...)), from the Design package.
Frank will perhaps come in with more detail, but if he doesn't then you can
get an understanding of what's being tested by doing the following on the
saved object from your OLS call (see ?anova.Design):

print(anova(ols$obj), which="sub")
plot(anova(ols$obj))

HTH, Mark.


Dylan Beaudette-2 wrote:> 
> Hi,
> 
> I am curious about how to interpret the table produced by
> anova(ols(...)), from the Design package. I have a multiple linear
> regression model, with some interaction, defined by:
> 
> ols(formula = log(ksat * 60 * 60) ~ log(sar) * pol(activity,
>     3) + log(conc) * pol(sand, 3), data = sm.clean, x = TRUE,
>     y = TRUE)
> 
>          n Model L.R.       d.f.         R2      Sigma
>       1834       1203         14       0.48        1.2
> 
> Residuals:
>    Min     1Q Median     3Q    Max
> -5.033 -0.859  0.016  0.739  4.868
> 
> Coefficients:
>                        Value Std. Error     t        Pr(>|t|)
> Intercept         11.3886790  2.0220171  5.63 0.0000000205580
> sar               -4.3991263  1.0157588 -4.33 0.0000156609226
> activity         -40.0591221  5.6907822 -7.04 0.0000000000027
> activity^2        33.0570116  5.0578520  6.54 0.0000000000819
> activity^3        -8.1645147  1.3750370 -5.94 0.0000000034548
> conc               0.3841260  0.0813200  4.72 0.0000024942478
> sand              -0.0096212  0.0327415 -0.29 0.7689032898947
> sand^2             0.0008495  0.0008589  0.99 0.3227487169683
> sand^3             0.0000025  0.0000066  0.39 0.6994987342042
> sar * activity    12.8134698  2.9513942  4.34 0.0000149300007
> sar * activity^2  -9.9981381  2.6310765 -3.80 0.0001494462966
> sar * activity^3   2.1481278  0.7168339  3.00 0.0027662261037
> conc * sand       -0.0157426  0.0076013 -2.07 0.0384966958735
> conc * sand^2      0.0003419  0.0001989  1.72 0.0857381555491
> conc * sand^3     -0.0000027  0.0000015 -1.77 0.0777025949762
> 
> 
> Looking at what I 'think' are "marginal p-values" i.e.
results of a
> test against coef_i != 0, there are several terms with non-significant
> coefficients (at p<0.05). Does a non-significant coefficient warrant
> removal from the model, or perhaps a mention in the discussion?
> 
> Compared to the above example, what tests are performed when calling
> anova() on this object? Here is the output in R:
> 
>                Analysis of Variance          Response: log(ksat * 60 * 60)
> 
>  Factor                                        d.f. Partial SS MS     F
>  sar  (Factor+Higher Order Factors)               4  168.43     42.11 
> 27.0
>   All Interactions                                3  142.13     47.38 
> 30.4
>  activity  (Factor+Higher Order Factors)          6  536.84     89.47 
> 57.3
>   All Interactions                                3  142.13     47.38 
> 30.4
>   Nonlinear (Factor+Higher Order Factors)         4  257.25     64.31 
> 41.2
>  conc  (Factor+Higher Order Factors)              4  443.02    110.75 
> 71.0
>   All Interactions                                3   76.74     25.58 
> 16.4
>  sand  (Factor+Higher Order Factors)              6 1906.29    317.71
> 203.6
>   All Interactions                                3   76.74     25.58 
> 16.4
>   Nonlinear (Factor+Higher Order Factors)         4  263.00     65.75 
> 42.1
>  sar * activity  (Factor+Higher Order Factors)    3  142.13     47.38 
> 30.4
>   Nonlinear                                       2   95.32     47.66 
> 30.5
>   Nonlinear Interaction : f(A,B) vs. AB           2   95.32     47.66 
> 30.5
>  conc * sand  (Factor+Higher Order Factors)       3   76.74     25.58 
> 16.4
>   Nonlinear                                       2    4.98      2.49  
> 1.6
>   Nonlinear Interaction : f(A,B) vs. AB           2    4.98      2.49  
> 1.6
>  TOTAL NONLINEAR                                  8  455.20     56.90 
> 36.5
>  TOTAL INTERACTION                                6  218.87     36.48 
> 23.4
>  TOTAL NONLINEAR + INTERACTION                   10  573.36     57.34 
> 36.7
>  REGRESSION                                      14 2631.53    187.97
> 120.4
>  ERROR                                         1819 2839.25      1.56
>  P
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  0.203
>  0.203
>  <.0001
>  <.0001
>  <.0001
>  <.0001
> 
> Are more of the 'terms' significant (at p<0.05) due to pooling
of
> model terms? I have looked through Frank's book on the topic, but
> can't quite wrap my head around what the above is telling me. I am
> mostly interested in presenting a model for use as a applied tool, and
> interpretation of terms / interaction is very important.
> 
> Thanks,
> 
> Dylan
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
-- 
View this message in context:
http://www.nabble.com/meaning-of-tests-presented-in-anova%28ols%28...%29%29-%7BDesign-package%7D-tp18458438p18461125.html
Sent from the R help mailing list archive at Nabble.com.

Frank E Harrell Jr

2008-Jul-16 01:25 UTC

head link

[R] meaning of tests presented in anova(ols(...)) {Design package}

Dylan Beaudette wrote:> Hi,
> 
> I am curious about how to interpret the table produced by
> anova(ols(...)), from the Design package. I have a multiple linear
> regression model, with some interaction, defined by:
> 
> ols(formula = log(ksat * 60 * 60) ~ log(sar) * pol(activity,
>     3) + log(conc) * pol(sand, 3), data = sm.clean, x = TRUE,
>     y = TRUE)
> 
>          n Model L.R.       d.f.         R2      Sigma
>       1834       1203         14       0.48        1.2
> 
> Residuals:
>    Min     1Q Median     3Q    Max
> -5.033 -0.859  0.016  0.739  4.868
> 
> Coefficients:
>                        Value Std. Error     t        Pr(>|t|)
> Intercept         11.3886790  2.0220171  5.63 0.0000000205580
> sar               -4.3991263  1.0157588 -4.33 0.0000156609226
> activity         -40.0591221  5.6907822 -7.04 0.0000000000027
> activity^2        33.0570116  5.0578520  6.54 0.0000000000819
> activity^3        -8.1645147  1.3750370 -5.94 0.0000000034548
> conc               0.3841260  0.0813200  4.72 0.0000024942478
> sand              -0.0096212  0.0327415 -0.29 0.7689032898947
> sand^2             0.0008495  0.0008589  0.99 0.3227487169683
> sand^3             0.0000025  0.0000066  0.39 0.6994987342042
> sar * activity    12.8134698  2.9513942  4.34 0.0000149300007
> sar * activity^2  -9.9981381  2.6310765 -3.80 0.0001494462966
> sar * activity^3   2.1481278  0.7168339  3.00 0.0027662261037
> conc * sand       -0.0157426  0.0076013 -2.07 0.0384966958735
> conc * sand^2      0.0003419  0.0001989  1.72 0.0857381555491
> conc * sand^3     -0.0000027  0.0000015 -1.77 0.0777025949762
> 
> 
> Looking at what I 'think' are "marginal p-values" i.e.
results of a
> test against coef_i != 0, there are several terms with non-significant
> coefficients (at p<0.05). Does a non-significant coefficient warrant
> removal from the model, or perhaps a mention in the discussion?
No
> 
> Compared to the above example, what tests are performed when calling
> anova() on this object? Here is the output in R:
Mark Difford gave a nice response for that.

Frank
> 
>                Analysis of Variance          Response: log(ksat * 60 * 60)
> 
>  Factor                                        d.f. Partial SS MS     F
>  sar  (Factor+Higher Order Factors)               4  168.43     42.11  27.0
>   All Interactions                                3  142.13     47.38  30.4
>  activity  (Factor+Higher Order Factors)          6  536.84     89.47  57.3
>   All Interactions                                3  142.13     47.38  30.4
>   Nonlinear (Factor+Higher Order Factors)         4  257.25     64.31  41.2
>  conc  (Factor+Higher Order Factors)              4  443.02    110.75  71.0
>   All Interactions                                3   76.74     25.58  16.4
>  sand  (Factor+Higher Order Factors)              6 1906.29    317.71 203.6
>   All Interactions                                3   76.74     25.58  16.4
>   Nonlinear (Factor+Higher Order Factors)         4  263.00     65.75  42.1
>  sar * activity  (Factor+Higher Order Factors)    3  142.13     47.38  30.4
>   Nonlinear                                       2   95.32     47.66  30.5
>   Nonlinear Interaction : f(A,B) vs. AB           2   95.32     47.66  30.5
>  conc * sand  (Factor+Higher Order Factors)       3   76.74     25.58  16.4
>   Nonlinear                                       2    4.98      2.49   1.6
>   Nonlinear Interaction : f(A,B) vs. AB           2    4.98      2.49   1.6
>  TOTAL NONLINEAR                                  8  455.20     56.90  36.5
>  TOTAL INTERACTION                                6  218.87     36.48  23.4
>  TOTAL NONLINEAR + INTERACTION                   10  573.36     57.34  36.7
>  REGRESSION                                      14 2631.53    187.97 120.4
>  ERROR                                         1819 2839.25      1.56
>  P
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  <.0001
>  0.203
>  0.203
>  <.0001
>  <.0001
>  <.0001
>  <.0001
> 
> Are more of the 'terms' significant (at p<0.05) due to pooling
of
> model terms? I have looked through Frank's book on the topic, but
> can't quite wrap my head around what the above is telling me. I am
> mostly interested in presenting a model for use as a applied tool, and
> interpretation of terms / interaction is very important.
> 
> Thanks,
> 
> Dylan
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 

-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University

Reasonably Related Threads

Search for more reasonably related threads

R help - Jul 2008 - meaning of tests presented in anova(ols(...)) {Design package}

[R] meaning of tests presented in anova(ols(...)) {Design package}

[R] meaning of tests presented in anova(ols(...)) {Design package}

[R] meaning of tests presented in anova(ols(...)) {Design package}

Reasonably Related Threads