Dear Robert,
Anova() calls linearHypothesis(), also in the car package, to compute
sums of squares and df, supplying appropriate hypothesis matrices.
linearHypothesis() usually tries to express the hypothesis matrix in
symbolic equation form for printing, but won't do this if coefficient
names include arithmetic operators, in your case - and +, which can
confuse it.
The symbolic form of the hypothesis isn't really relevant for Anova(),
which doesn't use the printed representation of each hypothesis, and so,
despite the warnings, you get the correct ANOVA table. In your case,
where the data are balanced, with 4 cases per cell, Anova(mod) and
summary(mod) are equivalent, which makes me wonder why you would use
Anova() in the first place.
To elaborate a bit, linearHypothesis() does tolerate arithmetic
operators in coefficient names if you specify the hypothesis
symbolically rather than as a hypothesis matrix. For example, to test,
the interaction:
------- snip --------
> linearHypothesis(mod,
+ c("TreatmentDabrafenib:ExpressionCD271+ = 0",
+ "TreatmentTrametinib:ExpressionCD271+ = 0",
+ "TreatmentCombination:ExpressionCD271+ = 0"))
Linear hypothesis test
Hypothesis:
TreatmentDabrafenib:ExpressionCD271+ = 0
TreatmentTrametinib:ExpressionCD271+ = 0
TreatmentCombination:ExpressionCD271+ = 0
Model 1: restricted model
Model 2: Viability ~ Treatment * Expression
Res.Df RSS Df Sum of Sq F Pr(>F)
1 27 18966
2 24 16739 3 2226.3 1.064 0.3828
------- snip --------
Alternatively:
------- snip --------
> H <- matrix(0, 3, 8)
> H[1, 6] <- H[2, 7] <- H[3, 8] <- 1
> H
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
[1,] 0 0 0 0 0 1 0 0
[2,] 0 0 0 0 0 0 1 0
[3,] 0 0 0 0 0 0 0 1
> linearHypothesis(mod, H)
Linear hypothesis test
Hypothesis:
Model 1: restricted model
Model 2: Viability ~ Treatment * Expression
Res.Df RSS Df Sum of Sq F Pr(>F)
1 27 18966
2 24 16739 3 2226.3 1.064 0.3828
Warning message:
In printHypothesis(L, rhs, names(b)) :
one or more coefficients in the hypothesis include
arithmetic operators in their names;
the printed representation of the hypothesis will be omitted
------- snip --------
There's no good reason that linearHypothesis() should try to express
each hypothesis symbolically for Anova(), since Anova() doesn't use that
information. When I have some time, I'll arrange to avoid the warning.
Best,
John
--
John Fox, Professor Emeritus
McMaster University
Hamilton, Ontario, Canada
web: https://www.john-fox.ca/
On 2023-09-16 4:39 p.m., Robert Baer wrote:> Caution: External email.
>
>
> When doing Anova using the car package, I get a print warning that is
> unexpected. It seemingly involves have my flow cytometry factor levels
> named CD271+ and CD171-. But I am not sure this warning should be
> intended behavior. Any explanation about whether I'm doing something
> wrong? Why can't I have CD271+ and CD271- as factor levels? Its legal
> text isn't it?
>
> library(car) mod = aov(Viability ~ Treatment*Expression, data = dat1)
> Anova(mod, type =2) Anova Table (Type II tests) Response: Viability Sum
> Sq Df F value Pr(>F) Treatment 19447.3 3 9.2942 0.0002927 *** Expression
> 2669.8 1 3.8279 0.0621394 . Treatment:Expression 2226.3 3 1.0640
> 0.3828336 Residuals 16739.3 24 --- Signif. codes: 0 ?***? 0.001 ?**?
> 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Warning messages: 1: In printHypothesis(L,
> rhs, names(b)) : one or more coefficients in the hypothesis include
> arithmetic operators in their names; the printed representation of the
> hypothesis will be omitted 2: In printHypothesis(L, rhs, names(b)) : one
> or more coefficients in the hypothesis include arithmetic operators in
> their names; the printed representation of the hypothesis will be
> omitted 3: In printHypothesis(L, rhs, names(b)) : one or more
> coefficients in the hypothesis include arithmetic operators in their
> names; the printed representation of the hypothesis will be omitted
>
>
> The code to reproduce:
>
> ```
>
>
> dat1 <-structure(list(Treatment = structure(c(1L, 1L, 1L, 1L, 3L, 1L,
> 1L, 1L, 1L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 4L, 4L, 4L, 4L,
> 4L, 4L, 4L, 4L), levels = c("Control",
> "Dabrafenib", "Trametinib", "Combination"),
class = "factor"),
> Expression = structure(c(2L, 2L, 2L, 2L, 2L, 1L,
> 1L, 1L,
> 1L, 2L, 2L, 2L, 2L, 1L,
> 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L,
> 1L, 2L, 2L, 2L, 2L, 1L,
> 1L, 1L, 1L), levels = c("CD271-",
> "CD271+"), class = "factor"),
> Viability = c(128.329809725159, 24.2360176821065,
> 76.3597924274457, 11.0128771862387, 21.4683836248318,
> 140.784162982894, 87.4303286565443,
> 118.181818181818, 53.603690178743,
> 51.2973284643475, 5.47760907168941,
> 27.1574091870075, 50.8360561214684,
> 56.5250816836441, 28.6949836632712,
> 93.2731116663463, 71.900826446281,
> 32.2314049586777, 24.2360176821065,
> 27.4649240822602, 24.0822602344801,
> 26.542379396502, 30.693830482414,
> 27.772438977513, 13.4729963482606,
> 8.24524312896406, 18.5469921199308,
> 13.9342686911397, 13.3192389006342,
> 19.9308091485681, 17.6244474341726,
> 16.2406304055353)),
> row.names = c(NA,
> -32L),
> class = c("tbl_df", "tbl",
"data.frame"))
>
> mod = aov(Viability ~ Treatment*Expression, data = dat1)
> summary(mod)
> library(car)
> Anova(mod, type =2)
>
> ```
>
>
>> sessionInfo() R version 4.3.1 (2023-06-16 ucrt) Platform:
> x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 11 x64 (build
> 25951) Matrix products: default locale: [1] LC_COLLATE=English_United
> States.utf8 LC_CTYPE=English_United States.utf8
> LC_MONETARY=English_United States.utf8 [4] LC_NUMERIC=C
> LC_TIME=English_United States.utf8 time zone: America/Chicago tzcode
> source: internal attached base packages: [1] stats graphics grDevices
> utils datasets methods base other attached packages: [1] car_3.1-2
> carData_3.0-5 tidyr_1.3.0 readr_2.1.4 readxl_1.4.3 ggplot2_3.4.3
> dplyr_1.1.3 loaded via a namespace (and not attached): [1] crayon_1.5.2
> vctrs_0.6.3 cli_3.6.1 rlang_1.1.1 purrr_1.0.2 generics_0.1.3
> labeling_0.4.3 [8] bit_4.0.5 glue_1.6.2 colorspace_2.1-0 hms_1.1.3
> scales_1.2.1 fansi_1.0.4 grid_4.3.1 [15] cellranger_1.1.0 abind_1.4-5
> munsell_0.5.0 tibble_3.2.1 tzdb_0.4.0 lifecycle_1.0.3 compiler_4.3.1
> [22] pkgconfig_2.0.3 rstudioapi_0.15.0 farver_2.1.1 R6_2.5.1
> tidyselect_1.2.0 utf8_1.2.3 parallel_4.3.1 [29] vroom_1.6.3 pillar_1.9.0
> magrittr_2.0.3 bit64_4.0.5 tools_4.3.1 withr_2.5.0 gtable_0.3.4
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Thanks John. Appreciate the insights. On 9/17/2023 9:43 AM, John Fox wrote:> Dear Robert, > > Anova() calls linearHypothesis(), also in the car package, to compute > sums of squares and df, supplying appropriate hypothesis matrices. > linearHypothesis() usually tries to express the hypothesis matrix in > symbolic equation form for printing, but won't do this if coefficient > names include arithmetic operators, in your case - and +, which can > confuse it. > > The symbolic form of the hypothesis isn't really relevant for Anova(), > which doesn't use the printed representation of each hypothesis, and > so, despite the warnings, you get the correct ANOVA table. In your > case, where the data are balanced, with 4 cases per cell, Anova(mod) > and summary(mod) are equivalent, which makes me wonder why you would > use Anova() in the first place. > > To elaborate a bit, linearHypothesis() does tolerate arithmetic > operators in coefficient names if you specify the hypothesis > symbolically rather than as a hypothesis matrix. For example, to test, > the interaction: > > ------- snip -------- > > > linearHypothesis(mod, > +????????????????? c("TreatmentDabrafenib:ExpressionCD271+ = 0", > +??????????????????? "TreatmentTrametinib:ExpressionCD271+ = 0", > +??????????????????? "TreatmentCombination:ExpressionCD271+ = 0")) > Linear hypothesis test > > Hypothesis: > TreatmentDabrafenib:ExpressionCD271+ = 0 > TreatmentTrametinib:ExpressionCD271+ = 0 > TreatmentCombination:ExpressionCD271+ = 0 > > Model 1: restricted model > Model 2: Viability ~ Treatment * Expression > > ? Res.Df?? RSS Df Sum of Sq???? F Pr(>F) > 1???? 27 18966 > 2???? 24 16739? 3??? 2226.3 1.064 0.3828 > > ------- snip -------- > > Alternatively: > > ------- snip -------- > > > H <- matrix(0, 3, 8) > > H[1, 6] <- H[2, 7] <- H[3, 8] <- 1 > > H > ???? [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] > [1,]??? 0??? 0??? 0??? 0??? 0??? 1??? 0??? 0 > [2,]??? 0??? 0??? 0??? 0??? 0??? 0??? 1??? 0 > [3,]??? 0??? 0??? 0??? 0??? 0??? 0??? 0??? 1 > > > linearHypothesis(mod, H) > Linear hypothesis test > > Hypothesis: > > > Model 1: restricted model > Model 2: Viability ~ Treatment * Expression > > ? Res.Df?? RSS Df Sum of Sq???? F Pr(>F) > 1???? 27 18966 > 2???? 24 16739? 3??? 2226.3 1.064 0.3828 > Warning message: > In printHypothesis(L, rhs, names(b)) : > ? one or more coefficients in the hypothesis include > ???? arithmetic operators in their names; > ? the printed representation of the hypothesis will be omitted > > ------- snip -------- > > There's no good reason that linearHypothesis() should try to express > each hypothesis symbolically for Anova(), since Anova() doesn't use > that information. When I have some time, I'll arrange to avoid the > warning. > > Best, > ?John >
Also, I would guess that the code precedes the use of backticks in non-syntactic names. Could they be deployed here? - Peter> On 17 Sep 2023, at 16:43 , John Fox <jfox at mcmaster.ca> wrote: > > Dear Robert, > > Anova() calls linearHypothesis(), also in the car package, to compute sums of squares and df, supplying appropriate hypothesis matrices. linearHypothesis() usually tries to express the hypothesis matrix in symbolic equation form for printing, but won't do this if coefficient names include arithmetic operators, in your case - and +, which can confuse it. > > The symbolic form of the hypothesis isn't really relevant for Anova(), which doesn't use the printed representation of each hypothesis, and so, despite the warnings, you get the correct ANOVA table. In your case, where the data are balanced, with 4 cases per cell, Anova(mod) and summary(mod) are equivalent, which makes me wonder why you would use Anova() in the first place. > > To elaborate a bit, linearHypothesis() does tolerate arithmetic operators in coefficient names if you specify the hypothesis symbolically rather than as a hypothesis matrix. For example, to test, the interaction: > > ------- snip -------- > > > linearHypothesis(mod, > + c("TreatmentDabrafenib:ExpressionCD271+ = 0", > + "TreatmentTrametinib:ExpressionCD271+ = 0", > + "TreatmentCombination:ExpressionCD271+ = 0")) > Linear hypothesis test > > Hypothesis: > TreatmentDabrafenib:ExpressionCD271+ = 0 > TreatmentTrametinib:ExpressionCD271+ = 0 > TreatmentCombination:ExpressionCD271+ = 0 > > Model 1: restricted model > Model 2: Viability ~ Treatment * Expression > > Res.Df RSS Df Sum of Sq F Pr(>F) > 1 27 18966 > 2 24 16739 3 2226.3 1.064 0.3828 > > ------- snip -------- > > Alternatively: > > ------- snip -------- > > > H <- matrix(0, 3, 8) > > H[1, 6] <- H[2, 7] <- H[3, 8] <- 1 > > H > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] > [1,] 0 0 0 0 0 1 0 0 > [2,] 0 0 0 0 0 0 1 0 > [3,] 0 0 0 0 0 0 0 1 > > > linearHypothesis(mod, H) > Linear hypothesis test > > Hypothesis: > > > Model 1: restricted model > Model 2: Viability ~ Treatment * Expression > > Res.Df RSS Df Sum of Sq F Pr(>F) > 1 27 18966 > 2 24 16739 3 2226.3 1.064 0.3828 > Warning message: > In printHypothesis(L, rhs, names(b)) : > one or more coefficients in the hypothesis include > arithmetic operators in their names; > the printed representation of the hypothesis will be omitted > > ------- snip -------- > > There's no good reason that linearHypothesis() should try to express each hypothesis symbolically for Anova(), since Anova() doesn't use that information. When I have some time, I'll arrange to avoid the warning. > > Best, > John > > -- > John Fox, Professor Emeritus > McMaster University > Hamilton, Ontario, Canada > web: https://www.john-fox.ca/ > On 2023-09-16 4:39 p.m., Robert Baer wrote: >> Caution: External email. >> When doing Anova using the car package, I get a print warning that is >> unexpected. It seemingly involves have my flow cytometry factor levels >> named CD271+ and CD171-. But I am not sure this warning should be >> intended behavior. Any explanation about whether I'm doing something >> wrong? Why can't I have CD271+ and CD271- as factor levels? Its legal >> text isn't it? >> library(car) mod = aov(Viability ~ Treatment*Expression, data = dat1) >> Anova(mod, type =2) Anova Table (Type II tests) Response: Viability Sum >> Sq Df F value Pr(>F) Treatment 19447.3 3 9.2942 0.0002927 *** Expression >> 2669.8 1 3.8279 0.0621394 . Treatment:Expression 2226.3 3 1.0640 >> 0.3828336 Residuals 16739.3 24 --- Signif. codes: 0 ?***? 0.001 ?**? >> 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Warning messages: 1: In printHypothesis(L, >> rhs, names(b)) : one or more coefficients in the hypothesis include >> arithmetic operators in their names; the printed representation of the >> hypothesis will be omitted 2: In printHypothesis(L, rhs, names(b)) : one >> or more coefficients in the hypothesis include arithmetic operators in >> their names; the printed representation of the hypothesis will be >> omitted 3: In printHypothesis(L, rhs, names(b)) : one or more >> coefficients in the hypothesis include arithmetic operators in their >> names; the printed representation of the hypothesis will be omitted >> The code to reproduce: >> ``` >> dat1 <-structure(list(Treatment = structure(c(1L, 1L, 1L, 1L, 3L, 1L, >> 1L, 1L, 1L, 2L, 2L, 2L, >> 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, >> 3L, 3L, 4L, 4L, 4L, 4L, >> 4L, 4L, 4L, 4L), levels = c("Control", >> "Dabrafenib", "Trametinib", "Combination"), class = "factor"), >> Expression = structure(c(2L, 2L, 2L, 2L, 2L, 1L, >> 1L, 1L, >> 1L, 2L, 2L, 2L, 2L, 1L, >> 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, >> 1L, 2L, 2L, 2L, 2L, 1L, >> 1L, 1L, 1L), levels = c("CD271-", >> "CD271+"), class = "factor"), >> Viability = c(128.329809725159, 24.2360176821065, >> 76.3597924274457, 11.0128771862387, 21.4683836248318, >> 140.784162982894, 87.4303286565443, >> 118.181818181818, 53.603690178743, >> 51.2973284643475, 5.47760907168941, >> 27.1574091870075, 50.8360561214684, >> 56.5250816836441, 28.6949836632712, >> 93.2731116663463, 71.900826446281, >> 32.2314049586777, 24.2360176821065, >> 27.4649240822602, 24.0822602344801, >> 26.542379396502, 30.693830482414, >> 27.772438977513, 13.4729963482606, >> 8.24524312896406, 18.5469921199308, >> 13.9342686911397, 13.3192389006342, >> 19.9308091485681, 17.6244474341726, >> 16.2406304055353)), >> row.names = c(NA, >> -32L), >> class = c("tbl_df", "tbl", "data.frame")) >> mod = aov(Viability ~ Treatment*Expression, data = dat1) >> summary(mod) >> library(car) >> Anova(mod, type =2) >> ``` >>> sessionInfo() R version 4.3.1 (2023-06-16 ucrt) Platform: >> x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 11 x64 (build >> 25951) Matrix products: default locale: [1] LC_COLLATE=English_United >> States.utf8 LC_CTYPE=English_United States.utf8 >> LC_MONETARY=English_United States.utf8 [4] LC_NUMERIC=C >> LC_TIME=English_United States.utf8 time zone: America/Chicago tzcode >> source: internal attached base packages: [1] stats graphics grDevices >> utils datasets methods base other attached packages: [1] car_3.1-2 >> carData_3.0-5 tidyr_1.3.0 readr_2.1.4 readxl_1.4.3 ggplot2_3.4.3 >> dplyr_1.1.3 loaded via a namespace (and not attached): [1] crayon_1.5.2 >> vctrs_0.6.3 cli_3.6.1 rlang_1.1.1 purrr_1.0.2 generics_0.1.3 >> labeling_0.4.3 [8] bit_4.0.5 glue_1.6.2 colorspace_2.1-0 hms_1.1.3 >> scales_1.2.1 fansi_1.0.4 grid_4.3.1 [15] cellranger_1.1.0 abind_1.4-5 >> munsell_0.5.0 tibble_3.2.1 tzdb_0.4.0 lifecycle_1.0.3 compiler_4.3.1 >> [22] pkgconfig_2.0.3 rstudioapi_0.15.0 farver_2.1.1 R6_2.5.1 >> tidyselect_1.2.0 utf8_1.2.3 parallel_4.3.1 [29] vroom_1.6.3 pillar_1.9.0 >> magrittr_2.0.3 bit64_4.0.5 tools_4.3.1 withr_2.5.0 gtable_0.3.4 >> [[alternative HTML version deleted]] >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com