Dear Robert, Anova() calls linearHypothesis(), also in the car package, to compute sums of squares and df, supplying appropriate hypothesis matrices. linearHypothesis() usually tries to express the hypothesis matrix in symbolic equation form for printing, but won't do this if coefficient names include arithmetic operators, in your case - and +, which can confuse it. The symbolic form of the hypothesis isn't really relevant for Anova(), which doesn't use the printed representation of each hypothesis, and so, despite the warnings, you get the correct ANOVA table. In your case, where the data are balanced, with 4 cases per cell, Anova(mod) and summary(mod) are equivalent, which makes me wonder why you would use Anova() in the first place. To elaborate a bit, linearHypothesis() does tolerate arithmetic operators in coefficient names if you specify the hypothesis symbolically rather than as a hypothesis matrix. For example, to test, the interaction: ------- snip -------- > linearHypothesis(mod, + c("TreatmentDabrafenib:ExpressionCD271+ = 0", + "TreatmentTrametinib:ExpressionCD271+ = 0", + "TreatmentCombination:ExpressionCD271+ = 0")) Linear hypothesis test Hypothesis: TreatmentDabrafenib:ExpressionCD271+ = 0 TreatmentTrametinib:ExpressionCD271+ = 0 TreatmentCombination:ExpressionCD271+ = 0 Model 1: restricted model Model 2: Viability ~ Treatment * Expression Res.Df RSS Df Sum of Sq F Pr(>F) 1 27 18966 2 24 16739 3 2226.3 1.064 0.3828 ------- snip -------- Alternatively: ------- snip -------- > H <- matrix(0, 3, 8) > H[1, 6] <- H[2, 7] <- H[3, 8] <- 1 > H [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [1,] 0 0 0 0 0 1 0 0 [2,] 0 0 0 0 0 0 1 0 [3,] 0 0 0 0 0 0 0 1 > linearHypothesis(mod, H) Linear hypothesis test Hypothesis: Model 1: restricted model Model 2: Viability ~ Treatment * Expression Res.Df RSS Df Sum of Sq F Pr(>F) 1 27 18966 2 24 16739 3 2226.3 1.064 0.3828 Warning message: In printHypothesis(L, rhs, names(b)) : one or more coefficients in the hypothesis include arithmetic operators in their names; the printed representation of the hypothesis will be omitted ------- snip -------- There's no good reason that linearHypothesis() should try to express each hypothesis symbolically for Anova(), since Anova() doesn't use that information. When I have some time, I'll arrange to avoid the warning. Best, John -- John Fox, Professor Emeritus McMaster University Hamilton, Ontario, Canada web: https://www.john-fox.ca/ On 2023-09-16 4:39 p.m., Robert Baer wrote:> Caution: External email. > > > When doing Anova using the car package, I get a print warning that is > unexpected. It seemingly involves have my flow cytometry factor levels > named CD271+ and CD171-. But I am not sure this warning should be > intended behavior. Any explanation about whether I'm doing something > wrong? Why can't I have CD271+ and CD271- as factor levels? Its legal > text isn't it? > > library(car) mod = aov(Viability ~ Treatment*Expression, data = dat1) > Anova(mod, type =2) Anova Table (Type II tests) Response: Viability Sum > Sq Df F value Pr(>F) Treatment 19447.3 3 9.2942 0.0002927 *** Expression > 2669.8 1 3.8279 0.0621394 . Treatment:Expression 2226.3 3 1.0640 > 0.3828336 Residuals 16739.3 24 --- Signif. codes: 0 ?***? 0.001 ?**? > 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Warning messages: 1: In printHypothesis(L, > rhs, names(b)) : one or more coefficients in the hypothesis include > arithmetic operators in their names; the printed representation of the > hypothesis will be omitted 2: In printHypothesis(L, rhs, names(b)) : one > or more coefficients in the hypothesis include arithmetic operators in > their names; the printed representation of the hypothesis will be > omitted 3: In printHypothesis(L, rhs, names(b)) : one or more > coefficients in the hypothesis include arithmetic operators in their > names; the printed representation of the hypothesis will be omitted > > > The code to reproduce: > > ``` > > > dat1 <-structure(list(Treatment = structure(c(1L, 1L, 1L, 1L, 3L, 1L, > 1L, 1L, 1L, 2L, 2L, 2L, > 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, > 3L, 3L, 4L, 4L, 4L, 4L, > 4L, 4L, 4L, 4L), levels = c("Control", > "Dabrafenib", "Trametinib", "Combination"), class = "factor"), > Expression = structure(c(2L, 2L, 2L, 2L, 2L, 1L, > 1L, 1L, > 1L, 2L, 2L, 2L, 2L, 1L, > 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, > 1L, 2L, 2L, 2L, 2L, 1L, > 1L, 1L, 1L), levels = c("CD271-", > "CD271+"), class = "factor"), > Viability = c(128.329809725159, 24.2360176821065, > 76.3597924274457, 11.0128771862387, 21.4683836248318, > 140.784162982894, 87.4303286565443, > 118.181818181818, 53.603690178743, > 51.2973284643475, 5.47760907168941, > 27.1574091870075, 50.8360561214684, > 56.5250816836441, 28.6949836632712, > 93.2731116663463, 71.900826446281, > 32.2314049586777, 24.2360176821065, > 27.4649240822602, 24.0822602344801, > 26.542379396502, 30.693830482414, > 27.772438977513, 13.4729963482606, > 8.24524312896406, 18.5469921199308, > 13.9342686911397, 13.3192389006342, > 19.9308091485681, 17.6244474341726, > 16.2406304055353)), > row.names = c(NA, > -32L), > class = c("tbl_df", "tbl", "data.frame")) > > mod = aov(Viability ~ Treatment*Expression, data = dat1) > summary(mod) > library(car) > Anova(mod, type =2) > > ``` > > >> sessionInfo() R version 4.3.1 (2023-06-16 ucrt) Platform: > x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 11 x64 (build > 25951) Matrix products: default locale: [1] LC_COLLATE=English_United > States.utf8 LC_CTYPE=English_United States.utf8 > LC_MONETARY=English_United States.utf8 [4] LC_NUMERIC=C > LC_TIME=English_United States.utf8 time zone: America/Chicago tzcode > source: internal attached base packages: [1] stats graphics grDevices > utils datasets methods base other attached packages: [1] car_3.1-2 > carData_3.0-5 tidyr_1.3.0 readr_2.1.4 readxl_1.4.3 ggplot2_3.4.3 > dplyr_1.1.3 loaded via a namespace (and not attached): [1] crayon_1.5.2 > vctrs_0.6.3 cli_3.6.1 rlang_1.1.1 purrr_1.0.2 generics_0.1.3 > labeling_0.4.3 [8] bit_4.0.5 glue_1.6.2 colorspace_2.1-0 hms_1.1.3 > scales_1.2.1 fansi_1.0.4 grid_4.3.1 [15] cellranger_1.1.0 abind_1.4-5 > munsell_0.5.0 tibble_3.2.1 tzdb_0.4.0 lifecycle_1.0.3 compiler_4.3.1 > [22] pkgconfig_2.0.3 rstudioapi_0.15.0 farver_2.1.1 R6_2.5.1 > tidyselect_1.2.0 utf8_1.2.3 parallel_4.3.1 [29] vroom_1.6.3 pillar_1.9.0 > magrittr_2.0.3 bit64_4.0.5 tools_4.3.1 withr_2.5.0 gtable_0.3.4 > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Thanks John. Appreciate the insights. On 9/17/2023 9:43 AM, John Fox wrote:> Dear Robert, > > Anova() calls linearHypothesis(), also in the car package, to compute > sums of squares and df, supplying appropriate hypothesis matrices. > linearHypothesis() usually tries to express the hypothesis matrix in > symbolic equation form for printing, but won't do this if coefficient > names include arithmetic operators, in your case - and +, which can > confuse it. > > The symbolic form of the hypothesis isn't really relevant for Anova(), > which doesn't use the printed representation of each hypothesis, and > so, despite the warnings, you get the correct ANOVA table. In your > case, where the data are balanced, with 4 cases per cell, Anova(mod) > and summary(mod) are equivalent, which makes me wonder why you would > use Anova() in the first place. > > To elaborate a bit, linearHypothesis() does tolerate arithmetic > operators in coefficient names if you specify the hypothesis > symbolically rather than as a hypothesis matrix. For example, to test, > the interaction: > > ------- snip -------- > > > linearHypothesis(mod, > +????????????????? c("TreatmentDabrafenib:ExpressionCD271+ = 0", > +??????????????????? "TreatmentTrametinib:ExpressionCD271+ = 0", > +??????????????????? "TreatmentCombination:ExpressionCD271+ = 0")) > Linear hypothesis test > > Hypothesis: > TreatmentDabrafenib:ExpressionCD271+ = 0 > TreatmentTrametinib:ExpressionCD271+ = 0 > TreatmentCombination:ExpressionCD271+ = 0 > > Model 1: restricted model > Model 2: Viability ~ Treatment * Expression > > ? Res.Df?? RSS Df Sum of Sq???? F Pr(>F) > 1???? 27 18966 > 2???? 24 16739? 3??? 2226.3 1.064 0.3828 > > ------- snip -------- > > Alternatively: > > ------- snip -------- > > > H <- matrix(0, 3, 8) > > H[1, 6] <- H[2, 7] <- H[3, 8] <- 1 > > H > ???? [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] > [1,]??? 0??? 0??? 0??? 0??? 0??? 1??? 0??? 0 > [2,]??? 0??? 0??? 0??? 0??? 0??? 0??? 1??? 0 > [3,]??? 0??? 0??? 0??? 0??? 0??? 0??? 0??? 1 > > > linearHypothesis(mod, H) > Linear hypothesis test > > Hypothesis: > > > Model 1: restricted model > Model 2: Viability ~ Treatment * Expression > > ? Res.Df?? RSS Df Sum of Sq???? F Pr(>F) > 1???? 27 18966 > 2???? 24 16739? 3??? 2226.3 1.064 0.3828 > Warning message: > In printHypothesis(L, rhs, names(b)) : > ? one or more coefficients in the hypothesis include > ???? arithmetic operators in their names; > ? the printed representation of the hypothesis will be omitted > > ------- snip -------- > > There's no good reason that linearHypothesis() should try to express > each hypothesis symbolically for Anova(), since Anova() doesn't use > that information. When I have some time, I'll arrange to avoid the > warning. > > Best, > ?John >
Also, I would guess that the code precedes the use of backticks in non-syntactic names. Could they be deployed here? - Peter> On 17 Sep 2023, at 16:43 , John Fox <jfox at mcmaster.ca> wrote: > > Dear Robert, > > Anova() calls linearHypothesis(), also in the car package, to compute sums of squares and df, supplying appropriate hypothesis matrices. linearHypothesis() usually tries to express the hypothesis matrix in symbolic equation form for printing, but won't do this if coefficient names include arithmetic operators, in your case - and +, which can confuse it. > > The symbolic form of the hypothesis isn't really relevant for Anova(), which doesn't use the printed representation of each hypothesis, and so, despite the warnings, you get the correct ANOVA table. In your case, where the data are balanced, with 4 cases per cell, Anova(mod) and summary(mod) are equivalent, which makes me wonder why you would use Anova() in the first place. > > To elaborate a bit, linearHypothesis() does tolerate arithmetic operators in coefficient names if you specify the hypothesis symbolically rather than as a hypothesis matrix. For example, to test, the interaction: > > ------- snip -------- > > > linearHypothesis(mod, > + c("TreatmentDabrafenib:ExpressionCD271+ = 0", > + "TreatmentTrametinib:ExpressionCD271+ = 0", > + "TreatmentCombination:ExpressionCD271+ = 0")) > Linear hypothesis test > > Hypothesis: > TreatmentDabrafenib:ExpressionCD271+ = 0 > TreatmentTrametinib:ExpressionCD271+ = 0 > TreatmentCombination:ExpressionCD271+ = 0 > > Model 1: restricted model > Model 2: Viability ~ Treatment * Expression > > Res.Df RSS Df Sum of Sq F Pr(>F) > 1 27 18966 > 2 24 16739 3 2226.3 1.064 0.3828 > > ------- snip -------- > > Alternatively: > > ------- snip -------- > > > H <- matrix(0, 3, 8) > > H[1, 6] <- H[2, 7] <- H[3, 8] <- 1 > > H > [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] > [1,] 0 0 0 0 0 1 0 0 > [2,] 0 0 0 0 0 0 1 0 > [3,] 0 0 0 0 0 0 0 1 > > > linearHypothesis(mod, H) > Linear hypothesis test > > Hypothesis: > > > Model 1: restricted model > Model 2: Viability ~ Treatment * Expression > > Res.Df RSS Df Sum of Sq F Pr(>F) > 1 27 18966 > 2 24 16739 3 2226.3 1.064 0.3828 > Warning message: > In printHypothesis(L, rhs, names(b)) : > one or more coefficients in the hypothesis include > arithmetic operators in their names; > the printed representation of the hypothesis will be omitted > > ------- snip -------- > > There's no good reason that linearHypothesis() should try to express each hypothesis symbolically for Anova(), since Anova() doesn't use that information. When I have some time, I'll arrange to avoid the warning. > > Best, > John > > -- > John Fox, Professor Emeritus > McMaster University > Hamilton, Ontario, Canada > web: https://www.john-fox.ca/ > On 2023-09-16 4:39 p.m., Robert Baer wrote: >> Caution: External email. >> When doing Anova using the car package, I get a print warning that is >> unexpected. It seemingly involves have my flow cytometry factor levels >> named CD271+ and CD171-. But I am not sure this warning should be >> intended behavior. Any explanation about whether I'm doing something >> wrong? Why can't I have CD271+ and CD271- as factor levels? Its legal >> text isn't it? >> library(car) mod = aov(Viability ~ Treatment*Expression, data = dat1) >> Anova(mod, type =2) Anova Table (Type II tests) Response: Viability Sum >> Sq Df F value Pr(>F) Treatment 19447.3 3 9.2942 0.0002927 *** Expression >> 2669.8 1 3.8279 0.0621394 . Treatment:Expression 2226.3 3 1.0640 >> 0.3828336 Residuals 16739.3 24 --- Signif. codes: 0 ?***? 0.001 ?**? >> 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 Warning messages: 1: In printHypothesis(L, >> rhs, names(b)) : one or more coefficients in the hypothesis include >> arithmetic operators in their names; the printed representation of the >> hypothesis will be omitted 2: In printHypothesis(L, rhs, names(b)) : one >> or more coefficients in the hypothesis include arithmetic operators in >> their names; the printed representation of the hypothesis will be >> omitted 3: In printHypothesis(L, rhs, names(b)) : one or more >> coefficients in the hypothesis include arithmetic operators in their >> names; the printed representation of the hypothesis will be omitted >> The code to reproduce: >> ``` >> dat1 <-structure(list(Treatment = structure(c(1L, 1L, 1L, 1L, 3L, 1L, >> 1L, 1L, 1L, 2L, 2L, 2L, >> 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, >> 3L, 3L, 4L, 4L, 4L, 4L, >> 4L, 4L, 4L, 4L), levels = c("Control", >> "Dabrafenib", "Trametinib", "Combination"), class = "factor"), >> Expression = structure(c(2L, 2L, 2L, 2L, 2L, 1L, >> 1L, 1L, >> 1L, 2L, 2L, 2L, 2L, 1L, >> 1L, 1L, 1L, 2L, 2L, 2L, 1L, 1L, 1L, >> 1L, 2L, 2L, 2L, 2L, 1L, >> 1L, 1L, 1L), levels = c("CD271-", >> "CD271+"), class = "factor"), >> Viability = c(128.329809725159, 24.2360176821065, >> 76.3597924274457, 11.0128771862387, 21.4683836248318, >> 140.784162982894, 87.4303286565443, >> 118.181818181818, 53.603690178743, >> 51.2973284643475, 5.47760907168941, >> 27.1574091870075, 50.8360561214684, >> 56.5250816836441, 28.6949836632712, >> 93.2731116663463, 71.900826446281, >> 32.2314049586777, 24.2360176821065, >> 27.4649240822602, 24.0822602344801, >> 26.542379396502, 30.693830482414, >> 27.772438977513, 13.4729963482606, >> 8.24524312896406, 18.5469921199308, >> 13.9342686911397, 13.3192389006342, >> 19.9308091485681, 17.6244474341726, >> 16.2406304055353)), >> row.names = c(NA, >> -32L), >> class = c("tbl_df", "tbl", "data.frame")) >> mod = aov(Viability ~ Treatment*Expression, data = dat1) >> summary(mod) >> library(car) >> Anova(mod, type =2) >> ``` >>> sessionInfo() R version 4.3.1 (2023-06-16 ucrt) Platform: >> x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 11 x64 (build >> 25951) Matrix products: default locale: [1] LC_COLLATE=English_United >> States.utf8 LC_CTYPE=English_United States.utf8 >> LC_MONETARY=English_United States.utf8 [4] LC_NUMERIC=C >> LC_TIME=English_United States.utf8 time zone: America/Chicago tzcode >> source: internal attached base packages: [1] stats graphics grDevices >> utils datasets methods base other attached packages: [1] car_3.1-2 >> carData_3.0-5 tidyr_1.3.0 readr_2.1.4 readxl_1.4.3 ggplot2_3.4.3 >> dplyr_1.1.3 loaded via a namespace (and not attached): [1] crayon_1.5.2 >> vctrs_0.6.3 cli_3.6.1 rlang_1.1.1 purrr_1.0.2 generics_0.1.3 >> labeling_0.4.3 [8] bit_4.0.5 glue_1.6.2 colorspace_2.1-0 hms_1.1.3 >> scales_1.2.1 fansi_1.0.4 grid_4.3.1 [15] cellranger_1.1.0 abind_1.4-5 >> munsell_0.5.0 tibble_3.2.1 tzdb_0.4.0 lifecycle_1.0.3 compiler_4.3.1 >> [22] pkgconfig_2.0.3 rstudioapi_0.15.0 farver_2.1.1 R6_2.5.1 >> tidyselect_1.2.0 utf8_1.2.3 parallel_4.3.1 [29] vroom_1.6.3 pillar_1.9.0 >> magrittr_2.0.3 bit64_4.0.5 tools_4.3.1 withr_2.5.0 gtable_0.3.4 >> [[alternative HTML version deleted]] >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com