thr3ads.net - R devel - [Rd] generating contrast names [Dec 2002]

If this information is useful, please help other people find it:
Share via:

John Fox

2002-Dec-01 18:00 UTC

[Rd] generating contrast names

Dear R-devel list members,

I'd like to suggest a more flexible procedure for generating contrast 
names. I apologise for a relatively long message -- I want my proposal to 
be clear.

I've never liked the current approach. For example, the names generated by 
contr.treatment paste factor to level names with no separation between the 
two; contr.sum simply numbers contrasts (I recall an exchange on the list 
about this question); none of contr.* explicitly indicates what kind of 
contrasts are generated. There are ways around these problems -- e.g., 
beginning level names with an uppercase character -- but in my experience 
it's common to see things like "regionmidwest".

I propose that the current behaviour remain the default, but that the 
contr.* functions use optional "prefix" and "suffix"
characters around
level names (or their equivalent) and an optional identifier string to 
indicate what kind of contrasts are generated. As well, I propose that 
contr.sum optionally identify contrasts by level names. All of these 
behaviours could be controlled by options which, if unset, would produce 
the current behaviour. There might be problems with my implementation of 
these ideas, or with the ideas themselves -- but I expect that these will 
arise in discussion. (Of course, there's nothing to prevent me from using 
the functions that I show below, but I thought that these questions might 
be of more general interest.)

Here, for example, are rewrites of contr.treatment and contr.sum (renamed 
contr.Treatment and contr.Sum because of name-space issues) that implement 
my suggestions, together with illustrations of their use:

------------------------------------------------------------------------------------------------------------------------------

     contr.Treatment <- function (n, base = 1, contrasts = TRUE) {
         if (is.numeric(n) && length(n) == 1)
             levs <- 1:n
         else {
             levs <- n
             n <- length(n)
         }
         lev.opt <- getOption("decorate.factor.levels")
         pre <- if (is.null(lev.opt)) "" else lev.opt[1]
         suf <- if (is.null(lev.opt)) "" else lev.opt[2]
         dec <- getOption("decorate.contr.Treatment")
         dec <- if (is.null(dec)) "" else dec
         contr.names <- paste(pre, dec, levs, suf, sep="")
         contr <- array(0, c(n, n), list(levs, contr.names))
         diag(contr) <- 1
         if (contrasts) {
             if (n < 2)
                 stop(paste("Contrasts not defined for", n - 1,
"degrees of
freedom"))
             if (base < 1 | base > n)
                 stop("Baseline group number out of range")
             contr <- contr[, -base, drop = FALSE]
         }
         contr
     }

     contr.Sum <- function (n, contrasts = TRUE)
     {
         if (length(n) <= 1) {
             if (is.numeric(n) && length(n) == 1 && n > 1)
                 levels <- 1:n
             else stop("Not enough degrees of freedom to define
contrasts")
         }
         else levels <- n
         lenglev <- length(levels)
         lev.opt <- getOption("decorate.factor.levels")
         pre <- if (is.null(lev.opt)) "" else lev.opt[1]
         suf <- if (is.null(lev.opt)) "" else lev.opt[2]
         dec <- getOption("decorate.contr.Sum")
         dec <- if (is.null(dec)) "" else dec
         show.lev <- getOption("contr.Sum.show.levels")
         contr.names <- if ((!is.null(show.lev)) && show.lev)
paste(pre,
dec, levels, suf, sep="")
         if (contrasts) {
             cont <- array(0, c(lenglev, lenglev - 1), list(levels,
                 contr.names[-lenglev]))
             cont[col(cont) == row(cont)] <- 1
             cont[lenglev, ] <- -1
         }
         else {
             cont <- array(0, c(lenglev, lenglev), list(levels, levels))
             cont[col(cont) == row(cont)] <- 1
         }
         cont
     }

     > library(car)
        . . .

     > data(Prestige)
     > attach(Prestige)
     > contrasts(type) <- "contr.Treatment"
     >
     > lm(prestige ~ (income + education)*type) # default behaviour

     Call:
     lm(formula = prestige ~ (income + education) * type)

     Coefficients:
         (Intercept)              income           education 
typeprof              typewc
             2.275753            0.003522            1.713275 
15.351896          -33.536652
     income:typeprof       income:typewc  education:typeprof 
education:typewc
             -0.002903           -0.002072            1.387809 
4.290875

     >
     > options(decorate.factor.levels=c("[", "]")) #
using brackets
     > lm(prestige ~ (income + education)*type)

     Call:
     lm(formula = prestige ~ (income + education) * type)

     Coefficients:
             (Intercept)                income             education 
     type[prof]              type[wc]
                 2.275753              0.003522              1.713275 
       15.351896            -33.536652
     income:type[prof]       income:type[wc]  education:type[prof] 
education:type[wc]
             -0.002903             -0.002072              1.387809 
     4.290875

     >
     > options(decorate.contr.Treatment="T.") # naming contrast
type
     > lm(prestige ~ (income + education)*type)

     Call:
     lm(formula = prestige ~ (income + education) * type)

     Coefficients:
             (Intercept)                  income               education 
         type[T.prof]
                 2.275753                0.003522                1.713275 
             15.351896
                 type[T.wc]     income:type[T.prof]       income:type[T.wc] 
  education:type[T.prof]
                 -33.536652               -0.002903               -0.002072 
                1.387809
     education:type[T.wc]
                 4.290875

     >
     > options(decorate.contr.Treatment=NULL)
     > options(decorate.factor.levels=c("#", "")) #
alternate style, using hash
     > lm(prestige ~ (income + education)*type)

     Call:
     lm(formula = prestige ~ (income + education) * type)

     Coefficients:
             (Intercept)               income            education 
   type#prof              type#wc
             2.275753             0.003522             1.713275 
15.351896           -33.536652
     income:type#prof       income:type#wc  education:type#prof 
education:type#wc
             -0.002903            -0.002072             1.387809 
  4.290875

     >
     > lm(prestige ~ (income + education)*type, 
contrasts=list(type="contr.Sum")) # default behaviour

     Call:
     lm(formula = prestige ~ (income + education) * type, contrasts = 
list(type = "contr.Sum"))

     Coefficients:
         (Intercept)           income        education            type1 
        type2     income:type1     income:type2
         -3.785832         0.001864         3.606169         6.061585 
  21.413481         0.001658        -0.001244
     education:type1  education:type2
         -1.892895        -0.505086

     >
     > options(contr.Sum.show.levels=TRUE)
     > options(decorate.factor.levels=c("[", "]"))
     > options(decorate.contr.Sum="S.")  # showing levels,
brackets,
contrast type
     > lm(prestige ~ (income + education)*type, 
contrasts=list(type="contr.Sum"))

     Call:
     lm(formula = prestige ~ (income + education) * type, contrasts = 
list(type = "contr.Sum"))

     Coefficients:
             (Intercept)                  income               education 
           type[S.bc]
                 -3.785832                0.001864                3.606169 
               6.061585
             type[S.prof]       income:type[S.bc]     income:type[S.prof] 
  education:type[S.bc]
                 21.413481                0.001658               -0.001244 
              -1.892895
     education:type[S.prof]
                 -0.505086

------------------------------------------------------------------------------------------------------------------------------

I look forward to people's comments.
  John
-----------------------------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
email: jfox@mcmaster.ca
phone: 905-525-9140x23604
web: www.socsci.mcmaster.ca/jfox
-----------------------------------------------------

Torsten Hothorn

2002-Dec-02 09:40 UTC

head link

[Rd] generating contrast names

> Dear R-devel list members,
> 
> I'd like to suggest a more flexible procedure for generating contrast 
> names. I apologise for a relatively long message -- I want my proposal to 
> be clear.
> 
> I've never liked the current approach. For example, the names generated
by
> contr.treatment paste factor to level names with no separation between the 
> two; contr.sum simply numbers contrasts (I recall an exchange on the list 
> about this question); none of contr.* explicitly indicates what kind of 
> contrasts are generated. There are ways around these problems -- e.g., 
> beginning level names with an uppercase character -- but in my experience 
> it's common to see things like "regionmidwest".
> 
> I propose that the current behaviour remain the default, but that the 
> contr.* functions use optional "prefix" and "suffix"
characters around
> level names (or their equivalent) and an optional identifier string to 
> indicate what kind of contrasts are generated. As well, I propose that 
> contr.sum optionally identify contrasts by level names. All of these 
> behaviours could be controlled by options which, if unset, would produce 
> the current behaviour. There might be problems with my implementation of 
> these ideas, or with the ideas themselves -- but I expect that these will 
> arise in discussion. (Of course, there's nothing to prevent me from
using
> the functions that I show below, but I thought that these questions might 
> be of more general interest.)
> 
very interesting suggestion. We had the same problem within the multcomp
package and we decided to paste the factor names, levels and kind of
contrasts used into one single name where possible, for example Dunnett: 
> ci <- simint(minutes ~ blanket, data=recovery, conf.level=0.9,
+       alternative="less",eps=0.0001)> summary(ci)
        Simultaneous 90% confidence intervals: Dunnett contrasts

Call: 
simint.formula(formula = minutes ~ blanket, data = recovery,
    conf.level = 0.9, alternative = "less", eps = 1e-04)

         Dunnett contrasts for factor blanket

Contrast matrix:
                      blanketb0 blanketb1 blanketb2 blanketb3
blanketb1-blanketb0 0        -1         1         0         0
blanketb2-blanketb0 0        -1         0         1         0
blanketb3-blanketb0 0        -1         0         0         1

where the factor is `blanket' at levels b0,b1,b2,b3. Of course this is
restricted to simple contrasts (like the difference here) but breaks for
more complicated situations (tetrade constrasts for example).

best,

Torsten



> Here, for example, are rewrites of contr.treatment and contr.sum (renamed 
> contr.Treatment and contr.Sum because of name-space issues) that implement 
> my suggestions, together with illustrations of their use:
> 
>
------------------------------------------------------------------------------------------------------------------------------
> 
>      contr.Treatment <- function (n, base = 1, contrasts = TRUE) {
>          if (is.numeric(n) && length(n) == 1)
>              levs <- 1:n
>          else {
>              levs <- n
>              n <- length(n)
>          }
>          lev.opt <- getOption("decorate.factor.levels")
>          pre <- if (is.null(lev.opt)) "" else lev.opt[1]
>          suf <- if (is.null(lev.opt)) "" else lev.opt[2]
>          dec <- getOption("decorate.contr.Treatment")
>          dec <- if (is.null(dec)) "" else dec
>          contr.names <- paste(pre, dec, levs, suf, sep="")
>          contr <- array(0, c(n, n), list(levs, contr.names))
>          diag(contr) <- 1
>          if (contrasts) {
>              if (n < 2)
>                  stop(paste("Contrasts not defined for", n - 1,
"degrees of
> freedom"))
>              if (base < 1 | base > n)
>                  stop("Baseline group number out of range")
>              contr <- contr[, -base, drop = FALSE]
>          }
>          contr
>      }
> 
>      contr.Sum <- function (n, contrasts = TRUE)
>      {
>          if (length(n) <= 1) {
>              if (is.numeric(n) && length(n) == 1 && n >
1)
>                  levels <- 1:n
>              else stop("Not enough degrees of freedom to define
contrasts")
>          }
>          else levels <- n
>          lenglev <- length(levels)
>          lev.opt <- getOption("decorate.factor.levels")
>          pre <- if (is.null(lev.opt)) "" else lev.opt[1]
>          suf <- if (is.null(lev.opt)) "" else lev.opt[2]
>          dec <- getOption("decorate.contr.Sum")
>          dec <- if (is.null(dec)) "" else dec
>          show.lev <- getOption("contr.Sum.show.levels")
>          contr.names <- if ((!is.null(show.lev)) && show.lev)
paste(pre,
> dec, levels, suf, sep="")
>          if (contrasts) {
>              cont <- array(0, c(lenglev, lenglev - 1), list(levels,
>                  contr.names[-lenglev]))
>              cont[col(cont) == row(cont)] <- 1
>              cont[lenglev, ] <- -1
>          }
>          else {
>              cont <- array(0, c(lenglev, lenglev), list(levels, levels))
>              cont[col(cont) == row(cont)] <- 1
>          }
>          cont
>      }
> 
>      > library(car)
>         . . .
> 
>      > data(Prestige)
>      > attach(Prestige)
>      > contrasts(type) <- "contr.Treatment"
>      >
>      > lm(prestige ~ (income + education)*type) # default behaviour
> 
>      Call:
>      lm(formula = prestige ~ (income + education) * type)
> 
>      Coefficients:
>          (Intercept)              income           education 
> typeprof              typewc
>              2.275753            0.003522            1.713275 
> 15.351896          -33.536652
>      income:typeprof       income:typewc  education:typeprof 
> education:typewc
>              -0.002903           -0.002072            1.387809 
> 4.290875
> 
>      >
>      > options(decorate.factor.levels=c("[", "]")) #
using brackets
>      > lm(prestige ~ (income + education)*type)
> 
>      Call:
>      lm(formula = prestige ~ (income + education) * type)
> 
>      Coefficients:
>              (Intercept)                income             education 
>      type[prof]              type[wc]
>                  2.275753              0.003522              1.713275 
>        15.351896            -33.536652
>      income:type[prof]       income:type[wc]  education:type[prof] 
> education:type[wc]
>              -0.002903             -0.002072              1.387809 
>      4.290875
> 
>      >
>      > options(decorate.contr.Treatment="T.") # naming
contrast type
>      > lm(prestige ~ (income + education)*type)
> 
>      Call:
>      lm(formula = prestige ~ (income + education) * type)
> 
>      Coefficients:
>              (Intercept)                  income               education 
>          type[T.prof]
>                  2.275753                0.003522                1.713275 
>              15.351896
>                  type[T.wc]     income:type[T.prof]       income:type[T.wc]
>   education:type[T.prof]
>                  -33.536652               -0.002903               -0.002072
>                 1.387809
>      education:type[T.wc]
>                  4.290875
> 
>      >
>      > options(decorate.contr.Treatment=NULL)
>      > options(decorate.factor.levels=c("#", "")) #
alternate style, using hash
>      > lm(prestige ~ (income + education)*type)
> 
>      Call:
>      lm(formula = prestige ~ (income + education) * type)
> 
>      Coefficients:
>              (Intercept)               income            education 
>    type#prof              type#wc
>              2.275753             0.003522             1.713275 
> 15.351896           -33.536652
>      income:type#prof       income:type#wc  education:type#prof 
> education:type#wc
>              -0.002903            -0.002072             1.387809 
>   4.290875
> 
>      >
>      > lm(prestige ~ (income + education)*type, 
> contrasts=list(type="contr.Sum")) # default behaviour
> 
>      Call:
>      lm(formula = prestige ~ (income + education) * type, contrasts = 
> list(type = "contr.Sum"))
> 
>      Coefficients:
>          (Intercept)           income        education            type1 
>         type2     income:type1     income:type2
>          -3.785832         0.001864         3.606169         6.061585 
>   21.413481         0.001658        -0.001244
>      education:type1  education:type2
>          -1.892895        -0.505086
> 
>      >
>      > options(contr.Sum.show.levels=TRUE)
>      > options(decorate.factor.levels=c("[", "]"))
>      > options(decorate.contr.Sum="S.")  # showing levels,
brackets,
> contrast type
>      > lm(prestige ~ (income + education)*type, 
> contrasts=list(type="contr.Sum"))
> 
>      Call:
>      lm(formula = prestige ~ (income + education) * type, contrasts = 
> list(type = "contr.Sum"))
> 
>      Coefficients:
>              (Intercept)                  income               education 
>            type[S.bc]
>                  -3.785832                0.001864                3.606169 
>                6.061585
>              type[S.prof]       income:type[S.bc]     income:type[S.prof] 
>   education:type[S.bc]
>                  21.413481                0.001658               -0.001244 
>               -1.892895
>      education:type[S.prof]
>                  -0.505086
> 
>
------------------------------------------------------------------------------------------------------------------------------
> 
> I look forward to people's comments.
>   John
> -----------------------------------------------------
> John Fox
> Department of Sociology
> McMaster University
> Hamilton, Ontario, Canada L8S 4M4
> email: jfox@mcmaster.ca
> phone: 905-525-9140x23604
> web: www.socsci.mcmaster.ca/jfox
> -----------------------------------------------------
> 
> ______________________________________________
> R-devel@stat.math.ethz.ch mailing list
> http://www.stat.math.ethz.ch/mailman/listinfo/r-devel
>

Seemingly Similar Threads

Search for more reasonably related threads

R devel - Dec 2002 - generating contrast names

[Rd] generating contrast names

[Rd] generating contrast names

Seemingly Similar Threads