Kwok, Heemun
2011-Mar-27 06:09 UTC
[R] Hmisc summary.formula formats for binary and continuous variables
Hello,
I am using Hmisc summary.formula, latex and Sweave to produce tables for
publication. Is it possible to change the formats for binary and continuous
variables? I would prefer to show 35 (10%) and 1.5 (1.2-1.8) rather than 10%
(35) and 1.2 / 1.5 / 1.8. Here is a simple example:
sex <- factor(sample(c("m","f"), 500, rep=TRUE))
age <- rnorm(500, 50, 5)
treatment <- factor(sample(c("Drug","Placebo"), 500,
rep=TRUE))
s1 <- summary(~sex + age)
s2 <- summary(treatment ~ sex + age, method="reverse")
print(s1); print(s2)
Descriptive Statistics (N=500)
+-------+-----------------+
| | |
+-------+-----------------+
|sex : m| 46% (232) |
+-------+-----------------+
|age |47.22/50.31/53.37|
+-------+-----------------+
Descriptive Statistics by treatment
+-------+-----------------+-----------------+
| |Drug |Placebo |
| |(N=257) |(N=243) |
+-------+-----------------+-----------------+
|sex : m| 47% (122) | 45% (110) |
+-------+-----------------+-----------------+
|age |47.35/50.00/52.68|46.78/50.92/53.97|
+-------+-----------------+-----------------+
Thanks,
Heemun
-------------------------------------------------
Heemun Kwok, M.D.
Research Fellow
Harbor-UCLA Department of Emergency Medicine
1000 West Carson Street, Box 21
Torrance, CA 90509-2910
office 310-222-3501, fax 310-212-6101
Joshua Wiley
2011-Mar-27 09:44 UTC
[R] Hmisc summary.formula formats for binary and continuous variables
I played around with this for awhile and did not get very far. I did
not see any arguments in summary.formula or its print methods to
reorder (happy to be corrected). Another approach I toyed with was to
create a custom function to pass to summary.formula() that would
itself create (something like) the desired output.
foo <- function(x) {
n <- length(x)
pct <- n/5
c(FOO = paste(n, "(", round(pct, digits = 0), "%)",
sep = ''))
}> summary(treatment ~ sex + age, fun = foo, method = "response")
treatment N=500
+-------+-----------+---+---------+
| | |N |FOO |
+-------+-----------+---+---------+
|sex |f |273|273(55%) |
| |m |227|227(45%) |
+-------+-----------+---+---------+
|age |[36.8,46.7)|125|125(25%) |
| |[46.7,50.0)|125|125(25%) |
| |[50.0,53.3)|125|125(25%) |
| |[53.3,67.5]|125|125(25%) |
+-------+-----------+---+---------+
|Overall| |500|500(100%)|
+-------+-----------+---+---------+
However, it does not work with method = "reverse". Also, this
approach would seem to require either defining a very flexible
function or multiple ones for each different situation you come
across. Looking at print.summary.formula.reverse, the magic seems to
happen on lines 47-50:
cs <- formatCats(stats[[i]], nam, tr, type[i], if
(length(x$group.freq))
x$group.freq
else x$n[i], npct, pctdig, exclude1, long, prtest,
pdig = pdig, eps = eps)
which lead me to explore formatCats(). A small tweak in the order of
the paste() call on lines 25-33 (and creating a copy in of the altered
version plus print.summary.formula.reverse in the global environment),
got me:
print.summary.formula.reverse(summary(treatment ~ sex + age,
method="reverse"))
Descriptive Statistics by treatment
+-------+--------------+--------------+
| |Drug |Placebo |
| |(N=262) |(N=238) |
+-------+--------------+--------------+
|sex : m| (118) 45% | (114) 48% |
+-------+--------------+--------------+
|age |46.5/50.0/53.8|46.6/49.5/52.6|
+-------+--------------+--------------+
which has the percentage info on the right side, though I did not take
the time to get the parentheses moved over. Still, it seems like
adding an argument that just flipped the order might not take that
much work/code.
Cheers,
Josh
(Though I cannot help but wonder if in response to "I want to cross
the street" I just said "we could start building a two-lane,
underground tunnel with...." and someone is probably going to come
along and point out the cross walk 10 feet down the street)
On Sat, Mar 26, 2011 at 11:09 PM, Kwok, Heemun <hkwok at emedharbor.edu>
wrote:>
> Hello,
> I am using Hmisc summary.formula, latex and Sweave to produce tables for
publication. ?Is it possible to change the formats for binary and continuous
variables? ?I would prefer to show 35 (10%) and 1.5 (1.2-1.8) rather than 10%
(35) and 1.2 / 1.5 / 1.8. Here is a simple example:
>
> sex <- factor(sample(c("m","f"), 500, rep=TRUE))
> age <- rnorm(500, 50, 5)
> treatment <- factor(sample(c("Drug","Placebo"), 500,
rep=TRUE))
>
> s1 <- summary(~sex + age)
> s2 <- summary(treatment ~ sex + age, method="reverse")
> print(s1); print(s2)
>
> Descriptive Statistics ?(N=500)
>
> +-------+-----------------+
> | ? ? ? | ? ? ? ? ? ? ? ? |
> +-------+-----------------+
> |sex : m| ? ?46% (232) ? ?|
> +-------+-----------------+
> |age ? ?|47.22/50.31/53.37|
> +-------+-----------------+
>
>
>
> Descriptive Statistics by treatment
>
> +-------+-----------------+-----------------+
> | ? ? ? |Drug ? ? ? ? ? ? |Placebo ? ? ? ? ?|
> | ? ? ? |(N=257) ? ? ? ? ?|(N=243) ? ? ? ? ?|
> +-------+-----------------+-----------------+
> |sex : m| ? ?47% (122) ? ?| ? ?45% (110) ? ?|
> +-------+-----------------+-----------------+
> |age ? ?|47.35/50.00/52.68|46.78/50.92/53.97|
> +-------+-----------------+-----------------+
>
> Thanks,
> Heemun
>
>
> -------------------------------------------------
> Heemun Kwok, M.D.
> Research Fellow
> Harbor-UCLA Department of Emergency Medicine
> 1000 West Carson Street, Box 21
> Torrance, CA 90509-2910
> office 310-222-3501, fax 310-212-6101
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/
Frank Harrell
2011-Mar-27 17:20 UTC
[R] Hmisc summary.formula formats for binary and continuous variables
If by 35 (10%) you mean that 35 is the numerator, this is not such a good idea. That's because it emphasizes something that is not a scientific quantity. A scientific quantity is something that has a meaning outside the current sample. The numerator is dependent on the denominator. Regarding the other formatting issue, summary.formula with method='reverse' is not flexible enough to allow that. Frank Kwok, Heemun wrote:> > Hello, > I am using Hmisc summary.formula, latex and Sweave to produce tables for > publication. Is it possible to change the formats for binary and > continuous variables? I would prefer to show 35 (10%) and 1.5 (1.2-1.8) > rather than 10% (35) and 1.2 / 1.5 / 1.8. Here is a simple example: > > sex <- factor(sample(c("m","f"), 500, rep=TRUE)) > age <- rnorm(500, 50, 5) > treatment <- factor(sample(c("Drug","Placebo"), > 500, rep=TRUE)) > > s1 <- summary(~sex + age) > s2 <- summary(treatment ~ sex + age, method="reverse") > print(s1); print(s2) > > Descriptive Statistics (N=500) > > +-------+-----------------+ > | | | > +-------+-----------------+ > |sex : m| 46% (232) | > +-------+-----------------+ > |age |47.22/50.31/53.37| > +-------+-----------------+ > > > > Descriptive Statistics by treatment > > +-------+-----------------+-----------------+ > | |Drug |Placebo | > | |(N=257) |(N=243) | > +-------+-----------------+-----------------+ > |sex : m| 47% (122) | 45% (110) | > +-------+-----------------+-----------------+ > |age |47.35/50.00/52.68|46.78/50.92/53.97| > +-------+-----------------+-----------------+ > > Thanks, > Heemun > > > ------------------------------------------------- > Heemun Kwok, M.D. > Research Fellow > Harbor-UCLA Department of Emergency Medicine > 1000 West Carson Street, Box 21 > Torrance, CA 90509-2910 > office 310-222-3501, fax 310-212-6101 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >----- Frank Harrell Department of Biostatistics, Vanderbilt University -- View this message in context: http://r.789695.n4.nabble.com/Hmisc-summary-formula-formats-for-binary-and-continuous-variables-tp3408967p3409563.html Sent from the R help mailing list archive at Nabble.com.