Dear R Users,
I have started to compile some useful hacks for the generation of nice
descriptive statistics. I hope that these functions & hacks are useful
to the wider R community. I hope that package developers also get some
inspiration from the code or from these ideas.
I have started to review various packages focused on descriptive
statistics - although I am still at the very beginning.
### Hacks / Code
- split table headers in 2 rows;
- split results over 2 rows: view.gtsummary(...);
- add abbreviations as footnotes: add.abbrev(...);
The results are exported as a web page (using shiny) and can be printed
as a pdf documented. See the following pdf example:
https://github.com/discoleo/R/blob/master/Stat/Tools.DescriptiveStatistics.Example_1.pdf
### Example
# currently focused on package gtsummary
library(gtsummary)
library(xml2)
mtcars %>%
??? # rename2():
??? # - see file Tools.Data.R;
??? # - behaves in most cases the same as dplyr::rename();
??? rename2("HP" = "hp", "Displ" = disp, "Wt
(klb)" = "wt", "Rar" =
drat) %>%
??? # as.factor.df():
??? # - see file Tools.Data.R;
??? # - encode as (ordered) factor;
??? as.factor.df("cyl", "Cyl ") %>%
??? # the Descriptive Statistics:
??? tbl_summary(by = cyl) %>%
??? modify_header(update = header) %>%
??? add_p() %>%
??? add_overall() %>%
??? modify_header(update = header0) %>%
??? # Hack: split long statistics !!!
??? view.gtsummary(view=FALSE, len=8) %>%
??? add.abbrev(
??? ??? c("Displ", "HP", "Rar", "Wt
(klb)" = "Wt"),
??? ??? c("Displacement (in^3)", "Gross horsepower",
"Rear axle ratio",
??? ??? "Weight (1000 lbs)"));
The required functions are on Github:
https://github.com/discoleo/R/blob/master/Stat/Tools.DescriptiveStatistics.R
The functions rename2() & as.factor.df() are only data-helpers and can
be found also on Github:
https://github.com/discoleo/R/blob/master/Stat/Tools.Data.R
Note:
1.) The function add.abbrev() operates on the generated html-code:
- the functionality is more generic and could be used easily with other
packages that export web pages as well;
2.) Split statistics: is an ugly hack. I plan to redesign the
functionality using xml-technologies. But I have already too many
side-projects.
3.) as.factor.df(): traditionally, one would create derived data-sets or
add a new column with the variable as factor (as the user may need the
numeric values for further analysis). But it looked nicer as a single
block of code.
Sincerely,
Leonard
If you think what you are doing is useful, why do you not put it in a package?! That is, after all, the whole purpose of packages. I can only speak for myself, of course, but I doubt that posting long involved messages with code here is going to have anything like the utility of providing a package with carefully written and tested code and documented functionality. If you have suggestions about how to improve a *particular* package, a better alternative is probably to contact the package maintainer. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sat, Oct 2, 2021 at 3:00 PM Leonard Mada via R-help <r-help at r-project.org> wrote:> > Dear R Users, > > > I have started to compile some useful hacks for the generation of nice > descriptive statistics. I hope that these functions & hacks are useful > to the wider R community. I hope that package developers also get some > inspiration from the code or from these ideas. > > > I have started to review various packages focused on descriptive > statistics - although I am still at the very beginning. > > > ### Hacks / Code > - split table headers in 2 rows; > - split results over 2 rows: view.gtsummary(...); > - add abbreviations as footnotes: add.abbrev(...); > > The results are exported as a web page (using shiny) and can be printed > as a pdf documented. See the following pdf example: > > https://github.com/discoleo/R/blob/master/Stat/Tools.DescriptiveStatistics.Example_1.pdf > > > ### Example > # currently focused on package gtsummary > library(gtsummary) > library(xml2) > > mtcars %>% > # rename2(): > # - see file Tools.Data.R; > # - behaves in most cases the same as dplyr::rename(); > rename2("HP" = "hp", "Displ" = disp, "Wt (klb)" = "wt", "Rar" > drat) %>% > # as.factor.df(): > # - see file Tools.Data.R; > # - encode as (ordered) factor; > as.factor.df("cyl", "Cyl ") %>% > # the Descriptive Statistics: > tbl_summary(by = cyl) %>% > modify_header(update = header) %>% > add_p() %>% > add_overall() %>% > modify_header(update = header0) %>% > # Hack: split long statistics !!! > view.gtsummary(view=FALSE, len=8) %>% > add.abbrev( > c("Displ", "HP", "Rar", "Wt (klb)" = "Wt"), > c("Displacement (in^3)", "Gross horsepower", "Rear axle ratio", > "Weight (1000 lbs)")); > > > The required functions are on Github: > https://github.com/discoleo/R/blob/master/Stat/Tools.DescriptiveStatistics.R > > > > The functions rename2() & as.factor.df() are only data-helpers and can > be found also on Github: > https://github.com/discoleo/R/blob/master/Stat/Tools.Data.R > > > Note: > > 1.) The function add.abbrev() operates on the generated html-code: > > - the functionality is more generic and could be used easily with other > packages that export web pages as well; > > 2.) Split statistics: is an ugly hack. I plan to redesign the > functionality using xml-technologies. But I have already too many > side-projects. > > 3.) as.factor.df(): traditionally, one would create derived data-sets or > add a new column with the variable as factor (as the user may need the > numeric values for further analysis). But it looked nicer as a single > block of code. > > > Sincerely, > > > Leonard > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Dear R users, I wrote in the meantime a new function: apply.html(html, XPATH, FUN, ...) This function applies FUN to the nodes selected using XPATH. However, I wonder if there is a possibility to use more simple selectors (e.g. jQuery). Although I am not an expert with jQuery, it may be easier for end users than XPATH. Package htmltools does not seem to offer support to import a native html file, nor do I see any functions using jQuery selectors. I do not seem to find any such packages. I would be glad for any hints. Many thanks, Leonard ====== Latest code is on Github: https://github.com/discoleo/R/blob/master/Stat/Tools.DescriptiveStatistics.R Notes: 1.) as.html() currently imports only a few types, but it could be easily extended to fully generic html; Note: the export as shiny app may not work with a fully generic html; I have not yet explored all the implications! 2.) I am still struggling to understand how to best design the option: with.tags = TRUE. 3.) llammas.FUN: Was implemented at great expense and at the last minute, but unfortunately is still incomplete and important visual styles are missing. Help is welcomed. On 10/3/2021 1:00 AM, Leonard Mada wrote:> Dear R Users, > > > I have started to compile some useful hacks for the generation of nice > descriptive statistics. I hope that these functions & hacks are useful > to the wider R community. I hope that package developers also get some > inspiration from the code or from these ideas. > > > I have started to review various packages focused on descriptive > statistics - although I am still at the very beginning. > > > ### Hacks / Code > - split table headers in 2 rows; > - split results over 2 rows: view.gtsummary(...); > - add abbreviations as footnotes: add.abbrev(...); > > The results are exported as a web page (using shiny) and can be > printed as a pdf documented. See the following pdf example: > > https://github.com/discoleo/R/blob/master/Stat/Tools.DescriptiveStatistics.Example_1.pdf > > > > ### Example > # currently focused on package gtsummary > library(gtsummary) > library(xml2) > > mtcars %>% > ??? # rename2(): > ??? # - see file Tools.Data.R; > ??? # - behaves in most cases the same as dplyr::rename(); > ??? rename2("HP" = "hp", "Displ" = disp, "Wt (klb)" = "wt", "Rar" = > drat) %>% > ??? # as.factor.df(): > ??? # - see file Tools.Data.R; > ??? # - encode as (ordered) factor; > ??? as.factor.df("cyl", "Cyl ") %>% > ??? # the Descriptive Statistics: > ??? tbl_summary(by = cyl) %>% > ??? modify_header(update = header) %>% > ??? add_p() %>% > ??? add_overall() %>% > ??? modify_header(update = header0) %>% > ??? # Hack: split long statistics !!! > ??? view.gtsummary(view=FALSE, len=8) %>% > ??? add.abbrev( > ??? ??? c("Displ", "HP", "Rar", "Wt (klb)" = "Wt"), > ??? ??? c("Displacement (in^3)", "Gross horsepower", "Rear axle ratio", > ??? ??? "Weight (1000 lbs)")); > > > The required functions are on Github: > https://github.com/discoleo/R/blob/master/Stat/Tools.DescriptiveStatistics.R > > > > The functions rename2() & as.factor.df() are only data-helpers and can > be found also on Github: > https://github.com/discoleo/R/blob/master/Stat/Tools.Data.R > > > Note: > > 1.) The function add.abbrev() operates on the generated html-code: > > - the functionality is more generic and could be used easily with > other packages that export web pages as well; > > 2.) Split statistics: is an ugly hack. I plan to redesign the > functionality using xml-technologies. But I have already too many > side-projects. > > 3.) as.factor.df(): traditionally, one would create derived data-sets > or add a new column with the variable as factor (as the user may need > the numeric values for further analysis). But it looked nicer as a > single block of code. > > > Sincerely, > > > Leonard > >[[alternative HTML version deleted]]