thr3ads.net - R help - [R] ggplot: add percentage for each element in legend and remove tick mark [Aug 2021]

If this information is useful, please help other people find it:
Share via:

Kai Yang

2021-Aug-13 21:48 UTC

[R] ggplot: add percentage for each element in legend and remove tick mark

Hello John,
I put my testing data below. I'm not sure how to use dupt() function. would
you please give me an example?
Thanks,
Kai

| 
ethnicity | 
individuals |
| Caucasian | 36062 |
| Ashkenazi Jewish | 4309 |
| Multiple | 3193 |
| Hispanic | 2113 |
| Asian. not specified | 1538 |
| Chinese | 1031 |
| African | 643 |
| Unknown | 510 |
| Filipino | 222 |
| Japanese | 129 |
| Native American | 116 |
| Indian | 111 |
| Pacific Islander | 23 |



    On Friday, August 13, 2021, 06:21:29 AM PDT, John Kane <jrkrideau at
gmail.com> wrote:
 
 Would you supply some sample data please? A handy way to supply sample
data is to use the dput() function. See ?dput.? If you have a very
large data set then something like head(dput(myfile), 100) will likely
supply enough data for us to work with.

On Thu, 12 Aug 2021 at 11:45, Kai Yang via R-help <r-help at
r-project.org> wrote:>
> Hello List,
> I use the following code to generate a donut plot.
> # Compute percentages
> eth$fraction = eth$individuals / sum(eth$individuals)
> # Compute the cumulative percentages (top of each rectangle)
> eth$ymax = cumsum(eth$fraction)
> # Compute the bottom of each rectangle
> eth$ymin = c(0, head(eth$ymax, n=-1))
> # Make the plot using percentage
> ggplot(eth, aes(ymax=ymax, ymin=ymin, xmax=4, xmin=3, fill=ethnicity)) +
>? geom_rect() +
>? coord_polar(theta="y")? +
>? xlim(c(2, 4)
>? )
>
> I want to improve the plot for two thing:
> 1. the legend: I need to add percentage (eth$fraction * 100 and then add %)
for each of element.
> 2. remove all number (tick mark ?) around the plot
> Please help
> Thank you,
> Kai
>
>? ? ? ? [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 
John Kane
Kingston ON Canada
  
	[[alternative HTML version deleted]]

Bert Gunter

2021-Aug-13 22:02 UTC

head link

[R] ggplot: add percentage for each element in legend and remove tick mark

It's dput()  *not* dupt() .  ?dput tells you how to use it (as usual).

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Fri, Aug 13, 2021 at 2:48 PM Kai Yang via R-help
<r-help at r-project.org> wrote:>
>  Hello John,
> I put my testing data below. I'm not sure how to use dupt() function.
would you please give me an example?
> Thanks,
> Kai
>
> |
> ethnicity |
> individuals |
> | Caucasian | 36062 |
> | Ashkenazi Jewish | 4309 |
> | Multiple | 3193 |
> | Hispanic | 2113 |
> | Asian. not specified | 1538 |
> | Chinese | 1031 |
> | African | 643 |
> | Unknown | 510 |
> | Filipino | 222 |
> | Japanese | 129 |
> | Native American | 116 |
> | Indian | 111 |
> | Pacific Islander | 23 |
>
>
>
>     On Friday, August 13, 2021, 06:21:29 AM PDT, John Kane <jrkrideau at
gmail.com> wrote:
>
>  Would you supply some sample data please? A handy way to supply sample
> data is to use the dput() function. See ?dput.  If you have a very
> large data set then something like head(dput(myfile), 100) will likely
> supply enough data for us to work with.
>
> On Thu, 12 Aug 2021 at 11:45, Kai Yang via R-help <r-help at
r-project.org> wrote:
> >
> > Hello List,
> > I use the following code to generate a donut plot.
> > # Compute percentages
> > eth$fraction = eth$individuals / sum(eth$individuals)
> > # Compute the cumulative percentages (top of each rectangle)
> > eth$ymax = cumsum(eth$fraction)
> > # Compute the bottom of each rectangle
> > eth$ymin = c(0, head(eth$ymax, n=-1))
> > # Make the plot using percentage
> > ggplot(eth, aes(ymax=ymax, ymin=ymin, xmax=4, xmin=3, fill=ethnicity))
+
> >  geom_rect() +
> >  coord_polar(theta="y")  +
> >  xlim(c(2, 4)
> >  )
> >
> > I want to improve the plot for two thing:
> > 1. the legend: I need to add percentage (eth$fraction * 100 and then
add %) for each of element.
> > 2. remove all number (tick mark ?) around the plot
> > Please help
> > Thank you,
> > Kai
> >
> >        [[alternative HTML version deleted]]
> >
> > ______________________________________________
> > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> John Kane
> Kingston ON Canada
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

Avi Gross

2021-Aug-14 01:29 UTC

head link

[R] ggplot: add percentage for each element in legend and remove tick mark

Kai,

It is easier to want to help someone if they generally know what they are doing
and are stuck on something. Less so when they do not know enough to explain to
us what they want, show what they did, and so on.

I modified the data you showed and hopefully it can be recreated this way:

library(tidyverse)

df <- tribble(
  ~ethnicity, ~individuals,
  "Caucasian", 36062,
  "Ashkenazi Jewish", 4309,
  "Multiple", 3193,
  "Hispanic", 2113,
  "Asian. not specified", 1538,
  "Chinese", 1031,
  "African", 643,
  "Unknown", 510,
  "Filipino", 222,
  "Japanese", 129,
  "Native American", 116,
  "Indian", 111,
  "Pacific Islander", 23)

If it was not clear, assuming you already had your data in some variable with a
name, like my df, you could do this:
> dput(df)structure(list(
  ethnicity = c(
    "Caucasian",
    "Ashkenazi Jewish",
    "Multiple",
    "Hispanic",
    "Asian. not specified",
    "Chinese",
    "African",
    "Unknown",
    "Filipino",
    "Japanese",
    "Native American",
    "Indian",
    "Pacific Islander"
  ),
  individuals = c(36062, 4309, 3193, 2113,
                  1538, 1031, 643, 510, 222, 129, 116, 111, 23)
), row.names = c(NA,
                 -13L), class = c("tbl_df", "tbl",
"data.frame"))

The above structure can be used to recreate the data somewhat portably including
a cut and paste like this:

Restoring <- the.above.put.here

The question you ask may better be answered by CHANGING what is in df before
calling ggplot.

Be that as it may, with lotf of work on your badly formatted code as shown in
plain text, I have this:
> eth# A tibble: 13 x 5
ethnicity            individuals fraction  ymax  ymin
<chr>                      <dbl>    <dbl> <dbl>
<dbl>
  1 Caucasian                  36062  0.721   0.721 0    
2 Ashkenazi Jewish            4309  0.0862  0.807 0.721
3 Multiple                    3193  0.0639  0.871 0.807
4 Hispanic                    2113  0.0423  0.914 0.871
5 Asian. not specified        1538  0.0308  0.944 0.914
6 Chinese                     1031  0.0206  0.965 0.944
7 African                      643  0.0129  0.978 0.965
8 Unknown                      510  0.0102  0.988 0.978
9 Filipino                     222  0.00444 0.992 0.988
10 Japanese                     129  0.00258 0.995 0.992
11 Native American              116  0.00232 0.997 0.995
12 Indian                       111  0.00222 1.00  0.997
13 Pacific Islander              23  0.00046 1     1.00

I used your ggplot code, reformatted so people can read and run it as:

ggplot(eth, aes(ymax=ymax, ymin=ymin, xmax=4, xmin=3, fill=ethnicity)) +
  geom_rect() +
  coord_polar(theta="y")  +
  xlim(c(2, 4))

It shows  donut plot I am not sure I can easily share here. You want to change
the legend by adding more. Sure, tons of ways to do that BUT not sure what you
actually want.

ONE WAY to do what you want is to make a new column like this:
> eth$label <- paste(eth$ethnicity, " ", eth$fraction*100,
"%", sep="")
> eth# A tibble: 13 x 6
ethnicity            individuals fraction  ymax  ymin label
<chr>                      <dbl>    <dbl> <dbl>
<dbl> <chr>
  1 Caucasian                  36062  0.721   0.721 0     Caucasian 72.124%
2 Ashkenazi Jewish            4309  0.0862  0.807 0.721 Ashkenazi Jewish 8.618%
3 Multiple                    3193  0.0639  0.871 0.807 Multiple 6.386%
4 Hispanic                    2113  0.0423  0.914 0.871 Hispanic 4.226%
5 Asian. not specified        1538  0.0308  0.944 0.914 Asian. not specified
3.076%
6 Chinese                     1031  0.0206  0.965 0.944 Chinese 2.062%
7 African                      643  0.0129  0.978 0.965 African 1.286%
8 Unknown                      510  0.0102  0.988 0.978 Unknown 1.02%
9 Filipino                     222  0.00444 0.992 0.988 Filipino 0.444%
10 Japanese                     129  0.00258 0.995 0.992 Japanese 0.258%
11 Native American              116  0.00232 0.997 0.995 Native American 0.232%
12 Indian                       111  0.00222 1.00  0.997 Indian 0.222%
13 Pacific Islander              23  0.00046 1     1.00  Pacific Islander 0.046%

Now once you make the labels look like the exact way you want, you need to ask
ggplot to substitute your labels, and make sure they line up right. It may be
tricky and may require making factors properly. You may also want to round the
percentages to all be the same. You can also use scale_fill_discrete to change
other things like replace "ethnicity" with another phrase and so on.

Here is the additional part of ggplot that makes the change:

ggplot(eth, aes(ymax=ymax, ymin=ymin, xmax=4, xmin=3, fill=ethnicity)) +
  geom_rect() +
  coord_polar(theta="y")  +
  xlim(c(2, 4)) +
  scale_fill_discrete( labels = eth$label)

Removing the tick mark text can be done by setting the right elements of a theme
as in the following:

ggplot(eth, aes(ymax=ymax, ymin=ymin, xmax=4, xmin=3, fill=ethnicity)) +
  geom_rect() +
  coord_polar(theta="y")  +
  xlim(c(2, 4)) +
  scale_fill_discrete( labels = eth$label) +
  theme(axis.ticks = element_blank(),
        axis.text = element_blank())

Only one of the two above is actually needed, and you can experiment.

I can send you personally an attachment showing the output as this is a text
only setup.




-----Original Message-----
From: R-help <r-help-bounces at r-project.org> On Behalf Of Kai Yang via
R-help
Sent: Friday, August 13, 2021 5:48 PM
To: John Kane <jrkrideau at gmail.com>
Cc: R-help Mailing List <r-help at r-project.org>
Subject: Re: [R] ggplot: add percentage for each element in legend and remove
tick mark

 Hello John,
I put my testing data below. I'm not sure how to use dupt() function. would
you please give me an example?
Thanks,
Kai

| 
ethnicity |
individuals |
| Caucasian | 36062 |
| Ashkenazi Jewish | 4309 |
| Multiple | 3193 |
| Hispanic | 2113 |
| Asian. not specified | 1538 |
| Chinese | 1031 |
| African | 643 |
| Unknown | 510 |
| Filipino | 222 |
| Japanese | 129 |
| Native American | 116 |
| Indian | 111 |
| Pacific Islander | 23 |



    On Friday, August 13, 2021, 06:21:29 AM PDT, John Kane <jrkrideau at
gmail.com> wrote:
 
 Would you supply some sample data please? A handy way to supply sample data is
to use the dput() function. See ?dput.  If you have a very large data set then
something like head(dput(myfile), 100) will likely supply enough data for us to
work with.

On Thu, 12 Aug 2021 at 11:45, Kai Yang via R-help <r-help at
r-project.org> wrote:>
> Hello List,
> I use the following code to generate a donut plot.
> # Compute percentages
> eth$fraction = eth$individuals / sum(eth$individuals)  # Compute the 
>cumulative percentages (top of each rectangle)  eth$ymax = 
>cumsum(eth$fraction)  # Compute the bottom of each rectangle  eth$ymin 
>= c(0, head(eth$ymax, n=-1))  # Make the plot using percentage  
>ggplot(eth, aes(ymax=ymax, ymin=ymin, xmax=4, xmin=3, fill=ethnicity)) 
>+
>  geom_rect() +
>  coord_polar(theta="y")  +
>  xlim(c(2, 4)
>  )
>
> I want to improve the plot for two thing:
> 1. the legend: I need to add percentage (eth$fraction * 100 and then add %)
for each of element.
> 2. remove all number (tick mark ?) around the plot Please help Thank 
> you, Kai
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see 
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


--
John Kane
Kingston ON Canada
  
	[[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

R help - Aug 2021 - ggplot: add percentage for each element in legend and remove tick mark

[R] ggplot: add percentage for each element in legend and remove tick mark

[R] ggplot: add percentage for each element in legend and remove tick mark

[R] ggplot: add percentage for each element in legend and remove tick mark