Bert Gunter
2021-Aug-17 00:49 UTC
[R] Including percentage values inside columns of a histogram
I may well misunderstand, but proffered solutions seem more complicated
than necessary.
Note that the return of hist() can be saved as a list of class
"histogram"
and then plotted with plot.histogram(), which already has a "labels"
argument that seems to be what you want. A simple example is"
dat <- runif(50, 0, 10)
myhist <- hist(dat, freq = TRUE, breaks ="Sturges")
plot(myhist, col = "darkgray",
labels = as.character(round(myhist$density*100,1) ),
ylim = c(0, 1.1*max(myhist$counts)))
## note that this is plot.histogram because myhist has class
"histogram"
Note that I expanded the y axis a bit to be sure to include the labels. You
can, of course, plot your separate years as Rui has indicated or via e.g.
?layout.
Apologies if I have misunderstood. Just ignore this in that case.
Otherwise, I leave it to you to fill in details.
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Mon, Aug 16, 2021 at 4:14 PM Paul Bernal <paulbernal07 at gmail.com>
wrote:
> Dear Jim,
>
> Thank you so much for your kind reply. Yes, this is what I am looking for,
> however, can?t see clearly how the bars correspond to the bins in the
> x-axis. Maybe there is a way to align the amounts so that they match the
> columns, sorry if I sound picky, but just want to learn if there is a way
> to accomplish this.
>
> Best regards,
>
> Paul
>
> El lun, 16 ago 2021 a las 17:57, Jim Lemon (<drjimlemon at
gmail.com>)
> escribi?:
>
> > Hi Paul,
> > I just worked out your first request:
> >
> > datasetregs<-<-structure(list(Date = structure(c(1L, 1L, 1L, 1L,
1L, 1L,
> > 2L,
> > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> > 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> > 3L, 3L, 3L), .Label = c("AF 2017", "AF 2020",
"AF 2021"), class > > "factor"),
> > Amount = c(40100, 101100, 35000, 40100, 15000, 45100, 40200,
> > 15000, 35000, 35100, 20300, 40100, 15000, 67100, 17100, 15000,
> > 15000, 50100, 35100, 15000, 15000, 15000, 15000, 15000, 15000,
> > 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000,
> > 15000, 15000, 20100, 15000, 15000, 15000, 15000, 15000, 15000,
> > 16600, 15000, 15000, 15700, 15000, 15000, 15000, 15000, 15000,
> > 15000, 15000, 15000, 15000, 20200, 21400, 25100, 15000, 15000,
> > 15000, 15000, 15000, 15000, 25600, 15000, 15000, 15000, 15000,
> > 15000, 15000, 15000, 15000)), row.names = c(NA, -74L), class >
> "data.frame")
> > histval<-with(datasetregs, hist(Amount, groups=Date,
scale="frequency",
> > breaks="Sturges", col="darkgray"))
> > library(plotrix)
> >
histpcts<-paste0(round(100*histval$counts/sum(histval$counts),1),"%")
> > barlabels(histval$mids,histval$counts,histpcts)
> >
> > I think that's what you asked for:
> >
> > Jim
> >
> > On Tue, Aug 17, 2021 at 8:44 AM Paul Bernal <paulbernal07 at
gmail.com>
> > wrote:
> > >
> > > This is way better, now, how could I put the frequency labels in
the
> > > columns as a percentage, instead of presenting them as counts?
> > >
> > > Thank you so much.
> > >
> > > Paul
> > >
> > > El lun, 16 ago 2021 a las 17:33, Rui Barradas (<ruipbarradas
at sapo.pt>)
> > > escribi?:
> > >
> > > > Hello,
> > > >
> > > > You forgot to cc the list.
> > > >
> > > > Here are two ways, both of them apply hist() and text() to
Amount
> split
> > > > by Date. The return value of hist is saved because it's
a list with
> > > > members the histogram's bars midpoints and the counts.
Those are used
> > to
> > > > know where to put the text labels.
> > > > A vector lbls is created to get rid of counts of zero.
> > > >
> > > > The main difference between the two ways is the
histogram's titles.
> > > >
> > > >
> > > > old_par <- par(mfrow = c(1, 3))
> > > > h_list <- with(datasetregs, tapply(Amount, Date,
function(x){
> > > > h <- hist(x)
> > > > lbls <- ifelse(h$counts == 0, NA_integer_, h$counts)
> > > > text(h$mids, h$counts/2, labels = lbls)
> > > > }))
> > > > par(old_par)
> > > >
> > > >
> > > >
> > > > old_par <- par(mfrow = c(1, 3))
> > > > sp <- split(datasetregs, datasetregs$Date)
> > > > h_list <- lapply(seq_along(sp), function(i){
> > > > hist_title <- paste("Histogram of",
names(sp)[i])
> > > > h <- hist(sp[[i]]$Amount, main = hist_title)
> > > > lbls <- ifelse(h$counts == 0, NA_integer_, h$counts)
> > > > text(h$mids, h$counts/2, labels = lbls)
> > > > })
> > > > par(old_par)
> > > >
> > > >
> > > > Hope this helps,
> > > >
> > > > Rui Barradas
> > > >
> > > > ?s 23:16 de 16/08/21, Paul Bernal escreveu:
> > > > > Dear Rui,
> > > > >
> > > > > The hist() function comes from the graphics package,
from what I
> > could
> > > > > see. The thing is that I want to divide the Amount
column into
> > several
> > > > > bins and then generate three different histograms, one
for each AF
> > > > > period (AF refers to fiscal years). As you can see, the
data
> contains
> > > > > three fiscal years (2017, 2020 and 2021). I want to see
the
> > percentage
> > > > > of cases that fall into different amount categories,
from 15,000
> and
> > > > > below, 16,000 to 17,000, from 18,000 to 19,000, and so
on.
> > > > >
> > > > > Thanks for your kind help.
> > > > >
> > > > > Paul
> > > > >
> > > > > El lun, 16 ago 2021 a las 17:07, Rui Barradas (<
> ruipbarradas at sapo.pt
> > > > > <mailto:ruipbarradas at sapo.pt>>) escribi?:
> > > > >
> > > > > Hello,
> > > > >
> > > > > The function Hist comes from what package?
> > > > >
> > > > > Are you sure you don't want a bar plot?
> > > > >
> > > > >
> > > > > agg <- aggregate(Amount ~ Date, datasetregs,
sum)
> > > > > bp <- barplot(Amount ~ Date, agg)
> > > > > with(agg, text(bp, Amount/2, labels = Amount))
> > > > >
> > > > >
> > > > > Hope this helps,
> > > > >
> > > > > Rui Barradas
> > > > >
> > > > > ?s 22:54 de 16/08/21, Paul Bernal escreveu:
> > > > > > Hello everyone,
> > > > > >
> > > > > > I am currently working with R version 4.1.0
and I am trying
> to
> > > > > include
> > > > > > (inside the columns of the histogram), the
percentage
> > > > > distribution and I
> > > > > > want to generate three histograms, one for
each fiscal year
> > (in
> > > > > the Date
> > > > > > column, there are three fiscal year AF 2017,
AF 2020 and AF
> > > > > 2021). However,
> > > > > > I can?t seem to accomplish this.
> > > > > >
> > > > > > Here is my data:
> > > > > >
> > > > > > structure(list(Date = structure(c(1L, 1L, 1L,
1L, 1L, 1L,
> 2L,
> > > > > > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L,
> > 2L,
> > > > > > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L,
> > 2L,
> > > > > > 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L,
> > 3L,
> > > > > > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L,
> > 3L,
> > > > > > 3L, 3L, 3L), .Label = c("AF 2017",
"AF 2020", "AF 2021"),
> > class > > > > > > "factor"),
> > > > > > Amount = c(40100, 101100, 35000, 40100,
15000, 45100,
> > 40200,
> > > > > > 15000, 35000, 35100, 20300, 40100,
15000, 67100, 17100,
> > > > 15000,
> > > > > > 15000, 50100, 35100, 15000, 15000,
15000, 15000, 15000,
> > > > 15000,
> > > > > > 15000, 15000, 15000, 15000, 15000,
15000, 15000, 15000,
> > > > 15000,
> > > > > > 15000, 15000, 20100, 15000, 15000,
15000, 15000, 15000,
> > > > 15000,
> > > > > > 16600, 15000, 15000, 15700, 15000,
15000, 15000, 15000,
> > > > 15000,
> > > > > > 15000, 15000, 15000, 15000, 20200,
21400, 25100, 15000,
> > > > 15000,
> > > > > > 15000, 15000, 15000, 15000, 25600,
15000, 15000, 15000,
> > > > 15000,
> > > > > > 15000, 15000, 15000, 15000)), row.names
= c(NA, -74L),
> > class
> > > > > > > > > > "data.frame")
> > > > > >
> > > > > > I would like to modify the following script:
> > > > > >
> > > > > >> with(datasetregs, Hist(Amount,
groups=Date,
> > scale="frequency",
> > > > > > + breaks="Sturges",
col="darkgray"))
> > > > > >
> > > > > > #The only thing missing here are the
percentages
> > corresponding to
> > > > > each bin
> > > > > > (I would like to see the percentages inside
each column, or
> on
> > > > > top outside
> > > > > > if possible)
> > > > > >
> > > > > > Any help will be greatly appreciated.
> > > > > >
> > > > > > Best regards,
> > > > > >
> > > > > > Paul.
> > > > > >
> > > > > > [[alternative HTML version deleted]]
> > > > > >
> > > > > >
______________________________________________
> > > > > > R-help at r-project.org <mailto:R-help at
r-project.org> mailing
> > list
> > > > > -- To UNSUBSCRIBE and more, see
> > > > > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > > >
<https://stat.ethz.ch/mailman/listinfo/r-help>
> > > > > > PLEASE do read the posting guide
> > > > > http://www.R-project.org/posting-guide.html
> > > > > <http://www.R-project.org/posting-guide.html>
> > > > > > and provide commented, minimal,
self-contained, reproducible
> > code.
> > > > > >
> > > > >
> > > >
> > >
> > > [[alternative HTML version deleted]]
> > >
> > > ______________________________________________
> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more,
see
> > > https://stat.ethz.ch/mailman/listinfo/r-help
> > > PLEASE do read the posting guide
> > http://www.R-project.org/posting-guide.html
> > > and provide commented, minimal, self-contained, reproducible
code.
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]
Paul Bernal
2021-Aug-17 01:04 UTC
[R] Including percentage values inside columns of a histogram
Thank you very much Mr. Gunter, I will give it a try. Cheers, Paul El lun., 16 de agosto de 2021 7:49 p. m., Bert Gunter < bgunter.4567 at gmail.com> escribi?:> I may well misunderstand, but proffered solutions seem more complicated > than necessary. > Note that the return of hist() can be saved as a list of class "histogram" > and then plotted with plot.histogram(), which already has a "labels" > argument that seems to be what you want. A simple example is" > > dat <- runif(50, 0, 10) > myhist <- hist(dat, freq = TRUE, breaks ="Sturges") > > plot(myhist, col = "darkgray", > labels = as.character(round(myhist$density*100,1) ), > ylim = c(0, 1.1*max(myhist$counts))) > ## note that this is plot.histogram because myhist has class "histogram" > > Note that I expanded the y axis a bit to be sure to include the labels. > You can, of course, plot your separate years as Rui has indicated or via > e.g. ?layout. > > Apologies if I have misunderstood. Just ignore this in that case. > Otherwise, I leave it to you to fill in details. > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Mon, Aug 16, 2021 at 4:14 PM Paul Bernal <paulbernal07 at gmail.com> > wrote: > >> Dear Jim, >> >> Thank you so much for your kind reply. Yes, this is what I am looking for, >> however, can?t see clearly how the bars correspond to the bins in the >> x-axis. Maybe there is a way to align the amounts so that they match the >> columns, sorry if I sound picky, but just want to learn if there is a way >> to accomplish this. >> >> Best regards, >> >> Paul >> >> El lun, 16 ago 2021 a las 17:57, Jim Lemon (<drjimlemon at gmail.com>) >> escribi?: >> >> > Hi Paul, >> > I just worked out your first request: >> > >> > datasetregs<-<-structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L, >> > 2L, >> > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, >> > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, >> > 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, >> > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, >> > 3L, 3L, 3L), .Label = c("AF 2017", "AF 2020", "AF 2021"), class >> > "factor"), >> > Amount = c(40100, 101100, 35000, 40100, 15000, 45100, 40200, >> > 15000, 35000, 35100, 20300, 40100, 15000, 67100, 17100, 15000, >> > 15000, 50100, 35100, 15000, 15000, 15000, 15000, 15000, 15000, >> > 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000, >> > 15000, 15000, 20100, 15000, 15000, 15000, 15000, 15000, 15000, >> > 16600, 15000, 15000, 15700, 15000, 15000, 15000, 15000, 15000, >> > 15000, 15000, 15000, 15000, 20200, 21400, 25100, 15000, 15000, >> > 15000, 15000, 15000, 15000, 25600, 15000, 15000, 15000, 15000, >> > 15000, 15000, 15000, 15000)), row.names = c(NA, -74L), class >> > "data.frame") >> > histval<-with(datasetregs, hist(Amount, groups=Date, scale="frequency", >> > breaks="Sturges", col="darkgray")) >> > library(plotrix) >> > histpcts<-paste0(round(100*histval$counts/sum(histval$counts),1),"%") >> > barlabels(histval$mids,histval$counts,histpcts) >> > >> > I think that's what you asked for: >> > >> > Jim >> > >> > On Tue, Aug 17, 2021 at 8:44 AM Paul Bernal <paulbernal07 at gmail.com> >> > wrote: >> > > >> > > This is way better, now, how could I put the frequency labels in the >> > > columns as a percentage, instead of presenting them as counts? >> > > >> > > Thank you so much. >> > > >> > > Paul >> > > >> > > El lun, 16 ago 2021 a las 17:33, Rui Barradas (<ruipbarradas at sapo.pt >> >) >> > > escribi?: >> > > >> > > > Hello, >> > > > >> > > > You forgot to cc the list. >> > > > >> > > > Here are two ways, both of them apply hist() and text() to Amount >> split >> > > > by Date. The return value of hist is saved because it's a list with >> > > > members the histogram's bars midpoints and the counts. Those are >> used >> > to >> > > > know where to put the text labels. >> > > > A vector lbls is created to get rid of counts of zero. >> > > > >> > > > The main difference between the two ways is the histogram's titles. >> > > > >> > > > >> > > > old_par <- par(mfrow = c(1, 3)) >> > > > h_list <- with(datasetregs, tapply(Amount, Date, function(x){ >> > > > h <- hist(x) >> > > > lbls <- ifelse(h$counts == 0, NA_integer_, h$counts) >> > > > text(h$mids, h$counts/2, labels = lbls) >> > > > })) >> > > > par(old_par) >> > > > >> > > > >> > > > >> > > > old_par <- par(mfrow = c(1, 3)) >> > > > sp <- split(datasetregs, datasetregs$Date) >> > > > h_list <- lapply(seq_along(sp), function(i){ >> > > > hist_title <- paste("Histogram of", names(sp)[i]) >> > > > h <- hist(sp[[i]]$Amount, main = hist_title) >> > > > lbls <- ifelse(h$counts == 0, NA_integer_, h$counts) >> > > > text(h$mids, h$counts/2, labels = lbls) >> > > > }) >> > > > par(old_par) >> > > > >> > > > >> > > > Hope this helps, >> > > > >> > > > Rui Barradas >> > > > >> > > > ?s 23:16 de 16/08/21, Paul Bernal escreveu: >> > > > > Dear Rui, >> > > > > >> > > > > The hist() function comes from the graphics package, from what I >> > could >> > > > > see. The thing is that I want to divide the Amount column into >> > several >> > > > > bins and then generate three different histograms, one for each AF >> > > > > period (AF refers to fiscal years). As you can see, the data >> contains >> > > > > three fiscal years (2017, 2020 and 2021). I want to see the >> > percentage >> > > > > of cases that fall into different amount categories, from 15,000 >> and >> > > > > below, 16,000 to 17,000, from 18,000 to 19,000, and so on. >> > > > > >> > > > > Thanks for your kind help. >> > > > > >> > > > > Paul >> > > > > >> > > > > El lun, 16 ago 2021 a las 17:07, Rui Barradas (< >> ruipbarradas at sapo.pt >> > > > > <mailto:ruipbarradas at sapo.pt>>) escribi?: >> > > > > >> > > > > Hello, >> > > > > >> > > > > The function Hist comes from what package? >> > > > > >> > > > > Are you sure you don't want a bar plot? >> > > > > >> > > > > >> > > > > agg <- aggregate(Amount ~ Date, datasetregs, sum) >> > > > > bp <- barplot(Amount ~ Date, agg) >> > > > > with(agg, text(bp, Amount/2, labels = Amount)) >> > > > > >> > > > > >> > > > > Hope this helps, >> > > > > >> > > > > Rui Barradas >> > > > > >> > > > > ?s 22:54 de 16/08/21, Paul Bernal escreveu: >> > > > > > Hello everyone, >> > > > > > >> > > > > > I am currently working with R version 4.1.0 and I am >> trying to >> > > > > include >> > > > > > (inside the columns of the histogram), the percentage >> > > > > distribution and I >> > > > > > want to generate three histograms, one for each fiscal year >> > (in >> > > > > the Date >> > > > > > column, there are three fiscal year AF 2017, AF 2020 and AF >> > > > > 2021). However, >> > > > > > I can?t seem to accomplish this. >> > > > > > >> > > > > > Here is my data: >> > > > > > >> > > > > > structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L, >> 2L, >> > > > > > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, >> > 2L, >> > > > > > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, >> > 2L, >> > > > > > 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, >> > 3L, >> > > > > > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, >> > 3L, >> > > > > > 3L, 3L, 3L), .Label = c("AF 2017", "AF 2020", "AF 2021"), >> > class >> > > > > > "factor"), >> > > > > > Amount = c(40100, 101100, 35000, 40100, 15000, 45100, >> > 40200, >> > > > > > 15000, 35000, 35100, 20300, 40100, 15000, 67100, >> 17100, >> > > > 15000, >> > > > > > 15000, 50100, 35100, 15000, 15000, 15000, 15000, >> 15000, >> > > > 15000, >> > > > > > 15000, 15000, 15000, 15000, 15000, 15000, 15000, >> 15000, >> > > > 15000, >> > > > > > 15000, 15000, 20100, 15000, 15000, 15000, 15000, >> 15000, >> > > > 15000, >> > > > > > 16600, 15000, 15000, 15700, 15000, 15000, 15000, >> 15000, >> > > > 15000, >> > > > > > 15000, 15000, 15000, 15000, 20200, 21400, 25100, >> 15000, >> > > > 15000, >> > > > > > 15000, 15000, 15000, 15000, 25600, 15000, 15000, >> 15000, >> > > > 15000, >> > > > > > 15000, 15000, 15000, 15000)), row.names = c(NA, -74L), >> > class >> > > > >> > > > > > "data.frame") >> > > > > > >> > > > > > I would like to modify the following script: >> > > > > > >> > > > > >> with(datasetregs, Hist(Amount, groups=Date, >> > scale="frequency", >> > > > > > + breaks="Sturges", col="darkgray")) >> > > > > > >> > > > > > #The only thing missing here are the percentages >> > corresponding to >> > > > > each bin >> > > > > > (I would like to see the percentages inside each column, >> or on >> > > > > top outside >> > > > > > if possible) >> > > > > > >> > > > > > Any help will be greatly appreciated. >> > > > > > >> > > > > > Best regards, >> > > > > > >> > > > > > Paul. >> > > > > > >> > > > > > [[alternative HTML version deleted]] >> > > > > > >> > > > > > ______________________________________________ >> > > > > > R-help at r-project.org <mailto:R-help at r-project.org> mailing >> > list >> > > > > -- To UNSUBSCRIBE and more, see >> > > > > > https://stat.ethz.ch/mailman/listinfo/r-help >> > > > > <https://stat.ethz.ch/mailman/listinfo/r-help> >> > > > > > PLEASE do read the posting guide >> > > > > http://www.R-project.org/posting-guide.html >> > > > > <http://www.R-project.org/posting-guide.html> >> > > > > > and provide commented, minimal, self-contained, >> reproducible >> > code. >> > > > > > >> > > > > >> > > > >> > > >> > > [[alternative HTML version deleted]] >> > > >> > > ______________________________________________ >> > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> > > https://stat.ethz.ch/mailman/listinfo/r-help >> > > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > > and provide commented, minimal, self-contained, reproducible code. >> > >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> >[[alternative HTML version deleted]]
Rui Barradas
2021-Aug-17 11:09 UTC
[R] Including percentage values inside columns of a histogram
Hello,
I had forgotten about plot.histogram, it does make everything simpler.
To have percentages on the bars, in the code below I use package scales.
Note that it seems to me that you do not want densities, to have
percentages, the proportions of counts are given by any of
h$counts/sum(h$counts)
h$density*diff(h$breaks)
# One histogram for all dates
h <- hist(datasetregs$Amount, plot = FALSE)
plot(h, labels = scales::percent(h$counts/sum(h$counts)),
ylim = c(0, 1.1*max(h$counts)))
# Histograms by date
sp <- split(datasetregs, datasetregs$Date)
old_par <- par(mfrow = c(1, 3))
h_list <- lapply(seq_along(sp), function(i){
hist_title <- paste("Histogram of", names(sp)[i])
h <- hist(sp[[i]]$Amount, plot = FALSE)
plot(h, main = hist_title, xlab = "Amount",
labels = scales::percent(h$counts/sum(h$counts)),
ylim = c(0, 1.1*max(h$counts)))
})
par(old_par)
Hope this helps,
Rui Barradas
?s 01:49 de 17/08/21, Bert Gunter escreveu:> I may well misunderstand, but proffered solutions seem more complicated
> than necessary.
> Note that the return of hist() can be saved as a list of class
"histogram"
> and then plotted with plot.histogram(), which already has a
"labels"
> argument that seems to be what you want. A simple example is"
>
> dat <- runif(50, 0, 10)
> myhist <- hist(dat, freq = TRUE, breaks ="Sturges")
>
> plot(myhist, col = "darkgray",
> labels = as.character(round(myhist$density*100,1) ),
> ylim = c(0, 1.1*max(myhist$counts)))
> ## note that this is plot.histogram because myhist has class
"histogram"
>
> Note that I expanded the y axis a bit to be sure to include the labels. You
> can, of course, plot your separate years as Rui has indicated or via e.g.
> ?layout.
>
> Apologies if I have misunderstood. Just ignore this in that case.
> Otherwise, I leave it to you to fill in details.
>
> Bert Gunter
>
> "The trouble with having an open mind is that people keep coming along
and
> sticking things into it."
> -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip
)
>
>
> On Mon, Aug 16, 2021 at 4:14 PM Paul Bernal <paulbernal07 at
gmail.com> wrote:
>
>> Dear Jim,
>>
>> Thank you so much for your kind reply. Yes, this is what I am looking
for,
>> however, can?t see clearly how the bars correspond to the bins in the
>> x-axis. Maybe there is a way to align the amounts so that they match
the
>> columns, sorry if I sound picky, but just want to learn if there is a
way
>> to accomplish this.
>>
>> Best regards,
>>
>> Paul
>>
>> El lun, 16 ago 2021 a las 17:57, Jim Lemon (<drjimlemon at
gmail.com>)
>> escribi?:
>>
>>> Hi Paul,
>>> I just worked out your first request:
>>>
>>> datasetregs<-<-structure(list(Date = structure(c(1L, 1L, 1L,
1L, 1L, 1L,
>>> 2L,
>>> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>>> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>>> 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>>> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>>> 3L, 3L, 3L), .Label = c("AF 2017", "AF 2020",
"AF 2021"), class >>> "factor"),
>>> Amount = c(40100, 101100, 35000, 40100, 15000, 45100, 40200,
>>> 15000, 35000, 35100, 20300, 40100, 15000, 67100, 17100, 15000,
>>> 15000, 50100, 35100, 15000, 15000, 15000, 15000, 15000, 15000,
>>> 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000,
>>> 15000, 15000, 20100, 15000, 15000, 15000, 15000, 15000, 15000,
>>> 16600, 15000, 15000, 15700, 15000, 15000, 15000, 15000, 15000,
>>> 15000, 15000, 15000, 15000, 20200, 21400, 25100, 15000, 15000,
>>> 15000, 15000, 15000, 15000, 25600, 15000, 15000, 15000, 15000,
>>> 15000, 15000, 15000, 15000)), row.names = c(NA, -74L), class
>>> "data.frame")
>>> histval<-with(datasetregs, hist(Amount, groups=Date,
scale="frequency",
>>> breaks="Sturges", col="darkgray"))
>>> library(plotrix)
>>>
histpcts<-paste0(round(100*histval$counts/sum(histval$counts),1),"%")
>>> barlabels(histval$mids,histval$counts,histpcts)
>>>
>>> I think that's what you asked for:
>>>
>>> Jim
>>>
>>> On Tue, Aug 17, 2021 at 8:44 AM Paul Bernal <paulbernal07 at
gmail.com>
>>> wrote:
>>>>
>>>> This is way better, now, how could I put the frequency labels
in the
>>>> columns as a percentage, instead of presenting them as counts?
>>>>
>>>> Thank you so much.
>>>>
>>>> Paul
>>>>
>>>> El lun, 16 ago 2021 a las 17:33, Rui Barradas (<ruipbarradas
at sapo.pt>)
>>>> escribi?:
>>>>
>>>>> Hello,
>>>>>
>>>>> You forgot to cc the list.
>>>>>
>>>>> Here are two ways, both of them apply hist() and text() to
Amount
>> split
>>>>> by Date. The return value of hist is saved because it's
a list with
>>>>> members the histogram's bars midpoints and the counts.
Those are used
>>> to
>>>>> know where to put the text labels.
>>>>> A vector lbls is created to get rid of counts of zero.
>>>>>
>>>>> The main difference between the two ways is the
histogram's titles.
>>>>>
>>>>>
>>>>> old_par <- par(mfrow = c(1, 3))
>>>>> h_list <- with(datasetregs, tapply(Amount, Date,
function(x){
>>>>> h <- hist(x)
>>>>> lbls <- ifelse(h$counts == 0, NA_integer_, h$counts)
>>>>> text(h$mids, h$counts/2, labels = lbls)
>>>>> }))
>>>>> par(old_par)
>>>>>
>>>>>
>>>>>
>>>>> old_par <- par(mfrow = c(1, 3))
>>>>> sp <- split(datasetregs, datasetregs$Date)
>>>>> h_list <- lapply(seq_along(sp), function(i){
>>>>> hist_title <- paste("Histogram of",
names(sp)[i])
>>>>> h <- hist(sp[[i]]$Amount, main = hist_title)
>>>>> lbls <- ifelse(h$counts == 0, NA_integer_, h$counts)
>>>>> text(h$mids, h$counts/2, labels = lbls)
>>>>> })
>>>>> par(old_par)
>>>>>
>>>>>
>>>>> Hope this helps,
>>>>>
>>>>> Rui Barradas
>>>>>
>>>>> ?s 23:16 de 16/08/21, Paul Bernal escreveu:
>>>>>> Dear Rui,
>>>>>>
>>>>>> The hist() function comes from the graphics package,
from what I
>>> could
>>>>>> see. The thing is that I want to divide the Amount
column into
>>> several
>>>>>> bins and then generate three different histograms, one
for each AF
>>>>>> period (AF refers to fiscal years). As you can see, the
data
>> contains
>>>>>> three fiscal years (2017, 2020 and 2021). I want to see
the
>>> percentage
>>>>>> of cases that fall into different amount categories,
from 15,000
>> and
>>>>>> below, 16,000 to 17,000, from 18,000 to 19,000, and so
on.
>>>>>>
>>>>>> Thanks for your kind help.
>>>>>>
>>>>>> Paul
>>>>>>
>>>>>> El lun, 16 ago 2021 a las 17:07, Rui Barradas (<
>> ruipbarradas at sapo.pt
>>>>>> <mailto:ruipbarradas at sapo.pt>>) escribi?:
>>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> The function Hist comes from what package?
>>>>>>
>>>>>> Are you sure you don't want a bar plot?
>>>>>>
>>>>>>
>>>>>> agg <- aggregate(Amount ~ Date, datasetregs,
sum)
>>>>>> bp <- barplot(Amount ~ Date, agg)
>>>>>> with(agg, text(bp, Amount/2, labels = Amount))
>>>>>>
>>>>>>
>>>>>> Hope this helps,
>>>>>>
>>>>>> Rui Barradas
>>>>>>
>>>>>> ?s 22:54 de 16/08/21, Paul Bernal escreveu:
>>>>>> > Hello everyone,
>>>>>> >
>>>>>> > I am currently working with R version 4.1.0
and I am trying
>> to
>>>>>> include
>>>>>> > (inside the columns of the histogram), the
percentage
>>>>>> distribution and I
>>>>>> > want to generate three histograms, one for
each fiscal year
>>> (in
>>>>>> the Date
>>>>>> > column, there are three fiscal year AF 2017,
AF 2020 and AF
>>>>>> 2021). However,
>>>>>> > I can?t seem to accomplish this.
>>>>>> >
>>>>>> > Here is my data:
>>>>>> >
>>>>>> > structure(list(Date = structure(c(1L, 1L,
1L, 1L, 1L, 1L,
>> 2L,
>>>>>> > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L,
>>> 2L,
>>>>>> > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 2L, 2L, 2L,
>>> 2L,
>>>>>> > 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L,
>>> 3L,
>>>>>> > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
3L, 3L, 3L, 3L,
>>> 3L,
>>>>>> > 3L, 3L, 3L), .Label = c("AF 2017",
"AF 2020", "AF 2021"),
>>> class >>>>>> > "factor"),
>>>>>> > Amount = c(40100, 101100, 35000, 40100,
15000, 45100,
>>> 40200,
>>>>>> > 15000, 35000, 35100, 20300, 40100,
15000, 67100, 17100,
>>>>> 15000,
>>>>>> > 15000, 50100, 35100, 15000, 15000,
15000, 15000, 15000,
>>>>> 15000,
>>>>>> > 15000, 15000, 15000, 15000, 15000,
15000, 15000, 15000,
>>>>> 15000,
>>>>>> > 15000, 15000, 20100, 15000, 15000,
15000, 15000, 15000,
>>>>> 15000,
>>>>>> > 16600, 15000, 15000, 15700, 15000,
15000, 15000, 15000,
>>>>> 15000,
>>>>>> > 15000, 15000, 15000, 15000, 20200,
21400, 25100, 15000,
>>>>> 15000,
>>>>>> > 15000, 15000, 15000, 15000, 25600,
15000, 15000, 15000,
>>>>> 15000,
>>>>>> > 15000, 15000, 15000, 15000)), row.names
= c(NA, -74L),
>>> class
>>>>> >>>>>> > "data.frame")
>>>>>> >
>>>>>> > I would like to modify the following script:
>>>>>> >
>>>>>> >> with(datasetregs, Hist(Amount,
groups=Date,
>>> scale="frequency",
>>>>>> > + breaks="Sturges",
col="darkgray"))
>>>>>> >
>>>>>> > #The only thing missing here are the
percentages
>>> corresponding to
>>>>>> each bin
>>>>>> > (I would like to see the percentages inside
each column, or
>> on
>>>>>> top outside
>>>>>> > if possible)
>>>>>> >
>>>>>> > Any help will be greatly appreciated.
>>>>>> >
>>>>>> > Best regards,
>>>>>> >
>>>>>> > Paul.
>>>>>> >
>>>>>> > [[alternative HTML version deleted]]
>>>>>> >
>>>>>> >
______________________________________________
>>>>>> > R-help at r-project.org <mailto:R-help at
r-project.org> mailing
>>> list
>>>>>> -- To UNSUBSCRIBE and more, see
>>>>>> > https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>
<https://stat.ethz.ch/mailman/listinfo/r-help>
>>>>>> > PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>
<http://www.R-project.org/posting-guide.html>
>>>>>> > and provide commented, minimal,
self-contained, reproducible
>>> code.
>>>>>> >
>>>>>>
>>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and
more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible
code.
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>