Rui Barradas
2021-Aug-16 22:33 UTC
[R] Including percentage values inside columns of a histogram
Hello, You forgot to cc the list. Here are two ways, both of them apply hist() and text() to Amount split by Date. The return value of hist is saved because it's a list with members the histogram's bars midpoints and the counts. Those are used to know where to put the text labels. A vector lbls is created to get rid of counts of zero. The main difference between the two ways is the histogram's titles. old_par <- par(mfrow = c(1, 3)) h_list <- with(datasetregs, tapply(Amount, Date, function(x){ h <- hist(x) lbls <- ifelse(h$counts == 0, NA_integer_, h$counts) text(h$mids, h$counts/2, labels = lbls) })) par(old_par) old_par <- par(mfrow = c(1, 3)) sp <- split(datasetregs, datasetregs$Date) h_list <- lapply(seq_along(sp), function(i){ hist_title <- paste("Histogram of", names(sp)[i]) h <- hist(sp[[i]]$Amount, main = hist_title) lbls <- ifelse(h$counts == 0, NA_integer_, h$counts) text(h$mids, h$counts/2, labels = lbls) }) par(old_par) Hope this helps, Rui Barradas ?s 23:16 de 16/08/21, Paul Bernal escreveu:> Dear Rui, > > The hist() function comes from the graphics package, from what I could > see. The thing is that I want to divide the Amount column into several > bins and then generate three different histograms, one for each AF > period (AF refers to fiscal years). As you can see, the data contains > three fiscal years (2017, 2020 and 2021). I want to see the percentage > of cases that fall into different amount categories, from 15,000 and > below, 16,000 to 17,000, from 18,000 to 19,000, and so on. > > Thanks for your kind help. > > Paul > > El lun, 16 ago 2021 a las 17:07, Rui Barradas (<ruipbarradas at sapo.pt > <mailto:ruipbarradas at sapo.pt>>) escribi?: > > Hello, > > The function Hist comes from what package? > > Are you sure you don't want a bar plot? > > > agg <- aggregate(Amount ~ Date, datasetregs, sum) > bp <- barplot(Amount ~ Date, agg) > with(agg, text(bp, Amount/2, labels = Amount)) > > > Hope this helps, > > Rui Barradas > > ?s 22:54 de 16/08/21, Paul Bernal escreveu: > > Hello everyone, > > > > I am currently working with R version 4.1.0 and I am trying to > include > > (inside the columns of the histogram), the percentage > distribution and I > > want to generate three histograms, one for each fiscal year (in > the Date > > column, there are three fiscal year AF 2017, AF 2020 and AF > 2021). However, > > I can?t seem to accomplish this. > > > > Here is my data: > > > > structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, > > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, > > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, > > 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, > > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, > > 3L, 3L, 3L), .Label = c("AF 2017", "AF 2020", "AF 2021"), class > > "factor"), > >? ? ? Amount = c(40100, 101100, 35000, 40100, 15000, 45100, 40200, > >? ? ? 15000, 35000, 35100, 20300, 40100, 15000, 67100, 17100, 15000, > >? ? ? 15000, 50100, 35100, 15000, 15000, 15000, 15000, 15000, 15000, > >? ? ? 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000, > >? ? ? 15000, 15000, 20100, 15000, 15000, 15000, 15000, 15000, 15000, > >? ? ? 16600, 15000, 15000, 15700, 15000, 15000, 15000, 15000, 15000, > >? ? ? 15000, 15000, 15000, 15000, 20200, 21400, 25100, 15000, 15000, > >? ? ? 15000, 15000, 15000, 15000, 25600, 15000, 15000, 15000, 15000, > >? ? ? 15000, 15000, 15000, 15000)), row.names = c(NA, -74L), class > > "data.frame") > > > > I would like to modify the following script: > > > >> with(datasetregs, Hist(Amount, groups=Date, scale="frequency", > > +? ?breaks="Sturges", col="darkgray")) > > > > #The only thing missing here are the percentages corresponding to > each bin > > (I would like to see the percentages inside each column, or on > top outside > > if possible) > > > > Any help will be greatly appreciated. > > > > Best regards, > > > > Paul. > > > >? ? ? ?[[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org <mailto:R-help at r-project.org> mailing list > -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > <https://stat.ethz.ch/mailman/listinfo/r-help> > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > <http://www.R-project.org/posting-guide.html> > > and provide commented, minimal, self-contained, reproducible code. > > >
Paul Bernal
2021-Aug-16 22:43 UTC
[R] Including percentage values inside columns of a histogram
This is way better, now, how could I put the frequency labels in the columns as a percentage, instead of presenting them as counts? Thank you so much. Paul El lun, 16 ago 2021 a las 17:33, Rui Barradas (<ruipbarradas at sapo.pt>) escribi?:> Hello, > > You forgot to cc the list. > > Here are two ways, both of them apply hist() and text() to Amount split > by Date. The return value of hist is saved because it's a list with > members the histogram's bars midpoints and the counts. Those are used to > know where to put the text labels. > A vector lbls is created to get rid of counts of zero. > > The main difference between the two ways is the histogram's titles. > > > old_par <- par(mfrow = c(1, 3)) > h_list <- with(datasetregs, tapply(Amount, Date, function(x){ > h <- hist(x) > lbls <- ifelse(h$counts == 0, NA_integer_, h$counts) > text(h$mids, h$counts/2, labels = lbls) > })) > par(old_par) > > > > old_par <- par(mfrow = c(1, 3)) > sp <- split(datasetregs, datasetregs$Date) > h_list <- lapply(seq_along(sp), function(i){ > hist_title <- paste("Histogram of", names(sp)[i]) > h <- hist(sp[[i]]$Amount, main = hist_title) > lbls <- ifelse(h$counts == 0, NA_integer_, h$counts) > text(h$mids, h$counts/2, labels = lbls) > }) > par(old_par) > > > Hope this helps, > > Rui Barradas > > ?s 23:16 de 16/08/21, Paul Bernal escreveu: > > Dear Rui, > > > > The hist() function comes from the graphics package, from what I could > > see. The thing is that I want to divide the Amount column into several > > bins and then generate three different histograms, one for each AF > > period (AF refers to fiscal years). As you can see, the data contains > > three fiscal years (2017, 2020 and 2021). I want to see the percentage > > of cases that fall into different amount categories, from 15,000 and > > below, 16,000 to 17,000, from 18,000 to 19,000, and so on. > > > > Thanks for your kind help. > > > > Paul > > > > El lun, 16 ago 2021 a las 17:07, Rui Barradas (<ruipbarradas at sapo.pt > > <mailto:ruipbarradas at sapo.pt>>) escribi?: > > > > Hello, > > > > The function Hist comes from what package? > > > > Are you sure you don't want a bar plot? > > > > > > agg <- aggregate(Amount ~ Date, datasetregs, sum) > > bp <- barplot(Amount ~ Date, agg) > > with(agg, text(bp, Amount/2, labels = Amount)) > > > > > > Hope this helps, > > > > Rui Barradas > > > > ?s 22:54 de 16/08/21, Paul Bernal escreveu: > > > Hello everyone, > > > > > > I am currently working with R version 4.1.0 and I am trying to > > include > > > (inside the columns of the histogram), the percentage > > distribution and I > > > want to generate three histograms, one for each fiscal year (in > > the Date > > > column, there are three fiscal year AF 2017, AF 2020 and AF > > 2021). However, > > > I can?t seem to accomplish this. > > > > > > Here is my data: > > > > > > structure(list(Date = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 2L, > > > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, > > > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, > > > 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, > > > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, > > > 3L, 3L, 3L), .Label = c("AF 2017", "AF 2020", "AF 2021"), class > > > "factor"), > > > Amount = c(40100, 101100, 35000, 40100, 15000, 45100, 40200, > > > 15000, 35000, 35100, 20300, 40100, 15000, 67100, 17100, > 15000, > > > 15000, 50100, 35100, 15000, 15000, 15000, 15000, 15000, > 15000, > > > 15000, 15000, 15000, 15000, 15000, 15000, 15000, 15000, > 15000, > > > 15000, 15000, 20100, 15000, 15000, 15000, 15000, 15000, > 15000, > > > 16600, 15000, 15000, 15700, 15000, 15000, 15000, 15000, > 15000, > > > 15000, 15000, 15000, 15000, 20200, 21400, 25100, 15000, > 15000, > > > 15000, 15000, 15000, 15000, 25600, 15000, 15000, 15000, > 15000, > > > 15000, 15000, 15000, 15000)), row.names = c(NA, -74L), class > > > > "data.frame") > > > > > > I would like to modify the following script: > > > > > >> with(datasetregs, Hist(Amount, groups=Date, scale="frequency", > > > + breaks="Sturges", col="darkgray")) > > > > > > #The only thing missing here are the percentages corresponding to > > each bin > > > (I would like to see the percentages inside each column, or on > > top outside > > > if possible) > > > > > > Any help will be greatly appreciated. > > > > > > Best regards, > > > > > > Paul. > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help at r-project.org <mailto:R-help at r-project.org> mailing list > > -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > <https://stat.ethz.ch/mailman/listinfo/r-help> > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > <http://www.R-project.org/posting-guide.html> > > > and provide commented, minimal, self-contained, reproducible code. > > > > > >[[alternative HTML version deleted]]