Hi! I have a dataset that looks like this: 0.0 14 0.0 3 0.9 12 0.73 15 0.78 2 1.0 15 0.3 2 0.32 8 ...and so on. I.e. a value between 0 and 1, and a number I would like to plot this in a histogram-like manner. I would like to have a set of bins, each 0.1 wide, and plot the sum of values in column 2 that falls within each bin. I.e, in this case I would like the first bin, 0.0, to have the value 17, the second, 0.1, to have the value 0 and so on, until the last bin which has the value 15. I am sadly uncertain of both how to sum these together, and also on which plot type to use. Thanks in advance! Karin -- Karin Lagesen, Ph.D. Centre for Ecological and Evolutionary Synthesis (CEES) University of Oslo, Dept. of Biology P.O. Box 1066 Blindern 0316 Oslo, Norway Ph. +47 22844132 Fax. +47 22854001 Email karin.lagesen at bio.uio.no http://folk.uio.no/karinlag
On Tue, Mar 29, 2011 at 11:05:08AM +0200, Karin Lagesen wrote:> Hi! > > I have a dataset that looks like this: > > 0.0 14 > 0.0 3 > 0.9 12 > ...and so on. > > I would like to plot this in a histogram-like manner.One way would be to re-create the original data and then simply use hist: dat <- data.frame(x=c(0,0,0.9,0.73,0.78,1,0.3,0.32), freq=c(14,3,12,15,2,15,2,8)) hist(with(dat, rep(x, times=freq))) My example did not take special binning wishes into account but you can easily customiye that with the breaks argument to hist. cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/
Look at the cut, tapply, and barplot functions. There is probably also a nice way to do this using ggplot2 package. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Karin Lagesen > Sent: Tuesday, March 29, 2011 3:05 AM > To: r-help at r-project.org > Subject: [R] producing histogram-like plot > > Hi! > > I have a dataset that looks like this: > > 0.0 14 > 0.0 3 > 0.9 12 > 0.73 15 > 0.78 2 > 1.0 15 > 0.3 2 > 0.32 8 > > ...and so on. > > I.e. a value between 0 and 1, and a number > > I would like to plot this in a histogram-like manner. I would like to > have a set of bins, each 0.1 wide, and plot the sum of values in column > 2 that falls within each bin. I.e, in this case I would like the first > bin, 0.0, to have the value 17, the second, 0.1, to have the value 0 > and > so on, until the last bin which has the value 15. I am sadly uncertain > of both how to sum these together, and also on which plot type to use. > > Thanks in advance! > > Karin > -- > Karin Lagesen, Ph.D. > Centre for Ecological and Evolutionary Synthesis (CEES) > University of Oslo, Dept. of Biology > P.O. Box 1066 Blindern 0316 Oslo, Norway > Ph. +47 22844132 Fax. +47 22854001 > Email karin.lagesen at bio.uio.no > http://folk.uio.no/karinlag > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi: Here's one way to do it, using ggplot2 and base graphics. # Simulate a data frame with values between 0 and 1 and a corresponding frequency df <- data.frame(val = round(runif(100), 2), freq = rpois(100, 10)) # findInterval assigns the values between 0 and 1 to intervals with width 0.1; you need # to be careful that you get 10 intervals here rather than 11 - if you get both 0.00 and 1.00 # as rounded values, you will get 11 intervals, since the intervals are open on the left. df$interval <- with(df, findInterval(val, seq(0, 1, by = 0.1), rightmost TRUE)) # There are several ways to cumulate the frequencies by interval, but since I want # to keep things in data frames (for input to ggplot2), I chose ddply() from the plyr package, # but tapply(), aggregate() and several functions in other packages would also work. library(ggplot2) # loads the plyr package as well tab <- ddply(df, 'interval', summarise, cfreq = sum(freq)) # Generate a bar chart in ggplot2 ggplot(tab, aes(x = interval/10 - 0.05, y = cfreq)) + geom_bar(stat = 'identity', fill = 'orange', color = 'orange') + labs(x = 'value', y = 'frequency') # Same graph in base graphics, but here I removed the space between bars. The # plot is saved to an object so that the x-axis labels can be replaced. u <- barplot(tab$cfreq, space = 0, col = 'red', ylim = c(0, 130), xlab 'Value', ylab = 'Frequency') axis(1, at = u, labels = u/10) box() HTH, Dennis On Tue, Mar 29, 2011 at 2:05 AM, Karin Lagesen <karin.lagesen@bio.uio.no>wrote:> Hi! > > I have a dataset that looks like this: > > 0.0 14 > 0.0 3 > 0.9 12 > 0.73 15 > 0.78 2 > 1.0 15 > 0.3 2 > 0.32 8 > > ...and so on. > > I.e. a value between 0 and 1, and a number > > I would like to plot this in a histogram-like manner. I would like to have > a set of bins, each 0.1 wide, and plot the sum of values in column 2 that > falls within each bin. I.e, in this case I would like the first bin, 0.0, to > have the value 17, the second, 0.1, to have the value 0 and so on, until the > last bin which has the value 15. I am sadly uncertain of both how to sum > these together, and also on which plot type to use. > > Thanks in advance! > > Karin > -- > Karin Lagesen, Ph.D. > Centre for Ecological and Evolutionary Synthesis (CEES) > University of Oslo, Dept. of Biology > P.O. Box 1066 Blindern 0316 Oslo, Norway > Ph. +47 22844132 Fax. +47 22854001 > Email karin.lagesen@bio.uio.no > http://folk.uio.no/karinlag > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]