Hi! I have a dataset that looks like this: 0.0 14 0.0 3 0.9 12 0.73 15 0.78 2 1.0 15 0.3 2 0.32 8 ...and so on. I.e. a value between 0 and 1, and a number I would like to plot this in a histogram-like manner. I would like to have a set of bins, each 0.1 wide, and plot the sum of values in column 2 that falls within each bin. I.e, in this case I would like the first bin, 0.0, to have the value 17, the second, 0.1, to have the value 0 and so on, until the last bin which has the value 15. I am sadly uncertain of both how to sum these together, and also on which plot type to use. Thanks in advance! Karin -- Karin Lagesen, Ph.D. Centre for Ecological and Evolutionary Synthesis (CEES) University of Oslo, Dept. of Biology P.O. Box 1066 Blindern 0316 Oslo, Norway Ph. +47 22844132 Fax. +47 22854001 Email karin.lagesen at bio.uio.no http://folk.uio.no/karinlag
On Tue, Mar 29, 2011 at 11:05:08AM +0200, Karin Lagesen wrote:> Hi! > > I have a dataset that looks like this: > > 0.0 14 > 0.0 3 > 0.9 12 > ...and so on. > > I would like to plot this in a histogram-like manner.One way would be to re-create the original data and then simply use hist: dat <- data.frame(x=c(0,0,0.9,0.73,0.78,1,0.3,0.32), freq=c(14,3,12,15,2,15,2,8)) hist(with(dat, rep(x, times=freq))) My example did not take special binning wishes into account but you can easily customiye that with the breaks argument to hist. cu Philipp -- Dr. Philipp Pagel Lehrstuhl f?r Genomorientierte Bioinformatik Technische Universit?t M?nchen Wissenschaftszentrum Weihenstephan Maximus-von-Imhof-Forum 3 85354 Freising, Germany http://webclu.bio.wzw.tum.de/~pagel/
Look at the cut, tapply, and barplot functions. There is probably also a nice way to do this using ggplot2 package. -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.snow at imail.org 801.408.8111> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of Karin Lagesen > Sent: Tuesday, March 29, 2011 3:05 AM > To: r-help at r-project.org > Subject: [R] producing histogram-like plot > > Hi! > > I have a dataset that looks like this: > > 0.0 14 > 0.0 3 > 0.9 12 > 0.73 15 > 0.78 2 > 1.0 15 > 0.3 2 > 0.32 8 > > ...and so on. > > I.e. a value between 0 and 1, and a number > > I would like to plot this in a histogram-like manner. I would like to > have a set of bins, each 0.1 wide, and plot the sum of values in column > 2 that falls within each bin. I.e, in this case I would like the first > bin, 0.0, to have the value 17, the second, 0.1, to have the value 0 > and > so on, until the last bin which has the value 15. I am sadly uncertain > of both how to sum these together, and also on which plot type to use. > > Thanks in advance! > > Karin > -- > Karin Lagesen, Ph.D. > Centre for Ecological and Evolutionary Synthesis (CEES) > University of Oslo, Dept. of Biology > P.O. Box 1066 Blindern 0316 Oslo, Norway > Ph. +47 22844132 Fax. +47 22854001 > Email karin.lagesen at bio.uio.no > http://folk.uio.no/karinlag > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi:
Here's one way to do it, using ggplot2 and base graphics.
# Simulate a data frame with values between 0 and 1 and a corresponding
frequency
df <- data.frame(val = round(runif(100), 2), freq = rpois(100, 10))
# findInterval assigns the values between 0 and 1 to intervals with width
0.1; you need
# to be careful that you get 10 intervals here rather than 11 - if you get
both 0.00 and 1.00
# as rounded values, you will get 11 intervals, since the intervals are open
on the left.
df$interval <- with(df, findInterval(val, seq(0, 1, by = 0.1), rightmost
TRUE))
# There are several ways to cumulate the frequencies by interval, but since
I want
# to keep things in data frames (for input to ggplot2), I chose ddply() from
the plyr package,
# but tapply(), aggregate() and several functions in other packages would
also work.
library(ggplot2) # loads the plyr package as well
tab <- ddply(df, 'interval', summarise, cfreq = sum(freq))
# Generate a bar chart in ggplot2
ggplot(tab, aes(x = interval/10 - 0.05, y = cfreq)) +
geom_bar(stat = 'identity', fill = 'orange', color =
'orange') +
labs(x = 'value', y = 'frequency')
# Same graph in base graphics, but here I removed the space between bars.
The
# plot is saved to an object so that the x-axis labels can be replaced.
u <- barplot(tab$cfreq, space = 0, col = 'red', ylim = c(0, 130),
xlab 'Value',
ylab = 'Frequency')
axis(1, at = u, labels = u/10)
box()
HTH,
Dennis
On Tue, Mar 29, 2011 at 2:05 AM, Karin Lagesen
<karin.lagesen@bio.uio.no>wrote:
> Hi!
>
> I have a dataset that looks like this:
>
> 0.0 14
> 0.0 3
> 0.9 12
> 0.73 15
> 0.78 2
> 1.0 15
> 0.3 2
> 0.32 8
>
> ...and so on.
>
> I.e. a value between 0 and 1, and a number
>
> I would like to plot this in a histogram-like manner. I would like to have
> a set of bins, each 0.1 wide, and plot the sum of values in column 2 that
> falls within each bin. I.e, in this case I would like the first bin, 0.0,
to
> have the value 17, the second, 0.1, to have the value 0 and so on, until
the
> last bin which has the value 15. I am sadly uncertain of both how to sum
> these together, and also on which plot type to use.
>
> Thanks in advance!
>
> Karin
> --
> Karin Lagesen, Ph.D.
> Centre for Ecological and Evolutionary Synthesis (CEES)
> University of Oslo, Dept. of Biology
> P.O. Box 1066 Blindern 0316 Oslo, Norway
> Ph. +47 22844132 Fax. +47 22854001
> Email karin.lagesen@bio.uio.no
> http://folk.uio.no/karinlag
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]