Hello I need to plot a histogram, but insted of using bars, I'd like to plot the data points. I've been doing it like this so far: h <- hist(x, plot = F) plot(y = x$counts / sum(x$counts), x = x$breaks[2:length(x$breaks)], type = "p", log = "xy") Sometimes I want to have a look at the "raw" data (avoiding any kind of binning). When x only contains integers, it's easy to just use bins of size 1 when generating h with "breaks = seq(0, max(x))". Is there any way to do something similar when x consists of fractional data? What I'm doing is setting a small bin length (for example, "breaks = seq(0, 1, by = 1e-6)", but there's still a chance that points will be grouped in a single bin. Is there a better way to do this kind of "raw histogram" plotting? Thanks, Andre
take a look at ?stem There is still a place for handtools in the age of integrated circuits. Of course, avoiding binning isn't really desirable. url: www.econ.uiuc.edu/~roger Roger Koenker email rkoenker at uiuc.edu Department of Economics vox: 217-333-4558 University of Illinois fax: 217-244-6678 Champaign, IL 61820 On Feb 26, 2008, at 4:10 PM, Andre Nathan wrote:> Hello > > I need to plot a histogram, but insted of using bars, I'd like to plot > the data points. I've been doing it like this so far: > > h <- hist(x, plot = F) > plot(y = x$counts / sum(x$counts), > x = x$breaks[2:length(x$breaks)], > type = "p", log = "xy") > > Sometimes I want to have a look at the "raw" data (avoiding any kind > of > binning). When x only contains integers, it's easy to just use bins of > size 1 when generating h with "breaks = seq(0, max(x))". > > Is there any way to do something similar when x consists of fractional > data? What I'm doing is setting a small bin length (for example, > "breaks > = seq(0, 1, by = 1e-6)", but there's still a chance that points will > be > grouped in a single bin. > > Is there a better way to do this kind of "raw histogram" plotting? > > Thanks, > Andre > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On Tue, Feb 26, 2008 at 4:10 PM, Andre Nathan <andre at digirati.com.br> wrote:> Hello > > I need to plot a histogram, but insted of using bars, I'd like to plot > the data points. I've been doing it like this so far: > > h <- hist(x, plot = F) > plot(y = x$counts / sum(x$counts), > x = x$breaks[2:length(x$breaks)], > type = "p", log = "xy")Another approach would be to use ggplot2, where all statistical transformations can be performed separately from their traditional appearance: install.packages("ggplot2") qplot(x, stat="bin", geom="bar") qplot(x, stat="bin") Hadley -- http://had.co.nz/
On Feb 27, 2008, at 8:16 AM, Andre Nathan wrote:> On Wed, 2008-02-27 at 14:15 +1300, Peter Alspach wrote: >> If I understand you correctly, you could try a barplot() on the >> result >> of table(). > > Hmm, table() does the counting exactly the way I want, i.e., just > counting individual values. Is there a way to extract the counts > vs. the > values from a table, so that I can pass them as the x and y > arguments to > plot()? >x <- table(rbinom(20,2,0.5)) plot(names(x),x) should do it. You can also try just plot(x). Use prop.table on table if you want the relative frequencies instead.> Thanks, > AndreHaris Skiadas Department of Mathematics and Computer Science Hanover College
Why not use the interactive histogram in iplots? ihist(x) Then you can vary the binwidth interactively and get a very quick idea of the structure of your data by looking at a range of plots with different binwidths. Relying on a single plot to reveal everything about a variable's distribution is not a good idea. A couple of people suggested estimating the density. That may miss roundings, discretisation or other odd structures. We should never underestimate what Peter Huber called "the rawness of raw data". Antony Unwin Professor of Computer-Oriented Statistics and Data Analysis, Mathematics Institute, University of Augsburg, Germany [[alternative HTML version deleted]]