Hi All I have a vector of about 15 million numbers which I would like to plot. The goal is the see the distribution. I tired the usual steps. 1. Histogram : never gets complete my window freezes w/out log base 10 2. Density : I first calculated the kernel density and then plotted it which worked. It would be nice to superimpose histogram with density but as of now I am not able to get this data as a histogram. I tried ggplot2 which also hangs. Any efficient methods to play with > 10 million numbers in a vector. Thanks, -Abhi
On 2/25/10, Abhishek Pratap <abhishek.vit at gmail.com> wrote:> Any efficient methods to play with > 10 million numbers in a vector. >Did you try rggobi? Liviu
Have you considered taking a random subset and plotting that? I'd bet you can get a really impression of the distribution with a few hundred thousand points at most. Tim Glover Senior Environmental Scientist - Geochemistry Geoscience Department Atlanta Area MACTEC Engineering and Consulting, Inc. Kennesaw, Georgia, USA Office 770-421-3310 Fax 770-421-3486 Email ntglover at mactec.com Web www.mactec.com -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Abhishek Pratap Sent: Thursday, February 25, 2010 6:12 PM To: r-help at r-project.org Subject: [R] Plotting 15 million points Hi All I have a vector of about 15 million numbers which I would like to plot. The goal is the see the distribution. I tired the usual steps. 1. Histogram : never gets complete my window freezes w/out log base 10 2. Density : I first calculated the kernel density and then plotted it which worked. It would be nice to superimpose histogram with density but as of now I am not able to get this data as a histogram. I tried ggplot2 which also hangs. Any efficient methods to play with > 10 million numbers in a vector. Thanks, -Abhi ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
On Feb 25, 2010, at 6:11 PM, Abhishek Pratap wrote:> Hi All > > I have a vector of about 15 million numbers which I would like to > plot. The goal is the see the distribution.> I tired the usual steps.I get that way after a long day myself.> > 1. Histogram : never gets complete my window freezes w/out log base 10What expressions?> 2. Density : I first calculated the kernel density and then plotted > it which worked. > > It would be nice to superimpose histogram with density but as of now I > am not able to get this data as a histogram.?cut ?table> I tried ggplot2 which > also hangs. > > Any efficient methods to play with > 10 million numbers in a vector.Well, I only have 4.5 million rows (in a hundred plus variable dataframe) but the typical commands seem to work fine. hist() gave a plot almost instantly: hist(TRdta$ur_procreat, breaks=c(seq(0, 4, by=0.2), 20) )> > Thanks, > -Abhi > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT
> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On > Behalf Of Abhishek Pratap > Sent: Thursday, February 25, 2010 3:12 PM > To: r-help at r-project.org > Subject: [R] Plotting 15 million points > > Hi All > > I have a vector of about 15 million numbers which I would like to > plot. The goal is the see the distribution. I tired the usual steps. > > 1. Histogram : never gets complete my window freezes w/out log base 10 > 2. Density : I first calculated the kernel density and then plotted > it which worked. > > It would be nice to superimpose histogram with density but as of now I > am not able to get this data as a histogram. I tried ggplot2 which > also hangs. > > Any efficient methods to play with > 10 million numbers in a vector. > > Thanks, > -Abhi >You need to show us what you did. Generating 15 million random normals and plotting a histogram worked just fine on my desktop in a matter of ~6 seconds.> x <- rnorm(15e6) > hist(x)Dan Daniel J. Nordlund Washington State Department of Social and Health Services Planning, Performance, and Accountability Research and Data Analysis Division Olympia, WA 98504-5204
Hi All I should have included this first up and I think I understand the problem. The load on the server I was running R was heavy which was causing everything to slow up.>summary(s)Min. 1st Qu. Median Mean 3rd Qu. Max. 2 182 263 6086 343 4630000> length(s)[1] 16750589 hist(log(s,10),breaks=100) Thanks! -Abhi On Thu, Feb 25, 2010 at 7:38 PM, Nordlund, Dan (DSHS/RDA) <NordlDJ at dshs.wa.gov> wrote:>> -----Original Message----- >> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On >> Behalf Of Abhishek Pratap >> Sent: Thursday, February 25, 2010 3:12 PM >> To: r-help at r-project.org >> Subject: [R] Plotting 15 million points >> >> Hi All >> >> I have a vector of about 15 million numbers which I would like to >> plot. The goal is the see the distribution. ?I tired the usual steps. >> >> 1. Histogram : never gets complete my window freezes w/out log base 10 >> 2. Density ?: I first calculated the kernel density and then plotted >> it which worked. >> >> It would be nice to superimpose histogram with density but as of now I >> am not able to get this data as a histogram. I tried ggplot2 which >> also hangs. >> >> Any efficient methods to play with > 10 million numbers in a vector. >> >> Thanks, >> -Abhi >> > > You need to show us what you did. ?Generating 15 million random normals and plotting a histogram worked just fine on my desktop in a matter of ~6 seconds. > >> x <- rnorm(15e6) >> hist(x) > > Dan > > Daniel J. Nordlund > Washington State Department of Social and Health Services > Planning, Performance, and Accountability > Research and Data Analysis Division > Olympia, WA ?98504-5204 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >