If it does finish, it will take some time. And what for?
If all you want is a plot to look at, why are you using all 33 million
observations? Chances are that a sample of, say, 10000 will get you about as
good as a plot of an ecdf would do. Have you tried
plot.ecdf(c(range(myDataVector), sample(myDataVector, 10000)))
for example? An alternative would be to sort x and take a systematic sample
starting at the first observation. 10000 is in fact a bit of an overkill.
Bill Venables
CSIRO/CMIS Cleveland Laboratories
-----Original Message-----
From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
On Behalf Of Jonathan
Sent: Friday, 5 March 2010 10:10 AM
To: r-help
Subject: [R] plotting ecdf; R is stalled
Dear R-help:
I am trying to plot the cumulative distribution function of a
vector of around 33 million numeric observations.
> plot.ecdf(myDataVector)
R has been non-responsive for about an hour, and my guess is that it's
probably not going to finish.
Does anybody have a sense whether this a reasonable experience (and if
so, is there a way to get the desired effect, or am I SOL)? I can't
find anything in the help archives.
OS: Windows 7 64-bit; R version 2.10.1; RAM: 4 gb
Thanks,
Jonathan
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.