I have a data frame (daf1), that holds +80000 records, and 10 variables (i.e. 10 columns and some 80000 rows)> length(daf1)[1] 10> length(daf1[,1])[1] 83805 I would like to sample() e.g. 10000 records from this. I use:> daf2 <- sample(daf1, 1000, replace = FALSE, prob = NULL)Error in `[.data.frame`(x, .Internal(sample(length(x), size, replace, : cannot take a sample larger than the population when 'replace = FALSE' As length(daf1) is 10, it thinks I'm taking 10000 samples from a size 10 population... Arghhh How do I go about sampeling from a data frame? :-( Martin
Hi, I find convenient to use a custom function for this: sample.df <- function (df, N = 1000, ...) { df[sample(nrow(df), N, ...), ] } sample.df(daf1,1000) Hope this helps, baptiste On 25 Aug 2008, at 12:31, Martin Hvidberg wrote:> > > I have a data frame (daf1), that holds +80000 records, and 10 > variables (i.e. 10 columns and some 80000 rows) > >> length(daf1) > [1] 10 >> length(daf1[,1]) > [1] 83805 > > I would like to sample() e.g. 10000 records from this. I use: > >> daf2 <- sample(daf1, 1000, replace = FALSE, prob = NULL) > Error in `[.data.frame`(x, .Internal(sample(length(x), size, > replace, : > cannot take a sample larger than the population when 'replace = > FALSE' > > As length(daf1) is 10, it thinks I'm taking 10000 samples from a > size 10 population... Arghhh > > How do I go about sampeling from a data frame? > > :-( Martin > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code._____________________________ Baptiste Augui? School of Physics University of Exeter Stocker Road, Exeter, Devon, EX4 4QL, UK Phone: +44 1392 264187 http://newton.ex.ac.uk/research/emag
Hi you are not overly specific about what you really want but maybe daf1[sample(1:83805, 10000), ] This selects randomly 10000 rows from daf1 Is this what you want? Regards Petr r-help-bounces at r-project.org napsal dne 25.08.2008 13:31:15:> > I have a data frame (daf1), that holds +80000 records, and 10 variables(i.e.> 10 columns and some 80000 rows) > > > length(daf1) > [1] 10 > > length(daf1[,1]) > [1] 83805 > > I would like to sample() e.g. 10000 records from this. I use: > > > daf2 <- sample(daf1, 1000, replace = FALSE, prob = NULL) > Error in `[.data.frame`(x, .Internal(sample(length(x), size, replace, :> cannot take a sample larger than the population when 'replace = FALSE' > > As length(daf1) is 10, it thinks I'm taking 10000 samples from a size 10> population... Arghhh > > How do I go about sampeling from a data frame? > > :-( Martin > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.