Good morning Stavos,
I currently use the following definition in my own environment.
sample.df <- function (df, n = 3) {
df[sample(nrow(df), min(nrow(df), n)), ]
}
I also added in the possibility of returning n sequential rows which I used
when examining address files... but I haven't used it in ages :-)
Kind regards,
Sean O'Riordain
Dublin
Ireland
On Fri, Feb 19, 2010 at 9:05 PM, Stavros Macrakis
<macrakis@alum.mit.edu>wrote:
> Currently, sample of a data.frame is a sample of the columns:
>
> e.g. sample(data.frame(a=1,b=2:3,c=4),2) => data.frame(b=2:3,c=c(4,4))
>
> I'd have thought it would be much more common to want a sample of the
rows.
>
> It's easy enough to define an appropriate function for this:
>
> sample.data.frame <- function(x,size,replace=FALSE,prob=NULL)
> # no auto-dispatch; sample is not a generic function
> {
> x[sample(nrow(x),size,replace,prob),]
> }
>
> Would it be a bad idea for this to be the standard behavior for sample?
>
> There is always, of course, the backwards-compatiblity argument. Is sample
> in fact used in practice to select random columns? I realize it is hard to
> quantify that, but perhaps there is some wisdom in the community about
> that.
>
> -s
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
[[alternative HTML version deleted]]