On Fri, 2009-03-27 at 15:11 -0400, Laura Rodriguez Murillo
wrote:> Hi dear list,
>
> I have a list of around 2000 identifiers aranged in a dataframe in one
> column and I would like to choose a random subset of these. I wonder
> if somebody can tell me if I could do this with R...
Not sure what you mean by identifiers, but to select a subset of the
2000 cells in that column, you could use sample(). See ?sample for
details, but here is an example.
## choose a random subset of 500 out of 2000 entries
## dummy data
dat <- data.frame(identifiers = sample(2000, 2000), X = rnorm(2000))
## set seed to make this the same on your PC as mine
## comment this if you want a different subset each time you run
set.seed(1234)
## random subset of 500
want <- sample(2000, 500)
## select out that subset
## head to show only first n of the selected
head(dat$identifiers[want])
Gives:
> head(dat$identifiers[want])
[1] 1327 587 835 430 1422 1687
This assumes the identifiers are unique.
HTH
G
>
> Thank you so much!
>
> Laura RM
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Dr. Gavin Simpson [t] +44 (0)20 7679 0522
ECRC, UCL Geography, [f] +44 (0)20 7679 0565
Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/
UK. WC1E 6BT. [w] http://www.freshwaters.org.uk
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%