thr3ads.net - R help - [R] Chosing a subset of a non-sorted vector [May 2007]

If this information is useful, please help other people find it:
Share via:

Christoph Scherber

2007-May-21 11:49 UTC

[R] Chosing a subset of a non-sorted vector

Dear all,

I have a tricky problem here:

I have a dataframe with biodiversity data in which suplots are a 
repeated sequence from 1 to 4 (1234,1234,...)

Now, I want to randomly pick two subplots each from each diversity level 
(DL).

The problem is that it works up to that point - but if I try to subset 
the whole dataframe, I get stuck:

DL=gl(3,4)
subplot=rep(1:4,3)
diversity.data=data.frame(DL,subplot)


subplot.sampled=NULL
for(i in 1:3)
subplot.sampled=c(subplot.sampled,sort(sample(4,2,replace=F)))

subplot.sampled
[1] 3 4 1 3 1 3
subplot[subplot.sampled]
[1] 3 4 1 3 1 3

## here comes the tricky bit:

diversity.data[subplot.sampled,]
     DL subplot
3    1       3
4    1       4
1    1       1
3.1  1       3
1.1  1       1
3.2  1       3

How can I select those rows of diversity.data that match the exact 
subplots in "subplot.sampled"?


Thank you very much for your help!

Best wishes,
Christoph

(I am using R 2.4.1 on Windows XP)


##
Christoph Scherber
DNPW, Agroecology
University of Goettingen
Waldweg 26
D-37073 Goettingen

+49-(0)551-39-8807

Adaikalavan Ramasamy

2007-May-22 09:38 UTC

head link

[R] Chosing a subset of a non-sorted vector

You want to select two subplots for each DL value. Try:

  df <- data.frame( DL=gl(3,4), subplot=rep(1:4,3) )

  df$index <- 1:nrow(df)
  ind <- tapply( df$index, df$DL, function(x) sample(x,2) )
  df[ unlist(ind), ]

You could also have used rownames(df) instead of creating df$index.

OR

   tmp <- lapply( split(df, df$DL), function(m) m[sample(1:nrow(m),2),] )
   do.call("rbind", tmp)

Regards, Adai



Christoph Scherber wrote:> Dear all,
> 
> I have a tricky problem here:
> 
> I have a dataframe with biodiversity data in which suplots are a 
> repeated sequence from 1 to 4 (1234,1234,...)
> 
> Now, I want to randomly pick two subplots each from each diversity level 
> (DL).
> 
> The problem is that it works up to that point - but if I try to subset 
> the whole dataframe, I get stuck:
> 
> DL=gl(3,4)
> subplot=rep(1:4,3)
> diversity.data=data.frame(DL,subplot)
> 
> 
> subplot.sampled=NULL
> for(i in 1:3)
> subplot.sampled=c(subplot.sampled,sort(sample(4,2,replace=F)))
> 
> subplot.sampled
> [1] 3 4 1 3 1 3
> subplot[subplot.sampled]
> [1] 3 4 1 3 1 3
> 
> ## here comes the tricky bit:
> 
> diversity.data[subplot.sampled,]
>      DL subplot
> 3    1       3
> 4    1       4
> 1    1       1
> 3.1  1       3
> 1.1  1       1
> 3.2  1       3
> 
> How can I select those rows of diversity.data that match the exact 
> subplots in "subplot.sampled"?
> 
> 
> Thank you very much for your help!
> 
> Best wishes,
> Christoph
> 
> (I am using R 2.4.1 on Windows XP)
> 
> 
> ##
> Christoph Scherber
> DNPW, Agroecology
> University of Goettingen
> Waldweg 26
> D-37073 Goettingen
> 
> +49-(0)551-39-8807
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 
> 
>

R help - May 2007 - Chosing a subset of a non-sorted vector

[R] Chosing a subset of a non-sorted vector

[R] Chosing a subset of a non-sorted vector