Jon Zadra
2009-Aug-11 19:25 UTC
[R] Prevent sequential repeated values in data frame column
Hi,
I'm trying to randomize a sequence of trials for an experimental
design. The trials consist of values for each of two factors. As is
there are 30 combinations of the two factors, and I want them to be
ordered randomly but with the requirement that for one of the factors,
the value can never be the same as the previous value.
I'm currently randomizing my dataframe by using:
x[sample(1:nrow(x), nrow(x),]
Output example, on rows 2 and 3 the angle value is the same (the
situation I wish to prevent):
Distance Angle
9 90
9 45
10 45
11 30
8 60
7 0
8 30
10 0
10 60
... ...
Can anyone recommend a simple way to do this? (Bonus if it could be
implemented for more than a single column!)
Thanks in advance,
Jon
--
Jon Zadra
Department of Psychology
University of Virginia
P.O. Box 400400
Charlottesville VA 22904
(434) 982-4744
email: zadra at virginia.edu
<http://www.google.com/calendar/embed?src=jzadra%40gmail.com>
Scott Sherrill-Mix
2009-Aug-12 04:05 UTC
[R] Prevent sequential repeated values in data frame column
It's a pretty inefficient way to do things (e.g. 50000+ iterations [20
seconds] to find a good sample) but if you're not doing this often I
guess you could do something like:
checkNeighborEqual<-function(x){
#assuming the final value is not infinity
return(any(x==c(x[-1],Inf)))
}
#fake data
x<-data.frame('distance'=rep(1:10,3),'angle'=rep(1:5,6),'var3'=rep(1:6,5))
#column that should not have adjacent duplicates
noDupeColumns<-c('angle','var3')
run<-1
while(run==1||any(apply(newX[,noDupeColumns],2,checkNeighborEqual))){
message('Scrambling for the ',run,' time')
newX<-x[sample(1:nrow(x),nrow(x)),]
run<-run+1
}
print(newX)
Scott
Scott Sherrill-Mix
Department of Microbiology
University of Pennsylvania
402B Johnson Pavilion
3610 Hamilton Walk
Philadelphia, PA 19104-6076
On Tue, Aug 11, 2009 at 4:25 PM, Jon Zadra<jrz9f at virginia.edu>
wrote:> Hi,
>
> I'm trying to randomize a sequence of trials for an experimental
design.
> ?The trials consist of values for each of two factors. ?As is there are 30
> combinations of the two factors, and I want them to be ordered randomly but
> with the requirement that for one of the factors, the value can never be
the
> same as the previous value.
>
> I'm currently randomizing my dataframe by using:
>
> ?x[sample(1:nrow(x), nrow(x),]
>
> Output example, on rows 2 and 3 the angle value is the same (the situation
I
> wish to prevent):
> ? Distance Angle
> ? ? ? ?9 ? ?90
> ? ? ? ?9 ? ?45
> ? ? ? 10 ? 45
> ? ? ? 11 ? 30
> ? ? ? ?8 ? ?60
> ? ? ? ?7 ? ? 0
> ? ? ? ?8 ? ?30
> ? ? ? 10 ? ? 0
> ? ? ? 10 ? ?60
> ? ? ? ?... ? ?...
>
> Can anyone recommend a simple way to do this? ?(Bonus if it could be
> implemented for more than a single column!)
>
> Thanks in advance,
>
> Jon
> --
> Jon Zadra
> Department of Psychology
> University of Virginia
> P.O. Box 400400
> Charlottesville VA 22904
> (434) 982-4744
> email: zadra at virginia.edu
> <http://www.google.com/calendar/embed?src=jzadra%40gmail.com>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>