On Sun, Jul 19, 2009 at 11:32 PM, Marujo A.<A.Marujo at soton.ac.uk>
wrote:> Dear R-helpers
> I have 2 variables
> x1=rgamma(6000, 2, 1) and x2=rgamma(6000, 3,2). I have to sort (descending)
each one and split it into groups. After this each two groups must be merged
into one until all population becomes one group. A dummy vector must be created
for each group (8, 4, 2, 1) being equal to 1 if the individual (i) belongs to
the group and equal to 0, otherwise.
if I understand correctly you want to create one factor with 8 levels,
one factor with 4 levels and one factor with 2 levels based on equal
divisions of the sorted x1 values. If so, it is advantageous to use
the "whole object" approach in R. I would suggest creating a data
frame with the values of x1 and x2 then sorting the rows in descending
order of x1 then adding the factors, which can easily be defined with
the gl() function. On a small example it looks like
> df <- data.frame(x1 = rgamma(20, 2, 1), x2 = rgamma(20, 3, 2))
> df <- df[rev(order(df$x1)), ]
> df$g4 <- gl(4, 5)
> df$g2 <- gl(2, 10)
> df
x1 x2 g4 g2
17 3.2050060 1.1395147 1 1
14 2.8422283 2.4612637 1 1
2 2.4286087 2.1572067 1 1
16 2.4108377 1.1360309 1 1
20 2.0954746 1.2974074 1 1
12 2.0641932 1.2820681 2 1
18 1.9857902 1.9888521 2 1
1 1.9394710 1.7363564 2 1
7 1.8907038 1.6302374 2 1
10 1.6421862 1.7538054 2 1
11 1.3926248 1.3363230 3 2
13 1.3590006 0.4226191 3 2
6 1.3172306 2.8610896 3 2
4 1.2888751 2.0672638 3 2
5 1.1358279 1.5365895 3 2
15 1.1017541 2.3689916 4 2
19 0.7358496 1.6427665 4 2
9 0.5669082 0.2964689 4 2
3 0.5657076 0.9320564 4 2
8 0.3211136 0.5938290 4 2
> What I have done was:
> id=(6000)
> x1sort=sort(x1, decreasing=TRUE)
> x1g8_1=x1sort[1:750]
> x1g8_2=x1sort[751:1500]
> x1g8_3=x1sort[1501:2250]
> x1g8_4=x1sort[2251:3000]
> x1g8_5=x1sort[3001:3750]
> x1g8_6=x1sort[3751:4500]
> x1g8_7=x1sort[4501:5250]
> x1g8_8=x1sort[5251:6000]
>
> x1g4_1=c(x1g8_1, x1g8_2)
> x1g4_2=c(x1g8_3, x1g8_4)
> x1g4_3=c(x1g8_5, x1g8_6)
> x1g4_4=c(x1g8_7, x1g8_8)
>
> x1g2_1=c(x1g4_1, x1g4_2)
> x1g2_2=c(x1g4_3, x1g4_4)
>
> x1ng=c(x1g2_1, x1g2_2)
>
> After this I did the dummy vector (the example is for group4)
>
> dum= replace(matrix(0, 4, 1), cbind(4, 1), 0) ? ? ? ? ? ? ? ? ? ? ? ? ? #
matrix of zeros
> dummy=lapply(1:4, function(i) replace(dum, cbind(i), 1)) ? ? ? ?# 4 dummy
vectors
> s=split(dummy, 1:4)
> ss=rename.vars(s, c("1", "2", "3",
"4"), c("dx14_1", "dx14_2", "dx14_3",
"dx14_4"))
>
> The problem is when I split into groups each group only identifies 750
individuals(in the case of x1g8 for instance) only assumes i=1, ..., 750 and I
need to keep i=1, ...., 6000. Also my option to dummy vectors don't seem to
work because I get 4 vectors with the number one (1) in each different variable
and not only one.
>
> So, I need some help on how should I make to keep i=1, ..., 6000 and how to
create a dummy vector that assumes only the value one (1) when some I belongs to
some group.
>
> Thank you so much.
> Ana
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>