thr3ads.net - R help - [R] Can somebody help me with following data manipulation? [Dec 2012]

If this information is useful, please help other people find it:
Share via:

Christofer Bogaso

2012-Dec-06 19:35 UTC

[R] Can somebody help me with following data manipulation?

Dear all, let say I have following data:

dat <- structure(list(V1 = structure(c(1L, 4L, 5L, 3L, 3L, 5L, 6L, 6L,
4L, 3L, 5L, 6L, 5L, 5L, 4L, 4L, 6L, 2L, 3L, 4L, 3L, 3L, 2L, 5L,
3L, 6L, 3L, 3L, 6L, 3L, 6L, 1L, 6L, 5L, 2L, 2L), .Label = c("C",
"G", "I", "O", "R", "T"),
class = "factor"), V2 = c(0L, 0L, 0L,
1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L,
1L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L,
0L), V3 = c(1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 0L,
0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L,
0L, 1L, 0L, 1L, 0L, 1L, 1L)), .Names = c("V1", "V2",
"V3"), class =
"data.frame", row.names = c(NA,
-36L))

Now I want to get following kind of data frame out of that:

dat1 <- structure(list(V1 = structure(c(3L, 3L, 1L, 1L, 2L, 2L), .Label 
= c("C",
"G", "I"), class = "factor"), V2 = c(0L, 1L, 0L,
1L, 0L, 1L),
     V3 = c(0.333333333, 0.428571429, 0.5, NA, 1, NA)), .Names =
c("V1",
"V2", "V3"), class = "data.frame", row.names =
c(NA, -6L))

Basically in 'dat1', the 3rd column is coming from: for 'V1 = I'
& 'V2 =
0' what is the percentage of '1' for "V3" and so on.....

Is there any R function to achieve that directly?

Thanks and regards,

Sarah Goslee

2012-Dec-06 20:03 UTC

head link

[R] Can somebody help me with following data manipulation?

If I understand what you want correctly, aggregate() should do it.
> aggregate(V3 ~ V1 + V2, "mean", data=dat)   V1 V2        V3
1   C  0 0.5000000
2   G  0 1.0000000
3   I  0 0.3333333
4   O  0 1.0000000
5   R  0 0.0000000
6   T  0 0.8333333
7   I  1 0.4285714
8   O  1 0.0000000
9   R  1 0.6666667
10  T  1 0.5000000

That returns the combinations that actually exist.

If you convert V1 and V2 to factors, thus setting the possible levels,
all combinations will be returned:> dat$V1 <- factor(dat$V1)
> dat$V2 <- factor(dat$V2)
> aggregate(V3 ~ V1 + V2, "mean", data=dat)   V1 V2        V3
1   C  0 0.5000000
2   G  0 1.0000000
3   I  0 0.3333333
4   O  0 1.0000000
5   R  0 0.0000000
6   T  0 0.8333333
7   I  1 0.4285714
8   O  1 0.0000000
9   R  1 0.6666667
10  T  1 0.5000000

Sarah

On Thu, Dec 6, 2012 at 2:35 PM, Christofer Bogaso
<bogaso.christofer at gmail.com> wrote:> Dear all, let say I have following data:
>
> dat <- structure(list(V1 = structure(c(1L, 4L, 5L, 3L, 3L, 5L, 6L, 6L,
> 4L, 3L, 5L, 6L, 5L, 5L, 4L, 4L, 6L, 2L, 3L, 4L, 3L, 3L, 2L, 5L,
> 3L, 6L, 3L, 3L, 6L, 3L, 6L, 1L, 6L, 5L, 2L, 2L), .Label = c("C",
> "G", "I", "O", "R", "T"),
class = "factor"), V2 = c(0L, 0L, 0L,
> 1L, 1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L,
> 1L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L,
> 0L), V3 = c(1L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 1L, 0L,
> 0L, 0L, 0L, 1L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L,
> 0L, 1L, 0L, 1L, 0L, 1L, 1L)), .Names = c("V1", "V2",
"V3"), class > "data.frame", row.names = c(NA,
> -36L))
>
> Now I want to get following kind of data frame out of that:
>
> dat1 <- structure(list(V1 = structure(c(3L, 3L, 1L, 1L, 2L, 2L), .Label
> c("C",
> "G", "I"), class = "factor"), V2 = c(0L, 1L,
0L, 1L, 0L, 1L),
>     V3 = c(0.333333333, 0.428571429, 0.5, NA, 1, NA)), .Names =
c("V1",
> "V2", "V3"), class = "data.frame", row.names
= c(NA, -6L))
>
> Basically in 'dat1', the 3rd column is coming from: for 'V1 =
I' & 'V2 = 0'
> what is the percentage of '1' for "V3" and so on.....
>
> Is there any R function to achieve that directly?
>
> Thanks and regards,
>

Maybe Matching Threads

Search for more apparently analagous threads

R help - Dec 2012 - Can somebody help me with following data manipulation?

[R] Can somebody help me with following data manipulation?

[R] Can somebody help me with following data manipulation?

Maybe Matching Threads