Don't use subset for a function name -- it's already the name of a
rather important function as is data (but at least that one's not a
function in your use so it's not quite so bad). Finally, use dput()
when sending data so we get a plaintext reproducible version.
I'd try something like this:
dats <- structure(list(Study = c(1L, 1L, 2L, 2L, 3L, 3L), TX = c(1L,
0L, 1L, 0L, 1L, 0L), AEs = c(3L, 2L, 1L, 2L, 1L, 1L), N = c(5L,
7L, 10L, 7L, 8L, 4L)), .Names = c("Study", "TX",
"AEs", "N"), class "data.frame", row.names =
c("1",
"2", "3", "4", "5", "6"))
# See how handy dput can be :-)
dats[unlist(mapply(FUN = function(x,y) rep(x, y), 1:NROW(dats), dats$N)), -4]
which isn't super elegant, but others might have something better.
Best,
Michael
On Tue, May 15, 2012 at 1:24 AM, Cheenghee AM Koh <sigontw at gmail.com>
wrote:> Hello, R-fellows,
>
> I have a question that I really don't know how to solve. I have spent
hours
> on line surfing for possible solutions but in veil. Please if anyone could
> help me handle this issue, you would be so appreciated!
>
> I have a "grouped" dataset like this:
>
>> data
> ?Study TX AEs ? N
> 1 ? ? 1 ? ? 1 ? ?3 ? ? ? 5
> 2 ? ? 1 ? ? 0 ? ?2 ? ? ? 7
> 3 ? ? 2 ? ? 1 ? ?1 ? ? ?10
> 4 ? ? 2 ? ? 0 ? ?2 ? ? ? 7
> 5 ? ? 3 ? ? 1 ? ?1 ? ? ? 8
> 6 ? ? 3 ? ? 0 ? ?1 ? ? ? 4
>
> where Study is the study id, TX is treatment, AEs is how many people in
> this trial is positive, and N is the number of the subjects. Therefore, for
> the row 1, it stands for: It is the treatment arm for the study one, where
> there are 5 subjects and 3 of them are positive. The row 2 stands for: It
> is the control arm of the study 1 where there are 7 subjects and 2 of them
> are positive.
>
> Now I would like to "un-group them", make it like:
>
> Study ?TX ? AEs
> ? 1 ? ? ? ? 1 ? ? ?1
> ? 1 ? ? ? ? 1 ? ? ?1
> ? 1 ? ? ? ? 1 ? ? ?1
> ? 1 ? ? ? ? 1 ? ? ?0
> ? 1 ? ? ? ? 1 ? ? ?0
> ? 1 ? ? ? ? 0 ? ? ?1
> ? 1 ? ? ? ? 0 ? ? ?1
> ? 1 ? ? ? ? 0 ? ? ?0
> ? 1 ? ? ? ? 0 ? ? ?0
> ? 1 ? ? ? ? 0 ? ? ?0
> ? 1 ? ? ? ? 0 ? ? ?0
> ? 1 ? ? ? ? 0 ? ? ?0
> ? 2 ? ? ? ? 1 ? ? ?1
> ? .....................
> ?.....................
>
>
> But I wasn't able to do it. In fact I wrote a small function, and use
> "lapply" to get what I want. It worked well, and did give me what
I want.
> But I wasn't able to collapse all the returns into one single data
frame
> for subsequent analysis.
>
> The function I wrote:
>
> subset = function(i){
> d = c(rep(data[i,1], data[i,4]), rep(data[i,2], data[i,4]), rep(0:1,
> c(data[i,4] - data[i,3],data[i,3])))
> d = matrix(d, data[i,4],3)
> d
> }
>
> then:
>
> Data = lapply(1:6, subset)
> Data
>
> Therefore, I tried to write a loop. But no matter how I tried, I can't
get
> what I want.
>
> Any idea?
>
> Thank you so much!
>
> Best,
>
>
> --
> Cheenghee Masaki Koh, MSW, MS(c), PhD Student
> School of Social Service Administration
> Department of Health Studies, Division of Biological Science
> University of Chicago
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.