thr3ads.net - R help - [R] unequal number of observations for longitudinal data [Jan 2007]

If this information is useful, please help other people find it:
Share via:

gallon li

2007-Jan-27 10:34 UTC

[R] unequal number of observations for longitudinal data

i have a large longitudinal data set. The number of observations for each
subject is not the same across the sample. The largest number of a subject
is 5 and the smallest number is 1.

now i want to make each subject to have the same number of observations by
filling zero, e.g., my original sample is

id x
001 10
001 30
001 20
002 10
002 20
002 40
002 80
002 70
003 20
003 40
004 ......

now i wish to make the data like

 id x
001 10
001 30
001 20
001 0
001 0
002 10
002 20
002 40
002 80
002 70
003 20
003 40
003 0
003 0
003 0
004 ......

so that each id has exactly 5 observations. is there a function which can
allow me do this quickly?

	[[alternative HTML version deleted]]

Chuck Cleland

2007-Jan-27 10:58 UTC

head link

[R] unequal number of observations for longitudinal data

gallon li wrote:> i have a large longitudinal data set. The number of observations for each
> subject is not the same across the sample. The largest number of a subject
> is 5 and the smallest number is 1.
> 
> now i want to make each subject to have the same number of observations by
> filling zero, e.g., my original sample is
> 
> id x
> 001 10
> 001 30
> 001 20
> 002 10
> 002 20
> 002 40
> 002 80
> 002 70
> 003 20
> 003 40
> 004 ......
> 
> now i wish to make the data like
> 
>  id x
> 001 10
> 001 30
> 001 20
> 001 0
> 001 0
> 002 10
> 002 20
> 002 40
> 002 80
> 002 70
> 003 20
> 003 40
> 003 0
> 003 0
> 003 0
> 004 ......
> 
> so that each id has exactly 5 observations. is there a function which can
> allow me do this quickly?
  Filling in with zeros seems like a bad idea, but here is an approach
to filling in with NAs.  I will leave replacing the NAs with zeros to you.

df.long <- data.frame(id = c(1,1,1,2,2,2,2,2,3,3), x = runif(10),
                      time = c(1,2,5,1,2,3,4,5,2,4))

df.long
   id          x time
1   1 0.72888215    1
2   1 0.60893548    2
3   1 0.41347690    5
4   2 0.79388248    1
5   2 0.05810054    2
6   2 0.02451654    3
7   2 0.85464775    4
8   2 0.15970365    5
9   3 0.22856183    2
10  3 0.38291471    4

df.wide <- reshape(df, idvar = "id", v.names = "x",
direction="wide")

df.wide
  id       x.1       x.2       x.5       x.3       x.4
1  1 0.6375135 0.1651258 0.3210223        NA        NA
4  2 0.9878134 0.8909020 0.9853269 0.7747615 0.3834130
9  3        NA 0.3586109        NA        NA 0.8310539

df.long2 <- reshape(df.wide, direction="long")

df.long2
    id time         x
1.1  1    1 0.6375135
2.1  2    1 0.9878134
3.1  3    1        NA
1.2  1    2 0.1651258
2.2  2    2 0.8909020
3.2  3    2 0.3586109
1.5  1    5 0.3210223
2.5  2    5 0.9853269
3.5  3    5        NA
1.3  1    3        NA
2.3  2    3 0.7747615
3.3  3    3        NA
1.4  1    4        NA
2.4  2    4 0.3834130
3.4  3    4 0.8310539

  This assumes that your data in the "long" format has a time
variable.
 See the help page for reshape() for more details.
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
-- 
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894

Gabor Grothendieck

2007-Jan-27 11:15 UTC

head link

[R] unequal number of observations for longitudinal data

merge.zoo in the zoo package has an n-way merge supporting zero fill:

library(zoo)

DF <- data.frame(id = c(1, 1, 1, 2, 2, 2, 2, 2, 3, 3), x = c(10,
30, 20, 10, 20, 40, 80, 70, 20, 40))

as.data.frame(do.call(merge, c(lapply(unstack(DF, x ~ id), zoo), fill = 0)))

# last line can alternately be

f <- function(DF) zoo(DF$x)
as.data.frame(do.call(merge, c(by(DF, DF$id, f), fill = 0)))



On 1/27/07, gallon li <gallon.li at gmail.com>
wrote:> i have a large longitudinal data set. The number of observations for each
> subject is not the same across the sample. The largest number of a subject
> is 5 and the smallest number is 1.
>
> now i want to make each subject to have the same number of observations by
> filling zero, e.g., my original sample is
>
> id x
> 001 10
> 001 30
> 001 20
> 002 10
> 002 20
> 002 40
> 002 80
> 002 70
> 003 20
> 003 40
> 004 ......
>
> now i wish to make the data like
>
>  id x
> 001 10
> 001 30
> 001 20
> 001 0
> 001 0
> 002 10
> 002 20
> 002 40
> 002 80
> 002 70
> 003 20
> 003 40
> 003 0
> 003 0
> 003 0
> 004 ......
>
> so that each id has exactly 5 observations. is there a function which can
> allow me do this quickly?
>
>        [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Possibly Parallel Threads

Search for more maybe matching threads

R help - Jan 2007 - unequal number of observations for longitudinal data

[R] unequal number of observations for longitudinal data

[R] unequal number of observations for longitudinal data

[R] unequal number of observations for longitudinal data

Possibly Parallel Threads