thr3ads.net - R help - [R] Handling time-series-Data [Sep 2008]

If this information is useful, please help other people find it:
Share via:

Kunzler, Andreas

2008-Sep-11 07:37 UTC

[R] Handling time-series-Data

Dear List,

I ran into some problems with time-series-Data. 

Imagine a data-structure where observations (x) of test attendants (i) are made
a four times (q) a year (y). The data is orderd the following way:
I	y	q	x	
1	2006	1	1
1	2006	3	1
1	2006	4	1
1	2007	1	1
1	2007	2	1
1	2007	3	1
1	2007	4	1
2	2006	1	1
3	2007	1	1
3	2007	2	1

I am looking for a way to count the attendants that at least have attendend one
time a year. In this case 2 persons, because i=2 has no observation in 2007.

I thought about creating a subset with the duplicate function. But I can't
find a way to control (i) and (y).

subset(data, !duplicated(i[y]))

Thanx so much

Andreas Kunzler
____________________________
Bundeszahn?rztekammer (BZ?K)
Chausseestra?e 13
10115 Berlin

Tel.: 030 40005-113
Fax:  030 40005-119

E-Mail: a.kunzler at bzaek.de

Achim Zeileis

2008-Sep-11 08:40 UTC

head link

[R] Handling time-series-Data

On Thu, 11 Sep 2008, Kunzler, Andreas wrote:
> Dear List,
>
> I ran into some problems with time-series-Data.
>
> Imagine a data-structure where observations (x) of test attendants (i) are
made a four times (q) a year (y). The data is orderd the following way:
> I	y	q	x
> 1	2006	1	1
> 1	2006	3	1
> 1	2006	4	1
> 1	2007	1	1
> 1	2007	2	1
> 1	2007	3	1
> 1	2007	4	1
> 2	2006	1	1
> 3	2007	1	1
> 3	2007	2	1
>
> I am looking for a way to count the attendants that at least have 
> attendend one time a year. In this case 2 persons, because i=2 has no 
> observation in 2007.
You might want to turn your data into an actual time series with one 
series per attendend and then aggregate. I've written a few short 
transformations based on the data above and using the "zoo" package.
It's
somewhat lengthy but might give you a few useful pointers.

hth,
Z

## read data
x <- read.table(textConnection("I y q x
1       2006    1       1
1       2006    3       1
1       2006    4       1
1       2007    1       1
1       2007    2       1
1       2007    3       1
1       2007    4       1
2       2006    1       1
3       2007    1       1
3       2007    2       1"), header = TRUE)

## store year/qtr as "yearqtr" object
library("zoo")
x$yq <- as.yearqtr(x$y + (x$q-1)/4)
x <- x[,-(2:3)]

## reshape data into wide format (one series per individual)
x <- reshape(x, timevar = "I", idvar = "yq", direction =
"wide")

## turn data into zoo series with zeros in quarters without observation
z <- zoo(as.matrix(x[,-1]), x[,1])
z <- merge(zoo(,seq(from = start(z), to = end(z), by = 0.25)), z)
z[is.na(z)] <- 0

## aggregate from quarterly to annual observations
zy <- aggregate(z, function(x) as.numeric(floor(x)), sum)

## aggregate over individuals
rollapply(zy, 1, function(x) sum(x > 0), by.column = FALSE)

> I thought about creating a subset with the duplicate function. But I 
> can't find a way to control (i) and (y).
>
> subset(data, !duplicated(i[y]))
>
> Thanx so much
>
> Andreas Kunzler
> ____________________________
> Bundeszahn?rztekammer (BZ?K)
> Chausseestra?e 13
> 10115 Berlin
>
> Tel.: 030 40005-113
> Fax:  030 40005-119
>
> E-Mail: a.kunzler at bzaek.de
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>

Gabor Grothendieck

2008-Sep-11 13:00 UTC

head link

[R] Handling time-series-Data

On Thu, Sep 11, 2008 at 3:37 AM, Kunzler, Andreas <a.kunzler at bzaek.de>
wrote:> Dear List,
>
> I ran into some problems with time-series-Data.
>
> Imagine a data-structure where observations (x) of test attendants (i) are
made a four times (q) a year (y). The data is orderd the following way:
> I       y       q       x
> 1       2006    1       1
> 1       2006    3       1
> 1       2006    4       1
> 1       2007    1       1
> 1       2007    2       1
> 1       2007    3       1
> 1       2007    4       1
> 2       2006    1       1
> 3       2007    1       1
> 3       2007    2       1
>
> I am looking for a way to count the attendants that at least have attendend
one time a year. In this case 2 persons, because i=2 has no observation in 2007.
>

Don't you mean 1 person, not 2 persons, since
- attendant 1 appears in both years but
- attendant 2 appears only in 2006
- attendant 3 appears only in 2007
so only attendant 1 appears in both years, i.e. 1 person.

Assuming DF is your data frame:

u <- unique(DF[1:2])
with(u, sum(tapply(y, I, length) == length(unique(y))))  # 1

> I thought about creating a subset with the duplicate function. But I
can't find a way to control (i) and (y).
>
> subset(data, !duplicated(i[y]))
>
> Thanx so much
>
> Andreas Kunzler
> ____________________________
> Bundeszahn?rztekammer (BZ?K)
> Chausseestra?e 13
> 10115 Berlin
>
> Tel.: 030 40005-113
> Fax:  030 40005-119
>
> E-Mail: a.kunzler at bzaek.de
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

Reasonably Related Threads

Search for more maybe matching threads

R help - Sep 2008 - Handling time-series-Data

[R] Handling time-series-Data

[R] Handling time-series-Data

[R] Handling time-series-Data

Reasonably Related Threads