thr3ads.net - R help - time series processing - count of datestamp delta's, per group [Mar 2014]

If this information is useful, please help other people find it:
Share via:

Martin Tomko

2014-Mar-23 00:32 UTC

time series processing - count of datestamp delta's, per group

Apologies if the question is a but naïve, I am a novice in time series data
handling in R

I have the following type of data, in a long format ( as called by the spacetime
vignette – the table contains also space, not noted here):

User |  Date | Otherdata |
A | 01/01/2014 | aa
A | 01/01/2014 | bb
A | 01/01/2014 | cc
B | 01/01/2014 | aa
B | 05/01/2014 | cc
A | 07/01/2014 | aa
C | 05/02/2014 | xx
C | 20/02/2014 | yy

Etc
[A,B,C,…] are user Ids (some strings).
Date is converted into a Date format (2013-10-15)

The table is sorted by User and then by Date, and is over 800K records long.
There are about 20K users.

User |  Date | Otherdata |
A | 2014-01-01 | aa
A | 2014-01-01  | bb
A | 2014-01-01  | cc
A | 2014-01-07  | aa
B | 2014-01-01  | aa
B | 2014-01-05  | cc
C | 2014-02-05  | xx
C | 2014-02-20  | yy

I want to:
Get a frequency table ( and ultimately plot) of the count of differences (in
days) between records of a user. Meaning, I would first get the unique days
recorded:

A | 2014-01-01
A | 2014-01-07
B | 2014-01-01
B | 2014-01-05
C | 2014-02-05
C | 2014-02-20

And then want to run the differences between timestamps within a group defined
by the user, in days:
A| 6
B| 4
C|15

Imagining that I have tens of thousands of records, I then want the table with
the counts of differences ( across all users) ( in our case it would be 6, 4 and
15, all counte = 1)
IN the larger sample, something like this:
DeltaDays | Count
1 | 150
2 | 320
…
N | X

I know there are all sorts of packages for time analysis, but I could not find a
simple function like this (incl searching here
http://www.statoek.wiso.uni-goettingen.de/veranstaltungen/zeitreihen/sommer03/ts_r_intro.pdf
). I assume that something working on a simple data frame would be sufficient,
but I am happy ( prefer?) to use TS. I would appreciate any hints. The ultimate
analysis involves also space, so hints in the direction of space-time are
welcome. Ultimately, I would like to separate records for each user into a
dataset that can be handled separately, but splitting it into a large number of
files does not seem wise. Any hint also appreciated.

Thanks,
Martin



	[[alternative HTML version deleted]]

R help - Mar 2014 - time series processing - count of datestamp delta's, per group

time series processing - count of datestamp delta's, per group