On 13-10-22 06:00 AM, Weiwu Zhang <zhangweiwu at realss.com>
wrote:> My data is sampled once per minute.
At the same second each minute or not? Regularly spaced would mean
exactly one minute between observations.
There are invalid samples, leaving> a lot of holes in the samples, successful sample is around 80% of all
> minutes in a day. and during the last 4 months sampling, one month's
> data was stored on a harddisk that failed, leaving a month's gap in
> between.
This is called "missing observations". With regular spacing you need
to
fill in the holes with NA. With irregular spacing you can either drop
the missing observations or, if you know the time at which they were
missed, you could fill in with NA.
>
> So am I working with regularly spaced time series or not? Should I
> padd all missing data with NAs, and start with ts(), and followed by
> forecast package (which seems to have all the functions I need in the
> begining) or should I start with a library with irregular time series
> in mind?
>
> Also, ts() manual didn't say how to create time-series with one minute
> as daltat. Its seems to assume time-series is about dates. So the data
> I have with me, is it really time series at all?
ts() representations works best with regularly spaced monthly,
quarterly, or annual data. You can use it for other things if they fit
nicely into the regular spaced observations with a frequency of
observation, such as 12 times per year or 60 times per hour. This
usually only makes sense if the frequency has something to do with your
problem, like seasonality questions. You can also use frequency 1 for
one observation per period, like annual data, which in your case would
be once per minute. I'm inclined to think that a zoo (see package zoo)
represenation would fit your problem better.
HTH,
Paul>
> Newbie question indeed. Thanks.
>