sour@ m@ili@g off i@st@te@edu
2018-Dec-17 16:50 UTC
[R] Functional data anlysis for unequal length and unequal width time series
Dear All,
I apologize if you have already seen in Stack Overflow. I
have not got any response from there so I am posting for help here.
I have data on 1318 time series. Many of these series are of unequal
length. Apart from this also quite a few time points for each of the
series are observed at different time points. For example consider the
following four series
t1 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.67)
V1 <- c(-0.1710, -0.0824, -0.0419, -0.0416, -0.0216, -0.0792, -0.0656,-
0.0273, -0.0589)
ser1 <- cbind(t1, V1)
t2 <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38)
V2 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231, 0.2264)
ser2 <- cbind(t2, V2)
t3 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.65,
25.88, 25.97, 25.99)
V3 <- c(0.0897, -0.0533, -0.3497, -0.5684, -0.4294, -0.1109, 0.0352,
0.0550, -0.0536, 0.0185, -0.0295, -0.0324)
ser3 <- cbind(t3, V3)
t4 <- c(24.5, 24.67, 24.71, 24.98, 25.17)
V4 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231)
ser4 <- cbind(t4, V4)
Here t1, t2, t3, t4 are the time points and V1, V2, V3, V4 are the
observations made at over those time points. The time points in the
actual data are Julian dates so they look like these, just that they
are much larger decimal figures like 2452450.6225.
I am trying to cluster these time series using functional data approach
for which I am using the "funFEM" package in R. Th examples present
are
for equispaced and equal length time series so I am not sure how to use
the package for my data. Initially I tried by making all the time
series equal in length to the time series having the highest number of
observations (here equal to ser3) by adding NA's to the time series. So
following this example I made ser2 as
t2_n <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38, 25.50, 25.55, 25.65,
25.88, 25.97, 25.99)
V2_na <- c(V2, rep(NA, 6))
ser2_na <- cbind(t2_n, V2_na)
Note that to make t2 equal to length of t3 I grabbed the last 6 time
points from t3. To make V2 equal in length to V3 I added NA's.
Then I created my data matrix as
dat <- rbind(V1_na, V2_na, V3, V4_na).
The code I used was
require(funFEM)
basis<- create.fourier.basis(c(min(t3), max(t3)), nbasis = 25)
fdobj <- smooth.basis(c(min(t3), max(t3)) ,dat, basis)$fd
Note that the range is constructed using the maximum and minumum time
point of ser_3 series.
res <- funFEM(fdobj, K = 2:9, model = "all", crit =
"bic", init "random")
But this gives me an error
Error in svd(X) : infinite or missing values in 'x'.
Can anyone tell please help me on how to deal with this dataset for
this package or any alternative package?
Sincerly,
Souradeep
Bert Gunter
2018-Dec-18 03:40 UTC
[R] Functional data anlysis for unequal length and unequal width time series
Specialized: Probably need to email the maintainer. See ?maintainer Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Dec 17, 2018 at 9:27 AM <soura at iastate.edu> wrote:> Dear All, > I apologize if you have already seen in Stack Overflow. I > have not got any response from there so I am posting for help here. > > I have data on 1318 time series. Many of these series are of unequal > length. Apart from this also quite a few time points for each of the > series are observed at different time points. For example consider the > following four series > > t1 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.67) > V1 <- c(-0.1710, -0.0824, -0.0419, -0.0416, -0.0216, -0.0792, -0.0656,- > 0.0273, -0.0589) > ser1 <- cbind(t1, V1) > > t2 <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38) > V2 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231, 0.2264) > ser2 <- cbind(t2, V2) > > t3 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.65, > 25.88, 25.97, 25.99) > V3 <- c(0.0897, -0.0533, -0.3497, -0.5684, -0.4294, -0.1109, 0.0352, > 0.0550, -0.0536, 0.0185, -0.0295, -0.0324) > ser3 <- cbind(t3, V3) > > t4 <- c(24.5, 24.67, 24.71, 24.98, 25.17) > V4 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231) > ser4 <- cbind(t4, V4) > > Here t1, t2, t3, t4 are the time points and V1, V2, V3, V4 are the > observations made at over those time points. The time points in the > actual data are Julian dates so they look like these, just that they > are much larger decimal figures like 2452450.6225. > > I am trying to cluster these time series using functional data approach > for which I am using the "funFEM" package in R. Th examples present are > for equispaced and equal length time series so I am not sure how to use > the package for my data. Initially I tried by making all the time > series equal in length to the time series having the highest number of > observations (here equal to ser3) by adding NA's to the time series. So > following this example I made ser2 as > > t2_n <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38, 25.50, 25.55, 25.65, > 25.88, 25.97, 25.99) > V2_na <- c(V2, rep(NA, 6)) > ser2_na <- cbind(t2_n, V2_na) > > Note that to make t2 equal to length of t3 I grabbed the last 6 time > points from t3. To make V2 equal in length to V3 I added NA's. > > Then I created my data matrix as > > dat <- rbind(V1_na, V2_na, V3, V4_na). > > The code I used was > > require(funFEM) > basis<- create.fourier.basis(c(min(t3), max(t3)), nbasis = 25) > fdobj <- smooth.basis(c(min(t3), max(t3)) ,dat, basis)$fd > > Note that the range is constructed using the maximum and minumum time > point of ser_3 series. > > res <- funFEM(fdobj, K = 2:9, model = "all", crit = "bic", init > "random") > > But this gives me an error > > Error in svd(X) : infinite or missing values in 'x'. > > Can anyone tell please help me on how to deal with this dataset for > this package or any alternative package? > > Sincerly, > Souradeep > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Jeff Newmiller
2018-Dec-18 06:53 UTC
[R] Functional data anlysis for unequal length and unequal width time series
You will learn something useful if you search for "rolling join". The
zoo package can handle this, as can the data.table package (read the vignette).
Your decision to pad with NA at the end was ill-considered... the first point of
your first series is between the first two points of your second series... you
need to interleave the points somehow.
You will need to decide whether you want to use piecewise linear approximation
(as with the base "approx" function) or the more stable
last-observation-carried-forward ("locf") or cubic splines or
something more exotic like Fourier interpolation to identify the new
interpolated "y" values in each series.
You can avoid the rolling join if you intend to resample the series to have
points at regular intervals. Just apply your preferred interpolation technique
with your intended mesh of regular time values to each of your series in turn
and then use cbind with the results.
I don't know anything about the package you mention, but getting time series
data aligned is a common preprocessing step for many time series analysis.
Oh, and to you should probably be familiar with that CRAN Time Series Task View
[1].
PS you should provide a link back to your original posting when moving the
conversation to a different venue in case the discussion doesn't stay dead
there.
[1] https://cran.r-project.org/web/views/TimeSeries.html
On December 17, 2018 8:50:09 AM PST, soura at iastate.edu
wrote:>Dear All,
> I apologize if you have already seen in Stack Overflow. I
>have not got any response from there so I am posting for help here.
>
>I have data on 1318 time series. Many of these series are of unequal
>length. Apart from this also quite a few time points for each of the
>series are observed at different time points. For example consider the
>following four series
>
>t1 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.67)
>V1 <- c(-0.1710, -0.0824, -0.0419, -0.0416, -0.0216, -0.0792, -0.0656,-
>0.0273, -0.0589)
>ser1 <- cbind(t1, V1)
>
>t2 <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38)
>V2 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231, 0.2264)
>ser2 <- cbind(t2, V2)
>
>t3 <- c(24.51, 24.67, 24.91, 24.95, 25.10, 25.35, 25.50, 25.55, 25.65,
>25.88, 25.97, 25.99)
>V3 <- c(0.0897, -0.0533, -0.3497, -0.5684, -0.4294, -0.1109, 0.0352,
>0.0550, -0.0536, 0.0185, -0.0295, -0.0324)
>ser3 <- cbind(t3, V3)
>
>t4 <- c(24.5, 24.67, 24.71, 24.98, 25.17)
>V4 <- c(-0.0280, -0.1980, -0.2556, 0.3131, 0.3231)
>ser4 <- cbind(t4, V4)
>
>Here t1, t2, t3, t4 are the time points and V1, V2, V3, V4 are the
>observations made at over those time points. The time points in the
>actual data are Julian dates so they look like these, just that they
>are much larger decimal figures like 2452450.6225.
>
>I am trying to cluster these time series using functional data approach
>for which I am using the "funFEM" package in R. Th examples
present are
>for equispaced and equal length time series so I am not sure how to use
>the package for my data. Initially I tried by making all the time
>series equal in length to the time series having the highest number of
>observations (here equal to ser3) by adding NA's to the time series. So
>following this example I made ser2 as
>
>t2_n <- c(24.5, 24.67, 24.91, 24.98, 25.14, 25.38, 25.50, 25.55, 25.65,
>25.88, 25.97, 25.99)
>V2_na <- c(V2, rep(NA, 6))
>ser2_na <- cbind(t2_n, V2_na)
>
>Note that to make t2 equal to length of t3 I grabbed the last 6 time
>points from t3. To make V2 equal in length to V3 I added NA's.
>
>Then I created my data matrix as
>
>dat <- rbind(V1_na, V2_na, V3, V4_na).
>
>The code I used was
>
>require(funFEM)
>basis<- create.fourier.basis(c(min(t3), max(t3)), nbasis = 25)
>fdobj <- smooth.basis(c(min(t3), max(t3)) ,dat, basis)$fd
>
>Note that the range is constructed using the maximum and minumum time
>point of ser_3 series.
>
>res <- funFEM(fdobj, K = 2:9, model = "all", crit =
"bic", init >"random")
>
>But this gives me an error
>
>Error in svd(X) : infinite or missing values in 'x'.
>
>Can anyone tell please help me on how to deal with this dataset for
>this package or any alternative package?
>
>Sincerly,
>Souradeep
>
>______________________________________________
>R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.
--
Sent from my phone. Please excuse my brevity.