Ateljevich, Eli@DWR
2017-Apr-17 17:38 UTC
[R] ssa gapfill of series with large window and sparse gaps with Rssa
I have several years of univariate wind speed data to which I would like to apply singular spectrum analysis. The data are sampled every 15min and a year is a fundamental periodicity, which suggests L=35,040 values. I would like to fill the gaps. The missing values are scattered at low density throughout the series. I doubt there is a block of even one month that doesn't have at least a couple pieces of missing data, but I'd surprised to learn the total number are prohibitive. The filling routines in Rssa like igapfill assume a shaped ssa object, so it seems I need to run ssa successfully first before I can fill. When I try this with L=35,040 or anything above about 2,000 I get an error message Nothing to decompose: the given field shape is empty and warnings like Some field elements were not covered by shaped window. 42646 elements will be ommited. This is frustrating, because if I manually fill missing data with the series mean, which I understand as being the first step of igapfill, the decomposition succeeds with L=35040. The operation seems efficient and the spectral components look as expected. But since I have manually created a series with no missing data, this doesn't help me with gap filling. To concoct a shaped ssa object with the original missing pattern, I invoked force.decompose=FALSE. At that point I can bring my task to completion, but I don't know what I'm doing. The only examples I see in the docs are not explained and are in 2D. Can someone familiar with this kind of use case explain what the purpose of force.decompose and explain the best practice given my missing data situation? Are there consequences to my workaround? Thanks. [[alternative HTML version deleted]]
Bert Gunter
2017-Apr-17 18:49 UTC
[R] ssa gapfill of series with large window and sparse gaps with Rssa
... "and explain the best practice given my missing data situation?" I cannot speak to your other issues, but the above is definitely off topic for this list, which is about R programming, not statistical matters. Missing data are certainly a complex issue: you might try a statistical list like stats.stackexchange.com for opinions on the above. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Apr 17, 2017 at 10:38 AM, Ateljevich, Eli at DWR <Eli.Ateljevich at water.ca.gov> wrote:> I have several years of univariate wind speed data to which I would like to apply singular spectrum analysis. The data are sampled every 15min and a year is a fundamental periodicity, which suggests L=35,040 values. > > > I would like to fill the gaps. The missing values are scattered at low density throughout the series. I doubt there is a block of even one month that doesn't have at least a couple pieces of missing data, but I'd surprised to learn the total number are prohibitive. > > > The filling routines in Rssa like igapfill assume a shaped ssa object, so it seems I need to run ssa successfully first before I can fill. When I try this with L=35,040 or anything above about 2,000 I get an error message > Nothing to decompose: the given field shape is empty > and warnings like > Some field elements were not covered by shaped window. 42646 elements will be ommited. > > > This is frustrating, because if I manually fill missing data with the series mean, which I understand as being the first step of igapfill, the decomposition succeeds with L=35040. The operation seems efficient and the spectral components look as expected. But since I have manually created a series with no missing data, this doesn't help me with gap filling. > > > To concoct a shaped ssa object with the original missing pattern, I invoked force.decompose=FALSE. At that point I can bring my task to completion, but I don't know what I'm doing. The only examples I see in the docs are not explained and are in 2D. > > > Can someone familiar with this kind of use case explain what the purpose of force.decompose and explain the best practice given my missing data situation? Are there consequences to my workaround? Thanks. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Bert Gunter
2017-Apr-17 19:20 UTC
[R] ssa gapfill of series with large window and sparse gaps with Rssa
I should probably have added that you should have a look at R's time series task view: https://cran.r-project.org/web/views/TimeSeries.html including anything there on irregular times series (e.g. irts() from tseries package) and imputation. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Apr 17, 2017 at 10:38 AM, Ateljevich, Eli at DWR <Eli.Ateljevich at water.ca.gov> wrote:> I have several years of univariate wind speed data to which I would like to apply singular spectrum analysis. The data are sampled every 15min and a year is a fundamental periodicity, which suggests L=35,040 values. > > > I would like to fill the gaps. The missing values are scattered at low density throughout the series. I doubt there is a block of even one month that doesn't have at least a couple pieces of missing data, but I'd surprised to learn the total number are prohibitive. > > > The filling routines in Rssa like igapfill assume a shaped ssa object, so it seems I need to run ssa successfully first before I can fill. When I try this with L=35,040 or anything above about 2,000 I get an error message > Nothing to decompose: the given field shape is empty > and warnings like > Some field elements were not covered by shaped window. 42646 elements will be ommited. > > > This is frustrating, because if I manually fill missing data with the series mean, which I understand as being the first step of igapfill, the decomposition succeeds with L=35040. The operation seems efficient and the spectral components look as expected. But since I have manually created a series with no missing data, this doesn't help me with gap filling. > > > To concoct a shaped ssa object with the original missing pattern, I invoked force.decompose=FALSE. At that point I can bring my task to completion, but I don't know what I'm doing. The only examples I see in the docs are not explained and are in 2D. > > > Can someone familiar with this kind of use case explain what the purpose of force.decompose and explain the best practice given my missing data situation? Are there consequences to my workaround? Thanks. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.