William Dunlap <wdunlap <at> tibco.com> writes:> Use the str() function to see the internal structure of most > objects. In your case it would show something like: > > > Data <- data.frame(theData=round(sin(1:38),1)) > > x <- ts(Data[[1]], frequency=12) # or Data[,1] > > y <- ts(Data, frequency=12) > > str(x) > Time-Series [1:38] from 1 to 4.08: 0.8 0.9 0.1 -0.8 -1 -0.3 0.7 1 0.4 -0.5> ... > > str(y) > ts [1:38, 1] 0.8 0.9 0.1 -0.8 -1 -0.3 0.7 1 0.4 -0.5 ... > - attr(*, "dimnames")=List of 2 > ..$ : NULL > ..$ : chr "theData" > - attr(*, "tsp")= num [1:3] 1 4.08 12 > > 'x' contains a vector of data and 'y' contains a 1-column matrix of > data. stl(x,"per") and stl(y, "per") give similar results as you > got. > > Evidently, stl() does not know that 1-column matrices can be treated > much the same as vectors and gives an error message. Thus you must > extract the one column into a vector: stl(y[,1], "per").Thanks, William. Interesting that a 2D matrix of size Nx1 is treated as a different animal from a length N vector. It's a departure from math convention, and from what I'm accustomed to in Matlab. that R's vector seems more akin to a list, where the notion of orientation doesn't apply. I rummaged around the help files for str, summary, dput, args. This seems like a more complicated language than Matlab, VBA, or even C++'s STL of old (which was pretty thoroughly documented). A function like str() returns an object description, and I'm guessing the conventions with which the object is described depends a lot on the person who wrote the handling code for the class. The description for the variable y seems particularly elaborate. Would I be right in assuming that the notation is ad-hoc and not documented? For example, the two invocations str(x) and str(y) show a Time-Series and a ts. And there are many lines of output for str(y) that is heavy in punctuation.
William Dunlap
2015-Apr-22 02:16 UTC
[R] How numerical data is stored inside ts time series objects
> Interesting that a 2D matrix of size Nx1 is treated as a different > animal from a length N vector.I think we can call this a bug in stl(). Bill Dunlap TIBCO Software wdunlap tibco.com On Tue, Apr 21, 2015 at 6:39 PM, Paul <Paul.Domaskis at gmail.com> wrote:> William Dunlap <wdunlap <at> tibco.com> writes: > > Use the str() function to see the internal structure of most > > objects. In your case it would show something like: > > > > > Data <- data.frame(theData=round(sin(1:38),1)) > > > x <- ts(Data[[1]], frequency=12) # or Data[,1] > > > y <- ts(Data, frequency=12) > > > str(x) > > Time-Series [1:38] from 1 to 4.08: 0.8 0.9 0.1 -0.8 -1 -0.3 0.7 1 0.4 - > 0.5 > > ... > > > str(y) > > ts [1:38, 1] 0.8 0.9 0.1 -0.8 -1 -0.3 0.7 1 0.4 -0.5 ... > > - attr(*, "dimnames")=List of 2 > > ..$ : NULL > > ..$ : chr "theData" > > - attr(*, "tsp")= num [1:3] 1 4.08 12 > > > > 'x' contains a vector of data and 'y' contains a 1-column matrix of > > data. stl(x,"per") and stl(y, "per") give similar results as you > > got. > > > > Evidently, stl() does not know that 1-column matrices can be treated > > much the same as vectors and gives an error message. Thus you must > > extract the one column into a vector: stl(y[,1], "per"). > > Thanks, William. > > Interesting that a 2D matrix of size Nx1 is treated as a different > animal from a length N vector. It's a departure from math convention, > and from what I'm accustomed to in Matlab. that R's vector seems > more akin to a list, where the notion of orientation doesn't apply. > > I rummaged around the help files for str, summary, dput, args. This > seems like a more complicated language than Matlab, VBA, or even C++'s > STL of old (which was pretty thoroughly documented). A function like > str() returns an object description, and I'm guessing the conventions > with which the object is described depends a lot on the person who > wrote the handling code for the class. The description for the > variable y seems particularly elaborate. > > Would I be right in assuming that the notation is ad-hoc and not > documented? For example, the two invocations str(x) and str(y) show a > Time-Series and a ts. And there are many lines of output for str(y) > that is heavy in punctuation. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Martin Maechler
2015-Apr-22 13:15 UTC
[R] How numerical data is stored inside ts time series objects
>>>>> Paul <Paul.Domaskis at gmail.com> >>>>> on Wed, 22 Apr 2015 01:39:16 +0000 writes:> William Dunlap <wdunlap <at> tibco.com> writes: >> Use the str() function to see the internal structure of most >> objects. In your case it would show something like: >> >> > Data <- data.frame(theData=round(sin(1:38),1)) >> > x <- ts(Data[[1]], frequency=12) # or Data[,1] >> > y <- ts(Data, frequency=12) >> > str(x) >> Time-Series [1:38] from 1 to 4.08: 0.8 0.9 0.1 -0.8 -1 -0.3 0.7 1 0.4 - > 0.5 >> ... >> > str(y) >> ts [1:38, 1] 0.8 0.9 0.1 -0.8 -1 -0.3 0.7 1 0.4 -0.5 ... >> - attr(*, "dimnames")=List of 2 >> ..$ : NULL >> ..$ : chr "theData" >> - attr(*, "tsp")= num [1:3] 1 4.08 12 >> >> 'x' contains a vector of data and 'y' contains a 1-column matrix of >> data. stl(x,"per") and stl(y, "per") give similar results as you >> got. >> >> Evidently, stl() does not know that 1-column matrices can be treated >> much the same as vectors and gives an error message. Thus you must >> extract the one column into a vector: stl(y[,1], "per"). > Thanks, William. > Interesting that a 2D matrix of size Nx1 is treated as a different > animal from a length N vector. It's a departure from math convention, > and from what I'm accustomed to in Matlab. Ha -- Not at all! The above is exactly the misconception I have been fighting -- mostly in vane -- for years. Matlab's convention of treating a vector as an N x 1 matrix is a BIG confusion to much of math teaching : The vector space |R^n is not all the same space as the space |R^{n x 1} even though of course there's a trivial mapping between the objects (and the metrics) of the two. A vector *is NOT* a matrix -- but in some matrix calculus notations there is a convention to *treat* n-vectors as (n x 1) matrices. Good linear algebra teaching does distinguish vectors from one-column or one-row matrices -- I'm sure still the case in all good math departments around the globe -- but maybe not in math teaching to engineers and others who only need applied math. Yes, linear algebra teaching will also make a point that in the usual matrix product notations, it is convenient and useful to treat vectors as if they were 1-column matrices. > That R's vector seems > more akin to a list, where the notion of orientation doesn't apply. Sorry, but again: not at all in the sense 'list's are used in R. Fortunately, well thought out languages such as S, R, Julia, Python, all do make a good distinction between vectors and matrices i.e. 1D and 2D arrays. If Matlab still does not do that, it's just another sign that Matlab users should flee and start using julia or R or python. {and well yes, we could start bitchering about S' and hence R's distinction between a 1D array and a vector ... which I think has been a clear design error... but that's not the topic here}
David R Forrest
2015-Apr-22 16:06 UTC
[R] How numerical data is stored inside ts time series objects
> On Apr 21, 2015, at 9:39 PM, Paul <Paul.Domaskis at gmail.com> wrote:...> I rummaged around the help files for str, summary, dput, args. This > seems like a more complicated language than Matlab, VBA, or even C++'s > STL of old (which was pretty thoroughly documented). A function like > str() returns an object description, and I'm guessing the conventions > with which the object is described depends a lot on the person who > wrote the handling code for the class. The description for the > variable y seems particularly elaborate. > > Would I be right in assuming that the notation is ad-hoc and not > documented? For example, the two invocations str(x) and str(y) show a > Time-Series and a ts. And there are many lines of output for str(y) > that is heavy in punctuation. >The details of how str() represents your x and y variables is within the utils::stl.default() function. You can hunt this down and see the code with: methods(class=class(x)) # Find the class-specific handlers -- no str() methods(str) # Find the methods for the generic getAnywhere(str.default) # or getFromNamespace('str.default','utils') Within the utils::str.default code, this 'Time-Series' specific code only triggers if the object doesn't match a long list of other items (for example: is.function(), is.list(), is.vector(object) || (is.array(object) && is.atomic(object)) ...) else if (stats::is.ts(object)) { tsp.a <- stats::tsp(object) str1 <- paste0(" Time-Series ", le.str, " from ", format(tsp.a[1L]), " to ", format(tsp.a[2L]), ":") std.attr <- c("tsp", "class") } This handling is not dependent on who wrote the ts class, but on who wrote the str.default function. A more explict way to look at the difference without the str() summarization is with dput(x) and dput(y):> dput(x)structure(c(464L, 675L, 703L, 887L, 1139L, 1077L, 1318L, 1260L, 1120L, 963L, 996L, 960L, 530L, 883L, 894L, 1045L, 1199L, 1287L, 1565L, 1577L, 1076L, 918L, 1008L, 1063L, 544L, 635L, 804L, 980L, 1018L, 1064L, 1404L, 1286L, 1104L, 999L, 996L, 1015L), .Tsp = c(1, 3.91666666666667, 12), class = "ts")> dput(y)structure(c(464L, 675L, 703L, 887L, 1139L, 1077L, 1318L, 1260L, 1120L, 963L, 996L, 960L, 530L, 883L, 894L, 1045L, 1199L, 1287L, 1565L, 1577L, 1076L, 918L, 1008L, 1063L, 544L, 635L, 804L, 980L, 1018L, 1064L, 1404L, 1286L, 1104L, 999L, 996L, 1015L), .Dim = c(36L, 1L), .Dimnames = list(NULL, "V1"), .Tsp = c(1, 3.91666666666667, 12), class = "ts") Also, Matlab sometimes needs a squeeze() to drop degenerate dimensions, and R's drop() is similar, and is less-black-magic looking than the [[1]] code:> str(drop(x))Time-Series [1:36] from 1 to 3.92: 464 675 703 887 1139 1077 1318 1260 1120 963 ...> str(drop(y))Time-Series [1:36] from 1 to 3.92: 464 675 703 887 1139 1077 1318 1260 1120 963 ... stl(drop(x),s.window='per') stl(drop(y),s.window='per') Maybe str.default() should do Time-Series interpretation of is.ts() objects for matrices as well as vectors. Dave
William Dunlap <wdunlap <at> tibco.com> writes:> I think we can call this a bug in stl().I used what I learned from the responses to this thread, I looked at the code for stl. As they say in Microsoft, "this is expected behaviour" according to the code. And it doesn't look like an inadvertent coding oversight. ----------------------------------------------- Martin Maechler <maechler <at> lynne.stat.math.ethz.ch> writes:>> Paul <Paul.Domaskis <at> gmail.com> Interesting that a 2D matrix >> of size Nx1 is treated as a different animal from a length N >> vector. It's a departure from math convention, and from what I'm >> accustomed to in Matlab. > > The vector space |R^n is not all the same space as the space > |R^{n x 1} even though of course there's a trivial mapping between > the objects (and the metrics) of the two. A vector *is NOT* a > matrix -- but in some matrix calculus notations there is a > convention to *treat* n-vectors as (n x 1) matrices. > > Good linear algebra teaching does distinguish vectors from > one-column or one-row matrices -- I'm sure still the case in all > good math departments around the globe -- but maybe not in math > teaching to engineers and others who only need applied math. Yes, > linear algebra teaching will also make a point that in the usual > matrix product notations, it is convenient and useful to treat > vectors as if they were 1-column matrices.The distinction in math is new me, with academic training in engineering, even at the post grad level. I haven't seen the distinction in the math for Comp. Sci., either, and that's in the meat grinder of Canada. Admittedly, it's not quite as geeky as some meat grinders in other countries. And admittedly, I only took C.S. courses that were geared to applications. So I had always considered such a distinction to a practicality in coding implementation of vector/matrix classes, e.g., in C, a vector being a single pointer to a number, while in a 2D array is a pointer to a vector and hence a different type.>> That R's vector seems more akin to a list, where the notion of >> orientation doesn't apply. > > Sorry, but again: not at all in the sense 'list's are used in R.No need to apologize. To clarify, being new to R, I was referring to the general use of the term "list". Specifically, I was referring to an ordered collection without orientation, so it is consistent with what you say above about distinguishing between length N vectors vs. 2D matrices of size Nx1 or 1xN.> Fortunately, well thought out languages such as S, R, Julia, Python, > all do make a good distinction between vectors and matrices i.e. 1D > and 2D arrays. If Matlab still does not do that, it's just another > sign that Matlab users should flee and start using julia or R or > python.Matlab pretty well only deals with 2D arrays, some of which have size Nx1 or 1xN. I haven't seen an example of a 1-D data structure that doesn't have an orientation, implied or otherwise. Though of course, if someone proves me wrong, then I stand corrected (and smarter because of it).> {and well yes, we could start bitchering about S' and hence R's > distinction between a 1D array and a vector ... which I think has > been a clear design error... but that's not the topic here}Big fan of python's readability, though I've only dabbled. And I won't start bitchering about R & S cuz I'm a newcomer and it's all an eye popping wonderland. ----------------------------------------------- David R Forrest <drf <at> vims.edu> writes:> The details of how str() represents your x and y variables is within > the utils::stl.default() function. You can hunt this down and seeI'm assuming that you meant utils.str.default() above. I can follow the rest of your post makes sense if I make that assumption. I snipped the majority of your response because I'm not responding to anything specific. However, it was an extremely educational post. Thank you for that.> Also, Matlab sometimes needs a squeeze() to drop degenerate > dimensions, and R's drop() is similar, and is less-black-magic > looking than the [[1]] code: > > > str(drop(x)) > Time-Series [1:36] from 1 to 3.92: 464 675 703 887 1139 1077 1318 > 1260 1120 963 ... > > str(drop(y)) > Time-Series [1:36] from 1 to 3.92: 464 675 703 887 1139 1077 1318 > 1260 1120 963 ... > > stl(drop(x),s.window='per') > stl(drop(y),s.window='per') > > Maybe str.default() should do Time-Series interpretation of is.ts() > objects for matrices as well as vectors.I'm assuming that you mean stl(), since str() already works on both? Maybe it's the version I have, however, but I find that the R code for stl() doesn't have have a section for is.ts(). Instead, it seems to run through a series of checks for pathological input, with the check for matrix data consisting of is.matrix(na.action(as.ts(x))), where x is the time series. Somehow, the fact that the na.action(time series argument) returns a matrix implies that the time series data is a matrix rather than a vector. In attempting to get insight, I found that the ts class has no na.action method, and that the default method for the generic na.action is not visible using getAnywhere (nor is it visible by entering it at the command line without brackets). Anyway, pretty educational. Thanks again.