I need some help understanding how on of the example data sets is formatted in the basic R installation. If I load the Mona Loa CO2 data, with the command: > data(co2) I can view the data with: > co2 And the data are in the form of 11 rows labeled as years (1994-2004) and 12 columns labeled (Jan - Dec). This structure appears to be a dataframe, however, if I type the command > plot(co2) I get a time series with CO2 on the x axis and time on the y. Also, > summary(co2) gives a single Min, Median, Max. The reason for my confusion is that I created another "similar looking" data set with read.table. In that case, the data looks to be in same format (rows as years, and columns as months). However, the command > summary(test.data) gives a summary for each month. Completely different behavior. If use the data.frame command: > data.frame(co2) I get a single column of CO2 data, while the data.frame command on my test.data data, keeps it's year-row, column-month format. Can anyone help me understand the differences in how these data sets are formatted? Thanks in advance.
> class(co2)[1] "ts" > is.data.frame(co2) [1] FALSE If you try these functions on your test.data I am guessing that the results will differ from above. -- David Winsemius On Dec 23, 2008, at 6:29 PM, Kirk Wythers wrote:> I need some help understanding how on of the example data sets is > formatted in the basic R installation. If I load the Mona Loa CO2 > data, with the command: > > > data(co2) > > I can view the data with: > > > co2 > > And the data are in the form of 11 rows labeled as years (1994-2004) > and 12 columns labeled (Jan - Dec). This structure appears to be a > dataframe, however, if I type the command > > > plot(co2) > > I get a time series with CO2 on the x axis and time on the y. Also, > > > summary(co2) gives a single Min, Median, Max. > > The reason for my confusion is that I created another "similar > looking" data set with read.table. In that case, the data looks to > be in same format (rows as years, and columns as months). However, > the command > > > summary(test.data) > > gives a summary for each month. Completely different behavior. > > If use the data.frame command: > > > data.frame(co2) > > I get a single column of CO2 data, while the data.frame command on > my test.data data, keeps it's year-row, column-month format. > > Can anyone help me understand the differences in how these data sets > are formatted? > > Thanks in advance. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Dear Kirk, Actually, co2 isn't a data frame but rather a "ts" (timeseries) object. A nice thing about R is that you can query and examine objects:> class(co2)[1] "ts"> str(co2) # structure of objectTime-Series [1:468] from 1959 to 1998: 315 316 316 318 318 ...> unclass(co2)[1] 315.42 316.31 316.50 317.56 318.13 318.00 316.39 314 . . . [33] 314.83 315.16 315.94 316.85 317.78 318.40 319.53 320.42 . . . [65] 322.06 321.73 320.27 318.54 316.54 316.71 317.53 318.55 . . . [97] 322.17 322.34 322.88 324.25 324.83 323.93 322.38 320.76 . . . . . . [449] 365.45 365.01 363.70 . . . 360.83 362.49 364.34 attr(,"tsp") [1] 1959.000 1997.917 12.000>(where . . . represents output that I've elided). I hope this helps, John ------------------------------ John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]On> Behalf Of Kirk Wythers > Sent: December-23-08 6:30 PM > To: r-help at r-project.org > Subject: [R] beginner data.frame question > > I need some help understanding how on of the example data sets is > formatted in the basic R installation. If I load the Mona Loa CO2 > data, with the command: > > > data(co2) > > I can view the data with: > > > co2 > > And the data are in the form of 11 rows labeled as years (1994-2004) > and 12 columns labeled (Jan - Dec). This structure appears to be a > dataframe, however, if I type the command > > > plot(co2) > > I get a time series with CO2 on the x axis and time on the y. Also, > > > summary(co2) gives a single Min, Median, Max. > > The reason for my confusion is that I created another "similar > looking" data set with read.table. In that case, the data looks to be > in same format (rows as years, and columns as months). However, the > command > > > summary(test.data) > > gives a summary for each month. Completely different behavior. > > If use the data.frame command: > > > data.frame(co2) > > I get a single column of CO2 data, while the data.frame command on my > test.data data, keeps it's year-row, column-month format. > > Can anyone help me understand the differences in how these data sets > are formatted? > > Thanks in advance. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code.