On Tue, Sep 13, 2011 at 2:07 PM, Rich Shepard <rshepard at
appl-ecosys.com> wrote:> ?I have read ?zoo but am not sure how to relate the parameters (x,
> order.by, frequency, and style) to my data.frame. The structure of the
> data.frame is
>
> 'data.frame': ? 11169 obs. of ?4 variables:
> ?$ stream ?: Factor w/ 37 levels "Burns","CIL",..: 1 1
1 1 1 1 1 1 1 1 ...
> ?$ sampdate: Date, format: "1987-07-23" "1987-09-17"
...
> ?$ param ? : Factor w/ 8 levels
"As","Ca","Cl",..: 1 1 1 1 1 1 1 1 1 1 ...
> ?$ quant ? : num ?0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0 ...
>
> ?The numeric column ('x' in zoo, I believe) is associated with the
unique
> combination of param, sampdate, and stream in each row. For example:
>
> tail(streamdata)
> ? ? ? stream ? sampdate param ? quant
> 11164 Winters 2010-06-30 ? SO4 120.000
> 11165 Winters 2010-06-30 ? ?Zn ? 0.010
> 11166 Winters 2011-06-06 ? ?As ? 0.005
> 11167 Winters 2011-06-06 ? ?Cl ? 5.000
> 11168 Winters 2011-06-06 ? SO4 150.000
> 11169 Winters 2011-06-06 ? ?Zn ? 0.010
>
> ?I'm in the early exploratory stage of understanding these data, but
want
> to produce time series plots and analyses by stream and param using zoo
> objects since the sampdate varies by both stream and chemical.
>
> ?I assume that order.by, the index, is sampdate. The frequency option is
> FALSE because these samples are not temporally regular. I've no idea
what to
> do with the style option, if anything.
>
> ?Most of the examples I see on using R (including in the lattice book
I'm
> now reading) have one or more numeric columns in the data.frame associated
> with a single factor. I have a single numeric column associated with two
> factors and a date.
>
> ?If there are other documents or books I should read to learn how to
> effectively use the zoo package for my project (in addition to zoo.pdf that
> lists the methods and is quite obtuse to me), please point me to them. I
> would greatly appreciate any and all help in getting up to speed with zoo.
>
As in ?zoo a zoo object is a numeric matrix, numeric vector or factor
together with an ordered time index which is unique. Its not clear
that that is what you have; however, if we can assume that for each
value of param we have a unique set of dates then quant could form a
multivariate zoo series with Date index. We used text=Lines in
read.zoo below to keep the example self-contained but in reality the
first argument to read.zoo would be something like "myfile.dat" to
refer to the file holding the data . The "NULL" entries in the
colClasses argument of read.zoo cause the respective columns to be
ignored.
Lines <- "stream sampdate param quant
11164 Winters 2010-06-30 SO4 120.000
11165 Winters 2010-06-30 Zn 0.010
11166 Winters 2011-06-06 As 0.005
11167 Winters 2011-06-06 Cl 5.000
11168 Winters 2011-06-06 SO4 150.000
11169 Winters 2011-06-06 Zn 0.010"
library(zoo)
packageVersion("zoo") # should be >= 1.7-4
z <- read.zoo(text = Lines, skip = 1, split = 2,
colClasses = c("NULL", "NULL", NA, NA, NA))
which gives
> z
As Cl SO4 Zn
2010-06-30 NA NA 120 0.01
2011-06-06 0.005 5 150 0.01
Read over ?zoo and ?read.zoo and also the 5 vignettes. The zoo-read
vignette is entirely about read.zoo . If you really do want to keep
all that info you might want to use a data frame instead or possibly
several zoo objects.
--
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com