Dear R-experts,
Sorry if I've overlooked a simple solution here. I have calculated a 
proportion of the number of observations which meet a criteria, applied to 
five years of data. How can I break down this proportion statistic for each 
year?
For example (data in zoo format):
                    open  high   low    close  hc  lc
2004-12-29 4135 4135 4106  4116  8 -21
2004-12-30 4120 4131 4115  4119 15  -1
2004-12-31 4123 4124 4114  4117  5  -5
2005-01-04 4106 4137 4103  4137 20 -14
2005-01-06 4085 4110 4085  4096 10 -15
2005-01-10 4133 4148 4122  4139 15 -11
2005-01-11 4142 4158 4127  4130 19 -12
2005-01-12 4113 4138 4112  4127  18  8
Statistic of interest is proportion of times that sign of "hc" is
positive
and sign of "lc" is negative on any given day. Looking to return
something
like:
Yr        Prop
2004    1.0
2005    0.8
Along these lines, if I have datasets A and B, where B is a subset of A, can 
I use the number of matching dates to calculate the yearly proportions in 
question?
Thanks,
Alfonso Sammassimo
Melbourne Australia
Here is one way to break it down to years.> x <- " open high low close hc lc+ 2004-12-29 4135 4135 4106 4116 8 -21 + 2004-12-30 4120 4131 4115 4119 15 -1 + 2004-12-31 4123 4124 4114 4117 5 -5 + 2005-01-04 4106 4137 4103 4137 20 -14 + 2005-01-06 4085 4110 4085 4096 10 -15 + 2005-01-10 4133 4148 4122 4139 15 -11 + 2005-01-11 4142 4158 4127 4130 19 -12 + 2005-01-12 4113 4138 4112 4127 18 8"> > xIn <- read.table(textConnection(x), header=TRUE) > x.zoo <- zoo(xIn, as.POSIXct(row.names(xIn))) > sapply(split(x.zoo, format(index(x.zoo), "%Y")), function(.year){+ sum(.year[,'lc'] < 0) / sum(.year[,'hc'] > 0) + }) 2004 2005 1.0 0.8 On 5/27/07, Alfonso Sammassimo <cincinattikid@bigpond.com> wrote:> > Dear R-experts, > > Sorry if I've overlooked a simple solution here. I have calculated a > proportion of the number of observations which meet a criteria, applied to > five years of data. How can I break down this proportion statistic for > each > year? > > For example (data in zoo format): > > open high low close hc lc > 2004-12-29 4135 4135 4106 4116 8 -21 > 2004-12-30 4120 4131 4115 4119 15 -1 > 2004-12-31 4123 4124 4114 4117 5 -5 > 2005-01-04 4106 4137 4103 4137 20 -14 > 2005-01-06 4085 4110 4085 4096 10 -15 > 2005-01-10 4133 4148 4122 4139 15 -11 > 2005-01-11 4142 4158 4127 4130 19 -12 > 2005-01-12 4113 4138 4112 4127 18 8 > > Statistic of interest is proportion of times that sign of "hc" is positive > and sign of "lc" is negative on any given day. Looking to return something > like: > > Yr Prop > 2004 1.0 > 2005 0.8 > > Along these lines, if I have datasets A and B, where B is a subset of A, > can > I use the number of matching dates to calculate the yearly proportions in > question? > > Thanks, > Alfonso Sammassimo > Melbourne Australia > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve? [[alternative HTML version deleted]]
Here are a couple of solutions:
1. using zoo package
First add Date to the header so there
are the same number of column headers as columns and
then read in using read.zoo.  Then aggregate over years
using mean.  For more on zoo try library(zoo); vignette("zoo")
and for more on dates see the R News 4/1 help desk article.
# added Date to the header
Lines <- "Date open  high   low    close  hc  lc
2004-12-29 4135 4135 4106  4116  8 -21
2004-12-30 4120 4131 4115  4119 15  -1
2004-12-31 4123 4124 4114  4117  5  -5
2005-01-04 4106 4137 4103  4137 20 -14
2005-01-06 4085 4110 4085  4096 10 -15
2005-01-10 4133 4148 4122  4139 15 -11
2005-01-11 4142 4158 4127  4130 19 -12
2005-01-12 4113 4138 4112  4127  18  8
"
library(zoo)
# z <- read.zoo("myfile.dat", header = TRUE)
z <- read.zoo(textConnection(Lines), header = TRUE)
aggregate(z[,"hc"] > 0 & z[,"lc"] < 0, function(x)
format(x, "%Y"), mean)
2. Using data frames and tapply
Read in as a data frame, calculate year and tapply the mean
by year:
# Lines is from above
# dat <- read.table("myfile.dat", header = TRUE)
dat <- read.table(textConnection(Lines), header = TRUE)
year <- as.numeric(format(as.Date(dat$Date), "%Y"))
tapply(dat$hc > 0 & dat$lc < 0, year, mean)
On 5/27/07, Alfonso Sammassimo <cincinattikid at bigpond.com>
wrote:> Dear R-experts,
>
> Sorry if I've overlooked a simple solution here. I have calculated a
> proportion of the number of observations which meet a criteria, applied to
> five years of data. How can I break down this proportion statistic for each
> year?
>
> For example (data in zoo format):
>
>                    open  high   low    close  hc  lc
> 2004-12-29 4135 4135 4106  4116  8 -21
> 2004-12-30 4120 4131 4115  4119 15  -1
> 2004-12-31 4123 4124 4114  4117  5  -5
> 2005-01-04 4106 4137 4103  4137 20 -14
> 2005-01-06 4085 4110 4085  4096 10 -15
> 2005-01-10 4133 4148 4122  4139 15 -11
> 2005-01-11 4142 4158 4127  4130 19 -12
> 2005-01-12 4113 4138 4112  4127  18  8
>
> Statistic of interest is proportion of times that sign of "hc" is
positive
> and sign of "lc" is negative on any given day. Looking to return
something
> like:
>
> Yr        Prop
> 2004    1.0
> 2005    0.8
>
> Along these lines, if I have datasets A and B, where B is a subset of A,
can
> I use the number of matching dates to calculate the yearly proportions in
> question?
>
> Thanks,
> Alfonso Sammassimo
> Melbourne Australia
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Maybe Matching Threads
- hang with fsdlm
- aggregating data with Zoo
- help with loop over data frame
- Is there better alternative to this loop?
- 14 commits - libswfdec/swfdec_as_object.c libswfdec/swfdec_file_reference.c libswfdec/swfdec_load_object.c libswfdec/swfdec_sprite_movie_as.c libswfdec/swfdec_system_security.c test/trace