thr3ads.net - R help - [R] Filtering data [Nov 2001]

If this information is useful, please help other people find it:
Share via:

Matt Pocernich

2001-Nov-07 03:39 UTC

[R] Filtering data

Hello,

I am having difficulty filtering data.  I am working with flow data
collected at a stream gage.  For each record, I have a date and flow
value.  I have filtered this data to only include days when flow values
exceed a given threshold.

Here is my problem.  Within this subset of data, I often have several
consecutive days above the threshold.  From this group of days, I wish to
select the record (both date and flow) containing the maximum flow.  If an
exceedance is isolated ( the preceeding and succeeding day is below the
threshold) I also wish to select that record.

For example from the data set

Day 	Flow

1	10
4 	13
5	20
6	15
9	13

I would like the 1st, 3rd and 5th record filered.

Any ideas on how I would write such and algorithm would be appreciated.

Thanks,

Matt Pocernich

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Joerg Maeder

2001-Nov-07 09:58 UTC

head link

[R] Filtering data

here a way to do it

#your data (days have to be sorted!)
da <- cbind(c(1,4,5,6,9),c(10,13,20,15,13))
#the length of it
l <- dim(da)[1]
#make day-groups
gr <- cumsum(c(T,da[2:l]-da[2:l-1]>1))
#find the index of the maximum of each group
mi <- tapply(da[,2],gr,function(a)(1:length(a))[a==max(a)])
#add them to the start index of each group
mi <- c(0,cumsum(tapply(da[,2],gr,length)))[1:length(mi)]+mi
#output
da[mi,]

Matt Pocernich wrote:> 
> Hello,
> 
> I am having difficulty filtering data.  I am working with flow data
> collected at a stream gage.  For each record, I have a date and flow
> value.  I have filtered this data to only include days when flow values
> exceed a given threshold.
> 
> Here is my problem.  Within this subset of data, I often have several
> consecutive days above the threshold.  From this group of days, I wish to
> select the record (both date and flow) containing the maximum flow.  If an
> exceedance is isolated ( the preceeding and succeeding day is below the
> threshold) I also wish to select that record.
> 
> For example from the data set
> 
> Day     Flow
> 
> 1       10
> 4       13
> 5       20
> 6       15
> 9       13
> 
> I would like the 1st, 3rd and 5th record filered.
> 
> Any ideas on how I would write such and algorithm would be appreciated.
> 
> Thanks,
> 
> Matt Pocernich
> 
>
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
>
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-- 
    Joerg Maeder             IACETH              INSTITUTE
   PhD Student                              FOR ATMOSPHERIC 
  Phone: +41 1 633 36 25                 AND CLIMATE SCIENCE
 Fax: +41 1 633 10 58                  ETH Z?RICH Switzerland
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

John Fox

2001-Nov-07 12:46 UTC

head link

[R] Filtering data

At 08:39 PM 11/6/2001 -0700, Matt Pocernich wrote:
>I am having difficulty filtering data.  I am working with flow data
>collected at a stream gage.  For each record, I have a date and flow
>value.  I have filtered this data to only include days when flow values
>exceed a given threshold.
>
>Here is my problem.  Within this subset of data, I often have several
>consecutive days above the threshold.  From this group of days, I wish to
>select the record (both date and flow) containing the maximum flow.  If an
>exceedance is isolated ( the preceeding and succeeding day is below the
>threshold) I also wish to select that record.
>
>For example from the data set
>
>Day     Flow
>
>1       10
>4       13
>5       20
>6       15
>9       13
>
>I would like the 1st, 3rd and 5th record filered.
>
>Any ideas on how I would write such and algorithm would be appreciated.
Dear Matt,

Here's a function that does what you want with loops. Perhaps someone else 
will produce a more elegant solution:

     > select.rows <- function(data) {
     +     indices <- data[,1]
     +     values <- data[,2]
     +     n <- length(indices)
     +     if (n == 0) stop('no data')
     +     if (n == 1) return(data)
     +     selection <- rep(0, n)  # so as not to grow the selection vector
     +     current <- 1
     +     number <- 1
     +     for (i in 2:n){
     +         if (indices[i] == 1 + indices[i - 1]){
     +             if (values[i] > values[current]) current <- i
     +             }
     +         else {
     +             selection[number] <- current
     +             number <- number + 1
     +             current <- i
     +             }
     +         }
     +     selection[number] <- current
     +     data[selection,]
     +     }
     >
     > data <- matrix(c(1,4,5,6,9, 10,13,20,15,13), 5, 2)
     > colnames(data) <- c('Day', 'Flow')
     > select.rows(data)
         Day Flow
     [1,]   1   10
     [2,]   5   20
     [3,]   9   13


I hope that this isn't too inefficient.

John
-----------------------------------------------------
John Fox
Department of Sociology
McMaster University
Hamilton, Ontario, Canada L8S 4M4
email: jfox at mcmaster.ca
phone: 905-525-9140x23604
web: socsci.mcmaster.ca/jfox
-----------------------------------------------------

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Maybe Matching Threads

Search for more apparently analagous threads

R help - Nov 2001 - Filtering data

[R] Filtering data

[R] Filtering data

[R] Filtering data

Maybe Matching Threads