Dear All, Consider a simple example a<-c(1,4,3,0,4,5,6,9,3,4) b<-c(0,4,5) c<-c(5,4,0) I would like to be able to tell whether a sequence is contained (the order of the elements does matter) in another one e.g. in the example above, b is a subsequence of a, whereas c is not. Since the order matters, I cannot treat the sequences above as sets (also, elements are repeated). Does anyone know a smart way of achieving that? Many thanks Lorenzo
Convert to strings and use grep functions. using c for variable is a bad idea. a <- paste(a, collapse="") b <- paste(b, collapse="") d <- paste(d, collapse="") grepl(b,a) grepl(d,a) Nikhil Kaza Asst. Professor, City and Regional Planning University of North Carolina nikhil.list at gmail.com On Sep 21, 2010, at 6:31 AM, Lorenzo Isella wrote:> a<-c(1,4,3,0,4,5,6,9,3,4) > b<-c(0,4,5) > c<-c(5,4,0)
This function might be helpful:
bleh <- function(a, b) {
  where <- list()
  matches <- 0
  first <- which(a == b[1])
  for (i in first) {
    seq.to.match <- seq(i, length = length(b))
    if (identical(a[seq.to.match], b)) {
      matches <- matches + 1
      where[[matches]] <- seq.to.match
    }
  }
  return(where)
}
a<-c(3,4,3,0,4,5,6,9,3,4)
b<-c(0,4,5)
c<-c(5,4,0)
d<-c(3,4)
bleh(a, b)
bleh(a, c)
bleh(a, d)
Cheers,
Gustavo.
On Tue, Sep 21, 2010 at 11:31 AM, Lorenzo Isella
<lorenzo.isella at gmail.com> wrote:> Dear All,
> Consider a simple example
>
> a<-c(1,4,3,0,4,5,6,9,3,4)
> b<-c(0,4,5)
> c<-c(5,4,0)
>
> I would like to be able to tell whether a sequence is contained (the order
> of the elements does matter) in another one e.g. in the example above, b is
> a subsequence of a, whereas c is not. Since the order matters, I cannot
> treat the sequences above as sets (also, elements are repeated).
> Does anyone know a smart way of achieving that?
> Many thanks
>
> Lorenzo
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
On Sep 21, 2010, at 6:31 AM, Lorenzo Isella wrote:> Dear All, > Consider a simple example > > a<-c(1,4,3,0,4,5,6,9,3,4) > b<-c(0,4,5) > c<-c(5,4,0) > > I would like to be able to tell whether a sequence is contained (the > order of the elements does matter) in another one e.g. in the > example above, b is a subsequence of a, whereas c is not. Since the > order matters, I cannot treat the sequences above as sets (also, > elements are repeated). > Does anyone know a smart way of achieving that?> grep(paste(c, collapse="#"), paste(a, collapse="#")) integer(0) > grep(paste(b, collapse="#"), paste(a, collapse="#")) [1] 1 Looking at that output I am wondering if you might need to also put markers at the ends of the arguments. > grep(paste("#",b,"#", collapse="#"), paste("#",a,"#", collapse="#")) [1] 1 # To prevent a match like c(1,2,3) with c(101,2,303). There is also an istrings package in the BioConductor repository that provides more extensive string matching facilities. -- David Winsemius, MD West Hartford, CT