Dear R users, I have this data frame, y samp 8 0.03060419 X 18 0.06120838 Y 10 0.23588374 X 3 0.32809965 X 1 0.36007100 X 7 0.36730571 X 20 0.47176748 Y 13 0.65619929 Y 11 0.72014201 Y 17 0.73461142 Y 6 0.76221313 X 2 0.77005691 X 4 0.92477243 X 9 0.93837591 X 5 0.98883581 X 16 1.52442626 Y 12 1.54011381 Y 14 1.84954487 Y 19 1.87675183 Y 15 1.97767162 Y and I am trying to find the number of X's that occur before ith Y occurs. For example, there is 1 X before the first Y, so I get 1. There are 4 X's before the second Y, so I get 4, there is no X between second and third Y, so I get 0 and so on. Any hint to at least help me to start this will be appreciated. Thanks a lot! [[alternative HTML version deleted]]
Hi, On Wed, Nov 2, 2011 at 12:54 PM, Sl K <s.karmv at gmail.com> wrote:> Dear R users, > > I have this data frame, > ? ? ? ? ? y samp > 8 0.03060419 ? ?X > 18 0.06120838 ? ?Y > 10 0.23588374 ? ?X > 3 0.32809965 ? ?X > 1 ?0.36007100 ? ?X > 7 0.36730571 ? ?X > 20 0.47176748 ? ?Y > 13 0.65619929 ? ?Y > 11 0.72014201 ? ?Y > 17 0.73461142 ? ?Y > 6 0.76221313 ? ?X > 2 0.77005691 ? ?X > 4 0.92477243 ? ?X > 9 0.93837591 ? ?X > 5 0.98883581 ? ?X > 16 1.52442626 ? ?Y > 12 1.54011381 ? ?Y > 14 1.84954487 ? ?Y > 19 1.87675183 ? ?Y > 15 1.97767162 ? ?Y > > and I am trying to find the number of X's that occur before ith Y occurs. > For example, there is 1 X before the first Y, so I get 1. There are 4 X's > before the second Y, so I get 4, there is no X between second and third Y, > so I get 0 and so on. Any hint to at least help me to start this will be > appreciated. Thanks a lot!Using dput() to provide reproducible data would be nice, but failing that here's a simple example with sample data:> testdata <- c("x", "y", "x", "x", "x", "y", "x", "x", "x", "x", "x", "y", "y") > rle(testdata)Run Length Encoding lengths: int [1:6] 1 1 3 1 5 2 values : chr [1:6] "x" "y" "x" "y" "x" "y" You can use the values component of the list returned by rle to subset the lengths component of the list to get only the x values if that's what you need to end up with.> rle(testdata)$lengths[rle(testdata)$values == "x"][1] 1 3 5 -- Sarah Goslee http://www.functionaldiversity.org
Is the following what you want? It should give the number of "X"s immediately preceding each "Y".> samp <- c("X", "Y", "X", "X", "X", "X", "Y", "Y", "Y", "Y", "X", "X","X", "X", "X", "Y", "Y", "Y", "Y", "Y")> diff((seq_along(samp) - cumsum(samp=="Y"))[samp=="Y"])[1] 4 0 0 0 5 0 0 0 0 Bill Dunlap Spotfire, TIBCO Software wdunlap tibco.com> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of Sl K > Sent: Wednesday, November 02, 2011 9:55 AM > To: r-help at r-project.org > Subject: [R] how to count number of occurrences > > Dear R users, > > I have this data frame, > y samp > 8 0.03060419 X > 18 0.06120838 Y > 10 0.23588374 X > 3 0.32809965 X > 1 0.36007100 X > 7 0.36730571 X > 20 0.47176748 Y > 13 0.65619929 Y > 11 0.72014201 Y > 17 0.73461142 Y > 6 0.76221313 X > 2 0.77005691 X > 4 0.92477243 X > 9 0.93837591 X > 5 0.98883581 X > 16 1.52442626 Y > 12 1.54011381 Y > 14 1.84954487 Y > 19 1.87675183 Y > 15 1.97767162 Y > > and I am trying to find the number of X's that occur before ith Y occurs. > For example, there is 1 X before the first Y, so I get 1. There are 4 X's > before the second Y, so I get 4, there is no X between second and third Y, > so I get 0 and so on. Any hint to at least help me to start this will be > appreciated. Thanks a lot! > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
On Nov 2, 2011, at 12:54 PM, Sl K wrote:> Dear R users, > > I have this data frame, > y samp > 8 0.03060419 X > 18 0.06120838 Y > 10 0.23588374 X > 3 0.32809965 X > 1 0.36007100 X > 7 0.36730571 X > 20 0.47176748 Y > 13 0.65619929 Y > 11 0.72014201 Y > 17 0.73461142 Y > 6 0.76221313 X > 2 0.77005691 X > 4 0.92477243 X > 9 0.93837591 X > 5 0.98883581 X > 16 1.52442626 Y > 12 1.54011381 Y > 14 1.84954487 Y > 19 1.87675183 Y > 15 1.97767162 Ydat$nXs <- cumsum(dat$samp=="X") dat$nYs <- cumsum(dat$samp=="Y") dat # y samp nXs nYs 8 0.03060419 X 1 0 18 0.06120838 Y 1 1 10 0.23588374 X 2 1 3 0.32809965 X 3 1 1 0.36007100 X 4 1 7 0.36730571 X 5 1 20 0.47176748 Y 5 2 13 0.65619929 Y 5 3 11 0.72014201 Y 5 4 17 0.73461142 Y 5 5 6 0.76221313 X 6 5 2 0.77005691 X 7 5 4 0.92477243 X 8 5 9 0.93837591 X 9 5 5 0.98883581 X 10 5 16 1.52442626 Y 10 6 12 1.54011381 Y 10 7 14 1.84954487 Y 10 8 19 1.87675183 Y 10 9 15 1.97767162 Y 10 10 I find that there are 5 X's before the second Y. > nXbefore_mthY <- function(m) dat[which(dat$nYs==m), "nXs"] > nXbefore_mthY(2) [1] 5> > and I am trying to find the number of X's that occur before ith Y > occurs. > For example, there is 1 X before the first Y, so I get 1. There are > 4 X's > before the second Y, so I get 4, there is no X between second and > third Y, > so I get 0 and so on. Any hint to at least help me to start this > will be > appreciated. Thanks a lot! > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT