Hi everyone, 
I am trying to generate a conditional dummy variable ?X" with the following
rules
 set X=1 if Y is =1, two years prior to the NA.  [0,0,NA]. 
For example, if  the pattern for Y is 0,0,NA then the X variable is =0 for all 
the two years prior to the NA. If the pattern for Y is 0,1,NA or 1,0,NA then the
X =1 . To be clear, if 1,1,NA then the X=1 that  first specific year, it should
only count once (X=1), not twice.
The code that I have now is not complete and I would appreciate some advice
here. This is the code:
dat2 <- dat1 %>% 
  group_by(country) %>% 
  group_by(grp = cumsum(is.na(lag(Y))), add = TRUE) %>% 
  mutate(first_year_at_1 = match(1, Y) * any(is.na(Y)) * any(tail(Y, 3) == 1L), 
         X = {x <- integer(length(Y)) ; x[first_year_at_1] <- 1L ; x})
%>%
  ungroup()
It doesn?t really generate what I described above. Any help here would be much
appreciated.
Below you can see my sample data with the desired outcome ?X? dummy in it.
Thank you! 
> dput(data)
structure(list(year = c(1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 
1997L, 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 
2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L, 
1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, 
2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 
2011L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 
1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 
2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L, 1993L, 
1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, 2002L, 
2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 
1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 
1999L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, 
2007L, 2008L, 2009L, 2010L, 2011L), country = structure(c(1L, 
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label =
c("Canada",
"Cuba", "Dominican Republic", "Haiti",
"Jamaica"), class = "factor"),
    Y = c(1L, NA, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 1L, 
    1L, NA, 1L, NA, 1L, NA, 1L, NA, NA, 1L, 1L, NA, NA, 1L, NA, 
    1L, NA, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 
    NA, 1L, NA, 1L, 0L, 0L, 0L, 1L, NA, 0L, 1L, 0L, 0L, 0L, 0L, 
    0L, 1L, NA, 0L, 1L, 1L, NA, 0L, 1L, NA, 1L, NA, 1L, NA, 1L, 
    NA, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 1L, 
    0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, NA, 0L, 1L, 1L, 1L, 
    NA, 1L, NA, 0L, 1L, 1L, NA), X = c(1L, 0L, 0L, 1L, 0L, 0L, 
    1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 
    0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 
    0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, 
    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 
    1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, 
    1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 
    1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L)), .Names =
c("year",
"country", "Y", "X"), class =
"data.frame", row.names = c(NA,
-110L))
	[[alternative HTML version deleted]]
I was not able to decipher your meaning, and your failure to garner a response so far may mean that others may have had similar difficultes. You might therefore try providing providing a **small reproducible** example of X's and Y's that show what you have and what you want to end up with to clarify your meaning. Or continue to wait for someone smarter to respond. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, May 28, 2018 at 3:43 AM, Faradj Koliev <faradj.g at gmail.com> wrote:> Hi everyone, > > I am trying to generate a conditional dummy variable ?X" with the > following rules > > set X=1 if Y is =1, two years prior to the NA. [0,0,NA]. > > For example, if the pattern for Y is 0,0,NA then the X variable is =0 for > all the two years prior to the NA. If the pattern for Y is 0,1,NA or > 1,0,NA then the X =1 . To be clear, if 1,1,NA then the X=1 that first > specific year, it should only count once (X=1), not twice. > > The code that I have now is not complete and I would appreciate some > advice here. This is the code: > dat2 <- dat1 %>% > group_by(country) %>% > group_by(grp = cumsum(is.na(lag(Y))), add = TRUE) %>% > mutate(first_year_at_1 = match(1, Y) * any(is.na(Y)) * any(tail(Y, 3) > == 1L), > X = {x <- integer(length(Y)) ; x[first_year_at_1] <- 1L ; x}) %>% > ungroup() > > It doesn?t really generate what I described above. Any help here would be > much appreciated. > > Below you can see my sample data with the desired outcome ?X? dummy in it. > > Thank you! > > > dput(data) > structure(list(year = c(1991L, 1992L, 1993L, 1994L, 1995L, 1996L, > 1997L, 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, > 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L, > 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, > 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, > 2011L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, > 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, > 2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L, 1993L, > 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, 2002L, > 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, > 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, > 1999L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L, > 2007L, 2008L, 2009L, 2010L, 2011L), country = structure(c(1L, > 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, > 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, > 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L, > 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, > 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, > 3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, > 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label = c("Canada", > "Cuba", "Dominican Republic", "Haiti", "Jamaica"), class = "factor"), > Y = c(1L, NA, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 1L, > 1L, NA, 1L, NA, 1L, NA, 1L, NA, NA, 1L, 1L, NA, NA, 1L, NA, > 1L, NA, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, > NA, 1L, NA, 1L, 0L, 0L, 0L, 1L, NA, 0L, 1L, 0L, 0L, 0L, 0L, > 0L, 1L, NA, 0L, 1L, 1L, NA, 0L, 1L, NA, 1L, NA, 1L, NA, 1L, > NA, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 1L, > 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, NA, 0L, 1L, 1L, 1L, > NA, 1L, NA, 0L, 1L, 1L, NA), X = c(1L, 0L, 0L, 1L, 0L, 0L, > 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, > 0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, > 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L, > 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, > 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L, > 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, > 1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L)), .Names > c("year", > "country", "Y", "X"), class = "data.frame", row.names = c(NA, > -110L)) > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]
Hi Faradj,
What a problem! I think I have worked it out, but only because the
result is the one you said you wanted.
# the sample data frame is named fkdf
Y2Xby3<-function(x) {
 nrows<-dim(x)[1]
 X<-rep(0,nrows)
 for(i in 1:(nrows-2)) {
  if(!is.na(x$Y[i])) {
   if(x$Y[i] == 1 && any(is.na(x$Y[(i+1):(i+2)]))) X[i]<-1
   if(i > 1) {
    if(X[i-1] == 1) X[i]<-0
   }
  }
  else {
   if(!is.na(x$Y[i+1])) {
    if(x$Y[i+1] == 1 && is.na(x$Y[i+2]) && X[i] == 0)
     X[i+1]<-1
   }
  }
 }
 return(X)
}
countries<-as.character(unique(fkdf$country))
X1<-NULL
for(country in countries)
 X1<-c(X1,Y2Xby3(fkdf[fkdf$country == country,]))
X1
  [1] 1 0 0 1 0 0 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 0
 [38] 1 0 1 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1 0 1 0
 [75] 1 0 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0
0> fkdf$X
  [1] 1 0 0 1 0 0 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1 0 0
 [38] 1 0 1 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1 0 1 0
 [75] 1 0 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0 0
Jim
On Mon, May 28, 2018 at 8:43 PM, Faradj Koliev <faradj.g at gmail.com>
wrote:> Hi everyone,
>
> I am trying to generate a conditional dummy variable ?X" with the
following rules
>
>  set X=1 if Y is =1, two years prior to the NA.  [0,0,NA].
>
> For example, if  the pattern for Y is 0,0,NA then the X variable is =0 for
all  the two years prior to the NA. If the pattern for Y is 0,1,NA or 1,0,NA
then the X =1 . To be clear, if 1,1,NA then the X=1 that  first specific year,
it should only count once (X=1), not twice.
>
> The code that I have now is not complete and I would appreciate some advice
here. This is the code:
> dat2 <- dat1 %>%
>   group_by(country) %>%
>   group_by(grp = cumsum(is.na(lag(Y))), add = TRUE) %>%
>   mutate(first_year_at_1 = match(1, Y) * any(is.na(Y)) * any(tail(Y, 3) ==
1L),
>          X = {x <- integer(length(Y)) ; x[first_year_at_1] <- 1L ;
x}) %>%
>   ungroup()
>
> It doesn?t really generate what I described above. Any help here would be
much appreciated.
>
> Below you can see my sample data with the desired outcome ?X? dummy in it.
>
> Thank you!
>
>> dput(data)
> structure(list(year = c(1991L, 1992L, 1993L, 1994L, 1995L, 1996L,
> 1997L, 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L,
> 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L,
> 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L,
> 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L,
> 2011L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L,
> 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L,
> 2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L, 1993L,
> 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, 2002L,
> 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L,
> 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L,
> 1999L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L,
> 2007L, 2008L, 2009L, 2010L, 2011L), country = structure(c(1L,
> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
> 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L,
> 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
> 3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
> 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label =
c("Canada",
> "Cuba", "Dominican Republic", "Haiti",
"Jamaica"), class = "factor"),
>     Y = c(1L, NA, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 1L,
>     1L, NA, 1L, NA, 1L, NA, 1L, NA, NA, 1L, 1L, NA, NA, 1L, NA,
>     1L, NA, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA,
>     NA, 1L, NA, 1L, 0L, 0L, 0L, 1L, NA, 0L, 1L, 0L, 0L, 0L, 0L,
>     0L, 1L, NA, 0L, 1L, 1L, NA, 0L, 1L, NA, 1L, NA, 1L, NA, 1L,
>     NA, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 1L,
>     0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, NA, 0L, 1L, 1L, 1L,
>     NA, 1L, NA, 0L, 1L, 1L, NA), X = c(1L, 0L, 0L, 1L, 0L, 0L,
>     1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L,
>     0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L,
>     0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L,
>     0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L,
>     1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L,
>     1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>     1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L)), .Names =
c("year",
> "country", "Y", "X"), class =
"data.frame", row.names = c(NA,
> -110L))
>
>
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
Dear Jim, 
wow! It worked! Thanks a lot. 
I did as you suggested and it worked well with the real data. Although it gave
me this error: Error in if (!is.na(x$Y[i])) { : argument is of length zero. For
some reason the X1 produced less observations than it is in the data. But
it's not a big deal - I identified those cases and simply deleted from the
data (it was countries that only appeared twice in the data (e.g. USSR
Yugoslavia etc).
Best, 
Faradj 
> 29 maj 2018 kl. 02:15 skrev Jim Lemon <drjimlemon at gmail.com>:
> 
> Hi Faradj,
> What a problem! I think I have worked it out, but only because the
> result is the one you said you wanted.
> 
> # the sample data frame is named fkdf
> Y2Xby3<-function(x) {
> nrows<-dim(x)[1]
> X<-rep(0,nrows)
> for(i in 1:(nrows-2)) {
>  if(!is.na(x$Y[i])) {
>   if(x$Y[i] == 1 && any(is.na(x$Y[(i+1):(i+2)]))) X[i]<-1
>   if(i > 1) {
>    if(X[i-1] == 1) X[i]<-0
>   }
>  }
>  else {
>   if(!is.na(x$Y[i+1])) {
>    if(x$Y[i+1] == 1 && is.na(x$Y[i+2]) && X[i] == 0)
>     X[i+1]<-1
>   }
>  }
> }
> return(X)
> }
> countries<-as.character(unique(fkdf$country))
> X1<-NULL
> for(country in countries)
> X1<-c(X1,Y2Xby3(fkdf[fkdf$country == country,]))
> X1
>  [1] 1 0 0 1 0 0 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1
0 0
> [38] 1 0 1 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1 0
1 0
> [75] 1 0 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0
0
>> fkdf$X
>  [1] 1 0 0 1 0 0 1 0 1 0 1 0 1 0 0 1 0 1 0 1 0 0 1 0 0 0 1 0 1 0 1 0 0 0 1
0 0
> [38] 1 0 1 0 1 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 1 0 1 0 1 0
1 0
> [75] 1 0 0 0 1 0 0 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 0 1 0
0
> 
> Jim
> 
> On Mon, May 28, 2018 at 8:43 PM, Faradj Koliev <faradj.g at
gmail.com> wrote:
>> Hi everyone,
>> 
>> I am trying to generate a conditional dummy variable ?X" with the
following rules
>> 
>> set X=1 if Y is =1, two years prior to the NA.  [0,0,NA].
>> 
>> For example, if  the pattern for Y is 0,0,NA then the X variable is =0
for all  the two years prior to the NA. If the pattern for Y is 0,1,NA or 1,0,NA
then the X =1 . To be clear, if 1,1,NA then the X=1 that  first specific year,
it should only count once (X=1), not twice.
>> 
>> The code that I have now is not complete and I would appreciate some
advice here. This is the code:
>> dat2 <- dat1 %>%
>>  group_by(country) %>%
>>  group_by(grp = cumsum(is.na(lag(Y))), add = TRUE) %>%
>>  mutate(first_year_at_1 = match(1, Y) * any(is.na(Y)) * any(tail(Y, 3)
== 1L),
>>         X = {x <- integer(length(Y)) ; x[first_year_at_1] <- 1L ;
x}) %>%
>>  ungroup()
>> 
>> It doesn?t really generate what I described above. Any help here would
be much appreciated.
>> 
>> Below you can see my sample data with the desired outcome ?X? dummy in
it.
>> 
>> Thank you!
>> 
>>> dput(data)
>> structure(list(year = c(1991L, 1992L, 1993L, 1994L, 1995L, 1996L,
>> 1997L, 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L,
>> 2006L, 2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L,
>> 1993L, 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L,
>> 2002L, 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L,
>> 2011L, 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L,
>> 1998L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L,
>> 2007L, 2008L, 2009L, 2010L, 2011L, 1990L, 1991L, 1992L, 1993L,
>> 1994L, 1995L, 1996L, 1997L, 1998L, 1999L, 2000L, 2001L, 2002L,
>> 2003L, 2004L, 2005L, 2006L, 2007L, 2008L, 2009L, 2010L, 2011L,
>> 1990L, 1991L, 1992L, 1993L, 1994L, 1995L, 1996L, 1997L, 1998L,
>> 1999L, 1999L, 2000L, 2001L, 2002L, 2003L, 2004L, 2005L, 2006L,
>> 2007L, 2008L, 2009L, 2010L, 2011L), country = structure(c(1L,
>> 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
>> 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
>> 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 4L, 4L, 4L, 4L, 4L, 4L,
>> 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L,
>> 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L,
>> 3L, 3L, 3L, 3L, 3L, 3L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L,
>> 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L), .Label =
c("Canada",
>> "Cuba", "Dominican Republic", "Haiti",
"Jamaica"), class = "factor"),
>>    Y = c(1L, NA, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 1L,
>>    1L, NA, 1L, NA, 1L, NA, 1L, NA, NA, 1L, 1L, NA, NA, 1L, NA,
>>    1L, NA, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA,
>>    NA, 1L, NA, 1L, 0L, 0L, 0L, 1L, NA, 0L, 1L, 0L, 0L, 0L, 0L,
>>    0L, 1L, NA, 0L, 1L, 1L, NA, 0L, 1L, NA, 1L, NA, 1L, NA, 1L,
>>    NA, 1L, NA, 1L, 1L, 1L, 1L, NA, 1L, NA, 1L, NA, 1L, NA, 1L,
>>    0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 1L, 1L, NA, 0L, 1L, 1L, 1L,
>>    NA, 1L, NA, 0L, 1L, 1L, NA), X = c(1L, 0L, 0L, 1L, 0L, 0L,
>>    1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L,
>>    0L, 1L, 0L, 0L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L,
>>    0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 1L,
>>    0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 0L,
>>    1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 1L, 0L, 0L,
>>    1L, 0L, 1L, 0L, 1L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L,
>>    1L, 0L, 0L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L, 1L, 0L, 0L)), .Names =
c("year",
>> "country", "Y", "X"), class =
"data.frame", row.names = c(NA,
>> -110L))
>> 
>> 
>> 
>>        [[alternative HTML version deleted]]
>> 
>> ______________________________________________
>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
	[[alternative HTML version deleted]]