Hi R users, I have a test dataframe ("file1," shown below) for which I am trying to create a flag for the first and last ID record (equivalent to SAS first.id and last.id variables. Dump of file1: > file1 id rx week dv1 1 1 1 1 1 2 1 1 2 1 3 1 1 3 2 4 2 1 1 3 5 2 1 2 4 6 2 1 3 1 7 3 1 1 2 8 3 1 2 3 9 3 1 3 4 10 4 1 1 2 11 4 1 2 6 12 4 1 3 5 13 5 2 1 7 14 5 2 2 8 15 5 2 3 5 16 6 2 1 2 17 6 2 2 4 18 6 2 3 6 19 7 2 1 7 20 7 2 2 8 21 8 2 1 9 22 9 2 1 4 23 9 2 2 5 I have written code that correctly assigns the first.id and last.id variabes: require(Hmisc) #for Lags #ascending order to define first dot file1<- file1[order(file1$id, file1$week),] file1$first.id <- (Lag(file1$id) != file1$id) file1$first.id[1]<-TRUE #force NA to TRUE #descending order to define last dot file1<- file1[order(-file1$id,-file1$week),] file1$last.id <- (Lag(file1$id) != file1$id) file1$last.id[1]<-TRUE #force NA to TRUE #resort to original order file1<- file1[order(file1$id,file1$week),] I am now trying to get the above code to work as a function, and am clearly doing something wrong: > first.last <- function (df, idvar, sortvars1, sortvars2) + { + #sort in ascending order to define first dot + df<- df[order(sortvars1),] + df$first.idvar <- (Lag(df$idvar) != df$idvar) + #force first record NA to TRUE + df$first.idvar[1]<-TRUE + + #sort in descending order to define last dot + df<- df[order(-sortvars2),] + df$last.idvar <- (Lag(df$idvar) != df$idvar) + #force last record NA to TRUE + df$last.idvar[1]<-TRUE + + #resort to original order + df<- df[order(sortvars1),] + } > Function call: > first.last(df=file1, idvar=file1$id, sortvars1=c(file1$id,file1$week), sortvars2=c(-file1$id,-file1$week)) R Error: Error in as.vector(x, mode) : invalid argument 'mode' > I am not sure about the passing of the sort strings. Perhaps this is were things are off. Any help greatly appreciated. Thanks, Gerard [[alternative HTML version deleted]]
This function should do it for you:> file1 <- read.table(textConnection(" id rx week dv1+ 1 1 1 1 1 + 2 1 1 2 1 + 3 1 1 3 2 + 4 2 1 1 3 + 5 2 1 2 4 + 6 2 1 3 1 + 7 3 1 1 2 + 8 3 1 2 3 + 9 3 1 3 4 + 10 4 1 1 2 + 11 4 1 2 6 + 12 4 1 3 5 + 13 5 2 1 7 + 14 5 2 2 8 + 15 5 2 3 5 + 16 6 2 1 2 + 17 6 2 2 4 + 18 6 2 3 6 + 19 7 2 1 7 + 20 7 2 2 8 + 21 8 2 1 9 + 22 9 2 1 4 + 23 9 2 2 5"), header=TRUE)> > mark.function <-+ function(df){ + df <- df[order(df$id, df$week),] + # create 'diff' of 'id' to determine where the breaks are + breaks <- diff(df$id) + # the first entry will be TRUE, and then every occurance of non-zero in breaks + df$first.id <- c(TRUE, breaks != 0) + # the last entry is TRUE and every non-zero breaks + df$last.id <- c(breaks != 0, TRUE) + df + }> > mark.function(file1)id rx week dv1 first.id last.id 1 1 1 1 1 TRUE FALSE 2 1 1 2 1 FALSE FALSE 3 1 1 3 2 FALSE TRUE 4 2 1 1 3 TRUE FALSE 5 2 1 2 4 FALSE FALSE 6 2 1 3 1 FALSE TRUE 7 3 1 1 2 TRUE FALSE 8 3 1 2 3 FALSE FALSE 9 3 1 3 4 FALSE TRUE 10 4 1 1 2 TRUE FALSE 11 4 1 2 6 FALSE FALSE 12 4 1 3 5 FALSE TRUE 13 5 2 1 7 TRUE FALSE 14 5 2 2 8 FALSE FALSE 15 5 2 3 5 FALSE TRUE 16 6 2 1 2 TRUE FALSE 17 6 2 2 4 FALSE FALSE 18 6 2 3 6 FALSE TRUE 19 7 2 1 7 TRUE FALSE 20 7 2 2 8 FALSE TRUE 21 8 2 1 9 TRUE TRUE 22 9 2 1 4 TRUE FALSE 23 9 2 2 5 FALSE TRUE> >On 9/7/07, Gerard Smits <g_smits at verizon.net> wrote:> Hi R users, > > I have a test dataframe ("file1," shown below) for which I am trying > to create a flag for the first and last ID record (equivalent to SAS > first.id and last.id variables. > > Dump of file1: > > > file1 > id rx week dv1 > 1 1 1 1 1 > 2 1 1 2 1 > 3 1 1 3 2 > 4 2 1 1 3 > 5 2 1 2 4 > 6 2 1 3 1 > 7 3 1 1 2 > 8 3 1 2 3 > 9 3 1 3 4 > 10 4 1 1 2 > 11 4 1 2 6 > 12 4 1 3 5 > 13 5 2 1 7 > 14 5 2 2 8 > 15 5 2 3 5 > 16 6 2 1 2 > 17 6 2 2 4 > 18 6 2 3 6 > 19 7 2 1 7 > 20 7 2 2 8 > 21 8 2 1 9 > 22 9 2 1 4 > 23 9 2 2 5 > > I have written code that correctly assigns the first.id and last.id variabes: > > require(Hmisc) #for Lags > #ascending order to define first dot > file1<- file1[order(file1$id, file1$week),] > file1$first.id <- (Lag(file1$id) != file1$id) > file1$first.id[1]<-TRUE #force NA to TRUE > > #descending order to define last dot > file1<- file1[order(-file1$id,-file1$week),] > file1$last.id <- (Lag(file1$id) != file1$id) > file1$last.id[1]<-TRUE #force NA to TRUE > > #resort to original order > file1<- file1[order(file1$id,file1$week),] > > > > I am now trying to get the above code to work as a function, and am > clearly doing something wrong: > > > first.last <- function (df, idvar, sortvars1, sortvars2) > + { > + #sort in ascending order to define first dot > + df<- df[order(sortvars1),] > + df$first.idvar <- (Lag(df$idvar) != df$idvar) > + #force first record NA to TRUE > + df$first.idvar[1]<-TRUE > + > + #sort in descending order to define last dot > + df<- df[order(-sortvars2),] > + df$last.idvar <- (Lag(df$idvar) != df$idvar) > + #force last record NA to TRUE > + df$last.idvar[1]<-TRUE > + > + #resort to original order > + df<- df[order(sortvars1),] > + } > > > > Function call: > > > first.last(df=file1, idvar=file1$id, > sortvars1=c(file1$id,file1$week), sortvars2=c(-file1$id,-file1$week)) > > R Error: > > Error in as.vector(x, mode) : invalid argument 'mode' > > > > I am not sure about the passing of the sort strings. Perhaps this is > were things are off. Any help greatly appreciated. > > Thanks, > > Gerard > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Jim Holtman Cincinnati, OH +1 513 646 9390 What is the problem you are trying to solve?
A slightly easier way to construct first and last if the vector x is sorted (as is assumed in SAS) is: first <- !duplicated(x) last <- !duplicated(x, fromLast = TRUE) where the fromLast= argument is added in R 2.6.0. On 9/7/07, Gerard Smits <g_smits at verizon.net> wrote:> Hi R users, > > I have a test dataframe ("file1," shown below) for which I am trying > to create a flag for the first and last ID record (equivalent to SAS > first.id and last.id variables. > > Dump of file1: > > > file1 > id rx week dv1 > 1 1 1 1 1 > 2 1 1 2 1 > 3 1 1 3 2 > 4 2 1 1 3 > 5 2 1 2 4 > 6 2 1 3 1 > 7 3 1 1 2 > 8 3 1 2 3 > 9 3 1 3 4 > 10 4 1 1 2 > 11 4 1 2 6 > 12 4 1 3 5 > 13 5 2 1 7 > 14 5 2 2 8 > 15 5 2 3 5 > 16 6 2 1 2 > 17 6 2 2 4 > 18 6 2 3 6 > 19 7 2 1 7 > 20 7 2 2 8 > 21 8 2 1 9 > 22 9 2 1 4 > 23 9 2 2 5 > > I have written code that correctly assigns the first.id and last.id variabes: > > require(Hmisc) #for Lags > #ascending order to define first dot > file1<- file1[order(file1$id, file1$week),] > file1$first.id <- (Lag(file1$id) != file1$id) > file1$first.id[1]<-TRUE #force NA to TRUE > > #descending order to define last dot > file1<- file1[order(-file1$id,-file1$week),] > file1$last.id <- (Lag(file1$id) != file1$id) > file1$last.id[1]<-TRUE #force NA to TRUE > > #resort to original order > file1<- file1[order(file1$id,file1$week),] > > > > I am now trying to get the above code to work as a function, and am > clearly doing something wrong: > > > first.last <- function (df, idvar, sortvars1, sortvars2) > + { > + #sort in ascending order to define first dot > + df<- df[order(sortvars1),] > + df$first.idvar <- (Lag(df$idvar) != df$idvar) > + #force first record NA to TRUE > + df$first.idvar[1]<-TRUE > + > + #sort in descending order to define last dot > + df<- df[order(-sortvars2),] > + df$last.idvar <- (Lag(df$idvar) != df$idvar) > + #force last record NA to TRUE > + df$last.idvar[1]<-TRUE > + > + #resort to original order > + df<- df[order(sortvars1),] > + } > > > > Function call: > > > first.last(df=file1, idvar=file1$id, > sortvars1=c(file1$id,file1$week), sortvars2=c(-file1$id,-file1$week)) > > R Error: > > Error in as.vector(x, mode) : invalid argument 'mode' > > > > I am not sure about the passing of the sort strings. Perhaps this is > were things are off. Any help greatly appreciated. > > Thanks, > > Gerard > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >