AC Del Re
2011-Sep-25  20:22 UTC
[R] selecting first row of a variable with long-format data
Hi, I am trying to select the first row of a variable with data in long-format, e.g., # sample data id <- c(1,1,1,2,2) value <- c(5,6,7,4,5) dat <- data.frame(id, value) dat How can I select/subset the first 'value' for each unique 'id'? Thanks, AC [[alternative HTML version deleted]]
R. Michael Weylandt
2011-Sep-25  20:48 UTC
[R] selecting first row of a variable with long-format data
dat <- data.frame(id = c(1,1,1,2,2), value = c(5,6,7,4,5), value2 c(1,4,3,3,4)) with(dat, dat[!duplicated(id),]) Michael On Sun, Sep 25, 2011 at 4:43 PM, AC Del Re <acdelre@gmail.com> wrote:> Great but how can I then retain the other variable (or variables if >1) > value associated with those retained values? > thank you. > > On Sun, Sep 25, 2011 at 1:27 PM, R. Michael Weylandt < > michael.weylandt@gmail.com> wrote: > >> How about something like >> >> with(dat, value[!duplicated(id)]) >> >> ? >> >> Michael Weylandt >> On Sun, Sep 25, 2011 at 4:22 PM, AC Del Re <delre@wisc.edu> wrote: >> >>> Hi, >>> >>> I am trying to select the first row of a variable with data in >>> long-format, >>> e.g., >>> >>> # sample data >>> id <- c(1,1,1,2,2) >>> value <- c(5,6,7,4,5) >>> dat <- data.frame(id, value) >>> dat >>> >>> How can I select/subset the first 'value' for each unique 'id'? >>> >>> Thanks, >>> >>> AC >>> >>> [[alternative HTML version deleted]] >>> >>> ______________________________________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-help >>> PLEASE do read the posting guide >>> http://www.R-project.org/posting-guide.html >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> >[[alternative HTML version deleted]]
R. Michael Weylandt
2011-Sep-25  22:24 UTC
[R] selecting first row of a variable with long-format data
I think that's in general a trickier question: If you are absolutely certain your id's are in blocks and that there are at least k of them, it would perhaps work to do a manual adjustment of the indices:something like # UNTESTED with(dat, dat[which(!duplicated(id)) + k - 1, ]) # where k is the whatever-eth element you want to look at If you aren't guaranteed a block structure, you might have to use some sort of "rinse and repeat" code (i.e., get the first occurrences, throw them out, get the new first occurrences = the old 2nd occurrences, etc). If you aren't certain that you have at least k occurences, you'll want to handle that as well. There might be a smarter way, but I'm not seeing one off hand One nice solution if you just want the last is to use the dupicated(, fromLast = TRUE) optional argument. Michael Weylandt On Sun, Sep 25, 2011 at 6:08 PM, AC Del Re <acdelre@gmail.com> wrote:> perfect. Thank you, Michael. Out of curiosity, how would the code differ if > you were interested in selecting the second row (or third, etc) per id? > > Thanks again for your help. > > AC > > > On Sun, Sep 25, 2011 at 1:48 PM, R. Michael Weylandt < > michael.weylandt@gmail.com> wrote: > >> dat <- data.frame(id = c(1,1,1,2,2), value = c(5,6,7,4,5), value2 >> c(1,4,3,3,4)) >> with(dat, dat[!duplicated(id),]) >> >> Michael >> >> >> On Sun, Sep 25, 2011 at 4:43 PM, AC Del Re <acdelre@gmail.com> wrote: >> >>> Great but how can I then retain the other variable (or variables if >1) >>> value associated with those retained values? >>> thank you. >>> >>> On Sun, Sep 25, 2011 at 1:27 PM, R. Michael Weylandt < >>> michael.weylandt@gmail.com> wrote: >>> >>>> How about something like >>>> >>>> with(dat, value[!duplicated(id)]) >>>> >>>> ? >>>> >>>> Michael Weylandt >>>> On Sun, Sep 25, 2011 at 4:22 PM, AC Del Re <delre@wisc.edu> wrote: >>>> >>>>> Hi, >>>>> >>>>> I am trying to select the first row of a variable with data in >>>>> long-format, >>>>> e.g., >>>>> >>>>> # sample data >>>>> id <- c(1,1,1,2,2) >>>>> value <- c(5,6,7,4,5) >>>>> dat <- data.frame(id, value) >>>>> dat >>>>> >>>>> How can I select/subset the first 'value' for each unique 'id'? >>>>> >>>>> Thanks, >>>>> >>>>> AC >>>>> >>>>> [[alternative HTML version deleted]] >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>>> >>>> >>>> >>> >> >[[alternative HTML version deleted]]
Dennis Murphy
2011-Sep-26  02:10 UTC
[R] selecting first row of a variable with long-format data
Hi:
The head() function is helpful here:
(i) plyr::ddply()
library('plyr')
ddply(dat, .(id), function(d) head(d, 1))
  id value
1  1     5
2  2     4
(ii) aggregate():
aggregate(value ~ id, data = dat, FUN = function(x) head(x, 1))
  id value
1  1     5
2  2     4
The formula version of aggregate() requires R-2.11.0 +
Dennis
On Sun, Sep 25, 2011 at 1:22 PM, AC Del Re <delre at wisc.edu>
wrote:> Hi,
>
> I am trying to select the first row of a variable with data in long-format,
> e.g.,
>
> # sample data
> id <- c(1,1,1,2,2)
> value <- c(5,6,7,4,5)
> dat <- data.frame(id, value)
> dat
>
> How can I select/subset the first 'value' ?for each unique
'id'?
>
> Thanks,
>
> AC
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
AC Del Re
2011-Sep-26  03:03 UTC
[R] selecting first row of a variable with long-format data
Great. Then, if one is interested in selecting the second row of a variable (from a unique id), something like this should work: aggregate(value ~ id, data = dat, FUN = function(x) head(x, 2)[2]) Thanks, Michael and Dennis! AC On Sun, Sep 25, 2011 at 7:10 PM, Dennis Murphy <djmuser@gmail.com> wrote:> Hi: > > The head() function is helpful here: > > (i) plyr::ddply() > > library('plyr') > ddply(dat, .(id), function(d) head(d, 1)) > id value > 1 1 5 > 2 2 4 > > (ii) aggregate(): > aggregate(value ~ id, data = dat, FUN = function(x) head(x, 1)) > id value > 1 1 5 > 2 2 4 > > The formula version of aggregate() requires R-2.11.0 + > > Dennis > > On Sun, Sep 25, 2011 at 1:22 PM, AC Del Re <delre@wisc.edu> wrote: > > Hi, > > > > I am trying to select the first row of a variable with data in > long-format, > > e.g., > > > > # sample data > > id <- c(1,1,1,2,2) > > value <- c(5,6,7,4,5) > > dat <- data.frame(id, value) > > dat > > > > How can I select/subset the first 'value' for each unique 'id'? > > > > Thanks, > > > > AC >[[alternative HTML version deleted]]