thr3ads.net - R help - [R] selecting first row of a variable with long-format data [Sep 2011]

If this information is useful, please help other people find it:
Share via:

AC Del Re

2011-Sep-25 20:22 UTC

[R] selecting first row of a variable with long-format data

Hi,

I am trying to select the first row of a variable with data in long-format,
e.g.,

# sample data
id <- c(1,1,1,2,2)
value <- c(5,6,7,4,5)
dat <- data.frame(id, value)
dat

How can I select/subset the first 'value'  for each unique 'id'?

Thanks,

AC

	[[alternative HTML version deleted]]

R. Michael Weylandt

2011-Sep-25 20:48 UTC

head link

[R] selecting first row of a variable with long-format data

dat <- data.frame(id = c(1,1,1,2,2), value = c(5,6,7,4,5), value2
c(1,4,3,3,4))
with(dat, dat[!duplicated(id),])

Michael

On Sun, Sep 25, 2011 at 4:43 PM, AC Del Re <acdelre@gmail.com> wrote:
> Great but how can I then retain the other variable (or variables if >1)
> value associated with those retained values?
> thank you.
>
> On Sun, Sep 25, 2011 at 1:27 PM, R. Michael Weylandt <
> michael.weylandt@gmail.com> wrote:
>
>> How about something like
>>
>> with(dat, value[!duplicated(id)])
>>
>> ?
>>
>> Michael Weylandt
>> On Sun, Sep 25, 2011 at 4:22 PM, AC Del Re <delre@wisc.edu>
wrote:
>>
>>> Hi,
>>>
>>> I am trying to select the first row of a variable with data in
>>> long-format,
>>> e.g.,
>>>
>>> # sample data
>>> id <- c(1,1,1,2,2)
>>> value <- c(5,6,7,4,5)
>>> dat <- data.frame(id, value)
>>> dat
>>>
>>> How can I select/subset the first 'value'  for each unique
'id'?
>>>
>>> Thanks,
>>>
>>> AC
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
	[[alternative HTML version deleted]]

R. Michael Weylandt

2011-Sep-25 22:24 UTC

head link

[R] selecting first row of a variable with long-format data

I think that's in general a trickier question:

If you are absolutely certain your id's are in blocks and that there are at
least k of them, it would perhaps work to do a manual adjustment of the
indices:something like

# UNTESTED
with(dat, dat[which(!duplicated(id)) + k - 1, ]) # where k is the
whatever-eth element you want to look at

If you aren't guaranteed a block structure, you might have to use some sort
of "rinse and repeat" code (i.e., get the first occurrences, throw
them out,
get the new first occurrences = the old 2nd occurrences, etc). If you aren't
certain that you have at least k occurences, you'll want to handle that as
well. There might be a smarter way, but I'm not seeing one off hand

One nice solution if you just want the last is to use the dupicated(,
fromLast = TRUE) optional argument.

Michael Weylandt


On Sun, Sep 25, 2011 at 6:08 PM, AC Del Re <acdelre@gmail.com> wrote:
> perfect. Thank you, Michael. Out of curiosity, how would the code differ if
> you were interested in selecting the second row (or third, etc) per id?
>
> Thanks again for your help.
>
> AC
>
>
> On Sun, Sep 25, 2011 at 1:48 PM, R. Michael Weylandt <
> michael.weylandt@gmail.com> wrote:
>
>> dat <- data.frame(id = c(1,1,1,2,2), value = c(5,6,7,4,5), value2
>> c(1,4,3,3,4))
>> with(dat, dat[!duplicated(id),])
>>
>> Michael
>>
>>
>> On Sun, Sep 25, 2011 at 4:43 PM, AC Del Re <acdelre@gmail.com>
wrote:
>>
>>> Great but how can I then retain the other variable (or variables if
>1)
>>> value associated with those retained values?
>>> thank you.
>>>
>>> On Sun, Sep 25, 2011 at 1:27 PM, R. Michael Weylandt <
>>> michael.weylandt@gmail.com> wrote:
>>>
>>>> How about something like
>>>>
>>>> with(dat, value[!duplicated(id)])
>>>>
>>>> ?
>>>>
>>>> Michael Weylandt
>>>> On Sun, Sep 25, 2011 at 4:22 PM, AC Del Re
<delre@wisc.edu> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am trying to select the first row of a variable with data
in
>>>>> long-format,
>>>>> e.g.,
>>>>>
>>>>> # sample data
>>>>> id <- c(1,1,1,2,2)
>>>>> value <- c(5,6,7,4,5)
>>>>> dat <- data.frame(id, value)
>>>>> dat
>>>>>
>>>>> How can I select/subset the first 'value'  for each
unique 'id'?
>>>>>
>>>>> Thanks,
>>>>>
>>>>> AC
>>>>>
>>>>>        [[alternative HTML version deleted]]
>>>>>
>>>>> ______________________________________________
>>>>> R-help@r-project.org mailing list
>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>> PLEASE do read the posting guide
>>>>> http://www.R-project.org/posting-guide.html
>>>>> and provide commented, minimal, self-contained,
reproducible code.
>>>>>
>>>>
>>>>
>>>
>>
>
	[[alternative HTML version deleted]]

Dennis Murphy

2011-Sep-26 02:10 UTC

head link

[R] selecting first row of a variable with long-format data

Hi:

The head() function is helpful here:

(i) plyr::ddply()

library('plyr')
ddply(dat, .(id), function(d) head(d, 1))
  id value
1  1     5
2  2     4

(ii) aggregate():
aggregate(value ~ id, data = dat, FUN = function(x) head(x, 1))
  id value
1  1     5
2  2     4

The formula version of aggregate() requires R-2.11.0 +

Dennis

On Sun, Sep 25, 2011 at 1:22 PM, AC Del Re <delre at wisc.edu>
wrote:> Hi,
>
> I am trying to select the first row of a variable with data in long-format,
> e.g.,
>
> # sample data
> id <- c(1,1,1,2,2)
> value <- c(5,6,7,4,5)
> dat <- data.frame(id, value)
> dat
>
> How can I select/subset the first 'value' ?for each unique
'id'?
>
> Thanks,
>
> AC
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

AC Del Re

2011-Sep-26 03:03 UTC

head link

[R] selecting first row of a variable with long-format data

Great. Then, if one is interested in selecting the second row of a variable
(from a unique id), something like this should work:

aggregate(value ~ id, data = dat, FUN = function(x) head(x, 2)[2])


Thanks, Michael and Dennis!


AC


On Sun, Sep 25, 2011 at 7:10 PM, Dennis Murphy <djmuser@gmail.com> wrote:
> Hi:
>
> The head() function is helpful here:
>
> (i) plyr::ddply()
>
> library('plyr')
> ddply(dat, .(id), function(d) head(d, 1))
>  id value
> 1  1     5
> 2  2     4
>
> (ii) aggregate():
> aggregate(value ~ id, data = dat, FUN = function(x) head(x, 1))
>  id value
> 1  1     5
> 2  2     4
>
> The formula version of aggregate() requires R-2.11.0 +
>
> Dennis
>
> On Sun, Sep 25, 2011 at 1:22 PM, AC Del Re <delre@wisc.edu> wrote:
> > Hi,
> >
> > I am trying to select the first row of a variable with data in
> long-format,
> > e.g.,
> >
> > # sample data
> > id <- c(1,1,1,2,2)
> > value <- c(5,6,7,4,5)
> > dat <- data.frame(id, value)
> > dat
> >
> > How can I select/subset the first 'value'  for each unique
'id'?
> >
> > Thanks,
> >
> > AC
>
	[[alternative HTML version deleted]]

Reasonably Related Threads

Search for more possibly parallel threads

R help - Sep 2011 - selecting first row of a variable with long-format data

[R] selecting first row of a variable with long-format data

[R] selecting first row of a variable with long-format data

[R] selecting first row of a variable with long-format data

[R] selecting first row of a variable with long-format data

[R] selecting first row of a variable with long-format data

Reasonably Related Threads