thr3ads.net - R help - [R] Selecting the first row based on a factor [Apr 2010]

If this information is useful, please help other people find it:
Share via:

Sam Albers

2010-Apr-02 18:28 UTC

[R] Selecting the first row based on a factor

Hello there,

I have a situation where I would like to select the first row of a
particular factor for a data frame (data example below). So that is, I would
like to select the first entry when the factor1 =A and then the first row
when factor1=B etc. I have thousands of entries so I need some general way
of doing this. I have a minimal example that should illustrate what I am
trying to do. I am using R version 2.9.2, ESS version 5.4 and Ubuntu 9.04.

Thanks so much in advance!

Sam

#Minimal example

x <- rnorm(100)
y <- rnorm(100)
xy <- data.frame(x,y)
xy$factor1 <- c("A", "B","C","D")
xy$factor2 <- c("a","b")
xy <- xy[order(xy$factor1),]  #This simply orders the data to look more like
the actual data I am working with

#I am trying to use this approach but I am not sure that I am selecting the
correct row and then the output "temp" is a total mess.
temp <- with(xy, unlist(lapply(split(xy, list(factor1=factor1,
factor2=factor2)), function(x) x[1,])))

               x            y               factor1 factor2
1    0.700042585 -2.481633101       A       a   # I would like to select
this row
5    1.402677849 -0.691143942       A       a
9    0.188287765 -1.723823157       A       a
13   0.714946028  0.715361315       A       a
17   0.690177271 -0.112394002       A       a
21   0.333101579 -0.316285321       A       a
25   0.439505793 -3.356415326       A       a
89  -1.001153334 -0.739440288       A       a
93   0.135509539  0.949943380       A       a
97  -1.730936150  0.356133105       A       a
2   -0.399355582 -0.843874548       B       b     # Then I would like to
select this row. etc
6    1.285958969  0.958501988       B       b
10   0.495795836 -0.805012667       B       b
14   0.512486789 -0.968247016       B       b
18  -1.189627025  0.455278250       B       b

-- 
*****************************************************
Sam Albers
Geography Program
University of Northern British Columbia
3333 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*****************************************************

	[[alternative HTML version deleted]]

Erik Iverson

2010-Apr-02 18:35 UTC

head link

[R] Selecting the first row based on a factor

Hello,

Sam Albers wrote:> Hello there,
> 
> I have a situation where I would like to select the first row of a
> particular factor for a data frame (data example below). So that is, I
would
> like to select the first entry when the factor1 =A and then the first row
> when factor1=B etc. I have thousands of entries so I need some general way
> of doing this. I have a minimal example that should illustrate what I am
> trying to do. I am using R version 2.9.2, ESS version 5.4 and Ubuntu 9.04.
> 
> Thanks so much in advance!
> 
> Sam
> 
> #Minimal example
> 
> x <- rnorm(100)
> y <- rnorm(100)
> xy <- data.frame(x,y)
> xy$factor1 <- c("A",
"B","C","D")
> xy$factor2 <- c("a","b")
> xy <- xy[order(xy$factor1),]  #This simply orders the data to look more
like
> the actual data I am working with
Does

xy[!duplicated(xy$factor1),]

do what you want?

R help - Apr 2010 - Selecting the first row based on a factor

[R] Selecting the first row based on a factor

[R] Selecting the first row based on a factor