Hello there,
I have a situation where I would like to select the first row of a
particular factor for a data frame (data example below). So that is, I would
like to select the first entry when the factor1 =A and then the first row
when factor1=B etc. I have thousands of entries so I need some general way
of doing this. I have a minimal example that should illustrate what I am
trying to do. I am using R version 2.9.2, ESS version 5.4 and Ubuntu 9.04.
Thanks so much in advance!
Sam
#Minimal example
x <- rnorm(100)
y <- rnorm(100)
xy <- data.frame(x,y)
xy$factor1 <- c("A", "B","C","D")
xy$factor2 <- c("a","b")
xy <- xy[order(xy$factor1),] #This simply orders the data to look more like
the actual data I am working with
#I am trying to use this approach but I am not sure that I am selecting the
correct row and then the output "temp" is a total mess.
temp <- with(xy, unlist(lapply(split(xy, list(factor1=factor1,
factor2=factor2)), function(x) x[1,])))
x y factor1 factor2
1 0.700042585 -2.481633101 A a # I would like to select
this row
5 1.402677849 -0.691143942 A a
9 0.188287765 -1.723823157 A a
13 0.714946028 0.715361315 A a
17 0.690177271 -0.112394002 A a
21 0.333101579 -0.316285321 A a
25 0.439505793 -3.356415326 A a
89 -1.001153334 -0.739440288 A a
93 0.135509539 0.949943380 A a
97 -1.730936150 0.356133105 A a
2 -0.399355582 -0.843874548 B b # Then I would like to
select this row. etc
6 1.285958969 0.958501988 B b
10 0.495795836 -0.805012667 B b
14 0.512486789 -0.968247016 B b
18 -1.189627025 0.455278250 B b
--
*****************************************************
Sam Albers
Geography Program
University of Northern British Columbia
3333 University Way
Prince George, British Columbia
Canada, V2N 4Z9
phone: 250 960-6777
*****************************************************
[[alternative HTML version deleted]]