Dear Community, I have a data set with two columns, bird number and mass. Individual birds were captured 1-13 times and weighed each time. I would like to remove those individuals that were captured only once, so that I can assess mass variability per bird. I¹ve tried many approaches with no success. Can anyone recommend a way to remove individuals that were captured only once? Thanks, Ray [[alternative HTML version deleted]]
> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] > On Behalf Of Raymond Danner > Sent: Monday, September 28, 2009 11:03 AM > To: r-help at r-project.org > Subject: [R] Remove single entries > > Dear Community, > > I have a data set with two columns, bird number and mass. Individual > birds > were captured 1-13 times and weighed each time. I would like to remove > those individuals that were captured only once, so that I can assess mass > variability per bird. I?ve tried many approaches with no success. Can > anyone recommend a way to remove individuals that were captured only once? >We need a reproducible example. My guess is that you might use some combination of indexing and the table function, but if you can give us code that shows us a basic example of your setup, that would be great.
On 9/28/2009 12:03 PM, Raymond Danner wrote:> Dear Community, > > I have a data set with two columns, bird number and mass. Individual birds > were captured 1-13 times and weighed each time. I would like to remove > those individuals that were captured only once, so that I can assess mass > variability per bird. I?ve tried many approaches with no success. Can > anyone recommend a way to remove individuals that were captured only once? > > Thanks, > RayHow about something like this? DF <- data.frame(BIRD = rep(1:10, c(1,1,2,10,5,6,7,1,8,9)), MASS = rnorm(50,50,10)) DF$NOBS <- with(DF, ave(MASS, BIRD, FUN=length)) subset(DF, NOBS > 1)> [[alternative HTML version deleted]] > > ------------------------------------------------------------------------ > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Chuck Cleland, Ph.D. NDRI, Inc. (www.ndri.org) 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894
On Mon, Sep 28, 2009 at 5:03 PM, Raymond Danner <rdanner at vt.edu> wrote:> Dear Community, > > I have a data set with two columns, bird number and mass. ?Individual birds > were captured 1-13 times and weighed each time. ?I would like to remove > those individuals that were captured only once, so that I can assess mass > variability per bird. ?I?ve tried many approaches with no success. ?Can > anyone recommend a way to remove individuals that were captured only once?Approach this one step at a time. My sample data is: > wts bird mass 1 1 2.3 2 1 3.2 3 1 2.1 4 2 1.2 5 3 5.4 6 3 4.5 7 3 4.4 8 4 3.2 how many times was each bird measured? Use table() > table(wts$bird) 1 2 3 4 3 1 3 1 table uses the row.names() function to get the row names of the original dataframe, so we want the row names where the count is greater than one: > row.names(table(wts$bird))[table(wts$bird)>1] [1] "1" "3" [This calls 'table' twice, so you might want to save the table to a new object] Now we want all the rows of our original dataframe where the bird number is in that set, so we select rows using %in%: > wts[wts$bird %in% row.names(table(wts$bird))[table(wts$bird)>1],] bird mass 1 1 2.3 2 1 3.2 3 1 2.1 5 3 5.4 6 3 4.5 7 3 4.4 Looks a bit messy, I'm not pleased with myself... Must be a better way... Aha! A table-free way of computing the bird counts is: > unique(wts$bird[duplicated(wts$bird)]) [1] 1 3 So you could do: > wts[wts$bird %in% unique(wts$bird[duplicated(wts$bird)]),] bird mass 1 1 2.3 2 1 3.2 3 1 2.1 5 3 5.4 6 3 4.5 7 3 4.4 which looks a bit neater! You might want to unravel unique(wts$bird[duplicated(wts$bird)]) to see what the various bits do. And read the help pages. TMTOWTDI, as they say. Barry
Possibly Parallel Threads
- Mclust - which cluster is each observation in?
- Questionnaire Analysis virtually without continuous Variables
- Bootstraping for groups and subgroups and joing with other table
- Generating permutations that always include one specific element
- Remove rows in a matrix that match rows in another matrix