thr3ads.net - R help - [R] Select rows with distinct values in a column and other conditions [Jan 2011]

If this information is useful, please help other people find it:
Share via:

sarah bauduin

2011-Jan-31 14:22 UTC

[R] Select rows with distinct values in a column and other conditions

My data frame looks like:
   SightingID PA1 PA2 PlotID InOverlap Area1        2001   1 -99    392        
Y        0.22        2002   1 -99    388         Y        0.253        2008   1 
NA    104         N        0.344        2010   1  NA     71         N       
0.185        2012   1  NA     61         N         0.166        2013   1  NA    
61         N         0.227        2014   1  NA     62         N          0.258  
2015   1  NA     63         N         0.199        2016   1  NA     63         N
0.310       2017   1  NA     63         N         0.2511       2018   1  NA    
63         N        0.2612       2019   1  NA     63         N         0.2613   
2020   1  NA     64         N         0.33  14       2021   1  NA     64        
N         0.4215       2022   1  NA     85         N         0.0816       2023  
0   1     95         Y           0.11 17       2024   1  NA     93         N    
0.2318       2025   1  NA    106         N         0.419       2026   1  NA   
134         N        0.28
The only unique values in the data frame are the SightingID. I would like to
obtain a new data frame with unique PlotID values based on several conditions:-
return the row if there is only one SightingID for the PlotID- if there is
several SightingID for the same PlotID value:     -select first the SightingID
for which PA1=0,      if there is several SightingID with PA1=0 for the same
PlotID select the one with the highest value in Area,     if there is several
SightingID with PA1=0 for the same PlotID with the highest value for Area select
one SightingID at random     - select the SightingID for which PA1 is not equal
to 0 based on the highest value in Area (and at random if there are several with
the highest value in Area)
I have no idea how to do that, can someone help me please ?	Sarah 		 	   		  
	[[alternative HTML version deleted]]

Joshua Wiley

2011-Jan-31 16:15 UTC

head link

[R] Select rows with distinct values in a column and other conditions

Dear Sarah,

What you will need is a series of logical conditions.  ?Logic or ?"|"
should pull up the documentation on the logical operators available to
use.   Because this list does not accept HTML emails (see the posting
guide), your data frame did not come through in any coherent form.
Can you resend the data using the following procedure:

Suppose your data frame is named "dfrm" (but substitute its actual
name).  At the R console, type:

dput(dfrm)

this will output how R "sees" the data to the console, then copy and
paste that whole jumble of text from R to your email and send it to
us.  This is one of the easiest way for us to read in small amounts of
data, and it should be easy for you to provide too.

Cheers,

Josh

On Mon, Jan 31, 2011 at 6:22 AM, sarah bauduin <sarahbauduin at
hotmail.fr> wrote:>
> My data frame looks like:
> ? SightingID PA1 PA2 PlotID InOverlap Area1 ? ? ? ?2001 ? 1 -99 ? ?392 ? ?
? ? Y ? ? ? ?0.22 ? ? ? ?2002 ? 1 -99 ? ?388 ? ? ? ? Y ? ? ? ?0.253 ? ? ? ?2008
? 1 ?NA ? ?104 ? ? ? ? N ? ? ? ?0.344 ? ? ? ?2010 ? 1 ?NA ? ? 71 ? ? ? ? N ? ? ?
?0.185 ? ? ? ?2012 ? 1 ?NA ? ? 61 ? ? ? ? N ? ? ? ? 0.166 ? ? ? ?2013 ? 1 ?NA ?
? 61 ? ? ? ? N ? ? ? ? 0.227 ? ? ? ?2014 ? 1 ?NA ? ? 62 ? ? ? ? N ? ? ? ? ?0.258
? ? ? ?2015 ? 1 ?NA ? ? 63 ? ? ? ? N ? ? ? ? 0.199 ? ? ? ?2016 ? 1 ?NA ? ? 63 ?
? ? ? N ? ? ? ? ?0.310 ? ? ? 2017 ? 1 ?NA ? ? 63 ? ? ? ? N ? ? ? ? 0.2511 ? ? ?
2018 ? 1 ?NA ? ? 63 ? ? ? ? N ? ? ? ?0.2612 ? ? ? 2019 ? 1 ?NA ? ? 63 ? ? ? ? N
? ? ? ? 0.2613 ? ? ? 2020 ? 1 ?NA ? ? 64 ? ? ? ? N ? ? ? ? 0.33 ?14 ? ? ? 2021 ?
1 ?NA ? ? 64 ? ? ? ? N ? ? ? ? 0.4215 ? ? ? 2022 ? 1 ?NA ? ? 85 ? ? ? ? N ? ? ?
? 0.0816 ? ? ? 2023 ? 0 ? 1 ? ? 95 ? ? ? ? Y ? ? ? ? ? 0.11 17 ? ? ? 2024 ? 1
?NA ? ? 93 ? ? ? ? N ? ? ? ? 0.2318 ? ? ? 2025 ? 1 ?NA ? ?106 ? ? ? ? N ? ? ? ?
0.419 ? ? ? 2026 ? 1 ?NA ? ?134 !
> ? ? ? ? N ? ? ? ?0.28
> The only unique values in the data frame are the SightingID. I would like
to obtain a new data frame with unique PlotID values based on several
conditions:- return the row if there is only one SightingID for the PlotID- if
there is several SightingID for the same PlotID value: ? ? -select first the
SightingID for which PA1=0, ? ? ?if there is several SightingID with PA1=0 for
the same PlotID select the one with the highest value in Area, ? ? if there is
several SightingID with PA1=0 for the same PlotID with the highest value for
Area select one SightingID at random ? ? - select the SightingID for which PA1
is not equal to 0 based on the highest value in Area (and at random if there are
several with the highest value in Area)
> I have no idea how to do that, can someone help me please ? ? ? Sarah
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Joshua Wiley
Ph.D. Student, Health Psychology
University of California, Los Angeles
http://www.joshuawiley.com/

sarah bauduin

2011-Jan-31 16:54 UTC

head link

[R] Select rows with distinct values in a column and other conditions

My dataframe looks like this one:
SightingID<-c(2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013)PA1<-c(0,1,0,0,1,1,1,1,0,0,-99,1,1)PA2<-c(1,NA,1,1,NA,-99,-99,NA,1,1,1,NA,NA)PlotID<-c(1,1,2,2,2,3,3,3,4,4,4,4,5)Area<-c(0.2,0.3,0.25,0.2,0.3,0.4,0.3,0.35,0.4,0.4,0.5,0.3,0.2)DF<-cbind(SightingID,PA1,PA2,PlotID,Area)
There are several SightingID for a same PlotID value and I need to select only
one SightingID for each PlotID value.The SightingID selected for the PlotID
value need to be:- the one with PA1=0- if there are several SightingID with
PA1=0, select the one with the highest Area value- if there are several
SightingID with PA1=0 and the same highest Area value, select one at random- if
for one PlotID value there is no SightingID with PA1=0, select the one with the
highest Area value (and at random if there are several with the same highest
Area value)
I would like to have this kind of result:
SightingID2<-c(2001,2003,2006,2009,2013)PA12<-c(0,0,1,0,1)PA22<-c(1,1,-99,1,NA)PlotID2<-c(1,2,3,4,5)Area2<-c(0.2,0.25,0.4,0.4,0.2)DF2<-cbind(SightingID2,PA12,PA22,PlotID2,Area2)
Can someone help me ?Thanks 
Sarah 		 	   		  
	[[alternative HTML version deleted]]

Ista Zahn

2011-Jan-31 22:08 UTC

head link

[R] Select rows with distinct values in a column and other conditions

Hi Sarah,
Here is how I would do it. Not elegent, but fairly transparent, and it
seems to give the desired result.



DF <- as.data.frame(DF)
pick.value <- function(x){
    if(0 %in% x$PA1) {
        x <- x[x$PA1 == 0,]
    }
    x <- x[x$Area == max(x$Area, na.rm=T),]
    S <- x[sample(1:nrow(x), 1),]
    return(as.matrix(S)[1, , drop=TRUE])
}

DF2 <- matrix(, nrow=length(unique(DF[, "PlotID"])), ncol=ncol(DF),
dimnames=list(NULL, names(DF)))
for(i in PlotID){
    DF2[i,] <- pick.value(DF[DF$PlotID == i,])
}

DF2


Best,
Ista

On Mon, Jan 31, 2011 at 11:54 AM, sarah bauduin <sarahbauduin at
hotmail.fr> wrote:>
> My dataframe looks like this one:
>
SightingID<-c(2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013)PA1<-c(0,1,0,0,1,1,1,1,0,0,-99,1,1)PA2<-c(1,NA,1,1,NA,-99,-99,NA,1,1,1,NA,NA)PlotID<-c(1,1,2,2,2,3,3,3,4,4,4,4,5)Area<-c(0.2,0.3,0.25,0.2,0.3,0.4,0.3,0.35,0.4,0.4,0.5,0.3,0.2)DF<-cbind(SightingID,PA1,PA2,PlotID,Area)
> There are several SightingID for a same PlotID value and I need to select
only one SightingID for each PlotID value.The SightingID selected for the PlotID
value need to be:- the one with PA1=0- if there are several SightingID with
PA1=0, select the one with the highest Area value- if there are several
SightingID with PA1=0 and the same highest Area value, select one at random- if
for one PlotID value there is no SightingID with PA1=0, select the one with the
highest Area value (and at random if there are several with the same highest
Area value)
> I would like to have this kind of result:
>
SightingID2<-c(2001,2003,2006,2009,2013)PA12<-c(0,0,1,0,1)PA22<-c(1,1,-99,1,NA)PlotID2<-c(1,2,3,4,5)Area2<-c(0.2,0.25,0.4,0.4,0.2)DF2<-cbind(SightingID2,PA12,PA22,PlotID2,Area2)
> Can someone help me ?Thanks
> Sarah
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

sarah bauduin

2011-Feb-25 21:34 UTC

head link

[R] Error loop: missing value where TRUE/FALSE needed

Dear R-help community,


I often get this error "missing value where TRUE/FALSE needed" when
I'm doing some loops, like with this one


for (i in 1:nrow(survey)) {

if (survey[i,7]==survey[i,3])

{survey[i,14]<-survey[i,6]}

else

if(survey[i,11]==0 && survey[i,12]==0)

{survey[i,14]<-survey[i,4]}

}


Can someone explain me what I do wrong because I don't see the difference
with other loops that work
Thanks a lot

Sarah 		 	   		  
	[[alternative HTML version deleted]]

Maybe Matching Threads

Search for more apparently analagous threads

R help - Jan 2011 - Select rows with distinct values in a column and other conditions

[R] Select rows with distinct values in a column and other conditions

[R] Select rows with distinct values in a column and other conditions

[R] Select rows with distinct values in a column and other conditions

[R] Select rows with distinct values in a column and other conditions

[R] Error loop: missing value where TRUE/FALSE needed

Maybe Matching Threads