Dear list-members, I have the following problem: I have a vector (countrydiff) with length 72 and another vector (long_df$country_name) which is about 12000 long. Basically what I want to do is to if the factor level (or string name) in long_df$country_name appears on the countrydiff, then long_df$povdat should be equal to 1, if it does not appear on the countrydiff vector then long_df$povdat should be equal to zero. I have tried different combinations and read some. The following code should in my mind do it, but it doesn?t: long_df$povdat<-ifelse(long_df$country_name == countrydiff, 1, 0) long_df$povdat<-ifelse(long_df$country_name %in% countrydiff, 1, 0) Additional information: the factor vector countrydiff contains unique country names (Albania, Zimbabwe etc.), whereas long_df$country_name also contains country names albeit not unique since it is in longform. The unique names that appear in long_df$country_name is around 200. Any suggestions? Thanks in advance. Best Adel -- View this message in context: http://r.789695.n4.nabble.com/ifelse-statement-with-two-vectors-of-different-length-tp4682401.html Sent from the R help mailing list archive at Nabble.com.
Hi, Please show a reproducible example. countrydiff <- c("Albania", "Algeria", "Belarus", "Canada", "Germany") long_df <- data.frame(country_name = c("Algeria", "Guyana", "Hungary", "Algeria", "Canada", "Iran", "Iran", "Norway","Uruguay", "Zimbabwe") ) ?ifelse(long_df$country_name %in% countrydiff,1,0) # [1] 1 0 0 1 1 0 0 0 0 0 #or 1*(long_df$country_name %in% countrydiff) # [1] 1 0 0 1 1 0 0 0 0 0 A.K. Dear list-members, I have the following problem: I have a vector (countrydiff) with length 72 and another vector (long_df$country_name) which is about 12000 long. Basically what I want to do is to if the factor level (or string name) in long_df$country_name appears on the countrydiff, then long_df$povdat should be equal to 1, if it does not appear on the countrydiff vector then long_df$povdat should be equal to zero. I have tried different combinations and read some. The following code should in my mind do it, but it doesn?t: long_df$povdat<-ifelse(long_df$country_name == countrydiff, 1, 0) long_df$povdat<-ifelse(long_df$country_name %in% countrydiff, 1, 0) Additional information: the factor vector countrydiff contains unique country names (Albania, Zimbabwe etc.), whereas long_df$country_name also contains country names albeit not unique since it is in longform. The unique names that appear in long_df$country_name is around 200. Any suggestions? Thanks in advance. Best Adel
Sarah Goslee
2013-Dec-18 15:04 UTC
[R] ifelse statement with two vectors of different length
Hi, Suggestion 1: read http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example and bookmark it for future reference. Suggestion 2: set.seed(123) countrydiff <- letters[1:5] long_df <- data.frame(country_name = sample(letters[1:8], 20, replace=TRUE)) long_df$povdat <- as.numeric(long_df$country_name %in% countrydiff) Sarah On Wed, Dec 18, 2013 at 8:57 AM, Adel <adel.daoud at sociology.gu.se> wrote:> > Dear list-members, > > I have the following problem: I have a vector (countrydiff) with length 72 > and another vector (long_df$country_name) which is about 12000 long. > Basically what I want to do is to if the factor level (or string name) in > long_df$country_name appears on the countrydiff, then long_df$povdat should > be equal to 1, if it does not appear on the countrydiff vector then > long_df$povdat should be equal to zero. I have tried different combinations > and read some. The following code should in my mind do it, but it doesn?t: > > long_df$povdat<-ifelse(long_df$country_name == countrydiff, 1, 0) > > long_df$povdat<-ifelse(long_df$country_name %in% countrydiff, 1, 0) > > Additional information: the factor vector countrydiff contains unique > country names (Albania, Zimbabwe etc.), whereas long_df$country_name also > contains country names albeit not unique since it is in longform. The unique > names that appear in long_df$country_name is around 200. > > > Any suggestions? > Thanks in advance. > > Best > Adel > >-- Sarah Goslee http://www.functionaldiversity.org
Hi Adel, If the problem is the spacing, then library(stringr) 1*(long_df$country_name %in% str_trim(countrydiff)) # [1] 1 0 0 1 1 0 0 0 0 0 A.K. Dear Arun Thanks for your reply, it made me realize that the problem was not in the code but in the levels() of the factors. Some countries had some extra spacing which made the ifelse() function not work. So if I modify your code (added space to countrydiff), it will then look something like this: countrydiff <- c("Albania ? ?", "Algeria ? ?", "Belarus ? ?", "Canada ? ", "Germany ? ") long_df <- data.frame(country_name = c("Algeria", "Guyana", "Hungary", "Algeria", "Canada", "Iran", "Iran", "Norway","Uruguay", "Zimbabwe") ) I had to use the gsub to fix this first. Interestingly, the setdiff() function did not react on spacing difference which I used before coming to the ifelse statement and therefore I did not react on this in the first place #no reaction from R on spacing diff. setdiff(countrydiff, long_df$country_name) Nevertheless, thanks again for being helpful! Adel On Wednesday, December 18, 2013 9:58 AM, Adel <adel.daoud at sociology.gu.se> wrote: Dear list-members, I have the following problem: I have a vector (countrydiff) with length 72 and another vector (long_df$country_name) which is about 12000 long. Basically what I want to do is to if the factor level (or string name) in long_df$country_name appears on the countrydiff, then long_df$povdat should be equal to 1, if it does not appear on the countrydiff vector then long_df$povdat should be equal to zero. I have tried different combinations and read some. The following code should in my mind do it, but it doesn?t: long_df$povdat<-ifelse(long_df$country_name == countrydiff, 1, 0) long_df$povdat<-ifelse(long_df$country_name %in% countrydiff, 1, 0) Additional information: the factor vector countrydiff contains unique country names (Albania, Zimbabwe etc.), whereas long_df$country_name also contains country names albeit not unique since it is in longform. The unique names that appear in long_df$country_name is around 200. Any suggestions? Thanks in advance. Best Adel -- View this message in context: http://r.789695.n4.nabble.com/ifelse-statement-with-two-vectors-of-different-length-tp4682401.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help at r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.