Dear list-members, I have the following problem: I have a vector (countrydiff) with length 72 and another vector (long_df$country_name) which is about 12000 long. Basically what I want to do is to if the factor level (or string name) in long_df$country_name appears on the countrydiff, then long_df$povdat should be equal to 1, if it does not appear on the countrydiff vector then long_df$povdat should be equal to zero. I have tried different combinations and read some. The following code should in my mind do it, but it doesn?t: long_df$povdat<-ifelse(long_df$country_name == countrydiff, 1, 0) long_df$povdat<-ifelse(long_df$country_name %in% countrydiff, 1, 0) Additional information: the factor vector countrydiff contains unique country names (Albania, Zimbabwe etc.), whereas long_df$country_name also contains country names albeit not unique since it is in longform. The unique names that appear in long_df$country_name is around 200. Any suggestions? Thanks in advance. Best Adel -- View this message in context: http://r.789695.n4.nabble.com/ifelse-statement-with-two-vectors-of-different-length-tp4682401.html Sent from the R help mailing list archive at Nabble.com.
Hi,
Please show a reproducible example.
countrydiff <- c("Albania", "Algeria",
"Belarus", "Canada", "Germany")
long_df <- data.frame(country_name = c("Algeria",
"Guyana", "Hungary", "Algeria",
"Canada", "Iran", "Iran",
"Norway","Uruguay", "Zimbabwe") )
?ifelse(long_df$country_name %in% countrydiff,1,0)
# [1] 1 0 0 1 1 0 0 0 0 0
#or
1*(long_df$country_name %in% countrydiff)
# [1] 1 0 0 1 1 0 0 0 0 0
A.K.
Dear list-members,
I have the following problem: I have a vector (countrydiff) with
length 72 and another vector (long_df$country_name) which is about
12000 long. Basically what I want to do is to if the factor level (or
string name) in long_df$country_name appears on the countrydiff, then
long_df$povdat should be equal to 1, if it does not appear on the
countrydiff vector then long_df$povdat should be equal to zero. I have
tried different combinations and read some. The following code should in
my mind do it, but it doesn?t:
long_df$povdat<-ifelse(long_df$country_name == countrydiff, 1, 0)
long_df$povdat<-ifelse(long_df$country_name %in% countrydiff, 1, 0)
Additional information: the factor vector countrydiff contains
unique country names (Albania, Zimbabwe etc.), whereas
long_df$country_name also contains country names albeit not unique since
it is in longform. The unique names that appear in long_df$country_name
is around 200.
Any suggestions?
Thanks in advance.
Best
Adel
Sarah Goslee
2013-Dec-18 15:04 UTC
[R] ifelse statement with two vectors of different length
Hi, Suggestion 1: read http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example and bookmark it for future reference. Suggestion 2: set.seed(123) countrydiff <- letters[1:5] long_df <- data.frame(country_name = sample(letters[1:8], 20, replace=TRUE)) long_df$povdat <- as.numeric(long_df$country_name %in% countrydiff) Sarah On Wed, Dec 18, 2013 at 8:57 AM, Adel <adel.daoud at sociology.gu.se> wrote:> > Dear list-members, > > I have the following problem: I have a vector (countrydiff) with length 72 > and another vector (long_df$country_name) which is about 12000 long. > Basically what I want to do is to if the factor level (or string name) in > long_df$country_name appears on the countrydiff, then long_df$povdat should > be equal to 1, if it does not appear on the countrydiff vector then > long_df$povdat should be equal to zero. I have tried different combinations > and read some. The following code should in my mind do it, but it doesn?t: > > long_df$povdat<-ifelse(long_df$country_name == countrydiff, 1, 0) > > long_df$povdat<-ifelse(long_df$country_name %in% countrydiff, 1, 0) > > Additional information: the factor vector countrydiff contains unique > country names (Albania, Zimbabwe etc.), whereas long_df$country_name also > contains country names albeit not unique since it is in longform. The unique > names that appear in long_df$country_name is around 200. > > > Any suggestions? > Thanks in advance. > > Best > Adel > >-- Sarah Goslee http://www.functionaldiversity.org
Hi Adel,
If the problem is the spacing, then
library(stringr)
1*(long_df$country_name %in% str_trim(countrydiff))
# [1] 1 0 0 1 1 0 0 0 0 0
A.K.
Dear Arun
Thanks for your reply, it made me realize that the problem was
not in the code but in the levels() of the factors. Some countries had
some extra spacing which made the ifelse() function not work. So if I
modify your code (added space to countrydiff), it will then look
something like this:
countrydiff <- c("Albania ? ?", "Algeria ? ?",
"Belarus ? ?", "Canada ? ", "Germany ? ")
long_df <- data.frame(country_name = c("Algeria",
"Guyana",
"Hungary", "Algeria", "Canada", "Iran",
"Iran", "Norway","Uruguay",
"Zimbabwe") )
I had to use the gsub to fix this first.
Interestingly, the setdiff() function did not react on
spacing difference which I used before coming to the ifelse statement
and therefore I did not react on this in the first place
#no reaction from R on spacing diff.
setdiff(countrydiff, long_df$country_name)
Nevertheless, thanks again for being helpful!
Adel
On Wednesday, December 18, 2013 9:58 AM, Adel <adel.daoud at
sociology.gu.se> wrote:
Dear list-members,
I have the following problem: I have a vector (countrydiff) with length 72
and another vector (long_df$country_name) which is about 12000 long.
Basically what I want to do is to if the factor level (or string name) in
long_df$country_name appears on the countrydiff, then long_df$povdat should
be equal to 1, if it does not appear on the countrydiff vector then
long_df$povdat should be equal to zero. I have tried different combinations
and read some. The following code should in my mind do it, but it doesn?t:
long_df$povdat<-ifelse(long_df$country_name == countrydiff, 1, 0)
long_df$povdat<-ifelse(long_df$country_name %in% countrydiff, 1, 0)
Additional information: the factor vector countrydiff contains unique
country names (Albania, Zimbabwe etc.), whereas long_df$country_name also
contains country names albeit not unique since it is in longform. The unique
names that appear in long_df$country_name is around 200.
Any suggestions?
Thanks in advance.
Best
Adel
--
View this message in context:
http://r.789695.n4.nabble.com/ifelse-statement-with-two-vectors-of-different-length-tp4682401.html
Sent from the R help mailing list archive at Nabble.com.
______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.