Avi Gross
2022-Jan-28 18:30 UTC
[R] Error in if (fraction <= 1) { : missing value where TRUE/FALSE needed
Javed,
Your explanation allows many other ways to look at the problem.?
Some of them skip steps and get to the point faster. Of course, I do not know
what exactly you mean by the "fairness object" other than guessing it
does an evaluation of what you supply and lets you know if it is fair.
For something categorical like gender it used to be easy to use the table()
function to show how many of each category you have. Of course, it now seems
that old assumptions about two genders are being replaced by additional choices
so it may literally be nonbinary.
Your code looked for 'T14' which gives no clue about purpose. Here is an
example where I coded the words "male" and "female" in a
small sample for illustration. You can leave the data as is and have it
automatically count or take percentages and then extract whatever you want and
use it to make decisions.
The darn HTML stripper this list uses makes showing code hard, so I have to
disperse it with extra spacing.
Here is some data:
gender <- c("male", "female", "female",
"male", "female", "female", "female")
I made it lopsided and you can see the counts easily enough with:
? ?tab.cnt <- table(gender)
The output is:> tab.cnt
genderfemale? ?male?? ? ?5? ? ? 2?
You can of course get percentages using the table object:
? ?tab.prcnt <- prop.table(tab.cnt)
The output is:
> tab.prcnt
gender? ?female? ? ? male?0.7142857 0.2857143?
You can, of course, multiply the above by a hundred and use round() to trim it
to fewer digits, but what you can do is extract the numbers to do things like a
comparison:
Consider deciding that more than 60% females is too much:
if (tab.prcnt[["female"]] > 0.6)? print("too many women")
Your criteria may of course be more complicated, but the thing I am teaching is
that there are built-in methods that may be used as you get to know not only the
language but techniques that work well with it. Your need may work well with
your technique of converting your data representation from one form to a numeric
form. Realistically, many might simply use another built-in feature called
factors. Converting my data to a factor does this:
> fact <- factor(gender)> fact[1] male? ?female female male? ?female
female femaleLevels: female male> as.numeric(fact)[1] 2 1 1 2 1 1 1
The default is to use integers starting with 1 but you can change that in many
ways, or in the above, simply subtract 1 to get what you want. To get the
percentage of men in the above, can be something like this:
> mean(as.numeric(fact) - 1)[1] 0.2857143
You may get lots of advice on many methods and ways to do things but pick what
fits your situation and sometimes you can try to change the situation. For some
purposes, categorical data needs to be transformed for proper use in something
like machine learning algorithms but sometimes it can be left alone as shown
above and the statistics can be worked with.?
From: javed khan <javedbtk111 at gmail.com>
To: Avi Gross <avigross at verizon.net>
Cc: r-help at r-project.org <r-help at r-project.org>
Sent: Fri, Jan 28, 2022 8:34 am
Subject: Re: Error in if (fraction <= 1) { : missing value where TRUE/FALSE
needed
Avi Gross, thanks for your reply.?
I have no interest of using the zero and one in my code, I mean true false can
also be ok because I don't have to do some arithmetic with it.?
I just want to pass a protected variable and one of its (privileged) value to
the fairness object to see if the model has any bias towards the unprivileged
values of the protected variable.?
You can consider my protected variable as Sex and it's values as male and
female. I want the fairness object to see if there is any bias towards the
female group which could be considered as an unprivileged group.?
Thanks
On Thursday, January 27, 2022, Avi Gross via R-help <r-help at
r-project.org> wrote:
Javed,
You may misunderstand something here.
Forget ifelse() which does all kinds of things (which you can see by just typing
"ifelse" and a carriage/return or ENTER.
Your initial goal should be kept in mind. You want to create a data structure,
in this case a vector, that is the same length as another vector called
test$operator in which you mark whether the corresponding element was exactly
"T13" or not.
There is nothing fundamentally wrong with your approach albeit it is overkill in
this case. As has been pointed out, SKIPPING ifelse() entirely, you can get a
vector of Logicals (TRUE or FALSE) by a simple command like this:
? ? result <-?test$operator == 'T13'
For many purposes, that is all you need. TRUE and FALSE are also sometimes
mapped into 1 and 0 for various purposes, so you can convert them into integers
or general numerics is that is needed. Consider the following code that checks
the integers from 1 to 7 to see if they are even (as in divisible by 2):
> result <- 1:7 %% 2 == 0> result[1] FALSE? TRUE FALSE? TRUE FALSE?
TRUE FALSE> as.integer(result)[1] 0 1 0 1 0 1 0> as.numeric(result)[1] 0 1
0 1 0 1 0> result <- as.integer(1:7 %% 2 == 0)> result[1] 0 1 0 1 0 1 0
If for some reason the choice of 1 and 0 is the opposite of what you need, you
can invert them several ways with the simplest being:
? ??as.integer(1:7 %% 2 != 0)
or? ??as.integer(!(1:7 %% 2 != 0))
The first negates the comparison and the second just flips every FALSE and TRUE
to the other.
Why are we talking about this? For many more interesting cases, ifelse() is
great as you can replace one or both of the choices with anything. A very common
case is replacing one choice with itself and changing the other, or nesting the
comparisons in a sort of simulated tree as in?
?? ?ifelse(some_condition,? ? ? ?ifelse(second_condition, result1, result2),? ?
? ? ?ifelse(third_condition, result3, result4)))
But you seem to want the simplest return of two values that also happen to be
the underlying equivalent of TRUE and FALSE in many languages. In Python,
anything that evaluates to zero (or the Boolean value FALSE) tends?to be treated
as FALSE, and anything else like a 1 or 666 is treated as TRUE, as shown below:
> if (TRUE) print("TRUE") else print("FALSE")[1]
"TRUE"> if (1) print("TRUE") else
print("FALSE")[1] "TRUE"> if (666)
print("TRUE") else print("FALSE")[1] "TRUE"> if
(FALSE) print("TRUE") else print("FALSE")[1]
"FALSE"> if (0) print("TRUE") else
print("FALSE")[1] "FALSE"
This is why you are being told that for many purposes, the Boolean vector may
work fine. But if you really want or need zero and one, that is a trivial
transformation as shown. Feel free to use ifelse() and then figure out what went
wrong with your code, but also to try the simpler version and see if the problem
goes away.
Avi
-----Original Message-----
From: javed khan <javedbtk111 at gmail.com>
To: Bert Gunter <bgunter.4567 at gmail.com>
Cc: R-help <r-help at r-project.org>
Sent: Thu, Jan 27, 2022 1:15 pm
Subject: Re: [R] Error in if (fraction <= 1) { : missing value where
TRUE/FALSE needed
Thank you Bert Gunter
Do you mean I should do something like this:
prot <- (as.numeric(ifelse(test$ operator == 'T13', 1, 0))
[[alternative HTML version deleted]]