thr3ads.net - R help - [R] Product of 1 - probabilities [May 2009]

If this information is useful, please help other people find it:
Share via:

Mark Bilton

2009-May-21 14:15 UTC

[R] Product of 1 - probabilities

I am having a slight problem with probabilities.

To calculate the final probability of an event p(F), we can take the product of
the chance that each independent event that makes p(F) will NOT occur.
So...
p(F) = 1- ( (1-p(A)) * (1-p(B)) * (1-p(C))...(1-p(x)) )

If the chance of an event within the product occurring remains the same, we can
therefore raise this probability to a power of the number of times that event
occurs.
e.g. rolling a dice p(A) = 1/6 of getting a 1...
p(F) = 1 - (1- (1/6))^z 
p(F) = 1 - (1-p(A))^z tells us the probabiltity of rolling a 1 'at least
once' in z number of rolls.

So then to R...

if p(A) = 0.01; z = 4; p(F) = 0.039

obviously p(F) > p(A)

however the problem arises when we use very small numbers e.g. p(B) = 1 * 10^-30
R understands this value
However when you have 1-p(B) you get something very close to 1 as you
expect...but R seems to think it is 1.
So when I calculate p(F) = 1 - (1-p(B))^z = 1 to the power anything equals 1 so
p(F) = 0 and not just close to zero BUT zero.
It doesn't matter therefore if z = 1*10^1000, the answer is still zero !!

Obviously therefore now p(F) < p(B)

Is there any solution to my problem, e.g.
- is it a problem with the sum (-) ? ie could I change the number of bits the
number understands (however it seems strange that it can hold it as a value
close to 0 but not close to 1)
-or should I maybe use a function to calculate the exact answer ?
-or something else

Any help greatly appreciated
Mark
-



  

_________________________________________________________________


	[[alternative HTML version deleted]]

Gabor Grothendieck

2009-May-21 14:45 UTC

head link

[R] Product of 1 - probabilities

There are several arbitrary precision packages available: gmp (an
interface to the GNU multi-precision library on CRAN) and
bc (an R interface to the bc arbitrary precision calculator):
http://r-bc.googlecode.com

There are also packages providing R interfaces to two computer
algebra systems and they both support not only arbitrary
precision but also exact calculation:

http://rsympy.googlecode.com
http://ryacas.googlecode.com
> library(bc)
> 1 - (1-10^-75)^10
[1] 0> bc("1 - (1-10^-75)^10")[1]
".0000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000"


On Thu, May 21, 2009 at 10:15 AM, Mark Bilton <mcbilton at hotmail.com>
wrote:>
> I am having a slight problem with probabilities.
>
> To calculate the final probability of an event p(F), we can take the
product of the chance that each independent event that makes p(F) will NOT
occur.
> So...
> p(F) = 1- ( (1-p(A)) * (1-p(B)) * (1-p(C))...(1-p(x)) )
>
> If the chance of an event within the product occurring remains the same, we
can therefore raise this probability to a power of the number of times that
event occurs.
> e.g. rolling a dice p(A) = 1/6 of getting a 1...
> p(F) = 1 - (1- (1/6))^z
> p(F) = 1 - (1-p(A))^z tells us the probabiltity of rolling a 1 'at
least once' in z number of rolls.
>
> So then to R...
>
> if p(A) = 0.01; z = 4; p(F) = 0.039
>
> obviously p(F) > p(A)
>
> however the problem arises when we use very small numbers e.g. p(B) = 1 *
10^-30
> R understands this value
> However when you have 1-p(B) you get something very close to 1 as you
expect...but R seems to think it is 1.
> So when I calculate p(F) = 1 - (1-p(B))^z = 1 to the power anything equals
1 so p(F) = 0 and not just close to zero BUT zero.
> It doesn't matter therefore if z = 1*10^1000, the answer is still zero
!!
>
> Obviously therefore now p(F) < p(B)
>
> Is there any solution to my problem, e.g.
> - is it a problem with the sum (-) ? ie could I change the number of bits
the number understands (however it seems strange that it can hold it as a value
close to 0 but not close to 1)
> -or should I maybe use a function to calculate the exact answer ?
> -or something else
>
> Any help greatly appreciated
> Mark
> -
>
>
>
>
>
> _________________________________________________________________
>
>
> ? ? ? ?[[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

(Ted Harding)

2009-May-21 14:59 UTC

head link

[R] Product of 1 - probabilities

On 21-May-09 14:15:08, Mark Bilton wrote:> I am having a slight problem with probabilities.
> 
> To calculate the final probability of an event p(F), we can take the
> product of the chance that each independent event that makes p(F) will
> NOT occur.
> So...
> p(F) = 1- ( (1-p(A)) * (1-p(B)) * (1-p(C))...(1-p(x)) )
> 
> If the chance of an event within the product occurring remains the
> same, we can therefore raise this probability to a power of the number
> of times that event occurs.
> e.g. rolling a dice p(A) = 1/6 of getting a 1...
> p(F) = 1 - (1- (1/6))^z 
> p(F) = 1 - (1-p(A))^z tells us the probabiltity of rolling a 1 'at
> least once' in z number of rolls.
> 
> So then to R...
> 
> if p(A) = 0.01; z = 4; p(F) = 0.039
> 
> obviously p(F) > p(A)
> 
> however the problem arises when we use very small numbers e.g. p(B) = 1
> * 10^-30
> R understands this value
> However when you have 1-p(B) you get something very close to 1 as you
> expect...but R seems to think it is 1.
> So when I calculate p(F) = 1 - (1-p(B))^z = 1 to the power anything
> equals 1 so p(F) = 0 and not just close to zero BUT zero.
> It doesn't matter therefore if z = 1*10^1000, the answer is still zero
> !!
> 
> Obviously therefore now p(F) < p(B)
> 
> Is there any solution to my problem, e.g.
> - is it a problem with the sum (-) ? ie could I change the number of
> bits the number understands (however it seems strange that it can hold
> it as a value close to 0 but not close to 1)
> -or should I maybe use a function to calculate the exact answer ?
> -or something else
> 
> Any help greatly appreciated
> Mark
The problem is not with sum(), but with the fact that R adopts the
exact value 1 for (1-x) if (on my machine) x < 10^(-16). This is to
do with the number of binary digits used to store a number. Your
10^(-30) is much smaller than that!

  x <- 10^(-30)
  x
# [1] 1e-30
  y <- (1-x) ; 1-y
# [1] 0
  x <- 10^(-15)
  y <- (1-x) ; 1-y
# [1] 9.992007e-16
  x <- 10^(-16)
  y <- (1-x) ; 1-y
# [1] 1.110223e-16
  x <- 10^(-17)
  y <- (1-x) ; 1-y
# [1] 0

Read ?.Machine and, in particular, about:
  double.eps: the smallest positive floating-point number 'x'
    such that '1 + x != 1'.  It equals 'base^ulp.digits' if
    either 'base' is 2 or 'rounding' is 0;  otherwise, it is
    '(base^ulp.digits) / 2'.  Normally '2.220446e-16'.

There is no cure for this in R, but you could try to work round it
(depending on the details of your problem) by using an approximation
to the value of the expression.

For instance, if x is very small (say 10^(-20)), and n is not very
large, then to first order

  (1 - x)^n = 1 - n*x

or, if X is a vector of very small numbers,

  prod((1 - X)) = 1 - sum(X)

Either of these may still result in 1 if what is being subtracted
is less than .Machine$double.eps.

However, in the particular type of application you describe,

  1 - (1 - x)^n = n*x to first order
  1 - prod((1 - X)) = sum(X) to first order

You will just have to work out what will give you a sensible answer.
Basically, you are trying to operate beyond the limits of precision
of R, and so will need to re-cast your question in alternative terms
which will yield an adequately precise result.

Hoping this helps,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 21-May-09                                       Time: 15:59:47
------------------------------ XFMail ------------------------------

Duncan Murdoch

2009-May-21 15:19 UTC

head link

[R] Product of 1 - probabilities

On 5/21/2009 10:15 AM, Mark Bilton wrote:> I am having a slight problem with probabilities.
> 
> To calculate the final probability of an event p(F), we can take the
product of the chance that each independent event that makes p(F) will NOT
occur.
> So...
> p(F) = 1- ( (1-p(A)) * (1-p(B)) * (1-p(C))...(1-p(x)) )
> 
> If the chance of an event within the product occurring remains the same, we
can therefore raise this probability to a power of the number of times that
event occurs.
> e.g. rolling a dice p(A) = 1/6 of getting a 1...
> p(F) = 1 - (1- (1/6))^z 
> p(F) = 1 - (1-p(A))^z tells us the probabiltity of rolling a 1 'at
least once' in z number of rolls.
> 
> So then to R...
> 
> if p(A) = 0.01; z = 4; p(F) = 0.039
> 
> obviously p(F) > p(A)
> 
> however the problem arises when we use very small numbers e.g. p(B) = 1 *
10^-30
> R understands this value
> However when you have 1-p(B) you get something very close to 1 as you
expect...but R seems to think it is 1.
> So when I calculate p(F) = 1 - (1-p(B))^z = 1 to the power anything equals
1 so p(F) = 0 and not just close to zero BUT zero.
> It doesn't matter therefore if z = 1*10^1000, the answer is still zero
!!
> 
> Obviously therefore now p(F) < p(B)
> 
> Is there any solution to my problem, e.g.
> - is it a problem with the sum (-) ? ie could I change the number of bits
the number understands (however it seems strange that it can hold it as a value
close to 0 but not close to 1)
> -or should I maybe use a function to calculate the exact answer ?
The problem is that R maintains about 15-16 decimal places of accuracy, 
and to that accuracy, 1- 1.e-30 is equal to 1.  You need to find a 
different formula, e.g.

(1-x)^z = exp(z * log(1-x)) = exp(z * log1p(-x))

So if you have z = 1.e20, and p(A) = 1.e-30, then

1 - (1-1.e-30)^1.e20 can be calculated in R as

1 - exp(1.e20 * log1p(-1.e-30))

which works out to 1.e-10.

Duncan Murdoch
> -or something else
> 
> Any help greatly appreciated
> Mark
> -
> 
> 
> 
>   
> 
> _________________________________________________________________
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

William Dunlap

2009-May-21 15:23 UTC

head link

[R] Product of 1 - probabilities

> -----Original Message-----
> From: r-help-bounces at r-project.org 
> [mailto:r-help-bounces at r-project.org] On Behalf Of Mark Bilton
> Sent: Thursday, May 21, 2009 7:15 AM
> To: r-help at r-project.org
> Subject: [R] Product of 1 - probabilities
> 
> 
> I am having a slight problem with probabilities.
> 
> To calculate the final probability of an event p(F), we can 
> take the product of the chance that each independent event 
> that makes p(F) will NOT occur.
> So...
> p(F) = 1- ( (1-p(A)) * (1-p(B)) * (1-p(C))...(1-p(x)) )
> 
> If the chance of an event within the product occurring 
> remains the same, we can therefore raise this probability to 
> a power of the number of times that event occurs.
> e.g. rolling a dice p(A) = 1/6 of getting a 1...
> p(F) = 1 - (1- (1/6))^z 
> p(F) = 1 - (1-p(A))^z tells us the probabiltity of rolling a 
> 1 'at least once' in z number of rolls.
> 
> So then to R...
> 
> if p(A) = 0.01; z = 4; p(F) = 0.039
> 
> obviously p(F) > p(A)
> 
> however the problem arises when we use very small numbers 
> e.g. p(B) = 1 * 10^-30
> R understands this value
> However when you have 1-p(B) you get something very close to 
> 1 as you expect...but R seems to think it is 1.
> So when I calculate p(F) = 1 - (1-p(B))^z = 1 to the power 
> anything equals 1 so p(F) = 0 and not just close to zero BUT zero.
> It doesn't matter therefore if z = 1*10^1000, the answer is 
> still zero !!
> 
> Obviously therefore now p(F) < p(B)
> 
> Is there any solution to my problem, e.g.
> - is it a problem with the sum (-) ? ie could I change the 
> number of bits the number understands (however it seems 
> strange that it can hold it as a value close to 0 but not close to 1)
> -or should I maybe use a function to calculate the exact answer ?
> -or something else
> 
> Any help greatly appreciated
> Mark
R uses double precision arithmetic, which gives you 52 binary
digits of precision (about 17 decimal digits), so 1.0-10.0^-30==1.0
even though 10.0^-30 is greater than 0.0 (you need to get down
to 10^-323 to make it 0.0 because there are 11 binary digits for
the exponent used to scale the value).

To work around this you can work on a log scale.  The R functions
log1p(x) and expm1(x) compute the much better approximations
to the real functions log(1+x) and exp(x)-1, respectively, than do
the R expressions log(1+x) and exp(x)-1 when x is very close to 0.
E.g., 
    > pB <- 1e-30 # 1*10^-30
    > exp(100 * log1p(-pB))
    [1] 1
    > exp(100 * log1p(-pB))-1
    [1] 0
    > expm1(100 * log1p(-pB))
    [1] -1e-28

You have another problem: you want z=10^1000, which is infinite
in double precision arithmetic.  You can take logs again to deal
with this or do some more analysis.

Also, if you are computing your basic pB and pF from R's p<dist>
family of functions, they all have arguments to let you get the upper
tail probabilities and output on a log scale.

Bill Dunlap
TIBCO Software Inc - Spotfire Division
wdunlap tibco.com

Maybe Matching Threads

Search for more maybe matching threads

R help - May 2009 - Product of 1 - probabilities

[R] Product of 1 - probabilities

[R] Product of 1 - probabilities

[R] Product of 1 - probabilities

[R] Product of 1 - probabilities

[R] Product of 1 - probabilities

Maybe Matching Threads