We obtained some disturbing results from convolve() (inaccuracies and negative probabilities). We'll try to make the context clear in as few lines as possible... Our function panjer() (code below) basically computes recursively the probability mass function of a compound Poisson distribution. When the Poisson parameter lambda is very large, the starting value of the recursive scheme --- the mass at 0 --- is 0 and the recursion fails. One way to solve this problem is to divide lambda by 2^n, apply the panjer() function and then convolve the result with itself n times. We applied the panjer() function with a lambda such that the mass at 0 is just larger than .Machine$double.xmin. We thus know that once this pmf is convoluted with itself, the first probabilities will be 0 (for the computer). Here are the two issues we have with convolve(): 1. The probabilities we know should be 0 are rather in the vicinity of 1E-19, as if convolve() could not "go lower". Using a hand made convolution function (not given here), we obtained the correct values. When probabilities get around 1E-12, results from convolve() and our home made function are essentially identical. 2. We obtained negative probabilities. More accurately, the same example returns negative probabilities under Windows, but not under Linux. We also obtained negative probabilities for another example under Linux, though. We understand that convolve() computes the convolutions using fft(), but we are not familiar enough with the latter to assess if the above issues are some sort of bugs or normal behavior. In the latter case, is there is any workaround? Any help/comments/ideas appreciated. === EXAMPLES ==panjer <- function(fx, lambda) { fx0 <- fx[1] fx <- fx[-1] r <- length(fx) cumul <- fs <- exp(-lambda * (1 - fx0)) repeat { x <- length(fs) m <- min(x, r) fs <- c(fs, sum((lambda * 1:m / x) * head(fx, m) * rev(tail(fs, m)))) if ((1-1E-8) < (cumul <- cumul + tail(fs, 1))) break } fs } ### Linux example ###> R.version_ platform i386-pc-linux-gnu arch i386 os linux-gnu system i386, linux-gnu status major 2 minor 1.0 year 2005 month 04 day 18 language R> fs <- panjer(rep(0.2, 5), 600) # compute pmf > fs[1:10] # small values[1] 3.456597e-209 4.147916e-207 2.530229e-205 1.045690e-203 [5] 3.292657e-202 8.422924e-201 1.822768e-199 3.431181e-198 [9] 5.733481e-197 8.637025e-196> fsc <- convolve(fs, rev(fs), type="o") # convolution > sum(fsc < 0) # no negatives under Linux[1] 0> fsc[1:10] # these should be 0[1] 4.819870e-19 4.135471e-19 4.511455e-19 3.994485e-19 3.820177e-19 [6] 4.738272e-19 3.885447e-19 3.167399e-19 3.695636e-19 3.701792e-19> sum(fsc) # no impact on the sum[1] 1 ### Windows example ###> R.version_ platform i386-pc-mingw32 arch i386 os mingw32 system i386, mingw32 status major 2 minor 1.1 year 2005 month 06 day 20 language R> fs <- panjer(rep(0.2, 5), 600) # compute pmf > fs[1:10] # small values[1] 3.456597e-209 4.147916e-207 2.530229e-205 1.045690e-203 [5] 3.292657e-202 8.422924e-201 1.822768e-199 3.431181e-198 [9] 5.733481e-197 8.637025e-196> fsc <- convolve(fs, rev(fs), type="o") # convolution > sum(fsc < 0) # negatives under Windows[1] 39> fsc[1:10] # these should be 0[1] 4.049552e-19 3.345584e-19 4.126913e-19 3.351211e-19 2.758626e-19 [6] 3.530111e-19 2.735041e-19 2.376711e-19 2.591287e-19 3.196405e-19> sum(fsc) # no impact on the sum[1] 1 -- Vincent Goulet, Associate Professor ??cole d'actuariat Universit?? Laval, Qu??bec Vincent.Goulet at act.ulaval.ca http://vgoulet.act.ulaval.ca
On 7/20/2005 12:52 PM, Vincent Goulet wrote:> We obtained some disturbing results from convolve() (inaccuracies and negative > probabilities). We'll try to make the context clear in as few lines as > possible... > > Our function panjer() (code below) basically computes recursively the > probability mass function of a compound Poisson distribution. When the > Poisson parameter lambda is very large, the starting value of the recursive > scheme --- the mass at 0 --- is 0 and the recursion fails. One way to solve > this problem is to divide lambda by 2^n, apply the panjer() function and then > convolve the result with itself n times. > > We applied the panjer() function with a lambda such that the mass at 0 is just > larger than .Machine$double.xmin. We thus know that once this pmf is > convoluted with itself, the first probabilities will be 0 (for the computer). > > Here are the two issues we have with convolve(): > > 1. The probabilities we know should be 0 are rather in the vicinity of 1E-19, > as if convolve() could not "go lower". Using a hand made convolution function > (not given here), we obtained the correct values. When probabilities get > around 1E-12, results from convolve() and our home made function are > essentially identical. > > 2. We obtained negative probabilities. More accurately, the same example > returns negative probabilities under Windows, but not under Linux. We also > obtained negative probabilities for another example under Linux, though. > > We understand that convolve() computes the convolutions using fft(), but we > are not familiar enough with the latter to assess if the above issues are > some sort of bugs or normal behavior. In the latter case, is there is any > workaround?Rounding will depend on the hardware and the compiler, so this might be normal behaviour. It's a little disturbing, but you shouldn't expect calculations to be accurate to more digits than your platform supports. Convolving using an FFT is essentially rotating the vectors in a high dimensional space, multiplying terms, and then rotating back. Unless those rotations are unrealistically accurate very small numbers won't necessarily show up as zeros. I'd suggest re-thinking your panjer function to work in log probabilities instead of probabilities, so that you can handle a larger dynamic range before you run into underflow problems, and you don't need to use convolve at all. Convert them back to probabilities at the very end. Duncan Murdoch
Check out sum.exact in the caTools package. On 7/20/05, Vincent Goulet <vincent.goulet at act.ulaval.ca> wrote:> We obtained some disturbing results from convolve() (inaccuracies and negative > probabilities). We'll try to make the context clear in as few lines as > possible... > > Our function panjer() (code below) basically computes recursively the > probability mass function of a compound Poisson distribution. When the > Poisson parameter lambda is very large, the starting value of the recursive > scheme --- the mass at 0 --- is 0 and the recursion fails. One way to solve > this problem is to divide lambda by 2^n, apply the panjer() function and then > convolve the result with itself n times. > > We applied the panjer() function with a lambda such that the mass at 0 is just > larger than .Machine$double.xmin. We thus know that once this pmf is > convoluted with itself, the first probabilities will be 0 (for the computer). > > Here are the two issues we have with convolve(): > > 1. The probabilities we know should be 0 are rather in the vicinity of 1E-19, > as if convolve() could not "go lower". Using a hand made convolution function > (not given here), we obtained the correct values. When probabilities get > around 1E-12, results from convolve() and our home made function are > essentially identical. > > 2. We obtained negative probabilities. More accurately, the same example > returns negative probabilities under Windows, but not under Linux. We also > obtained negative probabilities for another example under Linux, though. > > We understand that convolve() computes the convolutions using fft(), but we > are not familiar enough with the latter to assess if the above issues are > some sort of bugs or normal behavior. In the latter case, is there is any > workaround? > > Any help/comments/ideas appreciated. > > === EXAMPLES ==> panjer <- function(fx, lambda) > { > fx0 <- fx[1] > fx <- fx[-1] > r <- length(fx) > cumul <- fs <- exp(-lambda * (1 - fx0)) > repeat > { > x <- length(fs) > m <- min(x, r) > fs <- c(fs, sum((lambda * 1:m / x) * head(fx, m) * rev(tail(fs, m)))) > if ((1-1E-8) < (cumul <- cumul + tail(fs, 1))) > break > } > fs > } > > ### Linux example ### > > R.version > _ > platform i386-pc-linux-gnu > arch i386 > os linux-gnu > system i386, linux-gnu > status > major 2 > minor 1.0 > year 2005 > month 04 > day 18 > language R > > fs <- panjer(rep(0.2, 5), 600) # compute pmf > > fs[1:10] # small values > [1] 3.456597e-209 4.147916e-207 2.530229e-205 1.045690e-203 > [5] 3.292657e-202 8.422924e-201 1.822768e-199 3.431181e-198 > [9] 5.733481e-197 8.637025e-196 > > fsc <- convolve(fs, rev(fs), type="o") # convolution > > sum(fsc < 0) # no negatives under Linux > [1] 0 > > fsc[1:10] # these should be 0 > [1] 4.819870e-19 4.135471e-19 4.511455e-19 3.994485e-19 3.820177e-19 > [6] 4.738272e-19 3.885447e-19 3.167399e-19 3.695636e-19 3.701792e-19 > > sum(fsc) # no impact on the sum > [1] 1 > > ### Windows example ### > > R.version > _ > platform i386-pc-mingw32 > arch i386 > os mingw32 > system i386, mingw32 > status > major 2 > minor 1.1 > year 2005 > month 06 > day 20 > language R > > fs <- panjer(rep(0.2, 5), 600) # compute pmf > > fs[1:10] # small values > [1] 3.456597e-209 4.147916e-207 2.530229e-205 1.045690e-203 > [5] 3.292657e-202 8.422924e-201 1.822768e-199 3.431181e-198 > [9] 5.733481e-197 8.637025e-196 > > fsc <- convolve(fs, rev(fs), type="o") # convolution > > sum(fsc < 0) # negatives under Windows > [1] 39 > > fsc[1:10] # these should be 0 > [1] 4.049552e-19 3.345584e-19 4.126913e-19 3.351211e-19 2.758626e-19 > [6] 3.530111e-19 2.735041e-19 2.376711e-19 2.591287e-19 3.196405e-19 > > sum(fsc) # no impact on the sum > [1] 1 > > -- > Vincent Goulet, Associate Professor > ??cole d'actuariat > Universit?? Laval, Qu??bec > Vincent.Goulet at act.ulaval.ca http://vgoulet.act.ulaval.ca > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html >