Your function "jjj" is not vectorized. Try this: jjj <- function(www) sapply(www, function(x)2*integrate(dnorm,0,x)$value) plot(jjj, 0, 5) It should work. Ravi. ---------------------------------------------------------------------------- ------- Ravi Varadhan, Ph.D. Assistant Professor, The Center on Aging and Health Division of Geriatric Medicine and Gerontology Johns Hopkins University Ph: (410) 502-2619 Fax: (410) 614-9625 Email: rvaradhan at jhmi.edu Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html ---------------------------------------------------------------------------- -------- -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of theo borm Sent: Friday, February 23, 2007 1:43 PM To: r-help at stat.math.ethz.ch Subject: [R] using "integrate" in a function definition Dear list members, I'm quite new to R, and though I tried to find the answer to my probably very basic question through the available resources (website, mailing list archives, docs, google), I've not found it. If I try to use the "integrate" function from within my own functions, my functions seem to misbehave in some contexts. The following example is a bit silly, but illustrates what's happening: The following behaves as expected: > jjj<-function(www) {2*integrate(dnorm,0,www)$value} > kkk<-function(www) {2*(pnorm(www)-0.5)} > jjj(2) [1] 0.9544997 > kkk(2) [1] 0.9544997 However, if I do: > plot(jjj,0,5) Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ whereas > plot(kkk,0,5) produces a nice plot. > xxx<-seq(0:5) > yyy<-jjj(xxx) > zzz<-kkk(xxx) produces no errors, but: > yyy [1] 0.6826895 > zzz [1] 0.6826895 0.9544997 0.9973002 0.9999367 0.9999994 1.0000000 Why is this? Is this some R language feature that I've completely missed? Ultimately my problem is that I have a mathematical function describing a distribution, and I want to use this in a Kolmogorov-Smirnov test where I need a cumulative distribution function. If I use the following (synthetic) dataset with ks.test with either the "jjj" or "kkk" function, I get: > ppp<-c(1.74865955,1.12220426,0.24760427,0.24351439,0.10870853, 0.72023272,0.40245392,0.16002948,0.24203950,0.44029851, 0.34830446,1.66967152,1.71609574,1.17540830,0.22306379, 0.64382628,0.37382795,0.84361547,1.92138362,0.02844235, 0.11238473,0.21237557,1.01058073,2.33108243,1.36034473, 1.88951679,0.18230647,0.96571916,0.91239760,2.05574766, 0.33681130,2.76006257,0.83952491,1.24038831,1.46199671, 0.24694163,0.01565159,0.94816108,1.04642673,0.36556444, 2.20760578,1.59518590,0.83951030,0.07113652,0.97422886, 0.59835793,0.08383890,1.09180853,0.43508943,0.35368637, 0.75805978,0.12790161,1.56798563,1.68669770,0.56168021) > ks.test(ppp,kkk) One-sample Kolmogorov-Smirnov test data: ppp D = 0.1013, p-value = 0.5895 alternative hypothesis: two.sided [ which seems correct as the samples come from abs(rnorm()) ] > ks.test(ppp,jjj) One-sample Kolmogorov-Smirnov test data: ppp D = 0.9875, p-value < 2.2e-16 alternative hypothesis: two.sided [ which is *incorrect* ] Note: This example is artificial as I have used a function for which I know the integral; my real problem involves a distribution that I can only integrate numerically. How would I go about to solve this problem? With kind regards, Theo. ______________________________________________ R-help at stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
theo borm wrote:> >> jjj<-function(www) {2*integrate(dnorm,0,www)$value} >> kkk<-function(www) {2*(pnorm(www)-0.5)}>> xxx<-seq(0:5) >> yyy<-jjj(xxx) >> zzz<-kkk(xxx) > > produces no errors, but: >> yyy > [1] 0.6826895 >> zzz > [1] 0.6826895 0.9544997 0.9973002 0.9999367 0.9999994 1.0000000 > > Why is this? Is this some R language feature that I've completely missed? >Yes. Some functions work on vectors (and matrices), so when you give a vector, it returns a vector. This is true for most common functions (sin, cos), arithmetic operations (with the caveat that different dimensions for the arguments may cause unexpected outcomes) and some internal functions (dnorm, pnorm). So, if you write sin(0:10) or dnorm((-3):3), you get a vector. Some other functions don't, and this is the case with integrate. For example: fff <- function(x) x integrate(fff, 0, 1) # ok integrate(fff, 0, 1:5) # will integrate from 0 to 1 and ignore 2:5 'plot' will probably fall into some code that uses this vector-in-vector-out hypothesis, and then fail when the size of x differs from the size of y. Alberto Monteiro PS: fff <- function(x) 1 integrate(fff, 0, 1) # error. why?
Dear list members, I'm quite new to R, and though I tried to find the answer to my probably very basic question through the available resources (website, mailing list archives, docs, google), I've not found it. If I try to use the "integrate" function from within my own functions, my functions seem to misbehave in some contexts. The following example is a bit silly, but illustrates what's happening: The following behaves as expected: > jjj<-function(www) {2*integrate(dnorm,0,www)$value} > kkk<-function(www) {2*(pnorm(www)-0.5)} > jjj(2) [1] 0.9544997 > kkk(2) [1] 0.9544997 However, if I do: > plot(jjj,0,5) Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ whereas > plot(kkk,0,5) produces a nice plot. > xxx<-seq(0:5) > yyy<-jjj(xxx) > zzz<-kkk(xxx) produces no errors, but: > yyy [1] 0.6826895 > zzz [1] 0.6826895 0.9544997 0.9973002 0.9999367 0.9999994 1.0000000 Why is this? Is this some R language feature that I've completely missed? Ultimately my problem is that I have a mathematical function describing a distribution, and I want to use this in a Kolmogorov-Smirnov test where I need a cumulative distribution function. If I use the following (synthetic) dataset with ks.test with either the "jjj" or "kkk" function, I get: > ppp<-c(1.74865955,1.12220426,0.24760427,0.24351439,0.10870853, 0.72023272,0.40245392,0.16002948,0.24203950,0.44029851, 0.34830446,1.66967152,1.71609574,1.17540830,0.22306379, 0.64382628,0.37382795,0.84361547,1.92138362,0.02844235, 0.11238473,0.21237557,1.01058073,2.33108243,1.36034473, 1.88951679,0.18230647,0.96571916,0.91239760,2.05574766, 0.33681130,2.76006257,0.83952491,1.24038831,1.46199671, 0.24694163,0.01565159,0.94816108,1.04642673,0.36556444, 2.20760578,1.59518590,0.83951030,0.07113652,0.97422886, 0.59835793,0.08383890,1.09180853,0.43508943,0.35368637, 0.75805978,0.12790161,1.56798563,1.68669770,0.56168021) > ks.test(ppp,kkk) One-sample Kolmogorov-Smirnov test data: ppp D = 0.1013, p-value = 0.5895 alternative hypothesis: two.sided [ which seems correct as the samples come from abs(rnorm()) ] > ks.test(ppp,jjj) One-sample Kolmogorov-Smirnov test data: ppp D = 0.9875, p-value < 2.2e-16 alternative hypothesis: two.sided [ which is *incorrect* ] Note: This example is artificial as I have used a function for which I know the integral; my real problem involves a distribution that I can only integrate numerically. How would I go about to solve this problem? With kind regards, Theo.
Ravi Varadhan wrote:>Your function "jjj" is not vectorized. > >Try this: > >jjj <- function(www) sapply(www, function(x)2*integrate(dnorm,0,x)$value) >plot(jjj, 0, 5) > >It should work. > >Yes it does. Thanks! Thinking of it, it now starts to make some sort of sense that integrate should return a "scalar"; the result is really a list of which I only used the "value" component. And now I also understand the "x,q: vector of quantiles." phrases in the documentation of dnorm. Now if I do something silly: > jjj<-function(www) {2*integrate(dnorm,0,www)$value+sin(www)-sin(www)} then > jjj(1:5) [1] 0.6826895 0.6826895 0.6826895 0.6826895 0.6826895 Evidently the "inherritance" of "being vectorized" (which was why my kkk function worked) could lead to some unexpected (and probably hard to debug) results :-/ Thanks. with kind regards, Theo.
Hi, Many thanks for the explanation. Alberto Monteiro wrote:> >PS: fff <- function(x) 1 >integrate(fff, 0, 1) # error. why? >Guess: because integrate itself expects a "vectorized function" ? > fff(1:5) [1] 1 > ggg<-function(x) { sapply(x, function(x)1) } > ggg(1:5) [1] 1 1 1 1 1 > integrate(ggg,0,1) 1 with absolute error < 1.1e-14 > hhh<-function(x) 1+0*x > integrate(hhh, 0, 1) 1 with absolute error < 1.1e-14 I sense a certain lack of intuitiveness here :-/ regards, Theo