Your function "jjj" is not vectorized.
Try this:
jjj <- function(www) sapply(www, function(x)2*integrate(dnorm,0,x)$value)
plot(jjj, 0, 5)
It should work.
Ravi.
----------------------------------------------------------------------------
-------
Ravi Varadhan, Ph.D.
Assistant Professor, The Center on Aging and Health
Division of Geriatric Medicine and Gerontology
Johns Hopkins University
Ph: (410) 502-2619
Fax: (410) 614-9625
Email: rvaradhan at jhmi.edu
Webpage: http://www.jhsph.edu/agingandhealth/People/Faculty/Varadhan.html
----------------------------------------------------------------------------
--------
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of theo borm
Sent: Friday, February 23, 2007 1:43 PM
To: r-help at stat.math.ethz.ch
Subject: [R] using "integrate" in a function definition
Dear list members,
I'm quite new to R, and though I tried to find the answer to my probably
very basic question through the available resources (website, mailing
list archives, docs, google), I've not found it.
If I try to use the "integrate" function from within my own functions,
my functions seem to misbehave in some contexts. The following example
is a bit silly, but illustrates what's happening:
The following behaves as expected:
> jjj<-function(www) {2*integrate(dnorm,0,www)$value}
> kkk<-function(www) {2*(pnorm(www)-0.5)}
> jjj(2)
[1] 0.9544997
> kkk(2)
[1] 0.9544997
However, if I do:
> plot(jjj,0,5)
Error in xy.coords(x, y, xlabel, ylabel, log) :
'x' and 'y' lengths differ
whereas
> plot(kkk,0,5)
produces a nice plot.
> xxx<-seq(0:5)
> yyy<-jjj(xxx)
> zzz<-kkk(xxx)
produces no errors, but:
> yyy
[1] 0.6826895
> zzz
[1] 0.6826895 0.9544997 0.9973002 0.9999367 0.9999994 1.0000000
Why is this? Is this some R language feature that I've completely missed?
Ultimately my problem is that I have a mathematical function describing
a distribution, and I want to use this in a Kolmogorov-Smirnov test
where I need a cumulative distribution function. If I use the following
(synthetic) dataset with ks.test with either the "jjj" or
"kkk"
function, I get:
> ppp<-c(1.74865955,1.12220426,0.24760427,0.24351439,0.10870853,
0.72023272,0.40245392,0.16002948,0.24203950,0.44029851,
0.34830446,1.66967152,1.71609574,1.17540830,0.22306379,
0.64382628,0.37382795,0.84361547,1.92138362,0.02844235,
0.11238473,0.21237557,1.01058073,2.33108243,1.36034473,
1.88951679,0.18230647,0.96571916,0.91239760,2.05574766,
0.33681130,2.76006257,0.83952491,1.24038831,1.46199671,
0.24694163,0.01565159,0.94816108,1.04642673,0.36556444,
2.20760578,1.59518590,0.83951030,0.07113652,0.97422886,
0.59835793,0.08383890,1.09180853,0.43508943,0.35368637,
0.75805978,0.12790161,1.56798563,1.68669770,0.56168021)
> ks.test(ppp,kkk)
One-sample Kolmogorov-Smirnov test
data: ppp
D = 0.1013, p-value = 0.5895
alternative hypothesis: two.sided
[ which seems correct as the samples come from abs(rnorm()) ]
> ks.test(ppp,jjj)
One-sample Kolmogorov-Smirnov test
data: ppp
D = 0.9875, p-value < 2.2e-16
alternative hypothesis: two.sided
[ which is *incorrect* ]
Note: This example is artificial as I have used a function for which I
know the integral; my real problem involves a distribution that I can
only integrate numerically.
How would I go about to solve this problem?
With kind regards,
Theo.
______________________________________________
R-help at stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
theo borm wrote:> >> jjj<-function(www) {2*integrate(dnorm,0,www)$value} >> kkk<-function(www) {2*(pnorm(www)-0.5)}>> xxx<-seq(0:5) >> yyy<-jjj(xxx) >> zzz<-kkk(xxx) > > produces no errors, but: >> yyy > [1] 0.6826895 >> zzz > [1] 0.6826895 0.9544997 0.9973002 0.9999367 0.9999994 1.0000000 > > Why is this? Is this some R language feature that I've completely missed? >Yes. Some functions work on vectors (and matrices), so when you give a vector, it returns a vector. This is true for most common functions (sin, cos), arithmetic operations (with the caveat that different dimensions for the arguments may cause unexpected outcomes) and some internal functions (dnorm, pnorm). So, if you write sin(0:10) or dnorm((-3):3), you get a vector. Some other functions don't, and this is the case with integrate. For example: fff <- function(x) x integrate(fff, 0, 1) # ok integrate(fff, 0, 1:5) # will integrate from 0 to 1 and ignore 2:5 'plot' will probably fall into some code that uses this vector-in-vector-out hypothesis, and then fail when the size of x differs from the size of y. Alberto Monteiro PS: fff <- function(x) 1 integrate(fff, 0, 1) # error. why?
Dear list members,
I'm quite new to R, and though I tried to find the answer to my probably
very basic question through the available resources (website, mailing
list archives, docs, google), I've not found it.
If I try to use the "integrate" function from within my own functions,
my functions seem to misbehave in some contexts. The following example
is a bit silly, but illustrates what's happening:
The following behaves as expected:
> jjj<-function(www) {2*integrate(dnorm,0,www)$value}
> kkk<-function(www) {2*(pnorm(www)-0.5)}
> jjj(2)
[1] 0.9544997
> kkk(2)
[1] 0.9544997
However, if I do:
> plot(jjj,0,5)
Error in xy.coords(x, y, xlabel, ylabel, log) :
'x' and 'y' lengths differ
whereas
> plot(kkk,0,5)
produces a nice plot.
> xxx<-seq(0:5)
> yyy<-jjj(xxx)
> zzz<-kkk(xxx)
produces no errors, but:
> yyy
[1] 0.6826895
> zzz
[1] 0.6826895 0.9544997 0.9973002 0.9999367 0.9999994 1.0000000
Why is this? Is this some R language feature that I've completely missed?
Ultimately my problem is that I have a mathematical function describing
a distribution, and I want to use this in a Kolmogorov-Smirnov test
where I need a cumulative distribution function. If I use the following
(synthetic) dataset with ks.test with either the "jjj" or
"kkk"
function, I get:
> ppp<-c(1.74865955,1.12220426,0.24760427,0.24351439,0.10870853,
0.72023272,0.40245392,0.16002948,0.24203950,0.44029851,
0.34830446,1.66967152,1.71609574,1.17540830,0.22306379,
0.64382628,0.37382795,0.84361547,1.92138362,0.02844235,
0.11238473,0.21237557,1.01058073,2.33108243,1.36034473,
1.88951679,0.18230647,0.96571916,0.91239760,2.05574766,
0.33681130,2.76006257,0.83952491,1.24038831,1.46199671,
0.24694163,0.01565159,0.94816108,1.04642673,0.36556444,
2.20760578,1.59518590,0.83951030,0.07113652,0.97422886,
0.59835793,0.08383890,1.09180853,0.43508943,0.35368637,
0.75805978,0.12790161,1.56798563,1.68669770,0.56168021)
> ks.test(ppp,kkk)
One-sample Kolmogorov-Smirnov test
data: ppp
D = 0.1013, p-value = 0.5895
alternative hypothesis: two.sided
[ which seems correct as the samples come from abs(rnorm()) ]
> ks.test(ppp,jjj)
One-sample Kolmogorov-Smirnov test
data: ppp
D = 0.9875, p-value < 2.2e-16
alternative hypothesis: two.sided
[ which is *incorrect* ]
Note: This example is artificial as I have used a function for which I
know the integral; my real problem involves a distribution that I can
only integrate numerically.
How would I go about to solve this problem?
With kind regards,
Theo.
Ravi Varadhan wrote:>Your function "jjj" is not vectorized. > >Try this: > >jjj <- function(www) sapply(www, function(x)2*integrate(dnorm,0,x)$value) >plot(jjj, 0, 5) > >It should work. > >Yes it does. Thanks! Thinking of it, it now starts to make some sort of sense that integrate should return a "scalar"; the result is really a list of which I only used the "value" component. And now I also understand the "x,q: vector of quantiles." phrases in the documentation of dnorm. Now if I do something silly: > jjj<-function(www) {2*integrate(dnorm,0,www)$value+sin(www)-sin(www)} then > jjj(1:5) [1] 0.6826895 0.6826895 0.6826895 0.6826895 0.6826895 Evidently the "inherritance" of "being vectorized" (which was why my kkk function worked) could lead to some unexpected (and probably hard to debug) results :-/ Thanks. with kind regards, Theo.
Hi, Many thanks for the explanation. Alberto Monteiro wrote:> >PS: fff <- function(x) 1 >integrate(fff, 0, 1) # error. why? >Guess: because integrate itself expects a "vectorized function" ? > fff(1:5) [1] 1 > ggg<-function(x) { sapply(x, function(x)1) } > ggg(1:5) [1] 1 1 1 1 1 > integrate(ggg,0,1) 1 with absolute error < 1.1e-14 > hhh<-function(x) 1+0*x > integrate(hhh, 0, 1) 1 with absolute error < 1.1e-14 I sense a certain lack of intuitiveness here :-/ regards, Theo