Sorry -- stupid typos in my definition below! See at ===*** below. On Tue, 2018-10-23 at 11:41 +0100, Ted Harding wrote: Before the ticket finally enters the waste bin, I think it is necessary to explicitly explain what is meant by the "domain" of a random variable. This is not (though in special cases could be) the space of possible values of the random variable. Definition of (real-valued) Random Variable (RV): Let Z be a probability space, i.e. a set {z} of entities z on which a probability distribution is defined. The entities z do not need to be numeric. A real-valued RV X is a function X:Z --> R defined on Z such that, for any z in Z, X(z) is a real number. The set Z, in tthis context, is (by definitipon) the *domain* of X, i.e. the space on which X is defined. It may or may not be (and usually is not) the same as the set of possible values of X. Then. given any real value x0, the CDF of X at x- is Prob[X <= X0]. The distribution function of X does not define the domain of X. As a simple exam[ple: Suppose Q is a cube of side A, consisting of points z=(u,v,w) with 0 <= u,v,w <= A. Z is the probability space of points z with a uniform distribution of position within Q. Define the random variable X:Q --> [0,1] as ===*** X[u,v,w) = x/A Wrong! That should have been: X[u,v,w) = w/A ===*** Then X is uniformly distributed on [0,1], the domain of X is Q. Then for x <= 0 _Prob[X <= x] = 0, for 0 <= x <= 1 Prob(X >=x] = x, for x >= 1 Prob(X <= x] = 1. These define the CDF. The set of poaaible values of X is 1-dimensional, and is not the same as the domain of X, which is 3-dimensional. Hopiong this helps! Ted. On Tue, 2018-10-23 at 10:54 +0100, Hamed Ha wrote:> > Yes, now it makes more sense. > > > > Okay, I think that I am convinced and we can close this ticket. > > > > Thanks Eric. > > Regards, > > Hamed. > > > > On Tue, 23 Oct 2018 at 10:42, Eric Berger <ericjberger at gmail.com> wrote: > > > > > Hi Hamed, > > > That reference is sloppy. Try looking at > > > https://en.wikipedia.org/wiki/Cumulative_distribution_function > > > and in particular the first example which deals with a Unif[0,1] r.v. > > > > > > Best, > > > Eric > > > > > > > > > On Tue, Oct 23, 2018 at 12:35 PM Hamed Ha <hamedhaseli at gmail.com> wrote: > > > > > >> Hi Eric, > > >> > > >> Thank you for your reply. > > >> > > >> I should say that your justification makes sense to me. However, I am in > > >> doubt that CDF defines by the Pr(x <= X) for all X? that is the domain of > > >> RV is totally ignored in the definition. > > >> > > >> It makes a conflict between the formula and the theoretical definition. > > >> > > >> Please see page 115 in > > >> > > >> https://books.google.co.uk/books?id=FEE8D1tRl30C&printsec=frontcover&dq=statistical+distribution&hl=en&sa=X&ved=0ahUKEwjp3PGZmJzeAhUQqxoKHV7OBJgQ6AEIKTAA#v=onepage&q=uniform&f=false > > >> The > > >> > > >> > > >> Thanks. > > >> Hamed. > > >> > > >> > > >> > > >> On Tue, 23 Oct 2018 at 10:21, Eric Berger <ericjberger at gmail.com> wrote: > > >> > > >>> Hi Hamed, > > >>> I disagree with your criticism. > > >>> For a random variable X > > >>> X: D - - - > R > > >>> its CDF F is defined by > > >>> F: R - - - > [0,1] > > >>> F(z) = Prob(X <= z) > > >>> > > >>> The fact that you wrote a convenient formula for the CDF > > >>> F(z) = (z-a)/(b-a) a <= z <= b > > >>> in a particular range for z is your decision, and as you noted this > > >>> formula will give the wrong value for z outside the interval [a,b]. > > >>> But the problem lies in your formula, not the definition of the CDF > > >>> which would be, in your case: > > >>> > > >>> F(z) = 0 if z <= a > > >>> = (z-a)/(b-a) if a <= z <= b > > >>> = 1 if 1 <= z > > >>> > > >>> HTH, > > >>> Eric > > >>> > > >>> > > >>> > > >>> > > >>> On Tue, Oct 23, 2018 at 12:05 PM Hamed Ha <hamedhaseli at gmail.com> wrote: > > >>> > > >>>> Hi All, > > >>>> > > >>>> I recently discovered an interesting issue with the punif() function. > > >>>> Let > > >>>> X~Uiform[a,b] then the CDF is defined by F(x)=(x-a)/(b-a) for (a<= x<> > >>>> b). > > >>>> The important fact here is the domain of the random variable X. Having > > >>>> said > > >>>> that, R returns CDF for any value in the real domain. > > >>>> > > >>>> I understand that one can justify this by extending the domain of X and > > >>>> assigning zero probabilities to the values outside the domain. However, > > >>>> theoretically, it is not true to return a value for the CDF outside the > > >>>> domain. Then I propose a patch to R function punif() to return an error > > >>>> in > > >>>> this situations. > > >>>> > > >>>> Example: > > >>>> > punif(10^10) > > >>>> [1] 1 > > >>>> > > >>>> > > >>>> Regards, > > >>>> Hamed. > > >>>> > > >>>> [[alternative HTML version deleted]] > > >>>> > > >>>> ______________________________________________ > > >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > >>>> https://stat.ethz.ch/mailman/listinfo/r-help > > >>>> PLEASE do read the posting guide > > >>>> http://www.R-project.org/posting-guide.html > > >>>> and provide commented, minimal, self-contained, reproducible code. > > >>>> > > >>> > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
Hi Ted, Thanks for the explanation. I am convinced at least more than average by Eric and your answer. But still have some shadows of confusion that is definitely because I have forgotten some fundamentals in probabilities. In your cube example, the cumulative probability of reaching a point outside the cube (u or v or w > A) is 1 however, the bigger cube does not exists (because the Q is the reference space). Other words, I feel that we extend the space to accommodate any cube of any size! Looks a bit weird to me! Hamed. On Tue, 23 Oct 2018 at 11:52, Ted Harding <ted.harding at wlandres.net> wrote:> Sorry -- stupid typos in my definition below! > See at ===*** below. > > On Tue, 2018-10-23 at 11:41 +0100, Ted Harding wrote: > Before the ticket finally enters the waste bin, I think it is > necessary to explicitly explain what is meant by the "domain" > of a random variable. This is not (though in special cases > could be) the space of possible values of the random variable. > > Definition of (real-valued) Random Variable (RV): > Let Z be a probability space, i.e. a set {z} of entities z > on which a probability distribution is defined. The entities z > do not need to be numeric. A real-valued RV X is a function > X:Z --> R defined on Z such that, for any z in Z, X(z) is a > real number. The set Z, in tthis context, is (by definitipon) > the *domain* of X, i.e. the space on which X is defined. > It may or may not be (and usually is not) the same as the set > of possible values of X. > > Then. given any real value x0, the CDF of X at x- is Prob[X <= X0]. > The distribution function of X does not define the domain of X. > > As a simple exam[ple: Suppose Q is a cube of side A, consisting of > points z=(u,v,w) with 0 <= u,v,w <= A. Z is the probability space > of points z with a uniform distribution of position within Q. > Define the random variable X:Q --> [0,1] as > ===*** > X[u,v,w) = x/A > > Wrong! That should have been: > > X[u,v,w) = w/A > ===*** > Then X is uniformly distributed on [0,1], the domain of X is Q. > Then for x <= 0 _Prob[X <= x] = 0, for 0 <= x <= 1 Prob(X >=x] = x, > for x >= 1 Prob(X <= x] = 1. These define the CDF. The set of poaaible > values of X is 1-dimensional, and is not the same as the domain of X, > which is 3-dimensional. > > Hopiong this helps! > Ted. > > On Tue, 2018-10-23 at 10:54 +0100, Hamed Ha wrote: > > > Yes, now it makes more sense. > > > > > > Okay, I think that I am convinced and we can close this ticket. > > > > > > Thanks Eric. > > > Regards, > > > Hamed. > > > > > > On Tue, 23 Oct 2018 at 10:42, Eric Berger <ericjberger at gmail.com> > wrote: > > > > > > > Hi Hamed, > > > > That reference is sloppy. Try looking at > > > > https://en.wikipedia.org/wiki/Cumulative_distribution_function > > > > and in particular the first example which deals with a Unif[0,1] r.v. > > > > > > > > Best, > > > > Eric > > > > > > > > > > > > On Tue, Oct 23, 2018 at 12:35 PM Hamed Ha <hamedhaseli at gmail.com> > wrote: > > > > > > > >> Hi Eric, > > > >> > > > >> Thank you for your reply. > > > >> > > > >> I should say that your justification makes sense to me. However, I > am in > > > >> doubt that CDF defines by the Pr(x <= X) for all X? that is the > domain of > > > >> RV is totally ignored in the definition. > > > >> > > > >> It makes a conflict between the formula and the theoretical > definition. > > > >> > > > >> Please see page 115 in > > > >> > > > >> > https://books.google.co.uk/books?id=FEE8D1tRl30C&printsec=frontcover&dq=statistical+distribution&hl=en&sa=X&ved=0ahUKEwjp3PGZmJzeAhUQqxoKHV7OBJgQ6AEIKTAA#v=onepage&q=uniform&f=false > > > >> The > > > >> > > > >> > > > >> Thanks. > > > >> Hamed. > > > >> > > > >> > > > >> > > > >> On Tue, 23 Oct 2018 at 10:21, Eric Berger <ericjberger at gmail.com> > wrote: > > > >> > > > >>> Hi Hamed, > > > >>> I disagree with your criticism. > > > >>> For a random variable X > > > >>> X: D - - - > R > > > >>> its CDF F is defined by > > > >>> F: R - - - > [0,1] > > > >>> F(z) = Prob(X <= z) > > > >>> > > > >>> The fact that you wrote a convenient formula for the CDF > > > >>> F(z) = (z-a)/(b-a) a <= z <= b > > > >>> in a particular range for z is your decision, and as you noted this > > > >>> formula will give the wrong value for z outside the interval [a,b]. > > > >>> But the problem lies in your formula, not the definition of the CDF > > > >>> which would be, in your case: > > > >>> > > > >>> F(z) = 0 if z <= a > > > >>> = (z-a)/(b-a) if a <= z <= b > > > >>> = 1 if 1 <= z > > > >>> > > > >>> HTH, > > > >>> Eric > > > >>> > > > >>> > > > >>> > > > >>> > > > >>> On Tue, Oct 23, 2018 at 12:05 PM Hamed Ha <hamedhaseli at gmail.com> > wrote: > > > >>> > > > >>>> Hi All, > > > >>>> > > > >>>> I recently discovered an interesting issue with the punif() > function. > > > >>>> Let > > > >>>> X~Uiform[a,b] then the CDF is defined by F(x)=(x-a)/(b-a) for > (a<= x<> > > >>>> b). > > > >>>> The important fact here is the domain of the random variable X. > Having > > > >>>> said > > > >>>> that, R returns CDF for any value in the real domain. > > > >>>> > > > >>>> I understand that one can justify this by extending the domain of > X and > > > >>>> assigning zero probabilities to the values outside the domain. > However, > > > >>>> theoretically, it is not true to return a value for the CDF > outside the > > > >>>> domain. Then I propose a patch to R function punif() to return an > error > > > >>>> in > > > >>>> this situations. > > > >>>> > > > >>>> Example: > > > >>>> > punif(10^10) > > > >>>> [1] 1 > > > >>>> > > > >>>> > > > >>>> Regards, > > > >>>> Hamed. > > > >>>> > > > >>>> [[alternative HTML version deleted]] > > > >>>> > > > >>>> ______________________________________________ > > > >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > >>>> https://stat.ethz.ch/mailman/listinfo/r-help > > > >>>> PLEASE do read the posting guide > > > >>>> http://www.R-project.org/posting-guide.html > > > >>>> and provide commented, minimal, self-contained, reproducible code. > > > >>>> > > > >>> > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > >[[alternative HTML version deleted]]
Well, as a final (I hope!) clarification: It is not the case that "the bigger cube does not exists (because the Q is the reference space)". It does exist! Simply, the probability of the random point being in the bigger cube, and NOT in the cube Q, is 0. Hence "the cumulative probability of reaching a point outside the cube (u or v or w > A) is 1" is badly phrased. The "cumulative probability" is not the probability of *reaching* a point, but of being (in the case of a real random variable) less than or equal to the given value. If Prob[X <= x1] = 1, then Prob[X > x1] = 0. Hence if x0 is the minimum value such that Prob[X <= x0] = 1, then X "can reach" x0. But for any x1 > x0, Prob[x0 < X <= x1] = 0. Therefore, since X cannot be greater than x0, X *cannot reach* x1! Best wishes, Ted. On Tue, 2018-10-23 at 12:06 +0100, Hamed Ha wrote:> Hi Ted, > > Thanks for the explanation. > > I am convinced at least more than average by Eric and your answer. But > still have some shadows of confusion that is definitely because I have > forgotten some fundamentals in probabilities. > > In your cube example, the cumulative probability of reaching a point > outside the cube (u or v or w > A) is 1 however, the bigger cube does not > exists (because the Q is the reference space). Other words, I feel that we > extend the space to accommodate any cube of any size! Looks a bit weird to > me! > > > Hamed. > > On Tue, 23 Oct 2018 at 11:52, Ted Harding <ted.harding at wlandres.net> wrote: > > > Sorry -- stupid typos in my definition below! > > See at ===*** below. > > > > On Tue, 2018-10-23 at 11:41 +0100, Ted Harding wrote: > > Before the ticket finally enters the waste bin, I think it is > > necessary to explicitly explain what is meant by the "domain" > > of a random variable. This is not (though in special cases > > could be) the space of possible values of the random variable. > > > > Definition of (real-valued) Random Variable (RV): > > Let Z be a probability space, i.e. a set {z} of entities z > > on which a probability distribution is defined. The entities z > > do not need to be numeric. A real-valued RV X is a function > > X:Z --> R defined on Z such that, for any z in Z, X(z) is a > > real number. The set Z, in tthis context, is (by definitipon) > > the *domain* of X, i.e. the space on which X is defined. > > It may or may not be (and usually is not) the same as the set > > of possible values of X. > > > > Then. given any real value x0, the CDF of X at x- is Prob[X <= X0]. > > The distribution function of X does not define the domain of X. > > > > As a simple exam[ple: Suppose Q is a cube of side A, consisting of > > points z=(u,v,w) with 0 <= u,v,w <= A. Z is the probability space > > of points z with a uniform distribution of position within Q. > > Define the random variable X:Q --> [0,1] as > > ===*** > > X[u,v,w) = x/A > > > > Wrong! That should have been: > > > > X[u,v,w) = w/A > > ===*** > > Then X is uniformly distributed on [0,1], the domain of X is Q. > > Then for x <= 0 _Prob[X <= x] = 0, for 0 <= x <= 1 Prob(X >=x] = x, > > for x >= 1 Prob(X <= x] = 1. These define the CDF. The set of poaaible > > values of X is 1-dimensional, and is not the same as the domain of X, > > which is 3-dimensional. > > > > Hopiong this helps! > > Ted. > > > > On Tue, 2018-10-23 at 10:54 +0100, Hamed Ha wrote: > > > > Yes, now it makes more sense. > > > > > > > > Okay, I think that I am convinced and we can close this ticket. > > > > > > > > Thanks Eric. > > > > Regards, > > > > Hamed. > > > > > > > > On Tue, 23 Oct 2018 at 10:42, Eric Berger <ericjberger at gmail.com> > > wrote: > > > > > > > > > Hi Hamed, > > > > > That reference is sloppy. Try looking at > > > > > https://en.wikipedia.org/wiki/Cumulative_distribution_function > > > > > and in particular the first example which deals with a Unif[0,1] r.v. > > > > > > > > > > Best, > > > > > Eric > > > > > > > > > > > > > > > On Tue, Oct 23, 2018 at 12:35 PM Hamed Ha <hamedhaseli at gmail.com> > > wrote: > > > > > > > > > >> Hi Eric, > > > > >> > > > > >> Thank you for your reply. > > > > >> > > > > >> I should say that your justification makes sense to me. However, I > > am in > > > > >> doubt that CDF defines by the Pr(x <= X) for all X? that is the > > domain of > > > > >> RV is totally ignored in the definition. > > > > >> > > > > >> It makes a conflict between the formula and the theoretical > > definition. > > > > >> > > > > >> Please see page 115 in > > > > >> > > > > >> > > https://books.google.co.uk/books?id=FEE8D1tRl30C&printsec=frontcover&dq=statistical+distribution&hl=en&sa=X&ved=0ahUKEwjp3PGZmJzeAhUQqxoKHV7OBJgQ6AEIKTAA#v=onepage&q=uniform&f=false > > > > >> The > > > > >> > > > > >> > > > > >> Thanks. > > > > >> Hamed. > > > > >> > > > > >> > > > > >> > > > > >> On Tue, 23 Oct 2018 at 10:21, Eric Berger <ericjberger at gmail.com> > > wrote: > > > > >> > > > > >>> Hi Hamed, > > > > >>> I disagree with your criticism. > > > > >>> For a random variable X > > > > >>> X: D - - - > R > > > > >>> its CDF F is defined by > > > > >>> F: R - - - > [0,1] > > > > >>> F(z) = Prob(X <= z) > > > > >>> > > > > >>> The fact that you wrote a convenient formula for the CDF > > > > >>> F(z) = (z-a)/(b-a) a <= z <= b > > > > >>> in a particular range for z is your decision, and as you noted this > > > > >>> formula will give the wrong value for z outside the interval [a,b]. > > > > >>> But the problem lies in your formula, not the definition of the CDF > > > > >>> which would be, in your case: > > > > >>> > > > > >>> F(z) = 0 if z <= a > > > > >>> = (z-a)/(b-a) if a <= z <= b > > > > >>> = 1 if 1 <= z > > > > >>> > > > > >>> HTH, > > > > >>> Eric > > > > >>> > > > > >>> > > > > >>> > > > > >>> > > > > >>> On Tue, Oct 23, 2018 at 12:05 PM Hamed Ha <hamedhaseli at gmail.com> > > wrote: > > > > >>> > > > > >>>> Hi All, > > > > >>>> > > > > >>>> I recently discovered an interesting issue with the punif() > > function. > > > > >>>> Let > > > > >>>> X~Uiform[a,b] then the CDF is defined by F(x)=(x-a)/(b-a) for > > (a<= x<> > > > >>>> b). > > > > >>>> The important fact here is the domain of the random variable X. > > Having > > > > >>>> said > > > > >>>> that, R returns CDF for any value in the real domain. > > > > >>>> > > > > >>>> I understand that one can justify this by extending the domain of > > X and > > > > >>>> assigning zero probabilities to the values outside the domain. > > However, > > > > >>>> theoretically, it is not true to return a value for the CDF > > outside the > > > > >>>> domain. Then I propose a patch to R function punif() to return an > > error > > > > >>>> in > > > > >>>> this situations. > > > > >>>> > > > > >>>> Example: > > > > >>>> > punif(10^10) > > > > >>>> [1] 1 > > > > >>>> > > > > >>>> > > > > >>>> Regards, > > > > >>>> Hamed. > > > > >>>> > > > > >>>> [[alternative HTML version deleted]] > > > > >>>> > > > > >>>> ______________________________________________ > > > > >>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > >>>> https://stat.ethz.ch/mailman/listinfo/r-help > > > > >>>> PLEASE do read the posting guide > > > > >>>> http://www.R-project.org/posting-guide.html > > > > >>>> and provide commented, minimal, self-contained, reproducible code. > > > > >>>> > > > > >>> > > > > > > > > [[alternative HTML version deleted]] > > > > > > > > ______________________________________________ > > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > ______________________________________________ > > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.