thr3ads.net - R help - [R] How to determine whether a value belong to a cumulative distribution? [May 2020]

If this information is useful, please help other people find it:
Share via:

Luigi Marongiu

2020-May-10 08:17 UTC

[R] How to determine whether a value belong to a cumulative distribution?

Hello,
I am trying to translate a mathematical formula into R. The formula (or
rather a  set of formulas) is meant to determine the first outlier in a
sequence of measurements. To do this, a parameter r is calculated; this is
essentially the ratio between the variance of the value x and the sum of
the variances of the x-1 elements of the series. x follows a certain
distribution (namely, sigmoid), whereas r follows a cumulative empirical
one.
The text says:
"Each r is distributed as t under the model. Therefore, we can test the
hypothesis whether a single observation deviates from the model by
comparing r with the t distribution, where F(?) is the cumulative
distribution function of the t distribution:
                                P-value = 2 * [1 ? F(1 ? |r|)]
"
I generated a cumulative function with
```
cum_fun = ecdf(abs(x[1:n])
```
which gives me:
```> n=3
> Empirical CDFCall: ecdf(abs(x{1:n])
 x[1:3] = 5.5568, 6.5737, 7.2471
```
But now how can I determine if x belongs to the distribution?
If I do, as in the formula:
```> p = 2 * (1-cum_fun)Error in 1 - cum_fun : non-numeric argument to binary operator
```
Can I get a p-value associated with this association?
Thank you

-- 
Best regards,
Luigi

	[[alternative HTML version deleted]]

Ivan Krylov

2020-May-10 12:02 UTC

head link

[R] How to determine whether a value belong to a cumulative distribution?

On Sun, 10 May 2020 10:17:47 +0200
Luigi Marongiu <marongiu.luigi at gmail.com> wrote:
>If I do, as in the formula:
>```
>> p = 2 * (1-cum_fun)  
>Error in 1 - cum_fun : non-numeric argument to binary operator
>```
The ecdf function returns another function that calculates the ECDF
value for an arbitrary input. For example,

e <- ecdf(1:10)
e
# Empirical CDF
# Call: ecdf(1:10)
#  x[1:10] =      1,      2,      3,  ...,      9,     10
e(c(-1, 5, 100)) # call the returned value as a function
# [1] 0.0 0.5 1.0

If you want to see the empirical distribution function values for the
points of the dataset itself, call the function returned by ecdf with
the same data again:

x <- 1:10
ecdf(x)(x)
# [1] 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

If you want to calculate the CDF for a given value of 1 ? |r|, pass
this value as an argument to the function returned by ecdf:

cum_fun <- ecdf(abs(x[1:n])
p <- 2 * (1 - cum_fun(1 - abs(r)))

On the other hand, given the quotes from the text, I think than you
might need to use the theoretical t distribution function (available as
`dt` in R) in the formula instead of ECDF:

df <- ... # degrees of freedom for Student t distribution
p <- 2 * (1 - dt(1 - abs(r), df))

I am not sure about that, though.

-- 
Best regards,
Ivan

R help - May 2020 - How to determine whether a value belong to a cumulative distribution?

[R] How to determine whether a value belong to a cumulative distribution?

[R] How to determine whether a value belong to a cumulative distribution?