Hi:
On Sat, Sep 25, 2010 at 3:47 AM, Lorenzo Isella
<lorenzo.isella@gmail.com>wrote:
> Dear All,
> Suppose you are given two distributions (or better: two equally-sized lists
> of data); how can you evaluate the difference between them?
> I need something like an overlap measure of the two (let us say 0 == no
> overlap and 1== complete overlap). I should add that there is a 1-1
> correspondence of the data in the two distributions (they are ordered lists
> and e.g. the third element in the first distribution "matches"
the third
> element in the second distribution).
>
To visualize the two distributions, you could do an empirical Q-Q plot
(qqplot(x, y)); if the distributions are identical, they should lie on a 45
degree line - location shifts are indicated by level shifts (parallel lines)
to the 45 degree line, differences in scale by slope differences away from
1.
There are many types of tests to test equality of distribution, the simplest
one being the two-sample Kolmogorov-Smirnov test. It is sensitive to changes
in both location and scale of the two edf's (empirical distribution
functions). An alternative
is the two-sample Cramer-von Mises test. I have no doubt there are others...
HTH,
Dennis
The two distributions are not analytical (I mean they do not belong to
any> well known family).
> I wonder if mutual information could be what I am looking for.
> Cheers
>
> Lorenzo
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]