Leeds, Mark (IED)
2007-May-21 03:03 UTC
[R] Sample correlation coefficient question NOT R question
This is a statistics question not an R question. When calculating the
sample correlation coefficient cor(x_t,y_t) between say
two variables, x_t and y_t t=1,.....n ( one can assume that the
variables are in time but I don't think this really matters
for the question ), does someone know where I can find any piece of
literature that says that each (x_j,y_j) pair has
To be independent from the other (x_i,y_i) pairs (j doesn't equal i )
in order for the calculation to have any reasonable meaning. This
makes perfect sense to me but I need it official writing so I can show
it to someone else because I don't know how to explain it.
Obviously, there may be some way to calculate the correlation
coefficient when the (x_t,y_t) pairs aren't independent ( maybe ?) but
I am referring to the very standard correlation calculation ( pearson
for example or any other standard one ).
Thanks for any suggestions/references/insights etc.
--------------------------------------------------------
This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
Bruce Willy
2007-May-21 11:26 UTC
[R] Sample correlation coefficient question NOT R question
Hmm I'm not too sure
but the correlation coefficient must not be taken for its empirical counterpart
and the law of large numbers "tells" you roughly you can approximate a
mean by its empirical counterpart when the variables are identically distributed
and independant.
If they are not independant, the empirical counterpart could be a not very good
approximation.
Then I have learned it's always better to use the Spearman's rho or the
other one based on ranks, but not Pearson's correlation coefficient which is
only the best in the normal setting. There is a book from Lehmann about this
("Nonparametrics")
> Date: Sun, 20 May 2007 23:03:26 -0400> From:
Mark.Leeds@morganstanley.com> To: r-help@stat.math.ethz.ch> CC:
Elias.Belessakos@inginvestment.com> Subject: [R] Sample correlation
coefficient question NOT R question> > This is a statistics question not
an R question. When calculating the> sample correlation coefficient
cor(x_t,y_t) between say> two variables, x_t and y_t t=1,.....n ( one can
assume that the> variables are in time but I don't think this really
matters> for the question ), does someone know where I can find any piece
of> literature that says that each (x_j,y_j) pair has> To be independent
from the other (x_i,y_i) pairs (j doesn't equal i )> in order for the
calculation to have any reasonable meaning. This> makes perfect sense to me
but I need it official writing so I can show> it to someone else because I
don't know how to explain it. > Obviously, there may be some way to
calculate the correlation> coefficient when the (x_t,y_t) pairs aren't
independent ( maybe ?) but> I am referring to the very standard correlation
calculation ( pearson> for example or any other standard one ).> Thanks
for any suggestions/references/insights etc.>
--------------------------------------------------------> > This is not an
offer (or solicitation of an offer) to buy/se...{{dropped}}> >
______________________________________________> R-help@stat.math.ethz.ch
mailing list> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read
the posting guide http://www.R-project.org/posting-guide.html> and provide
commented, minimal, self-contained, reproducible code.
_________________________________________________________________
météo et bien plus encore !
[[alternative HTML version deleted]]