Leeds, Mark (IED)
2007-May-21 03:03 UTC
[R] Sample correlation coefficient question NOT R question
This is a statistics question not an R question. When calculating the sample correlation coefficient cor(x_t,y_t) between say two variables, x_t and y_t t=1,.....n ( one can assume that the variables are in time but I don't think this really matters for the question ), does someone know where I can find any piece of literature that says that each (x_j,y_j) pair has To be independent from the other (x_i,y_i) pairs (j doesn't equal i ) in order for the calculation to have any reasonable meaning. This makes perfect sense to me but I need it official writing so I can show it to someone else because I don't know how to explain it. Obviously, there may be some way to calculate the correlation coefficient when the (x_t,y_t) pairs aren't independent ( maybe ?) but I am referring to the very standard correlation calculation ( pearson for example or any other standard one ). Thanks for any suggestions/references/insights etc. -------------------------------------------------------- This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
Bruce Willy
2007-May-21 11:26 UTC
[R] Sample correlation coefficient question NOT R question
Hmm I'm not too sure but the correlation coefficient must not be taken for its empirical counterpart and the law of large numbers "tells" you roughly you can approximate a mean by its empirical counterpart when the variables are identically distributed and independant. If they are not independant, the empirical counterpart could be a not very good approximation. Then I have learned it's always better to use the Spearman's rho or the other one based on ranks, but not Pearson's correlation coefficient which is only the best in the normal setting. There is a book from Lehmann about this ("Nonparametrics")> Date: Sun, 20 May 2007 23:03:26 -0400> From: Mark.Leeds@morganstanley.com> To: r-help@stat.math.ethz.ch> CC: Elias.Belessakos@inginvestment.com> Subject: [R] Sample correlation coefficient question NOT R question> > This is a statistics question not an R question. When calculating the> sample correlation coefficient cor(x_t,y_t) between say> two variables, x_t and y_t t=1,.....n ( one can assume that the> variables are in time but I don't think this really matters> for the question ), does someone know where I can find any piece of> literature that says that each (x_j,y_j) pair has> To be independent from the other (x_i,y_i) pairs (j doesn't equal i )> in order for the calculation to have any reasonable meaning. This> makes perfect sense to me but I need it official writing so I can show> it to someone else because I don't know how to explain it. > Obviously, there may be some way to calculate the correlation> coefficient when the (x_t,y_t) pairs aren't independent ( maybe ?) but> I am referring to the very standard correlation calculation ( pearson> for example or any other standard one ).> Thanks for any suggestions/references/insights etc.> --------------------------------------------------------> > This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}> > ______________________________________________> R-help@stat.math.ethz.ch mailing list> https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code._________________________________________________________________ météo et bien plus encore ! [[alternative HTML version deleted]]