Leeds, Mark (IED)
2006-Nov-15 16:32 UTC
[R] Correlations not in sync with r2 from regression
I have two variables, minutereturnsa which can be thought of as my
independent variable and minutereturnsb which can be thought of as my
dependent variable. When I run correlations on the two variables,
depending on which of the three methods I use, I get values of between
-.15 through -.19.
Then, when I do a regression, I get an rsquared of .004 which is more in
line with my intuition. In other words, I think the cor
function is doing something very different from what a regression
calculates. In fact, when I use my full data set ( not
included here ), I get correlations at the level of -0.97 which is
extremely unreasonable given the two variables. So, I think cor is
calculating something else in terms of what I am used to ? Or, maybe
the scale of my variables is too small and this could be causing
something to go wrong ?
Has anyone else had this experience where their value from a corr is
extremely high and yet the x y plot of the data and the regression
itself do not reflect this. All the code is below the two data sets in
case anyone wants to run it in order to see better what
I mean. It's probably just an non understanding on my part of what the
cor function is actually doing ? Thanks a lot.
minutereturnsa<-c(-2.36264407318387e-05, -0.000114546483004574,
0.000480296012887571,
-3.4370702667097e-05, -1.75028713567116e-05, -4.48227082969765e-05,
2.90329205787643e-05, 0.000305825164510942, -5.03948020931233e-05,
-0.000132337254829196, 0.000257366609910825, -0.000143416497692783,
7.75575880389567e-05, -0.000390396184700492, 5.8463592766067e-05,
-0.000166182789493874, 2.60897827946138e-05, -9.68285203182262e-05,
0.000306300707090479, -0.000212593666131689, -2.05973305682505e-05,
-0.000892262006425781, -7.65296399478643e-05, 9.29686476904834e-05,
0.000400462742000229, -5.68981524482481e-05, 8.75374496889236e-06,
-0.000325482754985451, -0.00026561900794686, 6.6048490682924e-05,
5.22638320941127e-05, -9.67113649918971e-05, 2.9231547120645e-05,
-5.85151902221526e-06, -6.50840736984293e-05, 3.34530686902923e-05,
9.92970200188736e-05, 4.57716366808469e-05, 3.98160204646558e-05,
-6.00275310205234e-05, -0.000182345705006526, -8.2237112582817e-05,
8.49939625151563e-05, -3.54341054409346e-05, -0.000100100119395208,
-0.000480874457688962, -5.96482127885878e-05, -1.57319826001867e-05,
-0.000144679725631036, -0.000135114371429879, -4.7961402529495e-06,
-0.000169457716609145, 3.99966377280236e-05, -5.124402520984e-05,
0.000328975614625193, -0.00025080739315797, -0.000573125487459691,
-7.7472898995623e-05, 3.44346931751005e-05, -3.06202186477478e-05,
0.000370039560674940, 7.8350343693856e-05, -3.16439540668512e-05,
-0.000160178561449342, -0.000396591758462961, -0.000210859243796158,
0.000388855276420408, -0.000179700371241154, 4.16133481957459e-05,
1.41192273312996e-05, 7.08468899466297e-05, -1.52706151546056e-06,
3.67659444577839e-05, -0.000234283509586319, 0.000137243567309930,
-5.20968533468391e-05, -0.000134271000559849, 0.00015686727434705,
-1.20143299762177e-05, 0.000101875337767510, 3.65842929905824e-06,
4.69929868991414e-05, 5.7532628616741e-05, -4.97275753463811e-06,
-0.000170415848516292, 3.72182099566132e-05, -3.63157298233219e-05,
6.5485377211516e-05, 1.70517614943577e-05, -0.00044266660425496,
0.000117663889794173, -0.000156675474467072, -9.45186652945296e-06,
0.000228093804488516, -0.000183465434343333, 8.7116036074697e-05,
-0.000105286831582063, 3.77385685590426e-05, 0.000229830364734340,
7.83212236301623e-05)
minutereturnsb<-c(8.4645336092315e-05, 0.000342545113518611,
0.000619138432391253,
2.39224895137724e-05, -0.000105430737374235, -4.64094877949961e-05,
0.000232692295488945, 0.000170343242044346, -0.000278101637244177,
-2.39061360733928e-05, -0.000308615318143524, 0.00088876749136002,
8.32424077339411e-05, -0.00055346752090557, 8.45076374274e-05,
-0.00054551438959205, 0.000259390072846699, 0.000144324272249641,
0.000180899432469239, 0.000288664295990948, 9.95638801164489e-05,
0.00097571036608901, -4.60863646098986e-05, -0.00047897519405371,
5.14178471453519e-05, 2.4281687986516e-05, 0.00036058584985188,
0.000134937727959361, 0.000103280488242596, 2.88151877292364e-05,
-0.000310955520890666, 0.000106216873712484, 5.52294891624783e-05,
5.71829552473702e-05, 0, -4.24836809553852e-06, -0.000112243149114732,
-0.000225054512974054, -0.000127369605538163, -4.718171803475e-06,
0.000575331719490535, 0.000414750691947852, -2.48462938587934e-05,
-0.000508280783423132, 0.000246095358950704, -0.000407474448815393,
0.000288693409606466, -7.07108562671976e-05, -0.000794312866452707,
-0.000106260214363552, 0.00028805175686486, 7.4386971687268e-05,
-0.000298442739602223, 0.000194096767056173, -0.000298344525328176,
0.000220745065718120, 0.000709521706545146, -0.000217011729104044,
-1.82252827203300e-05, -0.000385731348574225, 0.000332978442621368,
-8.95786863042147e-05, 0.000104547275809885, -0.000295648677070659,
-9.6397420641381e-06, 0.000224064465289331, -0.000739358267871637,
6.56478242859748e-05, 0.000815010958709728, -0.000107869047081266,
-0.000304049331305123, 5.20322126815742e-05, -5.32009410303402e-05,
-0.000167938227511044, -9.21891011866904e-05, -0.000266690878483189,
3.64952404163787e-05, -4.46965899776330e-05, 4.25776509977993e-05,
-7.80437967993208e-05, -2.12827088637013e-05, -7.09320471283803e-05,
0.000148863850106373, -6.3806029690916e-05, 4.25559078296445e-05,
-9.58063435172463e-05, 5.3209320157066e-05, -0.000546163217071793,
-0.000351060356512889, 0.000162064277177798, -0.000384674967309095,
-0.000199174847720585, -0.000216944813291597, 0.000101829147203247,
0.000105031807862588, -1.94459082898391e-05, 0.000271353813222852,
-2.75323237151071e-05, 0.000581315536959615, -0.000348709449612628)
#-----------------------------------------------------------------------
---------
# CALCULATE VARIOUS CORRELATIONS
corpearson<-cor(minutereturnsa,minutereturnsb,method="pearson",use="pair
wise.complete.obs")
print(corpearson)
corkendall<-cor(minutereturnsa,minutereturnsb,method="kendall",use="pair
wise.complete.obs")
print(corkendall)
corspearman<-cor(minutereturnsa,minutereturnsb,method="spearman",use="pa
irwise.complete.obs")
print(corspearman)
# EXLCUDE POSSIBLE Nas
# DO A REGRESSION
# PLOT FITTED VERSUS ACTUAL
options(na.action=na.exclude)
returnreg<-lm(minutereturnsb ~ minutereturnsa)
regsumm<-summary(returnreg)
print(regsumm)
plot(minutereturnsa,minutereturnsb)
lines(minutereturnsa,fitted(returnreg))
--------------------------------------------------------
This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
Leeds, Mark (IED)
2006-Nov-15 16:54 UTC
[R] Correlations not in sync with r2 from regression
oops, I forgot to square !!!!!!!!!!!!!! Thanks Chuck. I would have spent the rest of the day and then some trying to figure that one out !!!! -----Original Message----- From: Chuck Cleland [mailto:ccleland at optonline.net] Sent: Wednesday, November 15, 2006 11:46 AM To: Leeds, Mark (IED) Subject: Re: [R] Correlations not in sync with r2 from regression Mark: cor() and R^2 are in sync for me on the data you provided:> summary(lm(minutereturnsa ~ minutereturnsb))$r.squared[1] 0.03640094> summary(lm(minutereturnsa ~ minutereturnsb))$r.squared^.5[1] 0.1907903> cor(minutereturnsa, minutereturnsb)[1] -0.1907903> cor(minutereturnsa, minutereturnsb)^2[1] 0.03640094 I suspect you are doing different things with missing values in cor() versus lm(). hope this helps, Chuck Leeds, Mark (IED) wrote:> I have two variables, minutereturnsa which can be thought of as my > independent variable and minutereturnsb which can be thought of as my > dependent variable. When I run correlations on the two variables, > depending on which of the three methods I use, I get values of between > -.15 through -.19. > > Then, when I do a regression, I get an rsquared of .004 which is more > in line with my intuition. In other words, I think the cor function is> doing something very different from what a regression calculates. In > fact, when I use my full data set ( not included here ), I get > correlations at the level of -0.97 which is extremely unreasonable > given the two variables. So, I think cor is calculating something else> in terms of what I am used to ? Or, maybe the scale of my variables > is too small and this could be causing something to go wrong ? > > Has anyone else had this experience where their value from a corr is > extremely high and yet the x y plot of the data and the regression > itself do not reflect this. All the code is below the two data sets in> case anyone wants to run it in order to see better what I mean. It's > probably just an non understanding on my part of what the cor function> is actually doing ? Thanks a lot. > > > minutereturnsa<-c(-2.36264407318387e-05, -0.000114546483004574, > 0.000480296012887571, -3.4370702667097e-05, -1.75028713567116e-05, > -4.48227082969765e-05, 2.90329205787643e-05, 0.000305825164510942, > -5.03948020931233e-05, -0.000132337254829196, 0.000257366609910825, > -0.000143416497692783, 7.75575880389567e-05, -0.000390396184700492, > 5.8463592766067e-05, -0.000166182789493874, 2.60897827946138e-05, > -9.68285203182262e-05, 0.000306300707090479, -0.000212593666131689, > -2.05973305682505e-05, -0.000892262006425781, -7.65296399478643e-05, > 9.29686476904834e-05, 0.000400462742000229, -5.68981524482481e-05, > 8.75374496889236e-06, -0.000325482754985451, -0.00026561900794686, > 6.6048490682924e-05, 5.22638320941127e-05, -9.67113649918971e-05, > 2.9231547120645e-05, -5.85151902221526e-06, -6.50840736984293e-05, > 3.34530686902923e-05, 9.92970200188736e-05, 4.57716366808469e-05, > 3.98160204646558e-05, -6.00275310205234e-05, -0.000182345705006526, > -8.2237112582817e-05, 8.49939625151563e-05, -3.54341054409346e-05, > -0.000100100119395208, -0.000480874457688962, -5.96482127885878e-05, > -1.57319826001867e-05, -0.000144679725631036, -0.000135114371429879, > -4.7961402529495e-06, -0.000169457716609145, 3.99966377280236e-05, > -5.124402520984e-05, 0.000328975614625193, -0.00025080739315797, > -0.000573125487459691, -7.7472898995623e-05, 3.44346931751005e-05, > -3.06202186477478e-05, 0.000370039560674940, 7.8350343693856e-05, > -3.16439540668512e-05, -0.000160178561449342, -0.000396591758462961, > -0.000210859243796158, 0.000388855276420408, -0.000179700371241154, > 4.16133481957459e-05, 1.41192273312996e-05, 7.08468899466297e-05, > -1.52706151546056e-06, 3.67659444577839e-05, -0.000234283509586319, > 0.000137243567309930, -5.20968533468391e-05, -0.000134271000559849, > 0.00015686727434705, -1.20143299762177e-05, 0.000101875337767510, > 3.65842929905824e-06, 4.69929868991414e-05, 5.7532628616741e-05, > -4.97275753463811e-06, -0.000170415848516292, 3.72182099566132e-05, > -3.63157298233219e-05, 6.5485377211516e-05, 1.70517614943577e-05, > -0.00044266660425496, 0.000117663889794173, -0.000156675474467072, > -9.45186652945296e-06, 0.000228093804488516, -0.000183465434343333, > 8.7116036074697e-05, -0.000105286831582063, 3.77385685590426e-05, > 0.000229830364734340, > 7.83212236301623e-05) > > minutereturnsb<-c(8.4645336092315e-05, 0.000342545113518611, > 0.000619138432391253, 2.39224895137724e-05, -0.000105430737374235, > -4.64094877949961e-05, 0.000232692295488945, 0.000170343242044346, > -0.000278101637244177, -2.39061360733928e-05, -0.000308615318143524, > 0.00088876749136002, 8.32424077339411e-05, -0.00055346752090557, > 8.45076374274e-05, -0.00054551438959205, 0.000259390072846699, > 0.000144324272249641, 0.000180899432469239, 0.000288664295990948, > 9.95638801164489e-05, 0.00097571036608901, -4.60863646098986e-05, > -0.00047897519405371, 5.14178471453519e-05, 2.4281687986516e-05, > 0.00036058584985188, 0.000134937727959361, 0.000103280488242596, > 2.88151877292364e-05, -0.000310955520890666, 0.000106216873712484, > 5.52294891624783e-05, 5.71829552473702e-05, 0, -4.24836809553852e-06, > -0.000112243149114732, -0.000225054512974054, -0.000127369605538163, > -4.718171803475e-06, 0.000575331719490535, 0.000414750691947852, > -2.48462938587934e-05, -0.000508280783423132, 0.000246095358950704, > -0.000407474448815393, 0.000288693409606466, -7.07108562671976e-05, > -0.000794312866452707, -0.000106260214363552, 0.00028805175686486, > 7.4386971687268e-05, -0.000298442739602223, 0.000194096767056173, > -0.000298344525328176, 0.000220745065718120, 0.000709521706545146, > -0.000217011729104044, -1.82252827203300e-05, -0.000385731348574225, > 0.000332978442621368, -8.95786863042147e-05, 0.000104547275809885, > -0.000295648677070659, -9.6397420641381e-06, 0.000224064465289331, > -0.000739358267871637, 6.56478242859748e-05, 0.000815010958709728, > -0.000107869047081266, -0.000304049331305123, 5.20322126815742e-05, > -5.32009410303402e-05, -0.000167938227511044, -9.21891011866904e-05, > -0.000266690878483189, 3.64952404163787e-05, -4.46965899776330e-05, > 4.25776509977993e-05, -7.80437967993208e-05, -2.12827088637013e-05, > -7.09320471283803e-05, 0.000148863850106373, -6.3806029690916e-05, > 4.25559078296445e-05, -9.58063435172463e-05, 5.3209320157066e-05, > -0.000546163217071793, -0.000351060356512889, 0.000162064277177798, > -0.000384674967309095, -0.000199174847720585, -0.000216944813291597, > 0.000101829147203247, 0.000105031807862588, -1.94459082898391e-05, > 0.000271353813222852, -2.75323237151071e-05, 0.000581315536959615, > -0.000348709449612628) > > > #--------------------------------------------------------------------- > -- > --------- > > # CALCULATE VARIOUS CORRELATIONS > > corpearson<-cor(minutereturnsa,minutereturnsb,method="pearson",use="pa > ir > wise.complete.obs") > print(corpearson) > > corkendall<-cor(minutereturnsa,minutereturnsb,method="kendall",use="pa > ir > wise.complete.obs") > print(corkendall) > > corspearman<-cor(minutereturnsa,minutereturnsb,method="spearman",use=" > pa > irwise.complete.obs") > print(corspearman) > > # EXLCUDE POSSIBLE Nas > # DO A REGRESSION > # PLOT FITTED VERSUS ACTUAL > > options(na.action=na.exclude) > returnreg<-lm(minutereturnsb ~ minutereturnsa) > regsumm<-summary(returnreg) > print(regsumm) > > plot(minutereturnsa,minutereturnsb) > lines(minutereturnsa,fitted(returnreg)) > -------------------------------------------------------- > > This is not an offer (or solicitation of an offer) to > buy/se...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 -------------------------------------------------------- This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}