Leeds, Mark (IED)
2006-Nov-15 16:32 UTC
[R] Correlations not in sync with r2 from regression
I have two variables, minutereturnsa which can be thought of as my independent variable and minutereturnsb which can be thought of as my dependent variable. When I run correlations on the two variables, depending on which of the three methods I use, I get values of between -.15 through -.19. Then, when I do a regression, I get an rsquared of .004 which is more in line with my intuition. In other words, I think the cor function is doing something very different from what a regression calculates. In fact, when I use my full data set ( not included here ), I get correlations at the level of -0.97 which is extremely unreasonable given the two variables. So, I think cor is calculating something else in terms of what I am used to ? Or, maybe the scale of my variables is too small and this could be causing something to go wrong ? Has anyone else had this experience where their value from a corr is extremely high and yet the x y plot of the data and the regression itself do not reflect this. All the code is below the two data sets in case anyone wants to run it in order to see better what I mean. It's probably just an non understanding on my part of what the cor function is actually doing ? Thanks a lot. minutereturnsa<-c(-2.36264407318387e-05, -0.000114546483004574, 0.000480296012887571, -3.4370702667097e-05, -1.75028713567116e-05, -4.48227082969765e-05, 2.90329205787643e-05, 0.000305825164510942, -5.03948020931233e-05, -0.000132337254829196, 0.000257366609910825, -0.000143416497692783, 7.75575880389567e-05, -0.000390396184700492, 5.8463592766067e-05, -0.000166182789493874, 2.60897827946138e-05, -9.68285203182262e-05, 0.000306300707090479, -0.000212593666131689, -2.05973305682505e-05, -0.000892262006425781, -7.65296399478643e-05, 9.29686476904834e-05, 0.000400462742000229, -5.68981524482481e-05, 8.75374496889236e-06, -0.000325482754985451, -0.00026561900794686, 6.6048490682924e-05, 5.22638320941127e-05, -9.67113649918971e-05, 2.9231547120645e-05, -5.85151902221526e-06, -6.50840736984293e-05, 3.34530686902923e-05, 9.92970200188736e-05, 4.57716366808469e-05, 3.98160204646558e-05, -6.00275310205234e-05, -0.000182345705006526, -8.2237112582817e-05, 8.49939625151563e-05, -3.54341054409346e-05, -0.000100100119395208, -0.000480874457688962, -5.96482127885878e-05, -1.57319826001867e-05, -0.000144679725631036, -0.000135114371429879, -4.7961402529495e-06, -0.000169457716609145, 3.99966377280236e-05, -5.124402520984e-05, 0.000328975614625193, -0.00025080739315797, -0.000573125487459691, -7.7472898995623e-05, 3.44346931751005e-05, -3.06202186477478e-05, 0.000370039560674940, 7.8350343693856e-05, -3.16439540668512e-05, -0.000160178561449342, -0.000396591758462961, -0.000210859243796158, 0.000388855276420408, -0.000179700371241154, 4.16133481957459e-05, 1.41192273312996e-05, 7.08468899466297e-05, -1.52706151546056e-06, 3.67659444577839e-05, -0.000234283509586319, 0.000137243567309930, -5.20968533468391e-05, -0.000134271000559849, 0.00015686727434705, -1.20143299762177e-05, 0.000101875337767510, 3.65842929905824e-06, 4.69929868991414e-05, 5.7532628616741e-05, -4.97275753463811e-06, -0.000170415848516292, 3.72182099566132e-05, -3.63157298233219e-05, 6.5485377211516e-05, 1.70517614943577e-05, -0.00044266660425496, 0.000117663889794173, -0.000156675474467072, -9.45186652945296e-06, 0.000228093804488516, -0.000183465434343333, 8.7116036074697e-05, -0.000105286831582063, 3.77385685590426e-05, 0.000229830364734340, 7.83212236301623e-05) minutereturnsb<-c(8.4645336092315e-05, 0.000342545113518611, 0.000619138432391253, 2.39224895137724e-05, -0.000105430737374235, -4.64094877949961e-05, 0.000232692295488945, 0.000170343242044346, -0.000278101637244177, -2.39061360733928e-05, -0.000308615318143524, 0.00088876749136002, 8.32424077339411e-05, -0.00055346752090557, 8.45076374274e-05, -0.00054551438959205, 0.000259390072846699, 0.000144324272249641, 0.000180899432469239, 0.000288664295990948, 9.95638801164489e-05, 0.00097571036608901, -4.60863646098986e-05, -0.00047897519405371, 5.14178471453519e-05, 2.4281687986516e-05, 0.00036058584985188, 0.000134937727959361, 0.000103280488242596, 2.88151877292364e-05, -0.000310955520890666, 0.000106216873712484, 5.52294891624783e-05, 5.71829552473702e-05, 0, -4.24836809553852e-06, -0.000112243149114732, -0.000225054512974054, -0.000127369605538163, -4.718171803475e-06, 0.000575331719490535, 0.000414750691947852, -2.48462938587934e-05, -0.000508280783423132, 0.000246095358950704, -0.000407474448815393, 0.000288693409606466, -7.07108562671976e-05, -0.000794312866452707, -0.000106260214363552, 0.00028805175686486, 7.4386971687268e-05, -0.000298442739602223, 0.000194096767056173, -0.000298344525328176, 0.000220745065718120, 0.000709521706545146, -0.000217011729104044, -1.82252827203300e-05, -0.000385731348574225, 0.000332978442621368, -8.95786863042147e-05, 0.000104547275809885, -0.000295648677070659, -9.6397420641381e-06, 0.000224064465289331, -0.000739358267871637, 6.56478242859748e-05, 0.000815010958709728, -0.000107869047081266, -0.000304049331305123, 5.20322126815742e-05, -5.32009410303402e-05, -0.000167938227511044, -9.21891011866904e-05, -0.000266690878483189, 3.64952404163787e-05, -4.46965899776330e-05, 4.25776509977993e-05, -7.80437967993208e-05, -2.12827088637013e-05, -7.09320471283803e-05, 0.000148863850106373, -6.3806029690916e-05, 4.25559078296445e-05, -9.58063435172463e-05, 5.3209320157066e-05, -0.000546163217071793, -0.000351060356512889, 0.000162064277177798, -0.000384674967309095, -0.000199174847720585, -0.000216944813291597, 0.000101829147203247, 0.000105031807862588, -1.94459082898391e-05, 0.000271353813222852, -2.75323237151071e-05, 0.000581315536959615, -0.000348709449612628) #----------------------------------------------------------------------- --------- # CALCULATE VARIOUS CORRELATIONS corpearson<-cor(minutereturnsa,minutereturnsb,method="pearson",use="pair wise.complete.obs") print(corpearson) corkendall<-cor(minutereturnsa,minutereturnsb,method="kendall",use="pair wise.complete.obs") print(corkendall) corspearman<-cor(minutereturnsa,minutereturnsb,method="spearman",use="pa irwise.complete.obs") print(corspearman) # EXLCUDE POSSIBLE Nas # DO A REGRESSION # PLOT FITTED VERSUS ACTUAL options(na.action=na.exclude) returnreg<-lm(minutereturnsb ~ minutereturnsa) regsumm<-summary(returnreg) print(regsumm) plot(minutereturnsa,minutereturnsb) lines(minutereturnsa,fitted(returnreg)) -------------------------------------------------------- This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}
Leeds, Mark (IED)
2006-Nov-15 16:54 UTC
[R] Correlations not in sync with r2 from regression
oops, I forgot to square !!!!!!!!!!!!!! Thanks Chuck. I would have spent the rest of the day and then some trying to figure that one out !!!! -----Original Message----- From: Chuck Cleland [mailto:ccleland at optonline.net] Sent: Wednesday, November 15, 2006 11:46 AM To: Leeds, Mark (IED) Subject: Re: [R] Correlations not in sync with r2 from regression Mark: cor() and R^2 are in sync for me on the data you provided:> summary(lm(minutereturnsa ~ minutereturnsb))$r.squared[1] 0.03640094> summary(lm(minutereturnsa ~ minutereturnsb))$r.squared^.5[1] 0.1907903> cor(minutereturnsa, minutereturnsb)[1] -0.1907903> cor(minutereturnsa, minutereturnsb)^2[1] 0.03640094 I suspect you are doing different things with missing values in cor() versus lm(). hope this helps, Chuck Leeds, Mark (IED) wrote:> I have two variables, minutereturnsa which can be thought of as my > independent variable and minutereturnsb which can be thought of as my > dependent variable. When I run correlations on the two variables, > depending on which of the three methods I use, I get values of between > -.15 through -.19. > > Then, when I do a regression, I get an rsquared of .004 which is more > in line with my intuition. In other words, I think the cor function is> doing something very different from what a regression calculates. In > fact, when I use my full data set ( not included here ), I get > correlations at the level of -0.97 which is extremely unreasonable > given the two variables. So, I think cor is calculating something else> in terms of what I am used to ? Or, maybe the scale of my variables > is too small and this could be causing something to go wrong ? > > Has anyone else had this experience where their value from a corr is > extremely high and yet the x y plot of the data and the regression > itself do not reflect this. All the code is below the two data sets in> case anyone wants to run it in order to see better what I mean. It's > probably just an non understanding on my part of what the cor function> is actually doing ? Thanks a lot. > > > minutereturnsa<-c(-2.36264407318387e-05, -0.000114546483004574, > 0.000480296012887571, -3.4370702667097e-05, -1.75028713567116e-05, > -4.48227082969765e-05, 2.90329205787643e-05, 0.000305825164510942, > -5.03948020931233e-05, -0.000132337254829196, 0.000257366609910825, > -0.000143416497692783, 7.75575880389567e-05, -0.000390396184700492, > 5.8463592766067e-05, -0.000166182789493874, 2.60897827946138e-05, > -9.68285203182262e-05, 0.000306300707090479, -0.000212593666131689, > -2.05973305682505e-05, -0.000892262006425781, -7.65296399478643e-05, > 9.29686476904834e-05, 0.000400462742000229, -5.68981524482481e-05, > 8.75374496889236e-06, -0.000325482754985451, -0.00026561900794686, > 6.6048490682924e-05, 5.22638320941127e-05, -9.67113649918971e-05, > 2.9231547120645e-05, -5.85151902221526e-06, -6.50840736984293e-05, > 3.34530686902923e-05, 9.92970200188736e-05, 4.57716366808469e-05, > 3.98160204646558e-05, -6.00275310205234e-05, -0.000182345705006526, > -8.2237112582817e-05, 8.49939625151563e-05, -3.54341054409346e-05, > -0.000100100119395208, -0.000480874457688962, -5.96482127885878e-05, > -1.57319826001867e-05, -0.000144679725631036, -0.000135114371429879, > -4.7961402529495e-06, -0.000169457716609145, 3.99966377280236e-05, > -5.124402520984e-05, 0.000328975614625193, -0.00025080739315797, > -0.000573125487459691, -7.7472898995623e-05, 3.44346931751005e-05, > -3.06202186477478e-05, 0.000370039560674940, 7.8350343693856e-05, > -3.16439540668512e-05, -0.000160178561449342, -0.000396591758462961, > -0.000210859243796158, 0.000388855276420408, -0.000179700371241154, > 4.16133481957459e-05, 1.41192273312996e-05, 7.08468899466297e-05, > -1.52706151546056e-06, 3.67659444577839e-05, -0.000234283509586319, > 0.000137243567309930, -5.20968533468391e-05, -0.000134271000559849, > 0.00015686727434705, -1.20143299762177e-05, 0.000101875337767510, > 3.65842929905824e-06, 4.69929868991414e-05, 5.7532628616741e-05, > -4.97275753463811e-06, -0.000170415848516292, 3.72182099566132e-05, > -3.63157298233219e-05, 6.5485377211516e-05, 1.70517614943577e-05, > -0.00044266660425496, 0.000117663889794173, -0.000156675474467072, > -9.45186652945296e-06, 0.000228093804488516, -0.000183465434343333, > 8.7116036074697e-05, -0.000105286831582063, 3.77385685590426e-05, > 0.000229830364734340, > 7.83212236301623e-05) > > minutereturnsb<-c(8.4645336092315e-05, 0.000342545113518611, > 0.000619138432391253, 2.39224895137724e-05, -0.000105430737374235, > -4.64094877949961e-05, 0.000232692295488945, 0.000170343242044346, > -0.000278101637244177, -2.39061360733928e-05, -0.000308615318143524, > 0.00088876749136002, 8.32424077339411e-05, -0.00055346752090557, > 8.45076374274e-05, -0.00054551438959205, 0.000259390072846699, > 0.000144324272249641, 0.000180899432469239, 0.000288664295990948, > 9.95638801164489e-05, 0.00097571036608901, -4.60863646098986e-05, > -0.00047897519405371, 5.14178471453519e-05, 2.4281687986516e-05, > 0.00036058584985188, 0.000134937727959361, 0.000103280488242596, > 2.88151877292364e-05, -0.000310955520890666, 0.000106216873712484, > 5.52294891624783e-05, 5.71829552473702e-05, 0, -4.24836809553852e-06, > -0.000112243149114732, -0.000225054512974054, -0.000127369605538163, > -4.718171803475e-06, 0.000575331719490535, 0.000414750691947852, > -2.48462938587934e-05, -0.000508280783423132, 0.000246095358950704, > -0.000407474448815393, 0.000288693409606466, -7.07108562671976e-05, > -0.000794312866452707, -0.000106260214363552, 0.00028805175686486, > 7.4386971687268e-05, -0.000298442739602223, 0.000194096767056173, > -0.000298344525328176, 0.000220745065718120, 0.000709521706545146, > -0.000217011729104044, -1.82252827203300e-05, -0.000385731348574225, > 0.000332978442621368, -8.95786863042147e-05, 0.000104547275809885, > -0.000295648677070659, -9.6397420641381e-06, 0.000224064465289331, > -0.000739358267871637, 6.56478242859748e-05, 0.000815010958709728, > -0.000107869047081266, -0.000304049331305123, 5.20322126815742e-05, > -5.32009410303402e-05, -0.000167938227511044, -9.21891011866904e-05, > -0.000266690878483189, 3.64952404163787e-05, -4.46965899776330e-05, > 4.25776509977993e-05, -7.80437967993208e-05, -2.12827088637013e-05, > -7.09320471283803e-05, 0.000148863850106373, -6.3806029690916e-05, > 4.25559078296445e-05, -9.58063435172463e-05, 5.3209320157066e-05, > -0.000546163217071793, -0.000351060356512889, 0.000162064277177798, > -0.000384674967309095, -0.000199174847720585, -0.000216944813291597, > 0.000101829147203247, 0.000105031807862588, -1.94459082898391e-05, > 0.000271353813222852, -2.75323237151071e-05, 0.000581315536959615, > -0.000348709449612628) > > > #--------------------------------------------------------------------- > -- > --------- > > # CALCULATE VARIOUS CORRELATIONS > > corpearson<-cor(minutereturnsa,minutereturnsb,method="pearson",use="pa > ir > wise.complete.obs") > print(corpearson) > > corkendall<-cor(minutereturnsa,minutereturnsb,method="kendall",use="pa > ir > wise.complete.obs") > print(corkendall) > > corspearman<-cor(minutereturnsa,minutereturnsb,method="spearman",use=" > pa > irwise.complete.obs") > print(corspearman) > > # EXLCUDE POSSIBLE Nas > # DO A REGRESSION > # PLOT FITTED VERSUS ACTUAL > > options(na.action=na.exclude) > returnreg<-lm(minutereturnsb ~ minutereturnsa) > regsumm<-summary(returnreg) > print(regsumm) > > plot(minutereturnsa,minutereturnsb) > lines(minutereturnsa,fitted(returnreg)) > -------------------------------------------------------- > > This is not an offer (or solicitation of an offer) to > buy/se...{{dropped}} > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894 -------------------------------------------------------- This is not an offer (or solicitation of an offer) to buy/se...{{dropped}}