thr3ads.net - R help - [R] How to Get Categorical Correlation Coefficient [Oct 2006]

If this information is useful, please help other people find it:
Share via:

Kum-Hoe Hwang

2006-Oct-12 08:08 UTC

[R] How to Get Categorical Correlation Coefficient

Howdy Gurus !

I have a different correlation result from the same data. The
"corridor1" string variable is expressed
as a number like the "corridor2" number variable.
--------------------------------------------------------------------------> levels(corridor1)[1] "A"   "B"   "C"   "D"    
"E"   "F"> levels(as.factor(corridor2))[1] "0" "1" "2" "3"
"4">------------------------------------------------------------------------------------------
I have the correlation results followings using cor() function.
------------------------------------------------------------------------------------------> cor(jh1_1, as.factor(corridor1))
[1] 0.01528538> cor(jh1_1, as.factor(corridor2))[1] -0.4972571
------------------------------------------------------------------------------------------
I donot know why the above correlation coefficients used the same data
are different.
They are 0.015 from as.factor(corridor1), -0.497 from as,factor(corridor2).
The string variable "corridor1" is the same catergory data with the
variable corridor2.
The difference is that "A" is replaced with "0",
"B" with "1", "C"
with "2", .....

Could you tell me why they are different, and which correlation
coefficient is correct?

Thank in advance,

-- 
Kum-Hoe Hwang, Ph.D.Phone : 82-31-250-3516Email : phdhwang at gmail.com

Peter Dalgaard

2006-Oct-12 08:25 UTC

head link

[R] How to Get Categorical Correlation Coefficient

"Kum-Hoe Hwang" <phdhwang at gmail.com> writes:
> Howdy Gurus !
> 
> I have a different correlation result from the same data. The
> "corridor1" string variable is expressed
> as a number like the "corridor2" number variable.
> --------------------------------------------------------------------------
> > levels(corridor1)
> [1] "A"   "B"   "C"   "D"    
"E"   "F"
> > levels(as.factor(corridor2))
> [1] "0" "1" "2" "3" "4"
> >
>
------------------------------------------------------------------------------------------
> I have the correlation results followings using cor() function.
>
------------------------------------------------------------------------------------------
> > cor(jh1_1, as.factor(corridor1))
> [1] 0.01528538
> > cor(jh1_1, as.factor(corridor2))
> [1] -0.4972571
>
------------------------------------------------------------------------------------------
> I donot know why the above correlation coefficients used the same data
> are different.
> They are 0.015 from as.factor(corridor1), -0.497 from as,factor(corridor2).
> The string variable "corridor1" is the same catergory data with
the
> variable corridor2.
> The difference is that "A" is replaced with "0",
"B" with "1", "C"
> with "2", .....
> 
> Could you tell me why they are different, and which correlation
> coefficient is correct?
One thing that strikes me is that corridor1 has 6 levels and corridor2
has 5...

In general correlations are not expected to work on factors so I'd be
explicit about taking as.numeric(). A glance at
table(corridor1,corridor2) should be informative too, as would a
summary(as.numeric(as.factor(corridor1))-as.numeric(as.factor(corridor1)))

-- 
   O__  ---- Peter Dalgaard             ?ster Farimagsgade 5, Entr.B
  c/ /'_ --- Dept. of Biostatistics     PO Box 2099, 1014 Cph. K
 (*) \(*) -- University of Copenhagen   Denmark          Ph:  (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)                  FAX: (+45) 35327907

Seemingly Similar Threads

Search for more reasonably related threads

R help - Oct 2006 - How to Get Categorical Correlation Coefficient

[R] How to Get Categorical Correlation Coefficient

[R] How to Get Categorical Correlation Coefficient

Seemingly Similar Threads