thr3ads.net - R help - [R] relationship between two discrete variables [Nov 2003]

If this information is useful, please help other people find it:
Share via:

Paul Sorenson

2003-Nov-10 23:10 UTC

[R] relationship between two discrete variables

I want to investigate possible relationships between two discrete variables.  I
have tried a few things but figured you guys might be able to point me at some
purpose built functions.

Our scientists score results of tests which are performed in lets say, 8
positions.  The scores are assigned a value of 1,2,3 or 4.  I want to know if
there is a correlation between the test results and the position.  The
scientists have a feeling that position 1 does not score as high as the others.

Not all 8 positions are always used, so the frequency of all test results can be
substantially biased towards the first position.  Here is an example dataset
(not very biased) resulting from table(result, position):

     1  2  3  4  5  6  7  8
  0  3  3  2  2  0  3  3  0
  1 11  4  6  7  7  3  3  5
  2 38 37 32 38 31 21 23 27
  3 51 66 54 66 57 37 58 56
  4  3  1  3  0  1  0  1  1

Because the test results are highly quantized, the boxplots I tried all looked
pretty much the same.

The bias means that stacked barplots aren't that useful for visualising the
data.  With a bit of data processing I guess I could normalise the total
frequencies of each test position.

I also tried a correlation between the two variables.  The answer is non-zero
but I am not sure that any relationship between the two variables would be
monotonic (BTW cor() give me the correlation coefficient, how do I get the
"confidence" of the coefficient?)

Maybe I am overlooking the obvious, like just averaging the scores.

cheers

Thomas W Blackwell

2003-Nov-11 03:48 UTC

head link

[R] relationship between two discrete variables

Paul  -

This situation seems like an obvious candidate for a log-linear model.
See the book MASS for details.  They're beyond the scope of this list.
Or try  help.search("log-linear").

(and ... can you find a way to break lines when sending your email ?)

-  tom blackwell  -  u michigan medical school  -  ann arbor  -

On Tue, 11 Nov 2003, Paul Sorenson wrote:
> I want to investigate possible relationships between two discrete
variables.  I have tried a few things but figured you guys might be able to
point me at some purpose built functions.
>
> Our scientists score results of tests which are performed in lets say, 8
positions.  The scores are assigned a value of 1,2,3 or 4.  I want to know if
there is a correlation between the test results and the position.  The
scientists have a feeling that position 1 does not score as high as the others.
>
> Not all 8 positions are always used, so the frequency of all test results
can be substantially biased towards the first position.  Here is an example
dataset (not very biased) resulting from table(result, position):
>
>      1  2  3  4  5  6  7  8
>   0  3  3  2  2  0  3  3  0
>   1 11  4  6  7  7  3  3  5
>   2 38 37 32 38 31 21 23 27
>   3 51 66 54 66 57 37 58 56
>   4  3  1  3  0  1  0  1  1
>
> Because the test results are highly quantized, the boxplots I tried all
looked pretty much the same.
>
> The bias means that stacked barplots aren't that useful for visualising
the data.  With a bit of data processing I guess I could normalise the total
frequencies of each test position.
>
> I also tried a correlation between the two variables.  The answer is
non-zero but I am not sure that any relationship between the two variables would
be monotonic (BTW cor() give me the correlation coefficient, how do I get the
"confidence" of the coefficient?)
>
> Maybe I am overlooking the obvious, like just averaging the scores.
>
> cheers
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://www.stat.math.ethz.ch/mailman/listinfo/r-help
>

Simon Fear

2003-Nov-11 09:44 UTC

head link

[R] relationship between two discrete variables

Don't use correlation for discrete variables; the 
correlation coefficient does not vary freely between -1 
and 1, it is tightly constrained by the joint probabilities of 
nonzero outcomes in a rather counterintuitive way.  
Please ask me off list if you don't follow this.

While I'm off-topic, following Tom B's comment, the only
way I know to break lines in emails using Explorer is to
actually type returns - is what I always do here and is a
pain in  the ****. If anyone knows how to overcome this
I'd be very grateful. [I don't have a choice about using
Explorer]  
 
Simon Fear 
Senior Statistician 
Syne qua non Ltd 
Tel: +44 (0) 1379 644449 
Fax: +44 (0) 1379 644445 
email: Simon.Fear at synequanon.com 
web: http://www.synequanon.com 
  
Number of attachments included with this message: 0 
  
This message (and any associated files) is confidential and\...{{dropped}}

Paul Sorenson

2003-Nov-18 03:54 UTC

head link

[R] RE: relationship between two discrete variables

Further to my queries re relating discrete variables I have had a couple of
tips on things I could try.  This has lead me to attempt a "marginal
homogeneity" test
(http://ourworld.compuserve.com/homepages/jsuebersax/margin.htm).

    o  Does anyone have an opinion on whether this approach would be
appropriate?

    o Does R have some built in help to do this?  I found a reference to
the McNemar test but not to the Stuart-Maxwell test.

cheers


	[[alternative HTML version deleted]]

Possibly Parallel Threads

Search for more reasonably related threads

R help - Nov 2003 - relationship between two discrete variables

[R] relationship between two discrete variables

[R] relationship between two discrete variables

[R] relationship between two discrete variables

[R] RE: relationship between two discrete variables

Possibly Parallel Threads