thr3ads.net - R help - [R] linear models and colinear variables... [Jun 2004]

If this information is useful, please help other people find it:
Share via:

Peter Gaffney

2004-Jun-30 23:32 UTC

[R] linear models and colinear variables...

Hi!

I'm having some issues on both conceptual and
technical levels for selecting the right combination
of variables for this model I'm working on. The basic,
all inclusive form looks like

lm(mic ~ B * D * S * U * V * ICU)

Where mic, U, V, and ICU are numeric values and B D
and S are factors with about 16, 16 and 2 levels
respectively. In short, there's a ton of actual
explanatory variables that look something like this:

Bstaph.aureus:Dvan:Sr:U:ICU

There are a good number of hits but there's also a
staggering number of complete misses, due to a
combination of scare data in that particular niche and
actual lack of deviation from the categorical mean. 
My suspicion is that there's a large degree of
colinearity in some of these variables that serves to
reduce the total effect of either of a nearly colinear
pair to an insignificant level; my hope is that
removing one of a mostly colinear group would allow
the other variables' possibly significant effects to
be measured.

Question 1) Is this legitimate at all? Can I do
regression using the entire data set over only
selected factors while ignoring others?
(Admittedly I only just got my Bachelor's in math; the
gaps in my knowlege here are profound and
aggravating.)

Question 2) How do I go about selecting possible
colinear explanatory variables?
I had originally thought I'd just make a matrix of
coefficients of colinearity for each pair of variables
and iteratively re-run the model until I got the
results I wanted, but I can't really figure out how to
do this.  In addition, I'm not sure how to do this in
the model syntax once I've actually decided on some
variables to exclude.
For instance, supposing I wanted to run the model as
above without the variable
Bstaph.aureus:Dvan:Sr:U:ICU.  What I tried was

lm(mic ~ B * D * S * U * V * ICU -
Bstaph.aureus:Dvan:Sr:U:ICU).

Obviously this doesn't work because the variable name
Bstaph.aureus:Dvan:Sr:U:ICU hasn't been recognized
yet.  How do I do this?  My best guess so far is to
build and define each of the variables like
Bstaph.aureus:Dvan:Sr:U:ICU by hand with some
imperative/iterative style programming using some kind
of string generation system.  This sounds like a royal
pain, and is something I'd rather avoid doing if at
all possible.

Any suggestions? :-D

-petertgaffney

Jonathan Baron

2004-Jun-30 23:47 UTC

head link

[R] linear models and colinear variables...

On 06/30/04 16:32, Peter Gaffney wrote:>Hi!
>
>I'm having some issues on both conceptual and
>technical levels for selecting the right combination
>of variables for this model I'm working on. The basic,
>all inclusive form looks like
>
>lm(mic ~ B * D * S * U * V * ICU)
When you do this, you are including all the interaction terms.
The * indicates an interaction, as opposed to +.  That might make
sense unders some circumstances, for example if you are just
trying to get the best model and you plan to eliminate
higher-order interactions that are not significant, but usually
it does more to obscure the interesting effects than to display
them.
>My suspicion is that there's a large degree of
>colinearity in some of these variables that serves to
>reduce the total effect of either of a nearly colinear
>pair to an insignificant level; my hope is that
>removing one of a mostly colinear group would allow
>the other variables' possibly significant effects to
>be measured.
There may be colinearity, but the most likely problem is that you
are including too many interactions, at too high a level.
Inclusion of nonsignificant interaction terms often turns
significant main effects into nonsignificant effects.
>Question 1) Is this legitimate at all? Can I do
>regression using the entire data set over only
>selected factors while ignoring others?
>(Admittedly I only just got my Bachelor's in math; the
>gaps in my knowlege here are profound and
>aggravating.)
If you select predictors on the basis of which ones are
significant, then the final significance levels don't mean much,
usually.  Remember, 1 out of 20 will be significant at .05 even
if you are using random numbers.
>Question 2) How do I go about selecting possible
>colinear explanatory variables?
If there is colinearity, then what to do about it depends on the
substance of the questions you are asking.  Some options are to
combine variables, do some sort of factor analysis and use
factors rather than variables as predictors, use the most
meaningful of the variables that are colinear, or just live with
it, if the substantive issues rule out the other options.  (I'm
sure there are other solutions that others might point out.)
>I had originally thought I'd just make a matrix of
>coefficients of colinearity for each pair of variables
>and iteratively re-run the model until I got the
>results I wanted, but I can't really figure out how to
>do this.  In addition, I'm not sure how to do this in
>the model syntax once I've actually decided on some
>variables to exclude.
>For instance, supposing I wanted to run the model as
>above without the variable
>Bstaph.aureus:Dvan:Sr:U:ICU.  What I tried was
>
>lm(mic ~ B * D * S * U * V * ICU -
>Bstaph.aureus:Dvan:Sr:U:ICU).
>
>Obviously this doesn't work because the variable name
>Bstaph.aureus:Dvan:Sr:U:ICU hasn't been recognized
>yet.  How do I do this?  My best guess so far is to
Not clear what you mean here.
>build and define each of the variables like
>Bstaph.aureus:Dvan:Sr:U:ICU by hand with some
>imperative/iterative style programming using some kind
>of string generation system.  This sounds like a royal
>pain, and is something I'd rather avoid doing if at
>all possible.
>
>Any suggestions? :-D
>
>-petertgaffney
Jon
-- 
Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page:            http://www.sas.upenn.edu/~baron
R search page:        http://finzi.psych.upenn.edu/

Apparently Analagous Threads

Search for more seemingly similar threads

R help - Jun 2004 - linear models and colinear variables...

[R] linear models and colinear variables...

[R] linear models and colinear variables...

Apparently Analagous Threads