thr3ads.net - R help - [R] Varying statistical significance in estimates of linear model [Aug 2013]

If this information is useful, please help other people find it:
Share via:

Stathis Kamperis

2013-Aug-08 10:43 UTC

[R] Varying statistical significance in estimates of linear model

Hi everyone,

I have a response variable 'y' and several predictor variables
'x_i'.
I start with a linear model:

m1 <- lm(y ~ x1); summary(m1)

and I get a statistically significant estimate for 'x1'. Then, I
modify my model as:

m2 <- lm(y ~ x1 + x2); summary(m2)

At this moment, the estimate for x1 might become non-significant while
the estimate of x2 significant.

As I add more predictor variables (or interaction terms), the
estimates for which I get a statistically significant result vary. So
sometimes x1, x2, x6 are significant, while others, x2, x4, x5 are.

It seems to me that I could tweak my model in such a way (by
adding/removing predictor variables or "suitable" interaction terms)
that I could "prove" whatever I'd like to prove.

What is the proper methodology involved here ? What do you people do
in such cases ? I can provide the data if anyone cares and would like
to have a look at them.

Best regards,
Stathis Kamperis

ONKELINX, Thierry

2013-Aug-08 13:25 UTC

head link

[R] Varying statistical significance in estimates of linear model

Dear Stathis,

I recommend that you try to get some advice from a local statistician or read an
introductory book on statistics. This kind of question is beyond the scope of a
mailing list.

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium
+ 32 2 525 02 51
+ 32 54 43 61 85
Thierry.Onkelinx at inbo.be
www.inbo.be

To call in the statistician after the experiment is done may be no more than
asking him to perform a post-mortem examination: he may be able to say what the
experiment died of.
~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data.
~ Roger Brinner

The combination of some data and an aching desire for an answer does not ensure
that a reasonable answer can be extracted from a given body of data.
~ John Tukey


-----Oorspronkelijk bericht-----
Van: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org]
Namens Stathis Kamperis
Verzonden: donderdag 8 augustus 2013 12:43
Aan: r-help at r-project.org
Onderwerp: [R] Varying statistical significance in estimates of linear model

Hi everyone,

I have a response variable 'y' and several predictor variables
'x_i'.
I start with a linear model:

m1 <- lm(y ~ x1); summary(m1)

and I get a statistically significant estimate for 'x1'. Then, I modify
my model as:

m2 <- lm(y ~ x1 + x2); summary(m2)

At this moment, the estimate for x1 might become non-significant while the
estimate of x2 significant.

As I add more predictor variables (or interaction terms), the estimates for
which I get a statistically significant result vary. So sometimes x1, x2, x6 are
significant, while others, x2, x4, x5 are.

It seems to me that I could tweak my model in such a way (by adding/removing
predictor variables or "suitable" interaction terms) that I could
"prove" whatever I'd like to prove.

What is the proper methodology involved here ? What do you people do in such
cases ? I can provide the data if anyone cares and would like to have a look at
them.

Best regards,
Stathis Kamperis

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
* * * * * * * * * * * * * D I S C L A I M E R * * * * * * * * * * * * *
Dit bericht en eventuele bijlagen geven enkel de visie van de schrijver weer en
binden het INBO onder geen enkel beding, zolang dit bericht niet bevestigd is
door een geldig ondertekend document.
The views expressed in this message and any annex are purely those of the writer
and may not be regarded as stating an official position of INBO, as long as the
message is not confirmed by a duly signed document.

Bert Gunter

2013-Aug-08 13:29 UTC

head link

[R] Varying statistical significance in estimates of linear model

Stathis:

1. This has nothing to do with R.  Post on a statistics list, like
stats.stackexchange.com

2. Read a basic regression/linear models text. You need to educate yourself.

-- Bert

On Thu, Aug 8, 2013 at 3:43 AM, Stathis Kamperis <ekamperi at gmail.com>
wrote:> Hi everyone,
>
> I have a response variable 'y' and several predictor variables
'x_i'.
> I start with a linear model:
>
> m1 <- lm(y ~ x1); summary(m1)
>
> and I get a statistically significant estimate for 'x1'. Then, I
> modify my model as:
>
> m2 <- lm(y ~ x1 + x2); summary(m2)
>
> At this moment, the estimate for x1 might become non-significant while
> the estimate of x2 significant.
>
> As I add more predictor variables (or interaction terms), the
> estimates for which I get a statistically significant result vary. So
> sometimes x1, x2, x6 are significant, while others, x2, x4, x5 are.
>
> It seems to me that I could tweak my model in such a way (by
> adding/removing predictor variables or "suitable" interaction
terms)
> that I could "prove" whatever I'd like to prove.
>
> What is the proper methodology involved here ? What do you people do
> in such cases ? I can provide the data if anyone cares and would like
> to have a look at them.
>
> Best regards,
> Stathis Kamperis
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

Stathis Kamperis

2013-Aug-09 20:20 UTC

head link

[R] Varying statistical significance in estimates of linear model

For archiving reasons:

1. "Practical Regression and Anova using R" by Faraway
2. Possible reason: multi-collinearity in predictor variables.

Thanks everybody!

On Thu, Aug 8, 2013 at 1:43 PM, Stathis Kamperis <ekamperi at gmail.com>
wrote:> Hi everyone,
>
> I have a response variable 'y' and several predictor variables
'x_i'.
> I start with a linear model:
>
> m1 <- lm(y ~ x1); summary(m1)
>
> and I get a statistically significant estimate for 'x1'. Then, I
> modify my model as:
>
> m2 <- lm(y ~ x1 + x2); summary(m2)
>
> At this moment, the estimate for x1 might become non-significant while
> the estimate of x2 significant.
>
> As I add more predictor variables (or interaction terms), the
> estimates for which I get a statistically significant result vary. So
> sometimes x1, x2, x6 are significant, while others, x2, x4, x5 are.
>
> It seems to me that I could tweak my model in such a way (by
> adding/removing predictor variables or "suitable" interaction
terms)
> that I could "prove" whatever I'd like to prove.
>
> What is the proper methodology involved here ? What do you people do
> in such cases ? I can provide the data if anyone cares and would like
> to have a look at them.
>
> Best regards,
> Stathis Kamperis

R help - Aug 2013 - Varying statistical significance in estimates of linear model

[R] Varying statistical significance in estimates of linear model

[R] Varying statistical significance in estimates of linear model

[R] Varying statistical significance in estimates of linear model

[R] Varying statistical significance in estimates of linear model