My question is about the "Signif. codes" and the p-value, specifically, the output when I run summary(nameofregression.lm) So you get this little key: Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 And on a regression I ran, next to the intercept data, I get '***' Coefficients:> > Estimate Std. Error t value Pr(>|t|) > > (Intercept) 7.95652 0.59993 13.262 <2e-16 *** > > day.f2 -0.04348 0.84843 -0.051 0.959 > > day.f3 -0.13043 0.84843 -0.154 0.878 > > day.f4 -0.21739 0.84843 -0.256 0.798 > > day.f5 0.02174 0.84843 0.026 0.980 > > day.f6 -0.15217 0.84843 -0.179 0.858 > > day.f7 0.14986 0.84390 0.178 0.859 > >Does this mean that these numbers have a 0% chance of being wrong? Is there a way to change this to the .05 level of significance? Thanks, John [[alternative HTML version deleted]]
There will always be uncertainty in your estimates so you don't have 0 percent chance of being wrong, but remember that's your intercept, your regressors are not significant. Although you can say it is less than ..05, I mean if its significant at .000000000000001 (or something like that), that's less than .05, so its not unethical to say p < .05, but it sounds like you need to understand the regression model a little better. Joe King 206-913-2912 jp at joepking.com "Never throughout history has a man who lived a life of ease left a name worth remembering." --Theodore Roosevelt -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of John Paul Telthorst Sent: Sunday, December 20, 2009 10:13 PM To: r-help at r-project.org Subject: [R] Signif. codes My question is about the "Signif. codes" and the p-value, specifically, the output when I run summary(nameofregression.lm) So you get this little key: Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 And on a regression I ran, next to the intercept data, I get '***' Coefficients:> > Estimate Std. Error t value Pr(>|t|) > > (Intercept) 7.95652 0.59993 13.262 <2e-16 *** > > day.f2 -0.04348 0.84843 -0.051 0.959 > > day.f3 -0.13043 0.84843 -0.154 0.878 > > day.f4 -0.21739 0.84843 -0.256 0.798 > > day.f5 0.02174 0.84843 0.026 0.980 > > day.f6 -0.15217 0.84843 -0.179 0.858 > > day.f7 0.14986 0.84390 0.178 0.859 > >Does this mean that these numbers have a 0% chance of being wrong? Is there a way to change this to the .05 level of significance? Thanks, John [[alternative HTML version deleted]]
No, so the probability means that's the probability of getting that data by chance, so a p-value of .9997 means there is a .9997 probability that the data could be acquired by chance. This is a very simplistic view and you should study the regression model better. Joe King 206-913-2912 jp@joepking.com "Never throughout history has a man who lived a life of ease left a name worth remembering." --Theodore Roosevelt From: John Paul Telthorst [mailto:jptelthorst@gmail.com] Sent: Sunday, December 20, 2009 10:36 PM To: Joe King Subject: Re: [R] Signif. codes Thanks for the reply, I definitely do need to understand the regression model better. I got a p-value of .9997, so that would be > .05? I guess I'm confused about the significance part you talked about. John On Mon, Dec 21, 2009 at 12:27 AM, Joe King <jp@joepking.com> wrote: There will always be uncertainty in your estimates so you don't have 0 percent chance of being wrong, but remember that's your intercept, your regressors are not significant. Although you can say it is less than ..05, I mean if its significant at .000000000000001 (or something like that), that's less than .05, so its not unethical to say p < .05, but it sounds like you need to understand the regression model a little better. Joe King 206-913-2912 jp@joepking.com "Never throughout history has a man who lived a life of ease left a name worth remembering." --Theodore Roosevelt -----Original Message----- From: r-help-bounces@r-project.org [mailto:r-help-bounces@r-project.org] On Behalf Of John Paul Telthorst Sent: Sunday, December 20, 2009 10:13 PM To: r-help@r-project.org Subject: [R] Signif. codes My question is about the "Signif. codes" and the p-value, specifically, the output when I run summary(nameofregression.lm) So you get this little key: Signif. codes: 0 0.001 0.01 0.05 0.1 And on a regression I ran, next to the intercept data, I get '***' Coefficients: > Estimate Std. Error t value Pr(>|t|) > (Intercept) 7.95652 0.59993 13.262 <2e-16 *** > day.f2 -0.04348 0.84843 -0.051 0.959 > day.f3 -0.13043 0.84843 -0.154 0.878 > day.f4 -0.21739 0.84843 -0.256 0.798 > day.f5 0.02174 0.84843 0.026 0.980> day.f6 -0.15217 0.84843 -0.179 0.858 > day.f7 0.149860.84390 0.178 0.859 Does this mean that these numbers have a 0% chance of being wrong? Is there a way to change this to the .05 level of significance? Thanks, John [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- John Telthorst, MHRIR University of Illinois Alumnus [[alternative HTML version deleted]]
No, it does not mean that the numbers have zero chance of being wrong. The extent to which the estimate can be wrong (which is a very bad and imprecise expression) is indicated by the standard error. The p-value close to zero implies that the intercept of the underlying population from which your sample was drawn is significantly different from zero with a probability that approaches certainty (One minus p<2e-16). Remember that your data is assumed to be a random sample drawn from an underlying, larger population. Thus, the sample can never PERFECTLY represent the underlying population (only the underlying population itself can). However, the regression model gives you an estimate for what the data-generating process in the underlying population was (i.e., it gives you probability distributions for the true coefficients of the population, assuming that the assumptions for OLS regression are met). So, given the observed mean in your sample (i.e., your data), the probability that the true mean of the intercept in the underlying population is zero approaches zero. Another way to look at this is that it would be extremely unlikely (next to impossible) to draw a random sample from the population that has a zero intercept. By the OLS assumptions, the probability density for the true intercept in the population will be distributed normally around the estimate for the intercept, with the mean equal to the estimated intercept and standard deviation equal to the standard error of the intercept. Daniel ------------------------- cuncta stricte discussurus ------------------------- -----Original Message----- From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-project.org] On Behalf Of John Paul Telthorst Sent: Monday, December 21, 2009 1:13 AM To: r-help at r-project.org Subject: [R] Signif. codes My question is about the "Signif. codes" and the p-value, specifically, the output when I run summary(nameofregression.lm) So you get this little key: Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1 And on a regression I ran, next to the intercept data, I get '***' Coefficients:> > Estimate Std. Error t value Pr(>|t|) > > (Intercept) 7.95652 0.59993 13.262 <2e-16 *** > > day.f2 -0.04348 0.84843 -0.051 0.959 > > day.f3 -0.13043 0.84843 -0.154 0.878 > > day.f4 -0.21739 0.84843 -0.256 0.798 > > day.f5 0.02174 0.84843 0.026 0.980 > > day.f6 -0.15217 0.84843 -0.179 0.858 > > day.f7 0.14986 0.84390 0.178 0.859 > >Does this mean that these numbers have a 0% chance of being wrong? Is there a way to change this to the .05 level of significance? Thanks, John [[alternative HTML version deleted]]