thr3ads.net - R help - [R] Insurance data in library(MASS) [Feb 2009]

If this information is useful, please help other people find it:
Share via:

choonhong ang

2009-Feb-23 17:05 UTC

[R] Insurance data in library(MASS)

I have used the insurance data from R library and I have 2 questions:
I use the following:>library(MASS)
>data(Insurance)
> m1=glm(Claims ~ District + Group + Age + offset(log(Holders)),data
Insurance, family = poisson)
>summary(m1)
Call:
glm(formula = Claims ~ District + Group + Age + offset(log(Holders)),
    family = poisson, data = Insurance)
Deviance Residuals:
     Min        1Q    Median        3Q       Max
-2.46558  -0.50802  -0.03198   0.55555   1.94026
Coefficients:
             Estimate Std. Error z value Pr(>|z|)
(Intercept) -1.810508   0.032972 -54.910  < 2e-16 ***
District2    0.025868   0.043016   0.601 0.547597
District3    0.038524   0.050512   0.763 0.445657
District4    0.234205   0.061673   3.798 0.000146 ***
Group.L      0.429708   0.049459   8.688  < 2e-16 ***
Group.Q      0.004632   0.041988   0.110 0.912150
Group.C     -0.029294   0.033069  -0.886 0.375696
Age.L       -0.394432   0.049404  -7.984 1.42e-15 ***
Age.Q       -0.000355   0.048918  -0.007 0.994210
Age.C       -0.016737   0.048478  -0.345 0.729910
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05
'.' 0.1 ' ' 1
(Dispersion parameter for poisson family taken to be 1)
    Null deviance: 236.26  on 63  degrees of freedom
Residual deviance:  51.42  on 54  degrees of freedom
AIC: 388.74
 (1) In the result above, what is Group.L, Group.Q, Group.C, Age.L, Age.Q,
Age.C ?

 (2) When I copy the Insurance data in csv format (as shown in the
attachement) and run the same procedure the result shown is different from
above result, why ?

Greg Snow

2009-Feb-23 17:41 UTC

head link

[R] Insurance data in library(MASS)

In the Insurance dataset both Age and Group are ordered factors so the default
encoding for them is orthogonal polynomials (assuming that the user has not
changed the default).  In the output below the .L indicates that line is for the
"Linear" piece of the encoding or the Linear contrast on the groups,
.Q is for the "Quadratic" piece/contrast and .C is for
"Cubic".  If you don't understand what is meant by
linear/quadratic/cubic, then do some background reading on orthogonal
polynomials.

If you read the data in yourself from a .csv file, then Age and Group will not
be ordered factors unless you specifically convert them to be.  Therefore the
default encoding will be something other than orthogonal polynomials and the
specific details will be different (though the overall effect will be the same).

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.snow at imail.org
801.408.8111

> -----Original Message-----
> From: r-help-bounces at r-project.org [mailto:r-help-bounces at r-
> project.org] On Behalf Of choonhong ang
> Sent: Monday, February 23, 2009 10:05 AM
> To: r-help at r-project.org
> Subject: [R] Insurance data in library(MASS)
> 
> I have used the insurance data from R library and I have 2 questions:
> I use the following:
> >library(MASS)
> >data(Insurance)
> > m1=glm(Claims ~ District + Group + Age + offset(log(Holders)),data
> Insurance, family = poisson)
> >summary(m1)
> 
> Call:
> glm(formula = Claims ~ District + Group + Age + offset(log(Holders)),
>     family = poisson, data = Insurance)
> Deviance Residuals:
>      Min        1Q    Median        3Q       Max
> -2.46558  -0.50802  -0.03198   0.55555   1.94026
> Coefficients:
>              Estimate Std. Error z value Pr(>|z|)
> (Intercept) -1.810508   0.032972 -54.910  < 2e-16 ***
> District2    0.025868   0.043016   0.601 0.547597
> District3    0.038524   0.050512   0.763 0.445657
> District4    0.234205   0.061673   3.798 0.000146 ***
> Group.L      0.429708   0.049459   8.688  < 2e-16 ***
> Group.Q      0.004632   0.041988   0.110 0.912150
> Group.C     -0.029294   0.033069  -0.886 0.375696
> Age.L       -0.394432   0.049404  -7.984 1.42e-15 ***
> Age.Q       -0.000355   0.048918  -0.007 0.994210
> Age.C       -0.016737   0.048478  -0.345 0.729910
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05
'.' 0.1 ' ' 1
> (Dispersion parameter for poisson family taken to be 1)
>     Null deviance: 236.26  on 63  degrees of freedom
> Residual deviance:  51.42  on 54  degrees of freedom
> AIC: 388.74
>  (1) In the result above, what is Group.L, Group.Q, Group.C, Age.L,
> Age.Q,
> Age.C ?
> 
>  (2) When I copy the Insurance data in csv format (as shown in the
> attachement) and run the same procedure the result shown is different
> from
> above result, why ?

Prof Brian Ripley

2009-Feb-23 17:47 UTC

head link

[R] Insurance data in library(MASS)

You are asking about support software for a book, and the book 
contains the answers ....  And it should be given due credit.

On Mon, 23 Feb 2009, choonhong ang wrote:
> I have used the insurance data from R library and I have 2 questions:
> I use the following:
>> library(MASS)
>> data(Insurance)
>> m1=glm(Claims ~ District + Group + Age + offset(log(Holders)),data >
Insurance, family = poisson)
>> summary(m1)
>
> Call:
> glm(formula = Claims ~ District + Group + Age + offset(log(Holders)),
>    family = poisson, data = Insurance)
> Deviance Residuals:
>     Min        1Q    Median        3Q       Max
> -2.46558  -0.50802  -0.03198   0.55555   1.94026
> Coefficients:
>             Estimate Std. Error z value Pr(>|z|)
> (Intercept) -1.810508   0.032972 -54.910  < 2e-16 ***
> District2    0.025868   0.043016   0.601 0.547597
> District3    0.038524   0.050512   0.763 0.445657
> District4    0.234205   0.061673   3.798 0.000146 ***
> Group.L      0.429708   0.049459   8.688  < 2e-16 ***
> Group.Q      0.004632   0.041988   0.110 0.912150
> Group.C     -0.029294   0.033069  -0.886 0.375696
> Age.L       -0.394432   0.049404  -7.984 1.42e-15 ***
> Age.Q       -0.000355   0.048918  -0.007 0.994210
> Age.C       -0.016737   0.048478  -0.345 0.729910
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05
'.' 0.1 ' ' 1
> (Dispersion parameter for poisson family taken to be 1)
>    Null deviance: 236.26  on 63  degrees of freedom
> Residual deviance:  51.42  on 54  degrees of freedom
> AIC: 388.74
> (1) In the result above, what is Group.L, Group.Q, Group.C, Age.L, Age.Q,
> Age.C ?
See the book ca p.146.
> (2) When I copy the Insurance data in csv format (as shown in the
> attachement) and run the same procedure the result shown is different from
> above result, why ?
Who knows?: you did not deign to tell us what you did with the CSV 
file nor the results you got.  Most likely you did not get the factor 
levels and classes the same as the help file destribes.  Hint: Group 
and Age are ordered factors.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

choonhong ang

2009-Feb-24 19:33 UTC

head link

[R] Insurance data in library(MASS)

Hi,

In the result shown, the District 1 is used as the base category.  How to
change to make District 4 as a base category ?

On Mon, Feb 23, 2009 at 11:05 AM, choonhong ang
<angie.bearman@gmail.com>wrote:
> I have used the insurance data from R library and I have 2 questions:
> I use the following:
> >library(MASS)
> >data(Insurance)
> > m1=glm(Claims ~ District + Group + Age + offset(log(Holders)),data
> Insurance, family = poisson)
> >summary(m1)
>
> Call:
> glm(formula = Claims ~ District + Group + Age + offset(log(Holders)),
>     family = poisson, data = Insurance)
> Deviance Residuals:
>      Min        1Q    Median        3Q       Max
> -2.46558  -0.50802  -0.03198   0.55555   1.94026
> Coefficients:
>              Estimate Std. Error z value Pr(>|z|)
> (Intercept) -1.810508   0.032972 -54.910  < 2e-16 ***
> District2    0.025868   0.043016   0.601 0.547597
> District3    0.038524   0.050512   0.763 0.445657
> District4    0.234205   0.061673   3.798 0.000146 ***
> Group.L      0.429708   0.049459   8.688  < 2e-16 ***
> Group.Q      0.004632   0.041988   0.110 0.912150
> Group.C     -0.029294   0.033069  -0.886 0.375696
> Age.L       -0.394432   0.049404  -7.984 1.42e-15 ***
> Age.Q       -0.000355   0.048918  -0.007 0.994210
> Age.C       -0.016737   0.048478  -0.345 0.729910
> ---
> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05
'.' 0.1 ' ' 1
> (Dispersion parameter for poisson family taken to be 1)
>     Null deviance: 236.26  on 63  degrees of freedom
> Residual deviance:  51.42  on 54  degrees of freedom
> AIC: 388.74
>  (1) In the result above, what is Group.L, Group.Q, Group.C, Age.L, Age.Q,
> Age.C ?
>
>  (2) When I copy the Insurance data in csv format (as shown in the
> attachement) and run the same procedure the result shown is different from
> above result, why ?
>
	[[alternative HTML version deleted]]

Possibly Parallel Threads

quantreg log and polinomial functions

R help - Feb 2009 - Insurance data in library(MASS)

[R] Insurance data in library(MASS)

[R] Insurance data in library(MASS)

[R] Insurance data in library(MASS)

[R] Insurance data in library(MASS)

Possibly Parallel Threads