thr3ads.net - R help - [R] regression function for categorical predictor data [Sep 2010]

If this information is useful, please help other people find it:
Share via:

karena

2010-Sep-08 21:11 UTC

[R] regression function for categorical predictor data

Hi, do you guys know what function in R handles the multiple regression on
categorical predictor data. i.e, 'lm' is used to handle continuous
predictor
data.

thanks,

karena
-- 
View this message in context:
http://r.789695.n4.nabble.com/regression-function-for-categorical-predictor-data-tp2532045p2532045.html
Sent from the R help mailing list archive at Nabble.com.

(Ted Harding)

2010-Sep-08 22:33 UTC

head link

[R] regression function for categorical predictor data

On 08-Sep-10 21:11:27, karena wrote:> Hi, do you guys know what function in R handles the multiple regression
> on categorical predictor data. i.e, 'lm' is used to handle
continuous
> predictor data.
> 
> thanks,
> karena
Karena,
lm() also handles categorical data, provided these are presented
as factors. For example:

set.seed(12345)
X <- 0.05*(-20:20)   # Continuous predictor
F <- as.factor(c(rep("A",21),rep("B",20)))
  ##21 obs at level "A", 20 at level "B"
Y <- 0.5*X + c(0.25*rnorm(21),0.25*rnorm(20)+2.0)
  ## Y increases linearly with X (coeff = 0.5)
  ## Y at Level "B" is 2.0 higher than at Level "A"
  ## "Error" term has SD = 0.25
plot(X,Y)

summary(lm(Y ~ X + F))
# Call: lm(formula = Y ~ X + F)
# Residuals:
#      Min       1Q   Median       3Q      Max 
# -0.56511 -0.15807 -0.00034  0.16484  0.44048 
# Coefficients:
#             Estimate Std. Error t value Pr(>|t|)    
# (Intercept)  0.09561    0.08869   1.078    0.288    
# X            0.63621    0.13671   4.654 3.89e-05 ***
# FB           1.93821    0.16181  11.978 1.80e-14 ***
# ---
# Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 
# Residual standard error: 0.2589 on 38 degrees of freedom
# Multiple R-squared: 0.965,      Adjusted R-squared: 0.9631 
# F-statistic: 523.4 on 2 and 38 DF,  p-value: < 2.2e-16 

The reported Estimate FB give the change in level resulting
from a change from "A" to "B" in F.

Hoping this helps,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 08-Sep-10                                       Time: 23:33:34
------------------------------ XFMail ------------------------------

Peng, C

2010-Sep-09 03:08 UTC

head link

[R] regression function for categorical predictor data

glm() is another choice. Using glm(), you response variable can be a discrete
random bariable, however, you need to specify the distribution in the
argument: family = " distriubtion name"

Use Teds simulated data and glm(), you get the same result as that produced
in lm():
> summary(glm(Y ~ X + F, family="gaussian")) 
Call:
glm(formula = Y ~ X + F, family = "gaussian")

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-0.53796  -0.16201  -0.08087   0.15080   0.47363  

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.03723    0.08457   0.440 0.662267    
X            0.51009    0.13036   3.913 0.000365 ***
FB           1.82578    0.15429  11.833  2.6e-14 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

(Dispersion parameter for gaussian family taken to be 0.06096497)

    Null deviance: 59.7558  on 40  degrees of freedom
Residual deviance:  2.3167  on 38  degrees of freedom
AIC: 6.5418

Number of Fisher Scoring iterations: 2

-- 
View this message in context:
http://r.789695.n4.nabble.com/regression-function-for-categorical-predictor-data-tp2532045p2532302.html
Sent from the R help mailing list archive at Nabble.com.

Peng, C

2010-Sep-09 03:12 UTC

head link

[R] regression function for categorical predictor data

Sorry, result is not the same, since our datasets are different. I also run
lm() based on the dataset that used in glm(). THe results are exactly the
same:
> summary(lm(Y ~ X + F)) 
Call:
lm(formula = Y ~ X + F)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.53796 -0.16201 -0.08087  0.15080  0.47363 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  0.03723    0.08457   0.440 0.662267    
X            0.51009    0.13036   3.913 0.000365 ***
FB           1.82578    0.15429  11.833  2.6e-14 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

Residual standard error: 0.2469 on 38 degrees of freedom
Multiple R-squared: 0.9612,     Adjusted R-squared: 0.9592 
F-statistic: 471.1 on 2 and 38 DF,  p-value: < 2.2e-16 


==============The dataset is given below:
> cbind(Y,X,F)                Y     X F
 [1,] -0.28473266 -1.00 1
 [2,] -0.59041310 -0.95 1
 [3,] -0.50431754 -0.90 1
 [4,] -0.60095969 -0.85 1
 [5,] -0.45849905 -0.80 1
 [6,] -0.48287208 -0.75 1
 [7,] -0.49598666 -0.70 1
 [8,] -0.08746758 -0.65 1
 [9,] -0.18665177 -0.60 1
[10,] -0.01007210 -0.55 1
[11,] -0.45765308 -0.50 1
[12,] -0.27318684 -0.45 1
[13,]  0.07638855 -0.40 1
[14,]  0.27043727 -0.35 1
[15,]  0.26926216 -0.30 1
[16,] -0.43047783 -0.25 1
[17,]  0.40884468 -0.20 1
[18,] -0.14638563 -0.15 1
[19,] -0.31374179 -0.10 1
[20,] -0.15028159 -0.05 1
[21,] -0.12540519  0.00 1
[22,]  1.58015611  0.05 2
[23,]  1.68200774  0.10 2
[24,]  2.02821901  0.15 2
[25,]  2.02359285  0.20 2
[26,]  2.14133171  0.25 2
[27,]  2.06931685  0.30 2
[28,]  2.05561726  0.35 2
[29,]  2.35720999  0.40 2
[30,]  1.96134404  0.45 2
[31,]  2.26144356  0.50 2
[32,]  2.24454620  0.55 2
[33,]  2.55707426  0.60 2
[34,]  2.18732022  0.65 2
[35,]  1.90950697  0.70 2
[36,]  2.10371010  0.75 2
[37,]  2.18266009  0.80 2
[38,]  2.18490441  0.85 2
[39,]  2.45248295  0.90 2
[40,]  2.79851838  0.95 2
[41,]  1.83514341  1.00 2


-- 
View this message in context:
http://r.789695.n4.nabble.com/regression-function-for-categorical-predictor-data-tp2532045p2532305.html
Sent from the R help mailing list archive at Nabble.com.

Apparently Analagous Threads

Search for more reasonably related threads

R help - Sep 2010 - regression function for categorical predictor data

[R] regression function for categorical predictor data

[R] regression function for categorical predictor data

[R] regression function for categorical predictor data

[R] regression function for categorical predictor data

Apparently Analagous Threads