thr3ads.net - R help - [R] Using MLE Method to Estimate Regression Coefficients [Jun 2011]

If this information is useful, please help other people find it:
Share via:

boylangc at earthlink.net

2011-Jun-14 17:04 UTC

[R] Using MLE Method to Estimate Regression Coefficients

Good Afternoon,

  I am relatively new to R and have been trying to figure out how to estimate
regression coefficients using the MLE method.  Some background: I am trying to
examine scenarios in which certain estimators might be preferred to others,
starting with MLE.  I understand that MLE will (should) produce the same results
as Ordinary Least Squares if the assumption of normality holds. That said, in
the following code (my apologies up front for any lack of elegance) I use the
data from the printing press study (commonly used in the QE and stats
literature) to develop first and second order models using OLS.  Then, using
some code I found online, I tried to use MLE to do the same thing. However, I
cannot get it to work, as I get an error in my attempt to use the optim
function.  I have been studying the optim function in R; I have also explored
the use of MLE in the R documentation via the stats4, MASS, and a few other
packages but to little avail.  My questions are as follows:

1) Is there a particular error in the MLE code below that I am just not seeing?
2) Is there a simpler, more direct, or otherwise better way to approach it?
3) Which package should I use for MLE regression?

Sorry for the length and thanks in advance for any assistance someone might
have; I know your time is valuable.  I have pasted my code below but have also
attached as a .txt file.

v/r,
Greg
Doctoral Student, 
Dept. of Industrial Eng, Clemson University


R-Code

# upload printing press data
# 27 design points with 3 obs each for 81 total points

press  
   first second third  obs
1     -1     -1    -1   34
2      0     -1    -1  115
3      1     -1    -1  192
4     -1      0    -1   82
5      0      0    -1   44
6      1      0    -1  322
7     -1      1    -1  141
8      0      1    -1  259
9      1      1    -1  290
10    -1     -1     0   81
11     0     -1     0   90
12     1     -1     0  319
13    -1      0     0  180
14     0      0     0  372
15     1      0     0  541
16    -1      1     0  288
17     0      1     0  432
18     1      1     0  713
19    -1     -1     1  364
20     0     -1     1  232
21     1     -1     1  408
22    -1      0     1  182
23     0      0     1  507
24     1      0     1  846
25    -1      1     1  236
26     0      1     1  660
27     1      1     1  878
28    -1     -1    -1   10
29     0     -1    -1  116
30     1     -1    -1  186
31    -1      0    -1   88
32     0      0    -1  178
33     1      0    -1  350
34    -1      1    -1  110
35     0      1    -1  251
36     1      1    -1  280
37    -1     -1     0   81
38     0     -1     0  122
39     1     -1     0  376
40    -1      0     0  180
41     0      0     0  372
42     1      0     0  568
43    -1      1     0  192
44     0      1     0  336
45     1      1     0  725
46    -1     -1     1   99
47     0     -1     1  221
48     1     -1     1  415
49    -1      0     1  233
50     0      0     1  515
51     1      0     1  535
52    -1      1     1  126
53     0      1     1  440
54     1      1     1  991
55    -1     -1    -1   28
56     0     -1    -1  130
57     1     -1    -1  263
58    -1      0    -1   88
59     0      0    -1  188
60     1      0    -1  350
61    -1      1    -1   86
62     0      1    -1  259
63     1      1    -1  245
64    -1     -1     0   81
65     0     -1     0   93
66     1     -1     0  376
67    -1      0     0  154
68     0      0     0  372
69     1      0     0  396
70    -1      1     0  312
71     0      1     0  513
72     1      1     0  754
73    -1     -1     1  199
74     0     -1     1  266
75     1     -1     1  443
76    -1      0     1  182
77     0      0     1  434
78     1      0     1  640
79    -1      1     1  168
80     0      1     1  403
81     1      1     1 1161

#define the response and desired predictor variables (first order only)
y<-press$obs
x1<-press$first
x2<-press$second
x3<-press$third

#develop estimators using least squares regression

reg1<-lm(y~x1+x2+x3,data=press)
summary(reg1)

Call:
lm(formula = y ~ x1 + x2 + x3, data = press)

Residuals:
    Min      1Q  Median      3Q     Max 
-252.56  -83.24   -5.20   57.33  428.44 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   314.67      12.74  24.696  < 2e-16 ***
x1            177.00      15.61  11.342  < 2e-16 ***
x2            109.43      15.61   7.012 7.88e-10 ***
x3            131.46      15.61   8.424 1.55e-12 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

Residual standard error: 114.7 on 77 degrees of freedom
Multiple R-squared: 0.7637,     Adjusted R-squared: 0.7544 
F-statistic: 82.93 on 3 and 77 DF,  p-value: < 2.2e-16 

# Now let's fit the Normal Maximum Likelihood model. 
# First, put the data into matrices for the MLE procedure
# I will try the first-model first to see if I can get it to work...

x<- cbind(1,x1,x2,x3)
y<- as.vector(y)
ones<- x[,1]
 
# Calculate K and n for later use

K <- ncol(x)
K
[1] 4
K1 <- K + 1
n <- nrow(x)
n
[1] 81
 
# Define the function to be optimized
 
llik.regress <- function(par,X,Y) {
X <- as.matrix(x)
Y <- as.vector(y)
xbeta <- X%*%par[1:K]
Sig <- par[K1:K1]
    sum(-(1/2)*log(2*pi)-(1/2)*log(Sig^2)-(1/(2*Sig^2))*(Y-xbeta)^2)
}

llik.regress 

# Now let's use the above function to estimate the model. 
model <- optim(c(1,1,1,1), llik.regress, method="BFGS",
control=list(fnscale=-1),
         hessian=TRUE)
Error in optim(c(1, 1, 1, 1), llik.regress, method = "BFGS", control =
list(fnscale = -1),  :
  initial value in 'vmmin' is not finite
 
-------------- next part --------------
# upload printing press data
# 27 design points with 3 obs each for 81 total points

press  
   first second third  obs
1     -1     -1    -1   34
2      0     -1    -1  115
3      1     -1    -1  192
4     -1      0    -1   82
5      0      0    -1   44
6      1      0    -1  322
7     -1      1    -1  141
8      0      1    -1  259
9      1      1    -1  290
10    -1     -1     0   81
11     0     -1     0   90
12     1     -1     0  319
13    -1      0     0  180
14     0      0     0  372
15     1      0     0  541
16    -1      1     0  288
17     0      1     0  432
18     1      1     0  713
19    -1     -1     1  364
20     0     -1     1  232
21     1     -1     1  408
22    -1      0     1  182
23     0      0     1  507
24     1      0     1  846
25    -1      1     1  236
26     0      1     1  660
27     1      1     1  878
28    -1     -1    -1   10
29     0     -1    -1  116
30     1     -1    -1  186
31    -1      0    -1   88
32     0      0    -1  178
33     1      0    -1  350
34    -1      1    -1  110
35     0      1    -1  251
36     1      1    -1  280
37    -1     -1     0   81
38     0     -1     0  122
39     1     -1     0  376
40    -1      0     0  180
41     0      0     0  372
42     1      0     0  568
43    -1      1     0  192
44     0      1     0  336
45     1      1     0  725
46    -1     -1     1   99
47     0     -1     1  221
48     1     -1     1  415
49    -1      0     1  233
50     0      0     1  515
51     1      0     1  535
52    -1      1     1  126
53     0      1     1  440
54     1      1     1  991
55    -1     -1    -1   28
56     0     -1    -1  130
57     1     -1    -1  263
58    -1      0    -1   88
59     0      0    -1  188
60     1      0    -1  350
61    -1      1    -1   86
62     0      1    -1  259
63     1      1    -1  245
64    -1     -1     0   81
65     0     -1     0   93
66     1     -1     0  376
67    -1      0     0  154
68     0      0     0  372
69     1      0     0  396
70    -1      1     0  312
71     0      1     0  513
72     1      1     0  754
73    -1     -1     1  199
74     0     -1     1  266
75     1     -1     1  443
76    -1      0     1  182
77     0      0     1  434
78     1      0     1  640
79    -1      1     1  168
80     0      1     1  403
81     1      1     1 1161

#define the response and desired predictor variables (first order only)
y<-press$obs
x1<-press$first
x2<-press$second
x3<-press$third

#develop estimators using least squares regression

reg1<-lm(y~x1+x2+x3,data=press)
summary(reg1)

Call:
lm(formula = y ~ x1 + x2 + x3, data = press)

Residuals:
    Min      1Q  Median      3Q     Max 
-252.56  -83.24   -5.20   57.33  428.44 

Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   314.67      12.74  24.696  < 2e-16 ***
x1            177.00      15.61  11.342  < 2e-16 ***
x2            109.43      15.61   7.012 7.88e-10 ***
x3            131.46      15.61   8.424 1.55e-12 ***
---
Signif. codes:  0 ?***? 0.001 ?**? 0.01 ?*? 0.05 ?.? 0.1 ? ? 1 

Residual standard error: 114.7 on 77 degrees of freedom
Multiple R-squared: 0.7637,     Adjusted R-squared: 0.7544 
F-statistic: 82.93 on 3 and 77 DF,  p-value: < 2.2e-16 

# Now let's fit the Normal Maximum Likelihood model. 
# First, put the data into matrices for the MLE procedure
# I will try the first-model first to see if I can get it to work...

x<- cbind(1,x1,x2,x3)
y<- as.vector(y)
ones<- x[,1]
 
# Calculate K and n for later use

K <- ncol(x)
K
[1] 4
K1 <- K + 1
n <- nrow(x)
n
[1] 81
 
# Define the function to be optimized
 
llik.regress <- function(par,X,Y) {
X <- as.matrix(x)
Y <- as.vector(y)
xbeta <- X%*%par[1:K]
Sig <- par[K1:K1]
    sum(-(1/2)*log(2*pi)-(1/2)*log(Sig^2)-(1/(2*Sig^2))*(Y-xbeta)^2)
}

llik.regress 

# Now let's use the above function to estimate the model. 
model <- optim(c(1,1,1,1), llik.regress, method="BFGS",
control=list(fnscale=-1),
         hessian=TRUE)
Error in optim(c(1, 1, 1, 1), llik.regress, method = "BFGS", control =
list(fnscale = -1),  :
  initial value in 'vmmin' is not finite

Prof. John C Nash

2011-Jun-15 14:13 UTC

head link

[R] Using MLE Method to Estimate Regression Coefficients

The error msg puts it quite clearly -- the initial parameters 1,1,1,1 are
inadmissible for
your function. You need to have valid initial parameters for the variable metric
method
(option BFGS).

This is one of the main problems users have with any optimization method. It is
ALWAYS a
good idea to actually evaluate your function outside of the optimizer i.e.,
simply put in
the initial parameters and find out what function value you get.

It should also be noted (as the package optimx does) that the VM and CG based
methods
really don't do very well without analytic gradients.

JN


On 06/15/2011 06:00 AM, r-help-request at r-project.org
wrote:> Message: 46
> Date: Tue, 14 Jun 2011 13:04:55 -0400 (GMT-04:00)
> From: boylangc at earthlink.net
> To: r-help at r-project.org
> Subject: [R] Using MLE Method to Estimate Regression Coefficients
> Message-ID:
> 	<11895593.1308071096125.JavaMail.root at
mswamui-andean.atl.sa.earthlink.net>
> 	
> Content-Type: text/plain; charset="utf-8"
> 
> Good Afternoon,
> 
>   I am relatively new to R and have been trying to figure out how to
estimate regression coefficients using the MLE method.  Some background: I am
trying to examine scenarios in which certain estimators might be preferred to
others, starting with MLE.  I understand that MLE will (should) produce the same
results as Ordinary Least Squares if the assumption of normality holds. That
said, in the following code (my apologies up front for any lack of elegance) I
use the data from the printing press study (commonly used in the QE and stats
literature) to develop first and second order models using OLS.  Then, using
some code I found online, I tried to use MLE to do the same thing. However, I
cannot get it to work, as I get an error in my attempt to use the optim
function.  I have been studying the optim function in R; I have also explored
the use of MLE in the R documentation via the stats4, MASS, and a few other
packages but to little avail.  My questions are as follows:
> 
> 1) Is there a particular error in the MLE code below that I am just not
seeing?
> 2) Is there a simpler, more direct, or otherwise better way to approach it?
> 3) Which package should I use for MLE regression?
> 
> Sorry for the length and thanks in advance for any assistance someone might
have; I know your time is valuable.  I have pasted my code below but have also
attached as a .txt file.
> 
> v/r,
> Greg
> Doctoral Student, 
> Dept. of Industrial Eng, Clemson University
[snip]> 
> # Now let's use the above function to estimate the model. 
> model <- optim(c(1,1,1,1), llik.regress, method="BFGS",
control=list(fnscale=-1),
>          hessian=TRUE)
> Error in optim(c(1, 1, 1, 1), llik.regress, method = "BFGS",
control = list(fnscale = -1),  :
>   initial value in 'vmmin' is not finite
>  
> 
> ------------------------------

Reasonably Related Threads

Search for more maybe matching threads

R help - Jun 2011 - Using MLE Method to Estimate Regression Coefficients

[R] Using MLE Method to Estimate Regression Coefficients

[R] Using MLE Method to Estimate Regression Coefficients

Reasonably Related Threads