Hi all, I am trying to estimate a simple logit model. By using MLE, I am maximizing the log likelihood, with optim(). The thing is, each observation has different set of choice options, so I need a loop inside the objective function, which I think slows down the optimization process. The data is constructed so that each row represent the characteristics for one alternative, and CS is a variable that represents choice situations. (say, 1 ~ Number of observations) cum_count is the ¡°cumulative¡± count of each choice situations, i.e. number of available alternatives in each CS. So I am maximizing the sum of [exp(U(chosen)) / sum(exp(U(all alternatives)))] When I have 6,7 predictors, the running time is about 10 minutes, and it slows down exponentially as I have more predictors. (More theta¡¯s to estimate) I want to know if there is a way I can improve the running time. Below is my code.. simple_logit = function(theta){ realized_prob = rep(0, max(data$CS)) theta_multiple = as.matrix(data[,4:35]) %*% as.matrix(theta) realized_prob[1] = exp(theta_multiple[1]) / sum(exp(theta_multiple[1:cum_count[1]])) for (i in 2:length(realized_prob)){ realized_prob[i] = exp(theta_multiple[cum_count[(i-1)]+1]) / sum(exp(theta_multiple[((cum_count[(i-1)]+1):cum_count[i])])) } -sum(log(realized_prob)) } initial = rep(0,32) out33 = optim(initial, simple_logit, method="BFGS", hessian=TRUE) Many thanks in advance!!! _________________________________________________________________ [[alternative HTML version deleted]]
Thank you. But I'd prefer using a written function which allows me more flexible model specification. Later on, I could have random parameters. So I want to know if there is any more efficient way so that I can speed it up.> Date: Fri, 30 Oct 2009 16:10:29 -0600 > To: bbom419@hotmail.com > CC: r-help@r-project.org > Subject: Re: [R] Efficient way to code using optim() > From: GPetris@uark.edu > > > Unless this is a homework problem, you would be much better off using > glm(). > > Giovanni > > > Date: Fri, 30 Oct 2009 12:23:45 -0700 > > From: parkbomee <bbom419@hotmail.com> > > Sender: r-help-bounces@r-project.org > > Importance: Normal > > Precedence: list > > > > > > --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q) > > Content-type: text/plain > > Content-transfer-encoding: 8BIT > > Content-disposition: inline > > Content-length: 1692 > > > > > > Hi all, > > > > I am trying to estimate a simple logit model. > > By using MLE, I am maximizing the log likelihood, with optim(). > > The thing is, each observation has different set of choice options, so I need a loop inside the objective function, > > which I think slows down the optimization process. > > > > The data is constructed so that each row represent the characteristics for one alternative, > > and CS is a variable that represents choice situations. (say, 1 ~ Number of observations) > > cum_count is the ¡°cumulative¡± count of each choice situations, i.e. number of available alternatives in each CS. > > So I am maximizing the sum of [exp(U(chosen)) / sum(exp(U(all alternatives)))] > > > > When I have 6,7 predictors, the running time is about 10 minutes, and it slows down exponentially as I have more predictors. (More theta¡¯s to estimate) > > I want to know if there is a way I can improve the running time. > > Below is my code.. > > > > simple_logit = function(theta){ > > realized_prob = rep(0, max(data$CS)) > > theta_multiple = as.matrix(data[,4:35]) %*% as.matrix(theta) > > realized_prob[1] = exp(theta_multiple[1]) / sum(exp(theta_multiple[1:cum_count[1]])) > > for (i in 2:length(realized_prob)){ > > realized_prob[i] = exp(theta_multiple[cum_count[(i-1)]+1]) / sum(exp(theta_multiple[((cum_count[(i-1)]+1):cum_count[i])])) > > } > > -sum(log(realized_prob)) > > } > > > > initial = rep(0,32) > > out33 = optim(initial, simple_logit, method="BFGS", hessian=TRUE) > > > > > > > > Many thanks in advance!!! > > _________________________________________________________________ > > > > > > [[alternative HTML version deleted]] > > > > > > --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q) > > MIME-version: 1.0 > > Content-type: text/plain; charset=us-ascii > > Content-transfer-encoding: 7BIT > > Content-disposition: inline > > > > ______________________________________________ > > R-help@r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > > --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q)-- > > > >_________________________________________________________________ 나의 글로벌 인맥, Windows Live Space! http://www.spaces.live.com [[alternative HTML version deleted]]
Unless this is a homework problem, you would be much better off using glm(). Giovanni> Date: Fri, 30 Oct 2009 12:23:45 -0700 > From: parkbomee <bbom419 at hotmail.com> > Sender: r-help-bounces at r-project.org > Importance: Normal > Precedence: list > > > --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q) > Content-type: text/plain > Content-transfer-encoding: 8BIT > Content-disposition: inline > Content-length: 1692 > > > Hi all, > > I am trying to estimate a simple logit model. > By using MLE, I am maximizing the log likelihood, with optim(). > The thing is, each observation has different set of choice options, so I need a loop inside the objective function, > which I think slows down the optimization process. > > The data is constructed so that each row represent the characteristics for one alternative, > and CS is a variable that represents choice situations. (say, 1 ~ Number of observations) > cum_count is the ??cumulative?? count of each choice situations, i.e. number of available alternatives in each CS. > So I am maximizing the sum of [exp(U(chosen)) / sum(exp(U(all alternatives)))] > > When I have 6,7 predictors, the running time is about 10 minutes, and it slows down exponentially as I have more predictors. (More theta??s to estimate) > I want to know if there is a way I can improve the running time. > Below is my code.. > > simple_logit = function(theta){ > realized_prob = rep(0, max(data$CS)) > theta_multiple = as.matrix(data[,4:35]) %*% as.matrix(theta) > realized_prob[1] = exp(theta_multiple[1]) / sum(exp(theta_multiple[1:cum_count[1]])) > for (i in 2:length(realized_prob)){ > realized_prob[i] = exp(theta_multiple[cum_count[(i-1)]+1]) / sum(exp(theta_multiple[((cum_count[(i-1)]+1):cum_count[i])])) > } > -sum(log(realized_prob)) > } > > initial = rep(0,32) > out33 = optim(initial, simple_logit, method="BFGS", hessian=TRUE) > > > > Many thanks in advance!!! > _________________________________________________________________ > > > [[alternative HTML version deleted]] > > > --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q) > MIME-version: 1.0 > Content-type: text/plain; charset=us-ascii > Content-transfer-encoding: 7BIT > Content-disposition: inline > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > --Boundary_(ID_/D+lL9iK1qLhrkPBeoxH+Q)-- > >