Not really a direct answer on your question, but:> system.time(replicate(10000,apply(as.matrix(theta), 1, rasch, b_vector)))
user system elapsed
4.51 0.03 4.55
> system.time(replicate(10000,theta%*%t(b_vector)))
user system elapsed
0.25 0.00 0.25
It does make a difference on large datasets...
Cheers
Joris
On Wed, Jun 2, 2010 at 4:44 PM, Doran, Harold <HDoran@air.org> wrote:
> I have a function that I am currently using very inefficiently. The
> following are needed to illustrate the problem:
>
> set.seed(12345)
> dat <- matrix(sample(c(0,1), 110, replace = TRUE), nrow = 11, ncol=10)
> mis <- sample(1:110, 5)
> dat[mis] <- NA
> theta <- rnorm(11)
> b_vector <- runif(10, -4,4)
> empty <- which(is.na(t(dat)))
>
> So, I have a matrix (dat) with some values within the matrix missing. In my
> real world problem, the matrix is huge, and most values are missing. The
> function in question is called derivs() and is below. But, let me step
> through the inefficient portions.
>
> First, I create a matrix of some predicted probabilities as:
>
> rasch <- function(theta,b) 1/ (1 + exp(b-theta))
> mat <- apply(as.matrix(theta), 1, rasch, b_vector)
>
> However, I only need those predicted probabilities in places where the data
> are not missing. So, the next step in the function is
>
> mat[empty] <- NA
>
> which manually places NAs in places where the data are missing (notice the
> matrix 'mat' is the transpose of the data matrix and so I get the
empty
> positions from the transpose of dat).
>
> Afterwards, the function computes the gradient and hessians needed to
> complete the MLE estimation.
>
> All of this works in the sense that it yields the correct answers for my
> problem. But, the glaring problem is that I create predicted probabilities
> for every cell in 'mat' when in many cases they are not needed. I
end up
> replacing those values with NAs. In my real world problem, this is horribly
> inefficient and slow.
>
> My question is then is there a way to use apply such that is computes the
> necessary predicted probabilities only when the data are not missing to
> yield the matrix 'mat'. My desired end result is the matrix
'mat' created
> after the manually placing the NAs in the appropriate cells.
>
> Thanks
> Harold
>
>
> derivs <- function(dat, b_vector, theta){
> mat <- apply(as.matrix(theta), 1, rasch,
> b_vector)
> mat[empty] <- NA
> gradient <- -(colSums(dat, na.rm = TRUE)
-
> rowSums(mat, na.rm = TRUE))
> hessian <- -(rowSums(mat * (1-mat),
na.rm > TRUE))
> list('gradient' = gradient,
'hessian' > hessian)
> }
>
>
>
> > sessionInfo()
> R version 2.10.1 (2009-12-14)
> i386-pc-mingw32
>
> locale:
> [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United
> States.1252 LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C LC_TIME=English_United
> States.1252
>
> attached base packages:
> [1] stats graphics grDevices utils datasets methods base
>
> loaded via a namespace (and not attached):
> [1] tools_2.10.1
> >
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Joris Meys
Statistical Consultant
Ghent University
Faculty of Bioscience Engineering
Department of Applied mathematics, biometrics and process control
Coupure Links 653
B-9000 Gent
tel : +32 9 264 59 87
Joris.Meys@Ugent.be
-------------------------------
Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php
[[alternative HTML version deleted]]