On 2007-12-21, Louis Martin <louismartinbis at yahoo.fr>
wrote:> Hi,
>
> I am running the following loop, but it takes hours to run as n is big. Is
there any way "apply" can be used? Thanks.
> ### Start
> nclass <- dim(data)[[2]] - 1
> z <- matrix(0, ncol = nclass, nrow = nclass)
> n <- dim(data)[[1]]
> x <- c(1:nclass)
> # loop starts
> for(loop in 1:n) { # looping over rows in data
> r <- data[loop, 1:nclass] # vector of length(nclass)
> classified <- x[r == max(r)] # index of rows == max(r)
>
> truth <- data[loop, nclass + 1] # next column, single value
> z[classified, truth] <- z[classified, truth] + 1 # increment
> the values of
> }
> # loop ends
>
Off the top, data is a bad choice for your dataframe, as it conflicts
with a standard function. Also, including some actual data would make
this easier to work with. I think you're using dim(data)[[1]] to get
the number of rows of data? That can be more clearly expressed as
nrow(data), and dim(data)[[2]] == ncol(data).
Anyways, this might be helpful:
add.mat <- apply(data[,1:nclass], MAR = 1, FUN = function(x)
ifelse(x == max(x), 1, 0))
for(i in 1:n)
z[ , data[i, ncol(data)]] <- z[ , data[i, ncol(data)]] + add.mat[,i]
There's still a loop, but it might not be needed depending on what
values 'truth' holds. Most of the calculations are shifted into the
apply() call, so the one line loop should run at least a little faster
than what you started with.
HTH,
Tyler