Hi Everyone, I have a question about for loops. If you have something like: f <- function(x) { y <- rep(NA,10); for( i in 1:10 ) { if ( i > 3 ) { if ( is.na(y[i-3]) == FALSE ) { # some calculation F which depends on one or more of the previously generated values in the series y[i] = y[i-1]+x[i]; } else { y[i] <- x[i]; } } } y } e.g.> f(c(1,2,3,4,5,6,7,8,9,10,11,12));[1] NA NA NA 4 5 6 13 21 30 40 is there a faster way to process this than with a 'for' loop? I have looked at lapply as well but I have read that lapply is no faster than a for loop and for my particular application it is easier to use a for loop. Also I have seen 'rle' which I think may help me but am not sure as I have only just come across it, any ideas? Many thanks Tom -- Dr. Thomas McCallum Systems Architect, Level E Limited ETTC, The King's Buildings Mayfield Road, Edinburgh EH9 3JL, UK Work +44 (0) 131 472 4813 Fax: +44 (0) 131 472 4719 http://www.levelelimited.com Email: tom at levelelimited.com Level E is a limited company incorporated in Scotland. The c...{{dropped}}
Tom, *apply's generally speed up calculations dramatically. However, if and only if you do a repetitive operation on a vector, list matrix which does NOT require accessing other elements of that variable than the one currently in the *apply index. This means in your case any of *apply will not speed up your calculation (until you significantly rethink the code). At the same time, you can speed up your code by orders of magnitude using c-functions for "complex" vector indexing operations. If you need instructions, I can send you a very nice "Step-by-step guide for using C/C++ in R" which goes beyond "Writing R Extensions" document. Otherwise, such questions should be posted to R-help, not Rd, please post correspondingly. Best regards, Oleg Tom McCallum wrote:> Hi Everyone, > > I have a question about for loops. If you have something like: > > f <- function(x) { > y <- rep(NA,10); > for( i in 1:10 ) { > if ( i > 3 ) { > if ( is.na(y[i-3]) == FALSE ) { > # some calculation F which depends on one or more of the previously > generated values in the series > y[i] = y[i-1]+x[i]; > } else { > y[i] <- x[i]; > } > } > } > y > } > > e.g. > >> f(c(1,2,3,4,5,6,7,8,9,10,11,12)); > [1] NA NA NA 4 5 6 13 21 30 40 > > is there a faster way to process this than with a 'for' loop? I have > looked at lapply as well but I have read that lapply is no faster than a > for loop and for my particular application it is easier to use a for loop. > Also I have seen 'rle' which I think may help me but am not sure as I have > only just come across it, any ideas? > > Many thanks > > Tom > > >-- Dr Oleg Sklyar * EBI/EMBL, Cambridge CB10 1SD, England * +44-1223-494466
On Tue, Jan 30, 2007 at 12:15:29PM +0000, Oleg Sklyar wrote:> magnitude using c-functions for "complex" vector indexing operations. If > you need instructions, I can send you a very nice "Step-by-step guide > for using C/C++ in R" which goes beyond "Writing R Extensions" document.Hi Oleg, Can you please post this guide online? I think that many people would be interested in reading it, incl. me. Tamas
On Tuesday 30 January 2007 15:46, Tamas K Papp wrote:> On Tue, Jan 30, 2007 at 12:15:29PM +0000, Oleg Sklyar wrote: > > magnitude using c-functions for "complex" vector indexing operations. If > > you need instructions, I can send you a very nice "Step-by-step guide > > for using C/C++ in R" which goes beyond "Writing R Extensions" document. > > Hi Oleg, > > Can you please post this guide online? I think that many people would > be interested in reading it, incl. me. >Me too. Thanks, R.> Tamas > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel-- Ram?n D?az-Uriarte Centro Nacional de Investigaciones Oncol?gicas (CNIO) (Spanish National Cancer Center) Melchor Fern?ndez Almagro, 3 28029 Madrid (Spain) Fax: +-34-91-224-6972 Phone: +-34-91-224-6900 http://ligarto.org/rdiaz PGP KeyID: 0xE89B3462 (http://ligarto.org/rdiaz/0xE89B3462.asc) **NOTA DE CONFIDENCIALIDAD** Este correo electr?nico, y en s...{{dropped}}
Tom McCallum wrote:> Hi Everyone, > > I have a question about for loops. If you have something like: > > f <- function(x) { > y <- rep(NA,10); > for( i in 1:10 ) { > if ( i > 3 ) { > if ( is.na(y[i-3]) == FALSE ) { > # some calculation F which depends on one or more of the previously > generated values in the series > y[i] = y[i-1]+x[i]; > } else { > y[i] <- x[i]; > } > } > } > y > } > > e.g. > >> f(c(1,2,3,4,5,6,7,8,9,10,11,12)); > [1] NA NA NA 4 5 6 13 21 30 40 > > is there a faster way to process this than with a 'for' loop? I have > looked at lapply as well but I have read that lapply is no faster than a > for loop and for my particular application it is easier to use a for loop. > Also I have seen 'rle' which I think may help me but am not sure as I have > only just come across it, any ideas?Hi Tom, In the general case, you need a loop in order to propagate calculations and their results across a vector. In _your_ particular case however, it seems that all you are doing is a cumulative sum on x (at least this is what's happening for i >= 6). So you could do: f2 <- function(x) { offset <- 3 start_propagate_at <- 6 y_length <- 10 init_range <- (offset+1):start_propagate_at y <- rep(NA, offset) y[init_range] <- x[init_range] y[start_propagate_at:y_length] <- cumsum(x[start_propagate_at:y_length]) y } and it will return the same thing as your function 'f' (at least when 'x' doesn't contain NAs) but it's not faster :-/ IMO, using sapply for propagating calculations across a vector is not appropriate because: (1) It requires special care. For example, this: > x <- 1:10 > sapply(2:length(x), function(i) {x[i] <- x[i-1]+x[i]}) doesn't work because the 'x' symbol on the left side of the <- in the anonymous function doesn't refer to the 'x' symbol defined in the global environment. So you need to use tricks like this: > sapply(2:length(x), function(i) {x[i] <- x[i-1]+x[i]; assign("x", x, envir=.GlobalEnv); x[i]}) (2) Because of this kind of tricks, then it is _very_ slow (about 10 times slower or more than a 'for' loop). Cheers, H.> > Many thanks > > Tom > > >