Hello Coming from matlab background, I could use some help on this one. I need to generate a data set based on this equation X(t) = 3.8x(t-1) (1-x(t-1)) + e(t), where e(t) is a N(0,0,001) random variable I need say 100 values. How do I do this? Thanks
It may be possible to do this without a loop but I haven't found a way ###Generate an array of 100 N(0,1) RVs z<-rnorm(100) ###Build the array to store output x<-vector(length=100) ###Create initial value x[1]<-z[1] ###Loop though building series for(i in 2:100){ x[i]<-3.8*x[i-1]*(1-x[i-1])+z[i] } As I understand it you take an overhead hit when using loops, so they are best avoided. HTH Phineas http://www.phineas.pwp.blueyonder.co.uk/ -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch]On Behalf Of Fred J. Sent: Sunday, March 07, 2004 7:55 PM To: r help Subject: [R] applying data generating function Hello Coming from matlab background, I could use some help on this one. I need to generate a data set based on this equation X(t) = 3.8x(t-1) (1-x(t-1)) + e(t), where e(t) is a N(0,0,001) random variable I need say 100 values. How do I do this? Thanks ______________________________________________ R-help at stat.math.ethz.ch mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
I'd use a for loop: set.seed(123) e <- 0.001*rnorm(100) x <- rep(0, 100) for(i in 2:100) x[i] <- (3.8*x[i-1]*(1-x[i-1])+e[i]) plot(x, type="l") plot(x[-100], x[-1]) "R" is great for standard vector and matrix operations, but recursions are not so easy. hope this helps. spencer graves Fred J. wrote:>Hello > >Coming from matlab background, I could use some help >on this one. > >I need to generate a data set based on this equation >X(t) = 3.8x(t-1) (1-x(t-1)) + e(t), where e(t) is a >N(0,0,001) random variable >I need say 100 values. > >How do I do this? > >Thanks > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >https://www.stat.math.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html > >
Fred J. wrote:>I need to generate a data set based on this equation >X(t) = 3.8x(t-1) (1-x(t-1)) + e(t), where e(t) is a >N(0,0,001) random variable >I need say 100 values. > >How do I do this? > > > >I assume X(t) and x(t) are the same (?). f<-function (x) { 3.8*x*(1-x) + rnorm(1,0,.001) } v=c() x=.1 # starting point for (i in 1:100) { x=f(x); v=append(v,x) } There may be smarter ways... Christophe Pallier
Regarding your comment on speed varying when replicating the runs, try running gc() first. --- Date: Sun, 07 Mar 2004 17:56:46 -0800 From: Spencer Graves <spencer.graves at pdf.com> To: Peter Dalgaard <p.dalgaard at biostat.ku.dk> Cc: Fred J. <phddas at yahoo.com>,r-help <r-help at stat.math.ethz.ch> Subject: Re: [R] applying data generating function Peter's enumeration of alternatives inspired me to compare compute times for N = 10^(2:5), with the following results: *** R 1.8.1 under Windows 2000, IBM Thinkpad T30: 10 100 1000 10000 1e+05 for loop 0 0.01 0.09 1.27 192.05 gen e + for loop 0 0.00 0.03 0.22 2.58 create storage + for loop 0 0.01 0.05 0.34 3.45 sapply 0 0.00 0.04 0.28 3.82 replicate 0 0.01 0.05 0.29 4.02 I repeated this with the "for loop" both first and last. The times tended to decline on replication, with the "for loop" time for N = 1e5 = 182.02, 126.04 (with the "for loop" last), 130.30 ("for loop" last), and 118.64 ("for loop" first again). Conclusions: (1) Apparently, in some cases, R picks up speed upon replication (2) The first 3 times for the "for loop" with N = 1e5 made me wonder if there was an order effect, with the "for loop" being longer in the first position. However, the last run with the "for loop" again first had the shortest time of 118.64, contradicting that hypothesis. By comparison, I also tried this under S-Plus 6.2: *** S-Plus 6.2, Windows 2000, IBM Thinkpad T30 ("for loop" first): 10 100 1000 10000 100000 for loop 0.01 0.05 0.331 3.976 273.073 gen e + for loop 0.00 0.04 0.320 3.154 29.112 create storage + for loop 0.01 0.03 0.231 2.113 22.242 sapply 0.00 0.04 0.380 4.757 23.003 The script I used appears below. As Peter said, "the only really crucial [issue] is to avoid the inefficient append by preallocating" the vectors to be generated. Moreover, this is only an issue for long loop, with a threshold of between 1e4 and 1e5 in this example. For shorter loops, the programmers' time is far more valuable. Enjoy. spencer graves #################### N.gen <- c(10, 100, 1000, 10000, 1e5) mtds <- c("for loop", "gen e + for loop", "create storage + for loop", "sapply", "replicate") m <- length(N.gen) ellapsed.time <- array(NA, dim=c(m, length(mtds))) dimnames(ellapsed.time) <- list(N.gen, mtds) for(iN in 1:m){ cat("\n", iN, "") N <- N.gen[iN] #for loop set.seed(123) start.time <- proc.time() f<-function (x.) { 3.8*x.*(1-x.) + rnorm(1,0,.001) } v=c() x=.1 # starting point for (i in 1:N) { x=f(x); v=append(v,x) } ellapsed.time[iN, "for loop"] <- (proc.time()-start.time)[3] cat(mtds[1], "") #gen e + for loop set.seed(123) start.time <- proc.time() e <- 0.001*rnorm(N) X <- rep(0.1, N+1) for(i in 2:(N+1)) X[i] <- (3.8*X[i-1]*(1-X[i-1])+e[i-1]) ellapsed.time[iN, "gen e + for loop"] <- (proc.time()-start.time)[3] cat(mtds[2], "") #create storage + for loop set.seed(123) start.time <- proc.time() V <- numeric(N) xv <- .1 ; for (i in 1:N) { xv <- f(xv); V[i] <- xv } ellapsed.time[iN, "create storage + for loop"] <- (proc.time()-start.time)[3] cat(mtds[3], "") #sapply set.seed(123) start.time <- proc.time() xa <- .1 ; va <- sapply(1:N, function(i) xa <<- f(xa)) ellapsed.time[iN, "sapply"] <- (proc.time()-start.time)[3] cat(mtds[4], "") if(!is.null(version$language)){ #replicate set.seed(123) start.time <- proc.time() z <- .1 ; vr <- replicate(N, z <<- f(z)) ellapsed.time[iN, "replicate"] <- (proc.time()-start.time)[3] cat(mtds[5], "") } } t(ellapsed.time) ############################# Peter Dalgaard wrote:>Christophe Pallier <pallier at lscp.ehess.fr> writes: > > > >>Fred J. wrote: >> >> >> >>>I need to generate a data set based on this equation >>>X(t) = 3.8x(t-1) (1-x(t-1)) + e(t), where e(t) is a >>>N(0,0,001) random variable >>>I need say 100 values. >>> >>>How do I do this? >>> >>> >>I assume X(t) and x(t) are the same (?). >> >>f<-function (x) { 3.8*x*(1-x) + rnorm(1,0,.001) } >>v=c() >>x=.1 # starting point >>for (i in 1:100) { x=f(x); v=append(v,x) } >> >>There may be smarter ways... >> >> > >Yes, but the only really crucial one is to avoid the inefficient append by >preallocating the v: > >v <- numeric(100) >x <- .1 ; for (i in 1:100) { x <- f(x); v[i] <- x } > >apart from that you can use implicit loops: > >x <- .1 ; v <- sapply(1:100, function(i) x <<- f(x)) > >or > >z <- .1 ; v <- replicate(100, z <<- f(z)) > >(You cannot use x there because of a variable capture issue which is a >bit of a bug. I intend to fix it for 1.9.0.) > > >
Its possible that there was a garbage collection at the beginning or maybe this suggestion does not apply, given the precautions you took. As far as I know, all you can do is try it and see if it gives more consistent results. --- Date: Sun, 07 Mar 2004 20:15:41 -0800 From: Spencer Graves <spencer.graves at pdf.com> To: <ggrothendieck at myway.com> Cc: <p.dalgaard at biostat.ku.dk>, <phddas at yahoo.com>, <r-help at stat.math.ethz.ch> Subject: Re: [R] applying data generating function Hi, Gabor: Thanks for the "garbage collection" suggestion. In this case, I can't imagine how it would change the results: I developed the script in an S-Plus script window, then copied it into an R session that had recently just been started. Moreover, the times generally declined upon replication. Do you think the time might INCREASE after "gc"? Best Wishes, spencer graves Gabor Grothendieck wrote:>Regarding your comment on speed varying when replicating the >runs, try running gc() first. > >--- >Date: Sun, 07 Mar 2004 17:56:46 -0800 >From: Spencer Graves <spencer.graves at pdf.com> >To: Peter Dalgaard <p.dalgaard at biostat.ku.dk> >Cc: Fred J. <phddas at yahoo.com>,r-help <r-help at stat.math.ethz.ch> >Subject: Re: [R] applying data generating function > > >Peter's enumeration of alternatives inspired me to compare compute >times for N = 10^(2:5), with the following results: > >*** R 1.8.1 under Windows 2000, IBM Thinkpad T30: >10 100 1000 10000 1e+05 >for loop 0 0.01 0.09 1.27 192.05 >gen e + for loop 0 0.00 0.03 0.22 2.58 >create storage + for loop 0 0.01 0.05 0.34 3.45 >sapply 0 0.00 0.04 0.28 3.82 >replicate 0 0.01 0.05 0.29 4.02 > >I repeated this with the "for loop" both first and last. The >times tended to decline on replication, with the "for loop" time for N = >1e5 = 182.02, 126.04 (with the "for loop" last), 130.30 ("for loop" >last), and 118.64 ("for loop" first again). > >Conclusions: > >(1) Apparently, in some cases, R picks up speed upon replication > >(2) The first 3 times for the "for loop" with N = 1e5 made me >wonder if there was an order effect, with the "for loop" being longer in >the first position. However, the last run with the "for loop" again >first had the shortest time of 118.64, contradicting that hypothesis. > >By comparison, I also tried this under S-Plus 6.2: > >*** S-Plus 6.2, Windows 2000, IBM Thinkpad T30 ("for loop" first): >10 100 1000 10000 100000 >for loop 0.01 0.05 0.331 3.976 273.073 >gen e + for loop 0.00 0.04 0.320 3.154 29.112 >create storage + for loop 0.01 0.03 0.231 2.113 22.242 >sapply 0.00 0.04 0.380 4.757 23.003 > >The script I used appears below. As Peter said, "the only really >crucial [issue] is to avoid the inefficient append by preallocating" the >vectors to be generated. Moreover, this is only an issue for long loop, >with a threshold of between 1e4 and 1e5 in this example. For shorter >loops, the programmers' time is far more valuable. > >Enjoy. spencer graves >#################### > > >N.gen <- c(10, 100, 1000, 10000, 1e5) >mtds <- c("for loop", "gen e + for loop", "create storage + for loop", >"sapply", "replicate") >m <- length(N.gen) >ellapsed.time <- array(NA, dim=c(m, length(mtds))) >dimnames(ellapsed.time) <- list(N.gen, mtds) > >for(iN in 1:m){ >cat("\n", iN, "") >N <- N.gen[iN] >#for loop >set.seed(123) >start.time <- proc.time() >f<-function (x.) { 3.8*x.*(1-x.) + rnorm(1,0,.001) } >v=c() >x=.1 # starting point >for (i in 1:N) { x=f(x); v=append(v,x) } >ellapsed.time[iN, "for loop"] <- (proc.time()-start.time)[3] >cat(mtds[1], "") > >#gen e + for loop >set.seed(123) >start.time <- proc.time() >e <- 0.001*rnorm(N) >X <- rep(0.1, N+1) >for(i in 2:(N+1)) >X[i] <- (3.8*X[i-1]*(1-X[i-1])+e[i-1]) >ellapsed.time[iN, "gen e + for loop"] <- (proc.time()-start.time)[3] >cat(mtds[2], "") > >#create storage + for loop >set.seed(123) >start.time <- proc.time() >V <- numeric(N) >xv <- .1 ; for (i in 1:N) { xv <- f(xv); V[i] <- xv } >ellapsed.time[iN, "create storage + for loop"] <- >(proc.time()-start.time)[3] >cat(mtds[3], "") > >#sapply >set.seed(123) >start.time <- proc.time() >xa <- .1 ; va <- sapply(1:N, function(i) xa <<- f(xa)) >ellapsed.time[iN, "sapply"] <- (proc.time()-start.time)[3] >cat(mtds[4], "") > >if(!is.null(version$language)){ >#replicate >set.seed(123) >start.time <- proc.time() >z <- .1 ; vr <- replicate(N, z <<- f(z)) >ellapsed.time[iN, "replicate"] <- (proc.time()-start.time)[3] >cat(mtds[5], "") >} > >} > >t(ellapsed.time) >############################# >Peter Dalgaard wrote: > > > >>Christophe Pallier <pallier at lscp.ehess.fr> writes: >> >> >> >> >> >>>Fred J. wrote: >>> >>> >>> >>> >>> >>>>I need to generate a data set based on this equation >>>>X(t) = 3.8x(t-1) (1-x(t-1)) + e(t), where e(t) is a >>>>N(0,0,001) random variable >>>>I need say 100 values. >>>> >>>>How do I do this? >>>> >>>> >>>> >>>> >>>I assume X(t) and x(t) are the same (?). >>> >>>f<-function (x) { 3.8*x*(1-x) + rnorm(1,0,.001) } >>>v=c() >>>x=.1 # starting point >>>for (i in 1:100) { x=f(x); v=append(v,x) } >>> >>>There may be smarter ways... >>> >>> >>> >>> >>Yes, but the only really crucial one is to avoid the inefficient append by >>preallocating the v: >> >>v <- numeric(100) >>x <- .1 ; for (i in 1:100) { x <- f(x); v[i] <- x } >> >>apart from that you can use implicit loops: >> >>x <- .1 ; v <- sapply(1:100, function(i) x <<- f(x)) >> >>or >> >>z <- .1 ; v <- replicate(100, z <<- f(z)) >> >>(You cannot use x there because of a variable capture issue which is a >>bit of a bug. I intend to fix it for 1.9.0.) >>
> -----Original Message----- > From: r-help-bounces at stat.math.ethz.ch > [mailto:r-help-bounces at stat.math.ethz.ch] On Behalf Of Spencer Graves > Sent: Monday, March 08, 2004 11:21 AM > To: Prof Brian Ripley > Cc: Fred J.; r-help; Peter Dalgaard > Subject: Re: [R] applying data generating function > > > With "gc()" right before each call to "proc.time", as > Brian Ripley > and Gabor Grothendieck suggested, the times were substantially more > stable. For the for loop, extending the vector with each of 1e5 > iterations, I got 181.25, 181.27, 182.72, 182.44, and 182.56. The > averages of the last 3 of these tests are as follows: > > 10 100 1000 10000 1e+05 > for loop 0 0.01 0.05 1.13 182.14 > gen e + for loop 0 0.00 0.03 0.26 2.58 > create storage + for loop 0 0.00 0.04 0.39 3.94 > sapply 0 0.00 0.03 0.32 4.05 > replicate 0 0.00 0.03 0.31 3.55 > > Without "gc()", I got 192.05, 182.02, 126.04, 130.30, and 118.64 for > extending the vector with each for loop iteration. > > Three more observations about this: > > 1. Without "gc()", the times started higher but declined by > roughly a third. This suggests that R may actually be storing > intermediate "semi-compiled" code in "garbage" and using it when the > situation warrants -- but "gc()" discards it. > > 2. Increasing N from 1e4 to 1e5 increased the time NOT by a > factor of 10 but by a factor of 161 = 182/1.13 when the length of the > vector was extended in each iteration.Growing a vector inside for loop is expensive and inefficient because R needs to allocate a new vector of sufficient length, copy the data over, then de-reference the old vector. (This my understanding of how it works. Hopefully I'm not too far off...) HTH, Andy> 3. The fastest was indeed to generate "e" and allocate all > required storage before entering the loop, but the major > improvement was > due to allocation of storage before initiating the for loop. > However, > with 1000 or fewer iterations, the difference was hardly detectable. > > Best Wishes, > spencer graves > > Prof Brian Ripley wrote: > > >You need to run gc() before running such timings in R, as > the first run > >often has to pay for a level-0 garbage collection. That is > normally the > >cause of (1), although I haven't seen differences as large > as 10 secs (but > >have no idea of the speed of your machine, and have seen 3 secs). > > > >On Sun, 7 Mar 2004, Spencer Graves wrote: > > > > > > > >> Peter's enumeration of alternatives inspired me to > compare compute > >>times for N = 10^(2:5), with the following results: > >> > >>*** R 1.8.1 under Windows 2000, IBM Thinkpad T30: > >> 10 100 1000 10000 1e+05 > >>for loop 0 0.01 0.09 1.27 192.05 > >>gen e + for loop 0 0.00 0.03 0.22 2.58 > >>create storage + for loop 0 0.01 0.05 0.34 3.45 > >>sapply 0 0.00 0.04 0.28 3.82 > >>replicate 0 0.01 0.05 0.29 4.02 > >> > >> I repeated this with the "for loop" both first and last. The > >>times tended to decline on replication, with the "for loop" > time for N = > >>1e5 = 182.02, 126.04 (with the "for loop" last), 130.30 ("for loop" > >>last), and 118.64 ("for loop" first again). > >> > >> Conclusions: > >> > >> (1) Apparently, in some cases, R picks up speed upon > replication > >> > >> (2) The first 3 times for the "for loop" with N = 1e5 made me > >>wonder if there was an order effect, with the "for loop" > being longer in > >>the first position. However, the last run with the "for > loop" again > >>first had the shortest time of 118.64, contradicting that > hypothesis. > >> > >> By comparison, I also tried this under S-Plus 6.2: > >> > >>*** S-Plus 6.2, Windows 2000, IBM Thinkpad T30 ("for loop" first): > >> 10 100 1000 10000 100000 > >> for loop 0.01 0.05 0.331 3.976 273.073 > >> gen e + for loop 0.00 0.04 0.320 3.154 29.112 > >>create storage + for loop 0.01 0.03 0.231 2.113 22.242 > >> sapply 0.00 0.04 0.380 4.757 23.003 > >> > >> The script I used appears below. As Peter said, "the > only really > >>crucial [issue] is to avoid the inefficient append by > preallocating" the > >>vectors to be generated. Moreover, this is only an issue > for long loop, > >>with a threshold of between 1e4 and 1e5 in this example. > For shorter > >>loops, the programmers' time is far more valuable. > >> > >>Enjoy. spencer graves > >>#################### > >> > >> > >>N.gen <- c(10, 100, 1000, 10000, 1e5) > >>mtds <- c("for loop", "gen e + for loop", "create storage + > for loop", > >> "sapply", "replicate") > >>m <- length(N.gen) > >>ellapsed.time <- array(NA, dim=c(m, length(mtds))) > >>dimnames(ellapsed.time) <- list(N.gen, mtds) > >> > >>for(iN in 1:m){ > >> cat("\n", iN, "") > >> N <- N.gen[iN] > >>#for loop > >>set.seed(123) > >>start.time <- proc.time() > >>f<-function (x.) { 3.8*x.*(1-x.) + rnorm(1,0,.001) } > >>v=c() > >>x=.1 # starting point > >>for (i in 1:N) { x=f(x); v=append(v,x) } > >>ellapsed.time[iN, "for loop"] <- (proc.time()-start.time)[3] > >>cat(mtds[1], "") > >> > >>#gen e + for loop > >>set.seed(123) > >>start.time <- proc.time() > >>e <- 0.001*rnorm(N) > >>X <- rep(0.1, N+1) > >>for(i in 2:(N+1)) > >> X[i] <- (3.8*X[i-1]*(1-X[i-1])+e[i-1]) > >>ellapsed.time[iN, "gen e + for loop"] <- (proc.time()-start.time)[3] > >>cat(mtds[2], "") > >> > >>#create storage + for loop > >>set.seed(123) > >>start.time <- proc.time() > >>V <- numeric(N) > >>xv <- .1 ; for (i in 1:N) { xv <- f(xv); V[i] <- xv } > >>ellapsed.time[iN, "create storage + for loop"] <- > >>(proc.time()-start.time)[3] > >>cat(mtds[3], "") > >> > >>#sapply > >>set.seed(123) > >>start.time <- proc.time() > >>xa <- .1 ; va <- sapply(1:N, function(i) xa <<- f(xa)) > >>ellapsed.time[iN, "sapply"] <- (proc.time()-start.time)[3] > >>cat(mtds[4], "") > >> > >>if(!is.null(version$language)){ > >>#replicate > >>set.seed(123) > >>start.time <- proc.time() > >>z <- .1 ; vr <- replicate(N, z <<- f(z)) > >>ellapsed.time[iN, "replicate"] <- (proc.time()-start.time)[3] > >>cat(mtds[5], "") > >>} > >> > >>} > >> > >>t(ellapsed.time) > >>############################# > >>Peter Dalgaard wrote: > >> > >> > >> > >>>Christophe Pallier <pallier at lscp.ehess.fr> writes: > >>> > >>> > >>> > >>> > >>> > >>>>Fred J. wrote: > >>>> > >>>> > >>>> > >>>> > >>>> > >>>>>I need to generate a data set based on this equation > >>>>>X(t) = 3.8x(t-1) (1-x(t-1)) + e(t), where e(t) is a > >>>>>N(0,0,001) random variable > >>>>>I need say 100 values. > >>>>> > >>>>>How do I do this? > >>>>> > >>>>> > >>>>> > >>>>> > >>>>I assume X(t) and x(t) are the same (?). > >>>> > >>>>f<-function (x) { 3.8*x*(1-x) + rnorm(1,0,.001) } > >>>>v=c() > >>>>x=.1 # starting point > >>>>for (i in 1:100) { x=f(x); v=append(v,x) } > >>>> > >>>>There may be smarter ways... > >>>> > >>>> > >>>> > >>>> > >>>Yes, but the only really crucial one is to avoid the > inefficient append by > >>>preallocating the v: > >>> > >>>v <- numeric(100) > >>>x <- .1 ; for (i in 1:100) { x <- f(x); v[i] <- x } > >>> > >>>apart from that you can use implicit loops: > >>> > >>>x <- .1 ; v <- sapply(1:100, function(i) x <<- f(x)) > >>> > >>>or > >>> > >>>z <- .1 ; v <- replicate(100, z <<- f(z)) > >>> > >>>(You cannot use x there because of a variable capture > issue which is a > >>>bit of a bug. I intend to fix it for 1.9.0.) > >>> > >>> > >>> > >>> > >>> > >>______________________________________________ > >>R-help at stat.math.ethz.ch mailing list > >>https://www.stat.math.ethz.ch/mailman/listinfo/r-help > >>PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >> > >> > >> > >> > > > > > > > > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > >------------------------------------------------------------------------------ Notice: This e-mail message, together with any attachments,...{{dropped}}