thr3ads.net - R help - [R] speed issue in simulating a stochastic process [Nov 2014]

If this information is useful, please help other people find it:
Share via:

Matteo Richiardi

2014-Nov-06 14:47 UTC

[R] speed issue in simulating a stochastic process

I wish to simulate the following stochastic process, for i = 1...N
individuals and t=1...T periods:

y_{i,t} = y_0 + lambda Ey_{t-1} + epsilon_{i,t}

where Ey_{t-1} is the average of y over the N individuals computed at time
t-1.

My solution (below) works but is incredibly slow. Is there a faster but
still clear and readable alternative?

Thanks a lot. Matteo

rm(list=ls())
library(plyr)
y0 = 0
lambda = 0.1
N = 20
T = 100
m_e = 0
sd_e = 1

# construct the data frame and initialize y
D = data.frame(
  id = rep(1:N,T),
  t = rep(1:T, each = N),
  y = rep(y0,N*T)
)

# update y
for(t in 2:T){
  ybar.L1 = mean(D[D$t==t-1,"y"])
  for(i in 1:N){
    epsilon = rnorm(1,mean=m_e,sd=sd_e)
    D[D$id==i & D$t==t,]$y = lambda*y0+(1-lambda)*ybar.L1+epsilon
  }
}

ybar <- ddply(D,~t,summarise,mean=mean(y))

plot(ybar, col = "blue", type = "l")

	[[alternative HTML version deleted]]

Thomas Adams

2014-Nov-06 15:13 UTC

head link

[R] speed issue in simulating a stochastic process

Matteo,

I tried your example code using R 3.1.1 on an iMac (24-inch, Early 2009), 3.06
GHz Intel Core 2 Duo, 8 GB 1333 MHz DDR3, NVIDIA GeForce GT 130 512 MB
running Mac OS X 10.10 (Yosemite).

After entering your code, the elapsed time from the time I hit return to
when the graphics appeared was about 2 seconds ? is this about what you are
seeing?

Regards,
Tom



On Thu, Nov 6, 2014 at 7:47 AM, Matteo Richiardi <matteo.richiardi at
gmail.com> wrote:
> I wish to simulate the following stochastic process, for i = 1...N
> individuals and t=1...T periods:
>
> y_{i,t} = y_0 + lambda Ey_{t-1} + epsilon_{i,t}
>
> where Ey_{t-1} is the average of y over the N individuals computed at time
> t-1.
>
> My solution (below) works but is incredibly slow. Is there a faster but
> still clear and readable alternative?
>
> Thanks a lot. Matteo
>
> rm(list=ls())
> library(plyr)
> y0 = 0
> lambda = 0.1
> N = 20
> T = 100
> m_e = 0
> sd_e = 1
>
> # construct the data frame and initialize y
> D = data.frame(
>   id = rep(1:N,T),
>   t = rep(1:T, each = N),
>   y = rep(y0,N*T)
> )
>
> # update y
> for(t in 2:T){
>   ybar.L1 = mean(D[D$t==t-1,"y"])
>   for(i in 1:N){
>     epsilon = rnorm(1,mean=m_e,sd=sd_e)
>     D[D$id==i & D$t==t,]$y = lambda*y0+(1-lambda)*ybar.L1+epsilon
>   }
> }
>
> ybar <- ddply(D,~t,summarise,mean=mean(y))
>
> plot(ybar, col = "blue", type = "l")
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

Thomas Adams

2014-Nov-06 16:42 UTC

head link

[R] speed issue in simulating a stochastic process

Matteo,

Ah ? OK, N=20, I did not catch that. You have nested for loops, which R is
known to be exceedingly slow at handling ? if you can reorganize the code
to eliminate the loops, your performance will increase significantly.

Tom

On Thu, Nov 6, 2014 at 7:47 AM, Matteo Richiardi <matteo.richiardi at
gmail.com> wrote:
> I wish to simulate the following stochastic process, for i = 1...N
> individuals and t=1...T periods:
>
> y_{i,t} = y_0 + lambda Ey_{t-1} + epsilon_{i,t}
>
> where Ey_{t-1} is the average of y over the N individuals computed at time
> t-1.
>
> My solution (below) works but is incredibly slow. Is there a faster but
> still clear and readable alternative?
>
> Thanks a lot. Matteo
>
> rm(list=ls())
> library(plyr)
> y0 = 0
> lambda = 0.1
> N = 20
> T = 100
> m_e = 0
> sd_e = 1
>
> # construct the data frame and initialize y
> D = data.frame(
>   id = rep(1:N,T),
>   t = rep(1:T, each = N),
>   y = rep(y0,N*T)
> )
>
> # update y
> for(t in 2:T){
>   ybar.L1 = mean(D[D$t==t-1,"y"])
>   for(i in 1:N){
>     epsilon = rnorm(1,mean=m_e,sd=sd_e)
>     D[D$id==i & D$t==t,]$y = lambda*y0+(1-lambda)*ybar.L1+epsilon
>   }
> }
>
> ybar <- ddply(D,~t,summarise,mean=mean(y))
>
> plot(ybar, col = "blue", type = "l")
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

William Dunlap

2014-Nov-06 17:46 UTC

head link

[R] speed issue in simulating a stochastic process

I find that representing the simulated data as a T row by N column matrix
allows for a clearer and faster simulation function.  E.g., compare the
output of the following two functions, the first of which uses your code
and the second a matrix representation (which I convert to a data.frame at
the end so I can compare outputs easily).  I timed both of them for T=10^3
times and N=50 individuals; both gave the same results and f1 was 10000
times faster than f0:
  > set.seed(1); t0 <- system.time(s0 <- f0(N=50,T=1000))
  > set.seed(1); t1 <- system.time(s1 <- f1(N=50,T=1000))
  > rbind(t0, t1)
     user.self sys.self elapsed user.child sys.child
  t0    436.87     0.11  438.48         NA        NA
  t1      0.04     0.00    0.04         NA        NA
  > all.equal(s0, s1)
  [1] TRUE

The functions are:

f0 <- function(N = 20, T = 100, lambda = 0.1, m_e = 0, sd_e = 1, y0 = 0)
{
  # construct the data frame and initialize y
  D <- data.frame(
    id = rep(1:N,T),
    t = rep(1:T, each = N),
    y = rep(y0,N*T)
  )

  # update y
  for(t in 2:T){
    ybar.L1 = mean(D[D$t==t-1,"y"])
    for(i in 1:N){
      epsilon = rnorm(1,mean=m_e,sd=sd_e)
      D[D$id==i & D$t==t,]$y = lambda*y0+(1-lambda)*ybar.L1+epsilon
    }
  }
  D
}

f1 <- function(N = 20, T = 100, lambda = 0.1, m_e = 0, sd_e = 1, y0 = 0)
{
  # same process simulated using a matrix representation
  #   The T rows are times, the N columns are individuals
  M <- matrix(y0, nrow=T, ncol=N)
  if (T > 1) for(t in 2:T) {
    ybar.L1 <- mean(M[t-1L,])
    epsilon <- rnorm(N, mean=m_e, sd=sd_e)
    M[t,] <- lambda * y0 + (1-lambda)*ybar.L1 + epsilon
  }
  # convert to the data.frame representation that f0 uses
  tM <- t(M)
  data.frame(id = as.vector(row(tM)), t = as.vector(col(tM)), y as.vector(tM))
}



Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Thu, Nov 6, 2014 at 6:47 AM, Matteo Richiardi <matteo.richiardi at
gmail.com> wrote:
> I wish to simulate the following stochastic process, for i = 1...N
> individuals and t=1...T periods:
>
> y_{i,t} = y_0 + lambda Ey_{t-1} + epsilon_{i,t}
>
> where Ey_{t-1} is the average of y over the N individuals computed at time
> t-1.
>
> My solution (below) works but is incredibly slow. Is there a faster but
> still clear and readable alternative?
>
> Thanks a lot. Matteo
>
> rm(list=ls())
> library(plyr)
> y0 = 0
> lambda = 0.1
> N = 20
> T = 100
> m_e = 0
> sd_e = 1
>
> # construct the data frame and initialize y
> D = data.frame(
>   id = rep(1:N,T),
>   t = rep(1:T, each = N),
>   y = rep(y0,N*T)
> )
>
> # update y
> for(t in 2:T){
>   ybar.L1 = mean(D[D$t==t-1,"y"])
>   for(i in 1:N){
>     epsilon = rnorm(1,mean=m_e,sd=sd_e)
>     D[D$id==i & D$t==t,]$y = lambda*y0+(1-lambda)*ybar.L1+epsilon
>   }
> }
>
> ybar <- ddply(D,~t,summarise,mean=mean(y))
>
> plot(ybar, col = "blue", type = "l")
>
>         [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
	[[alternative HTML version deleted]]

R help - Nov 2014 - speed issue in simulating a stochastic process

[R] speed issue in simulating a stochastic process

[R] speed issue in simulating a stochastic process

[R] speed issue in simulating a stochastic process

[R] speed issue in simulating a stochastic process