Hi:
On Mon, Dec 13, 2010 at 4:28 PM, chandu
<Chandrasekhar.karnam@gmail.com>wrote:
>
> Dear all,
>
> I am relatively new to R. I would like to know how can we write the
> realizations (for example generated through rnorm or runif) in to a data
> file. It would be very inefficient to first generate values and then write
> them in to file using "write" function. Instead, is there a way
to generate
> 1 value at a time and append them in to the file.
>
On the contrary, it is very inefficient in R to generate one value at a time
and then append it to a file. R can do vectorized calculations, so for
generating random data, it takes one line of code; e.g.,
rnorm(1000000, 0, 5)
generates a vector of 1000000 random numbers from a normal distribution with
mean 0 and standard deviation 5. To generate the code and write it to a file
can also take one line:
write.csv(rnorm(1000000, 0, 5), file = 'myRandomNumbers.csv'), row.names
FALSE, quote = FALSE)
On my system, it took 4.24 seconds to write 1000000 random numbers to a
file (its size is 18.6 Mb).
Now, let's try your for loop approach, without writing to a file:
# Pre-allocate space for the vector:
u <- vector('numeric', 1000000)
system.time(for(i in seq_along(u)) u[i] <- rnorm(1, 0, 5))
user system elapsed
6.86 0.00 6.88
# Initialize an empty object and populate it one element at a time:
u <- NULL> system.time(for(i in 1:1000000) u <- c(u, rnorm(1, 0, 5)))
The reason the second one is so inefficient is because of two important
features in R that generally don't arise in most programming languages:
fixed memory for workspaces and lazy evaluation. Because you are repeatedly
appending to a object that grows and grows (this is where the lazy
evaluation come into play), R has to work harder to find new memory after a
while and so it slows down precipitously as it expends more and more effort
finding memory. I got impatient with waiting, so
Timing stopped at: 571.83 9.08 585.53
> system.time(for(i in 1:1000) u <- c(u, rnorm(1, 0, 5)))
user system elapsed
2.64 0.08 2.74> system.time(for(i in 1:10000) u <- c(u, rnorm(1, 0, 5)))
user system elapsed
27.47 0.50 28.08
Multiply the last one (total time is on the far right) by 100 to get a
probable lower bound for how long this takes. There are more efficient ways
to do this (use of the function force(), for example), but the point is that
one thing you definitely do NOT want to do in R is to append to an object
one value at a time. It wouldn't be much different if you were writing to an
external file.
The question might be trivial to many experts. I appreciate your
help.>
The question is far from trivial, and many people have put in great amounts
of effort to make R efficient. If possible, vectorized operations are a
good way to go because they are generally fast. Having said that, there are
occasions where it is more efficient to perform loops, but for novices who
are used to Fortran/C/Java looping constructs, it is usually the case that
there are very fast ways to do the same thing in R without a loop using
vectorized operations.
HTH,
Dennis
>
> Thank you
> --
> View this message in context:
>
http://r.789695.n4.nabble.com/writing-sample-values-in-to-a-file-tp3086286p3086286.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
[[alternative HTML version deleted]]