First, I love R and am grateful to be using this free and extremely high quality software. Recently I have been comparing two algorithms and naturally I programmed in R first. It is so slow that I can almost feel its pain. So I decided to do a comparison with Java. To draw 500,0000 truncated normal, Java program takes 2 second and R takes 72 seconds. Not a computer science major, I find it hard to understand how R can be so slow. R code: normal.truncated <- function(mean, sd, lower, upper) { repeat { deviate = round( rnorm(1, mean, sd) ); if(deviate<=upper && deviate>=lower) break; } temp <- c(deviate+.5, deviate-.5, upper+.5, lower-.5); result <- pnorm(temp, mean, sd); prob <- (result[1] - result[2])/(result[3] - result[4]); return(c(deviate, prob)); } print(date()); n = 500000; result = numeric(n); for(i in 1:n) result[i] = normal.truncated(0,1, -10, 10)[1] print(date()); Java Code: package Mcdonald; import VisualNumerics.math.*; import java.util.*; public final class normal_univariate_truncated { public double harvest, prob; static Random ran_obj = new Random(); public normal_univariate_truncated(double mean, double var, double lower, double upper) { double sd = Math.sqrt(var); lower -= .5; upper += .5; double left = (lower - mean)/sd; double right = (upper - mean)/sd; double prob2 = Statistics.normalCdf(right) - Statistics.normalCdf(left); while(true) { harvest = mean + ran_obj.nextGaussian()*sd; if(harvest>lower && harvest<upper) break; } harvest = Sfun.nearestInteger(harvest); left = (harvest - .5 - mean)/sd; right = (harvest + .5 - mean)/sd; double prob1 = Statistics.normalCdf(right) - Statistics.normalCdf(left); prob = prob1/prob2; } public static void main(String[] useless) { System.out.println(new Date()); int n = 500000; double[] result = new double[n]; for(int i=0; i<n; i++) { normal_univariate_truncated obj = new normal_univariate_truncated(0, 1, -10, 10); result[i] = obj.harvest; } System.out.println(new Date()); } } ====Jason G. Liao, Ph.D. Division of Biometrics University of Medicine and Dentistry of New Jersey 335 George Street, Suite 2200 New Brunswick, NJ 08903-2688 phone (732) 235-8611, fax (732) 235-9777 http://www.geocities.com/jg_liao __________________________________________________ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
ripley@stats.ox.ac.uk
2002-Aug-06 20:38 UTC
[R] badly written code may be slow (was hard to believe speed difference)
If you write R like C or Java, it will often be slow. R is a vector language, so learn to work with it rather than against it. For a truncated (standard) normal, try qnorm(runif(500000, a, b)) for suitable a and b (pnorm(lower) and pnorm(upper)?) And what are you going to do with 0.5m truncated normals that makes 70s significant? BTW, R has system.time() to time things. On Tue, 6 Aug 2002, Jason Liao wrote:> First, I love R and am grateful to be using this free and extremely > high quality software. > > Recently I have been comparing two algorithms and naturally I > programmed in R first. It is so slow that I can almost feel its pain. > So I decided to do a comparison with Java. To draw 500,0000 truncated > normal, Java program takes 2 second and R takes 72 seconds. Not a > computer science major, I find it hard to understand how R can be so > slow.It's not the language ....> R code: > > normal.truncated <- function(mean, sd, lower, upper) > { > repeat > { > deviate = round( rnorm(1, mean, sd) ); > if(deviate<=upper && deviate>=lower) break; > } > temp <- c(deviate+.5, deviate-.5, upper+.5, lower-.5); > result <- pnorm(temp, mean, sd); > prob <- (result[1] - result[2])/(result[3] - result[4]); > > return(c(deviate, prob)); > } > > print(date()); > > n = 500000; > result = numeric(n); > > for(i in 1:n) result[i] = normal.truncated(0,1, -10, 10)[1] > print(date()); > > Java Code: > > package Mcdonald; > import VisualNumerics.math.*; > import java.util.*; > > public final class normal_univariate_truncated > { > public double harvest, prob; > static Random ran_obj = new Random(); > > public normal_univariate_truncated(double mean, double var, > double lower, double upper) > { > double sd = Math.sqrt(var); > > lower -= .5; > upper += .5; > double left = (lower - mean)/sd; > double right = (upper - mean)/sd; > double prob2 = Statistics.normalCdf(right) - > Statistics.normalCdf(left); > > while(true) > { > harvest = mean + ran_obj.nextGaussian()*sd; > if(harvest>lower && harvest<upper) break; > } > > harvest = Sfun.nearestInteger(harvest); > left = (harvest - .5 - mean)/sd; > right = (harvest + .5 - mean)/sd; > double prob1 = Statistics.normalCdf(right) - > Statistics.normalCdf(left); > > prob = prob1/prob2; > } > > > public static void main(String[] useless) > { > System.out.println(new Date()); > int n = 500000; > double[] result = new double[n]; > for(int i=0; i<n; i++) > { > normal_univariate_truncated obj = new > normal_univariate_truncated(0, 1, -10, 10); > result[i] = obj.harvest; > } > System.out.println(new Date()); > } > > } > > > ====> Jason G. Liao, Ph.D. > Division of Biometrics > University of Medicine and Dentistry of New Jersey > 335 George Street, Suite 2200 > New Brunswick, NJ 08903-2688 > phone (732) 235-8611, fax (732) 235-9777 > http://www.geocities.com/jg_liao > > __________________________________________________ > > > > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._ >-- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Tue, 6 Aug 2002, Jason Liao wrote:> First, I love R and am grateful to be using this free and extremely > high quality software. > > Recently I have been comparing two algorithms and naturally I > programmed in R first. It is so slow that I can almost feel its pain. > So I decided to do a comparison with Java. To draw 500,0000 truncated > normal, Java program takes 2 second and R takes 72 seconds. Not a > computer science major, I find it hard to understand how R can be so > slow. > > R code: > > normal.truncated <- function(mean, sd, lower, upper) > { > repeat > { > deviate = round( rnorm(1, mean, sd) ); > if(deviate<=upper && deviate>=lower) break; > } > temp <- c(deviate+.5, deviate-.5, upper+.5, lower-.5); > result <- pnorm(temp, mean, sd); > prob <- (result[1] - result[2])/(result[3] - result[4]); > > return(c(deviate, prob)); > } > > print(date()); > > n = 500000; > result = numeric(n); > > for(i in 1:n) result[i] = normal.truncated(0,1, -10, 10)[1] > print(date()); >Largely because you aren't using vectorised code. Loops in R are slow (to some extent true with all interpreters). The way around this is to use R's ability to work on vectors or matrices instead of looping over their elements. Eg rtruncnorm<-function(n,mean,sd,lower,upper){ rval<-numeric(n) done<-rep(FALSE,n) while(m<-sum(!done)){ rval[!done] <- round(rnorm(m,mean,sd)) done <- rval>=lower & rval<=upper } prob<-(pnorm(rval+0.5,mean,sd)-pnorm(rval-0.5,mean,sd))/(pnorm(upper+0.5,mean,sd)-pnorm(lower-0.5)) cbind(rval,prob) } rtruncnorm2<-function(n,mean,sd,lower,upper){ uval<-runif(n,pnorm(lower-0.5,mean,sd),pnorm(upper+0.5,mean,sd)) rval<-round(qnorm(uval,mean,sd)) prob<-(pnorm(rval+0.5,mean,sd)-pnorm(rval-0.5,mean,sd))/(pnorm(upper+0.5,mean,sd)-pnorm(lower-0.5)) cbind(rval,prob) } are two improved versions of your code that run more than 20 times faster on my computer. The first uses the same rejection sampling algorithm that you use, but does it for the whole vector at once. The second generates truncated uniforms (which is trivial) and transforms them to Normal. In addition to being a lot faster than your code I think they will handle the case where mean, sd, lower or upper are also vectors. Even faster, if you only want whole numbers and always the same parameters, is to compute the probabilities once and then sample from the resulting discrete distribution. rtruncnorm3<-function(n,mean,sd,lower,upper){ p<-diff(pnorm((lower:(1+upper))-0.5,mean,sd)) sample(lower:upper,n,replace=TRUE,prob=p) } This is about 400 times faster than your version (and the speed advantage would increase if the truncation points were less extreme), apparently faster than the Java program. By the way, there's no point truncating standard Normals to [-10,10]. The probability of getting anything outside [-10,10] in 500000 samples is about 10^-17. Even for [-6, 6] it's less than 0.001. -thomas -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Marc Schwartz
2002-Aug-12 13:41 UTC
[R] Attaching marginal summary plots to the main matrix plot
> Dear helplist, I am trying to visualise a datamatrix together with its > marginal densities. > Suppose dat <- matrix( rnorm(1000), nrow=25 ) > > I can the main visual plot using image, levelplot etc. But how do Iattach> the plots for rows sums and column sums to the top/bottom and righthand> side of the original plot ? > > The only way I can think of is by using par(mfrow=c(2,2)) command butthe> result is not desirable. Is there are useful command to do this ? If > succesful I am thinking of affixing the boxplots by row and columns aswell.> I have gone through the lattice help but am still confused. Thank you.If I correctly understand what you are trying to do, the way to do this would be to use par("fig"), which adjusts the size of the plot region within the current device. par(mfrow) and par(mfcol) result in the device being split up into equally sized plot regions, which does not achieve what I think you want. Below is some code as an example. I am using barplot() for the row and column sum plots in the example. This may or may not be how you want these plotted. I will leave the fine tuning of the plot types, sizes, axis ranges, labels, row/column alignments, etc. for you. This should give you a conceptual feel for how to go about it. HTH. Marc ------------------ #define matrix dat <- matrix(rnorm(1000), nrow = 25) # save the current parameters op <- par(no.readonly = TRUE) # open a new plot window plot.new() # adjust the plot region size for each area # the LL and UR corners of the default region are (0,0) and (1,1) # par("fig") is (x1, x2, y1, y2) # first set the region for the larger plot at the bottom # note that the dimensions will overlap a bit in each region par(fig = c(0.0, 0.75, 0.0, 0.75)) # now draw the larger plot image(dat) # do not overwrite the current plot par(new = TRUE) # now set the second region for the small plot at the top par(fig = c(0.0, 0.75, .65, 1.0)) # plot the colSums barplot(colSums(dat)) # do not overwrite the current plot par(new = TRUE) # now set the third region for the small plot to the right par(fig = c(0.65, 1.0, 0.0, 0.75)) # plot the rowSums barplot(rowSums(dat), horiz = TRUE) # restore the original parameters par(op) -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._