thr3ads.net - R help - [R] case study of efficient R coding [Oct 2002]

If this information is useful, please help other people find it:
Share via:

Jason Liao

2002-Oct-15 16:38 UTC

[R] case study of efficient R coding

The following are four ways I coded for the variance of multinomial
distribution and their CPU time. It is a bit surprising that the 3rd
method beats the 4th one substantially.


 var.multinomial1 = function(n, pi)
+   {
+      p = length(pi);
+      var = array(0, c(p,p));
+      
+      for(i in 1:p)
+        for(j in 1:p)
+        {
+           if(i==j) var[i,j] = n*pi[i]*(1-pi[1])
+   else var[i,j] = - n*pi[i]*pi[j];
+        }
+    }> 
>    var.multinomial2 = function(n, pi)+    {
+      n*( -pi%*%t(pi) + diag(pi) )
+    }>    
>            var.multinomial3 = function(n, pi)+            {
+               var = -pi%*%t(pi);
+               var[row(var)==col(var)] = pi*(1-pi); 
+               n*var;
+            }>    
>    var.multinomial4 = function(n, pi)+            {
+               var = -pi%*%t(pi);
+               var[seq(1, by=(p+1), length=p)] = pi*(1-pi); 
+               n*var;
+            }>    
>    n = 100;
>    pi = exp( rnorm(10));
>    pi = pi/sum(pi);
>    n.simu = 1000;
>    
>    system.time(for(i in 1:n.simu) var = var.multinomial1(n, pi))
[1]  4.50  0.04 99.73    NA    NA>     system.time(for(i in 1:n.simu) var = var.multinomial2(n, pi))
[1]  0.33  0.00 10.74    NA    NA>      system.time(for(i in 1:n.simu) var = var.multinomial3(n, pi))
[1] 0.14 0.00 4.17   NA   NA>       system.time(for(i in 1:n.simu) var = var.multinomial4(n, pi))
[1] 0.21 0.00 6.47   NA   NA>    
>      
>    But first, does anyone know a vectorized way of coding (summing the
outter product of rows)

p = 10;
x = rnorm(p*p);
dim(x) = c(p.p);

var = array(0, c(p,p));
for(i in 1:p) var = var + x[i,]%*% t(x[i,]);

====Jason G. Liao, Ph.D.
Division of Biometrics
University of Medicine and Dentistry of New Jersey
335 George Street, Suite 2200
New Brunswick, NJ 08903-2688
phone (732) 235-8611, fax (732) 235-9777
http://www.geocities.com/jg_liao

__________________________________________________

Faith Hill - Exclusive Performances, Videos & More

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Huntsinger, Reid

2002-Oct-15 18:59 UTC

head link

[R] case study of efficient R coding

t(X)%*%X or better crossprod(X) is the sum of outer products
of the rows of X. The (i,j) entry is the sum of X[k,i]*X[k,j]
over k, i.e., the (i,j) entry of the outer product of X[k, ]
with itself.

Reid Huntsinger

-----Original Message-----
From: Jason Liao [mailto:jg_liao at yahoo.com]
Sent: Tuesday, October 15, 2002 12:38 PM
To: r-help at stat.math.ethz.ch
Subject: [R] case study of efficient R coding


[...]
But first, does anyone know a vectorized way of coding (summing the
outter product of rows)

p = 10;
x = rnorm(p*p);
dim(x) = c(p.p);

var = array(0, c(p,p));
for(i in 1:p) var = var + x[i,]%*% t(x[i,]);

====Jason G. Liao, Ph.D.
Division of Biometrics
University of Medicine and Dentistry of New Jersey
335 George Street, Suite 2200
New Brunswick, NJ 08903-2688
phone (732) 235-8611, fax (732) 235-9777
http://www.geocities.com/jg_liao

__________________________________________________

Faith Hill - Exclusive Performances, Videos & More

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.
-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._.
_._

------------------------------------------------------------------------------
Notice: This e-mail message, together with any attachments, contains information
of Merck & Co., Inc. (Whitehouse Station, New Jersey, USA) that may be
confidential, proprietary copyrighted and/or legally privileged, and is intended
solely for the use of the individual or entity named on this message.  If you
are not the intended recipient, and have received this message in error, please
immediately return this by e-mail and then delete it.

=============================================================================
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Jun Yan

2002-Oct-15 20:35 UTC

head link

[R] case study of efficient R coding

This is an interesting case study. Method 3 and 4 only differ in how the
indexes are obtained. Method 3 uses seq, which uses the primitive function
c. Method 4 evaluates p^2 logical expression in R. When p is small, method
4 can beat method 3. But as p gets bigger, the time usage by method 3 only
slightly increases, while the time usage by method 4 increasely
dramatically.

test <- function(nsim, p) {
  z1<- system.time(for (i in 1:nsim) seq(1, by=(p+1), length=p))
  mat <- matrix(1, p, p)
  z2 <- system.time(for (i in 1:nsim) row(mat) == col(mat))
  rbind(seq=z1, rowcol=z2)
}
> test(1000, 200)       [,1] [,2] [,3] [,4] [,5]
seq    0.21    0 0.21    0    0
rowcol 9.16    0 9.16    0    0> test(1000, 100)       [,1] [,2] [,3] [,4] [,5]
seq    0.17 0.01 0.19    0    0
rowcol 2.06 0.00 2.05    0    0> test(1000, 10)       [,1] [,2] [,3] [,4] [,5]
seq    0.16 0.00 0.16    0    0
rowcol 0.04 0.01 0.05    0    0

On Tue, 15 Oct 2002, Jason Liao wrote:
> The following are four ways I coded for the variance of multinomial
> distribution and their CPU time. It is a bit surprising that the 3rd
> method beats the 4th one substantially.
> 
> 
>  var.multinomial1 = function(n, pi)
> +   {
> +      p = length(pi);
> +      var = array(0, c(p,p));
> +      
> +      for(i in 1:p)
> +        for(j in 1:p)
> +        {
> +           if(i==j) var[i,j] = n*pi[i]*(1-pi[1])
> +   else var[i,j] = - n*pi[i]*pi[j];
> +        }
> +    }
> > 
> >    var.multinomial2 = function(n, pi)
> +    {
> +      n*( -pi%*%t(pi) + diag(pi) )
> +    }
> >    
> >            var.multinomial3 = function(n, pi)
> +            {
> +               var = -pi%*%t(pi);
> +               var[row(var)==col(var)] = pi*(1-pi); 
> +               n*var;
> +            }
> >    
> >    var.multinomial4 = function(n, pi)
> +            {
> +               var = -pi%*%t(pi);
> +               var[seq(1, by=(p+1), length=p)] = pi*(1-pi); 
> +               n*var;
> +            }
> >    
> >    n = 100;
> >    pi = exp( rnorm(10));
> >    pi = pi/sum(pi);
> >    n.simu = 1000;
> >    
> >    system.time(for(i in 1:n.simu) var = var.multinomial1(n, pi))
> [1]  4.50  0.04 99.73    NA    NA
> >     system.time(for(i in 1:n.simu) var = var.multinomial2(n, pi))
> [1]  0.33  0.00 10.74    NA    NA
> >      system.time(for(i in 1:n.simu) var = var.multinomial3(n, pi))
> [1] 0.14 0.00 4.17   NA   NA
> >       system.time(for(i in 1:n.simu) var = var.multinomial4(n, pi))
> [1] 0.21 0.00 6.47   NA   NA
> >    
> >      
> >    
> But first, does anyone know a vectorized way of coding (summing the
> outter product of rows)
> 
> p = 10;
> x = rnorm(p*p);
> dim(x) = c(p.p);
> 
> var = array(0, c(p,p));
> for(i in 1:p) var = var + x[i,]%*% t(x[i,]);
> 
> ====> Jason G. Liao, Ph.D.
> Division of Biometrics
> University of Medicine and Dentistry of New Jersey
> 335 George Street, Suite 2200
> New Brunswick, NJ 08903-2688
> phone (732) 235-8611, fax (732) 235-9777
> http://www.geocities.com/jg_liao
> 
> __________________________________________________
> 
> Faith Hill - Exclusive Performances, Videos & More
> 
>
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
> r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
> Send "info", "help", or "[un]subscribe"
> (in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
>
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

R help - Oct 2002 - case study of efficient R coding

[R] case study of efficient R coding

[R] case study of efficient R coding

[R] case study of efficient R coding