thr3ads.net - R help - [R] colSums in C [Dec 2001]

If this information is useful, please help other people find it:
Share via:

David Brahm

2001-Dec-14 23:25 UTC

[R] colSums in C

Hi, all!

   My project today is to write a speedy colSums(), which is a function
available in S-Plus to add the columns of a matrix.  Here are 4 ways to do it,
with the time it took (elapsed, best of 3 trials) in both R and S-Plus:

m <- matrix(1, 400, 40000)
x1 <- apply(m, 2, sum)                  ## R=16.55   S=52.39
x2 <- as.vector(rep(1,nrow(m)) %*% m)   ## R= 2.39   S= 8.52
x3 <- g.apply.sum(m,2,F)                ## R= 5.72   S=11.25
x4 <- colSums(m,F)                      ##           S= 0.48

Method 1 is the naive "apply", and is significantly faster in R than
S-Plus.
Method 2 is a matrix operation I learned from Bill Venables; also faster in R.
Method 3 is my C code, given below.
Method 4 is the S-Plus built-in function, the bogey to beat!

My question is, can anyone suggest how to speed up my C code to get results
comparable to colSums, or even to method 2?  Is it stupid code, a bad compiler,
non-optimal optimizer options, or ???

Here's my interface function and simple C code (compiled with gcc via R
SHLIB, no Makevars file):

g.apply.sum <- function(x, MARGIN=1, na.rm=T) {
  dx <- dim(x)
  if (na.rm) x[is.na(x)] <- 0
  .C("applysum", as.real(x), dx, as.integer(MARGIN),
z=real(dx[MARGIN]),
     DUP=FALSE)$z
}

void applysum(double *x, long *dx, long *mar, double *z) {
  long i, j;
  if (*mar==1)
    for(i=0; i<dx[0]; i++) for(j=0; j<dx[1]; j++) z[i] += x[i + dx[0]*j];
  else
    for(j=0; j<dx[1]; j++) for(i=0; i<dx[0]; i++) z[j] += x[i + dx[0]*j];
}

Possibly useful audit trail:

agate|Rpkg> R SHLIB g.colSums/src/*.c
/res/local/bin/gcc -I/res/local/lib/R/include  -I/res/local/include   -fPIC  -g
-O2 -c g.colSums/src/applyfilt.c -o g.colSums/src/applyfilt.o
/res/local/bin/gcc -I/res/local/lib/R/include  -I/res/local/include   -fPIC  -g
-O2 -c g.colSums/src/applysum.c -o g.colSums/src/applysum.o
/res/local/bin/gcc -shared  -o g.colSums/src/applyfilt.so
g.colSums/src/applyfilt.o g.colSums/src/applysum.o  -L/res/local/include

Version:
 platform = sparc-sun-solaris2.6
 arch = sparc
 os = solaris2.6
 system = sparc, solaris2.6
 status = 
 major = 1
 minor = 3.1
 year = 2001
 month = 08
 day = 31
 language = R

-- 
                              -- David Brahm (brahm at alum.mit.edu)
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Peter Dalgaard BSA

2001-Dec-15 00:08 UTC

head link

[R] colSums in C

David Brahm  <brahm at alum.mit.edu> writes:
> My question is, can anyone suggest how to speed up my C code to get results
> comparable to colSums, or even to method 2?  Is it stupid code, a bad
compiler,
> non-optimal optimizer options, or ???
Your code is not very efficiently written, but the compiler ought to
be able to catch the common subexpressions etc. On the other hand,
especially on Sparc there are some pretty tricky pipelining and
parallelism issues that can be really hard to get right and can mean
quite a lot. And there are some optimizations that a compiler cannot
really do without making assumptions that might be wrong.
(Prototypically: what if z and x have overlapping storage? *You* know
they don't, but the compiler really can't be sure.)
 > Here's my interface function and simple C code (compiled with gcc via R
> SHLIB, no Makevars file):
> 
> g.apply.sum <- function(x, MARGIN=1, na.rm=T) {
>   dx <- dim(x)
>   if (na.rm) x[is.na(x)] <- 0
>   .C("applysum", as.real(x), dx, as.integer(MARGIN),
z=real(dx[MARGIN]),
>      DUP=FALSE)$z
> }
> 
> void applysum(double *x, long *dx, long *mar, double *z) {
>   long i, j;
>   if (*mar==1)
>     for(i=0; i<dx[0]; i++) for(j=0; j<dx[1]; j++) z[i] += x[i +
dx[0]*j];
>   else
>     for(j=0; j<dx[1]; j++) for(i=0; i<dx[0]; i++) z[j] += x[i +
dx[0]*j];
> }
Hmm. What happens if you rewrite the "else" case as something like

register long i;
register double s, *p; 

p = x;
for(j=0; j<dx[1]; j++) {
        i = dx[0];
        s = 0.;
        while (i--)
                s += *p++;
        z[j] = s;
}          

-- 
   O__  ---- Peter Dalgaard             Blegdamsvej 3  
  c/ /'_ --- Dept. of Biostatistics     2200 Cph. N   
 (*) \(*) -- University of Copenhagen   Denmark      Ph: (+45) 35327918
~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk)             FAX: (+45) 35327907
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

David Brahm

2001-Dec-17 23:57 UTC

head link

[R] colSums in C

I asked how to write speedy C code to implement colSums().  My original version
on a 400x40000 matrix took 5.72s.

Peter Dalgaard <p.dalgaard at biostat.ku.dk> suggested some more efficient
coding,
which sped my example up to 3.90s.  Douglas Bates <bates at stat.wisc.edu>
suggested using .Call() instead of .C, and I was amazed to see the time went
down to 0.69s!  Doug had actually posted his code (a package called
"MatUtils")
to R-help on July 19, 2001.

I've taken Doug's code, added names to the result, and included an na.rm
flag.
Unfortunately, my na.rm option makes it really slow again! (12.15s).  That's
no
faster than pre-processing the matrix with "m[is.na(m)] <- 0".  Can
anyone help
me understand why the ISNA conditional is taking so much time?  The C code is
below.  Thanks!

---- begin "colSums.c" (90% due to Doug Bates) ----
#include <R.h>
#include <Rdefines.h>

SEXP colSums(SEXP m, SEXP NaRm) {
    int i, j, *mdims, n, p, narm;
    double *mm, sum;
    SEXP val, nms;

    if (!isMatrix(m)) error("m must be a matrix");
    mdims = INTEGER(GET_DIM(m));
    narm  = asLogical(NaRm);
    n = mdims[0]; p = mdims[1];
    PROTECT(val  = NEW_NUMERIC(p));
    PROTECT(nms  = GET_COLNAMES(GET_DIMNAMES(m)));
    PROTECT(  m  = AS_NUMERIC(m));
    mm = REAL(m);
    if (narm) for (j = 0; j < p; j++) {
      for (sum = 0., i = 0; i < n; i++) if (!ISNA(mm[i])) sum += mm[i];
      REAL(val)[j] = sum;
      mm += n;
    } else for (j = 0; j < p; j++) {
      for (sum = 0., i = 0; i < n; i++) sum += mm[i];
      REAL(val)[j] = sum;
      mm += n;
    }
    namesgets(val, nms);
    UNPROTECT(3);
    return val;
}
---- end "colSums.c" ----

-- 
                              -- David Brahm (brahm at alum.mit.edu)
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Maybe Matching Threads

Search for more maybe matching threads

R help - Dec 2001 - colSums in C

[R] colSums in C

[R] colSums in C

[R] colSums in C

Maybe Matching Threads