thr3ads.net - R help - [R] is match slow? [Nov 2001]

If this information is useful, please help other people find it:
Share via:

Agustin Lobo

2001-Nov-20 17:26 UTC

[R] is match slow?

I'm doing

m <- match(matriz, origen, 0)

where matriz is a 270x900 matrix and
origen a 11675 elements vector, and is taking
a very long time. 

Is match a function
implemented in C? If not, would a C
code be faster?

Thanks

Agus

Dr. Agustin Lobo
Instituto de Ciencias de la Tierra (CSIC)
Lluis Sole Sabaris s/n
08028 Barcelona SPAIN
tel 34 93409 5410
fax 34 93411 0012
alobo at ija.csic.es


-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Thomas Lumley

2001-Nov-20 17:35 UTC

head link

[R] is match slow?

On Tue, 20 Nov 2001, Agustin Lobo wrote:
>
> I'm doing
>
> m <- match(matriz, origen, 0)
>
> where matriz is a 270x900 matrix and
> origen a 11675 elements vector, and is taking
> a very long time.
>
> Is match a function
> implemented in C? If not, would a C
> code be faster?
Well, typing the function name at the R prompt gives
R> match
function (x, table, nomatch = NA, incomparables = FALSE)
{
    if (!is.logical(incomparables) || incomparables)
        .NotYetUsed("incomparables != FALSE")
    .Internal(match(if (is.factor(x)) as.character(x) else x,
        if (is.factor(table)) as.character(table) else table,
        nomatch))

showing that it is .Internal and thus in compiled C code. Looking at
src/main/unique.c reveals that it is implemented by sticking `table' in a
hash table and looking up each element of x, which is a pretty good
algorithm for this problem. If the hash function is good it will take
about length(table)+length(x) hash computations, and you won't be able to
beat that easily.

I don't even find it that slow
> matriz<-matrix(rnorm(270*900),ncol=900)
> origen<-rnorm(11675)
> system.time(match(matriz,origen,0))[1] 0.27 0.01 0.33 0.00 0.00

or with a lot of matches> matriz<-matrix(sample(270*900,1:20,TRUE),ncol=900)
> origen<-1:11675
> system.time(match(matriz,origen,0))[1] 0.01 0.00 0.01 0.00 0.00


	-thomas

Thomas Lumley			Asst. Professor, Biostatistics
tlumley at u.washington.edu	University of Washington, Seattle

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Prof Brian Ripley

2001-Nov-20 17:35 UTC

head link

[R] is match slow?

On Tue, 20 Nov 2001, Agustin Lobo wrote:
>
> I'm doing
>
> m <- match(matriz, origen, 0)
>
> where matriz is a 270x900 matrix and
> origen a 11675 elements vector, and is taking
> a very long time.
>
> Is match a function
> implemented in C? If not, would a C
All of R is implemented in C or Fortran, ultimately.  But you could
do> matchfunction (x, table, nomatch = NA, incomparables = FALSE)
{
    if (!is.logical(incomparables) || incomparables)
        .NotYetUsed("incomparables != FALSE")
    .Internal(match(if (is.factor(x)) as.character(x) else x,
        if (is.factor(table)) as.character(table) else table,
        nomatch))
}
to see that it is a direct call to an internal function, and they
are in C.
> code be faster?
The internal C code (do_match in src/main/unique.c) uses hashing,
so unless that is not doing a good job on your particular data it ought
to be about as fast as possible.

You could have looked at the source code in the same way I did: that's
the beauty of an open-source system.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To: r-help-request at
stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Possibly Parallel Threads

Search for more possibly parallel threads

R help - Nov 2001 - is match slow?

[R] is match slow?

[R] is match slow?

[R] is match slow?

Possibly Parallel Threads