On Thu, 17 Apr 2008, Herve Pages wrote:
> Couldn't is.unsorted() bail out immediately here (after comparing
> the first 2 elements):
>
> > x <- 20000000:1
> > system.time(is.unsorted(x), gcFirst=TRUE)
> user system elapsed
> 0.084 0.040 0.124
>
> > x <- 200000000:1
> > system.time(is.unsorted(x), gcFirst=TRUE)
> user system elapsed
> 0.772 0.440 1.214
The C code does bail out upon seeing the first out- of-order pair, but
before calling the C code, the S code does any(is.na(x)), forcing a
scan of the entire data. If you remove the is.na calls from
is.unsorted's S code you will see the timings improve in your example.
(It looks easy to do the NA checks in the C code.)
is.unsorted.no.nacheck <- function (x, na.rm = FALSE) {
if (is.null(x))
return(FALSE)
if (!is.atomic(x))
return(NA)
.Internal(is.unsorted(x))
}
> x <- 20000000:1
> system.time(is.unsorted(x), gcFirst=TRUE)
user system elapsed
0.356 0.157 0.514
> system.time(is.unsorted.no.nacheck(x), gcFirst=TRUE)
user system elapsed
0 0 0
> revx <- rev(x)
> system.time(is.unsorted(revx), gcFirst=TRUE)
user system elapsed
0.500 0.170 0.672
> system.time(is.unsorted.no.nacheck(revx),gcFirst=TRUE)
user system elapsed
0.131 0.000 0.132
----------------------------------------------------------------------------
Bill Dunlap
Insightful Corporation
bill at insightful dot com
360-428-8146
"All statements in this message represent the opinions of the author and
do
not necessarily reflect Insightful Corporation policy or position."