Sklyar, Oleg (London)
2008-Sep-09 13:29 UTC
[Rd] 'xtfrm' performance (influences 'order' performance) in R devel
Hello everybody, it looks like the presense of some (do know know which) S4 methods for a given S4 class degrades the performance of xtfrm (used in 'order' in new R-devel) by a factor of millions. This is for classes that ARE derived from numeric directly and thus should be quite trivial to convert to numeric. Consider the following example: setClass("TimeDateBase", representation("numeric", mode="character"), prototype(mode="posix") ) setClass("TimeDate", representation("TimeDateBase", tzone="character"), prototype(tzone="London") ) x = new("TimeDate", 1220966224 + runif(1e5)) system.time({ z = order(x) }) ## > system.time({ z = order(x) }) ## user system elapsed ## 0.048 0.000 0.048 getClass("TimeDate") ## Class "TimeDate" ## Slots: ## Name: .Data tzone mode ## Class: numeric character character ## Extends: ## Class "TimeDateBase", directly ## Class "numeric", by class "TimeDateBase", distance 2 ## Class "vector", by class "TimeDateBase", distance 3 Now, if I load a library that not only defines these same classes, but also a bunch of methods for those, then I have the following result: library(AHLCalendar) x = now() + runif(1e5) ## just random times in POSIXct format x[1:5] ## TimeDate [posix] object in 'Europe/London' of length 5: ## [1] "2008-09-09 14:19:35.218" "2008-09-09 14:19:35.672" ## [3] "2008-09-09 14:19:35.515" "2008-09-09 14:19:35.721" ## [5] "2008-09-09 14:19:35.657"> system.time({ z = order(x) })Enter a frame number, or 0 to exit 1: system.time({ 2: order(x) 3: lapply(z, function(x) if (is.object(x)) xtfrm(x) else x) 4: FUN(X[[1]], ...) 5: xtfrm(x) 6: xtfrm.default(x) 7: as.vector(rank(x, ties.method = "min", na.last = "keep")) 8: rank(x, ties.method = "min", na.last = "keep") 9: switch(ties.method, average = , min = , max .Internal(rank(x[!nas], ties. 10: .gt(c(1220966375.21811, 1220966375.67217, 1220966375.51470, 1220966375.7211 11: x[j] 12: x[j] Selection: 0 Timing stopped at: 47.618 13.791 66.478 At the same time: system.time({ z = as.numeric(x) }) ## same as x at .Data ## user system elapsed ## 0.001 0.000 0.001 The only difference between the two is that I have the following methods defined for TimeDate (full listing below). Any idea why this could be happenning. And yes, it is down to xtfrm function, 'order' was just a place where the problem occured. Should xtfrm function be smarter with respect to classes that are actually derived from 'numeric'?> showMethods(class="TimeDate")Function: + (package base) e1="TimeDate", e2="TimeDate" e1="TimeDate", e2="numeric" (inherited from: e1="TimeDateBase", e2="numeric") Function: - (package base) e1="TimeDate", e2="TimeDate" Function: Time (package AHLCalendar) x="TimeDate" Function: TimeDate (package AHLCalendar) x="TimeDate" Function: TimeDate<- (package AHLCalendar) x="TimeSeries", value="TimeDate" Function: TimeSeries (package AHLCalendar) x="data.frame", ts="TimeDate" x="matrix", ts="TimeDate" x="numeric", ts="TimeDate" Function: [ (package base) x="TimeDate", i="POSIXt", j="missing" x="TimeDate", i="Time", j="missing" x="TimeDate", i="TimeDate", j="missing" x="TimeDate", i="integer", j="missing" (inherited from: x="TimeDateBase", i="ANY", j="missing") x="TimeDate", i="logical", j="missing" (inherited from: x="TimeDateBase", i="ANY", j="missing") x="TimeSeries", i="TimeDate", j="missing" x="TimeSeries", i="TimeDate", j="vector" Function: [<- (package base) x="TimeDate", i="ANY", j="ANY", value="ANY" x="TimeDate", i="ANY", j="ANY", value="numeric" x="TimeDate", i="missing", j="ANY", value="ANY" x="TimeDate", i="missing", j="ANY", value="numeric" Function: add (package AHLCalendar) x="TimeDate" Function: addMonths (package AHLCalendar) x="TimeDate" Function: addYears (package AHLCalendar) x="TimeDate" Function: align (package AHLCalendar) x="TimeDate", to="character" x="TimeDate", to="missing" Function: as.POSIXct (package base) x="TimeDate" Function: as.POSIXlt (package base) x="TimeDate" Function: coerce (package methods) from="TimeDate", to="TimeDateBase" Function: coerce<- (package methods) from="TimeDate", to="numeric" Function: dates (package AHLCalendar) x="TimeDate" Function: format (package base) x="TimeDate" Function: fxFwdDate (package AHLCalendar) x="TimeDate", country="character" Function: fxSettleDate (package AHLCalendar) x="TimeDate", country="character" Function: holidays (package AHLCalendar) x="TimeDate" Function: index (package AHLCalendar) x="TimeDate", y="POSIXt" x="TimeDate", y="Time" x="TimeDate", y="TimeDate" Function: initialize (package methods) .Object="TimeDate" (inherited from: .Object="ANY") Function: leapYear (package AHLCalendar) x="TimeDate" Function: mday (package AHLCalendar) x="TimeDate" Function: mode (package base) x="TimeDate" (inherited from: x="TimeDateBase") Function: mode<- (package base) x="TimeDate", value="character" (inherited from: x="TimeDateBase", value="character") Function: month (package AHLCalendar) x="TimeDate" Function: pretty (package base) x="TimeDate" Function: prettyFormat (package AHLCalendar) x="TimeDate", munit="character" x="TimeDate", munit="missing" Function: print (package base) x="TimeDate" Function: show (package methods) object="TimeDate" (inherited from: object="TimeDateBase") Function: summary (package base) object="TimeDate" Function: td2tz (package AHLCalendar) x="TimeDate" Function: times (package AHLCalendar) x="TimeDate" Function: tojulian (package AHLCalendar) x="TimeDate" Function: toposix (package AHLCalendar) x="TimeDate" Function: tots (package AHLCalendar) x="TimeDate" Function: tzone (package AHLCalendar) x="TimeDate" Function: tzone<- (package AHLCalendar) x="TimeDate" Function: wday (package AHLCalendar) x="TimeDate" Function: yday (package AHLCalendar) x="TimeDate" Function: year (package AHLCalendar) x="TimeDate" Dr Oleg Sklyar Research Technologist AHL / Man Investments Ltd +44 (0)20 7144 3107 osklyar at maninvestments.com ********************************************************************** The contents of this email are for the named addressee(s...{{dropped:22}}
John Chambers
2008-Sep-09 14:10 UTC
[Rd] 'xtfrm' performance (influences 'order' performance) in R devel
No definitive answers, but here are a few observations. In the call to order() code, I notice that you have dropped into the branch if (any(unlist(lapply(z, is.object)))) where the alternative in your case would seem to have been going directly to the internal code. You can consider a method for xtfrm(), which would help but won't get you completely back to a trivial computation. Alternatively, order() should be eligible for the new mechanism of defining methods for "...". (Individual existing methods may not be the issue, and one can't infer anything definite from the evidence given, but a plausible culprit is the "[" method. Because [] expressions appear so often, it's always chancy to define a nontrivial method for this function.) John Sklyar, Oleg (London) wrote:> Hello everybody, > > it looks like the presense of some (do know know which) S4 methods for a > given S4 class degrades the performance of xtfrm (used in 'order' in new > R-devel) by a factor of millions. This is for classes that ARE derived > from numeric directly and thus should be quite trivial to convert to > numeric. > > Consider the following example: > > setClass("TimeDateBase", > representation("numeric", mode="character"), > prototype(mode="posix") > ) > setClass("TimeDate", > representation("TimeDateBase", tzone="character"), > prototype(tzone="London") > ) > x = new("TimeDate", 1220966224 + runif(1e5)) > > system.time({ z = order(x) }) > ## > system.time({ z = order(x) }) > ## user system elapsed > ## 0.048 0.000 0.048 > > getClass("TimeDate") > ## Class "TimeDate" > > ## Slots: > > ## Name: .Data tzone mode > ## Class: numeric character character > > ## Extends: > ## Class "TimeDateBase", directly > ## Class "numeric", by class "TimeDateBase", distance 2 > ## Class "vector", by class "TimeDateBase", distance 3 > > > Now, if I load a library that not only defines these same classes, but > also a bunch of methods for those, then I have the following result: > > library(AHLCalendar) > x = now() + runif(1e5) ## just random times in POSIXct format > x[1:5] > ## TimeDate [posix] object in 'Europe/London' of length 5: > ## [1] "2008-09-09 14:19:35.218" "2008-09-09 14:19:35.672" > ## [3] "2008-09-09 14:19:35.515" "2008-09-09 14:19:35.721" > ## [5] "2008-09-09 14:19:35.657" > > >> system.time({ z = order(x) }) >> > > > Enter a frame number, or 0 to exit > > 1: system.time({ > 2: order(x) > 3: lapply(z, function(x) if (is.object(x)) xtfrm(x) else x) > 4: FUN(X[[1]], ...) > 5: xtfrm(x) > 6: xtfrm.default(x) > 7: as.vector(rank(x, ties.method = "min", na.last = "keep")) > 8: rank(x, ties.method = "min", na.last = "keep") > 9: switch(ties.method, average = , min = , max > .Internal(rank(x[!nas], ties. > 10: .gt(c(1220966375.21811, 1220966375.67217, 1220966375.51470, > 1220966375.7211 > 11: x[j] > 12: x[j] > > Selection: 0 > Timing stopped at: 47.618 13.791 66.478 > > At the same time: > > system.time({ z = as.numeric(x) }) ## same as x@.Data > ## user system elapsed > ## 0.001 0.000 0.001 > > The only difference between the two is that I have the following methods > defined for TimeDate (full listing below). > > Any idea why this could be happenning. And yes, it is down to xtfrm > function, 'order' was just a place where the problem occured. Should > xtfrm function be smarter with respect to classes that are actually > derived from 'numeric'? > > >> showMethods(class="TimeDate") >> > Function: + (package base) > e1="TimeDate", e2="TimeDate" > e1="TimeDate", e2="numeric" > (inherited from: e1="TimeDateBase", e2="numeric") > > Function: - (package base) > e1="TimeDate", e2="TimeDate" > > Function: Time (package AHLCalendar) > x="TimeDate" > > Function: TimeDate (package AHLCalendar) > x="TimeDate" > > Function: TimeDate<- (package AHLCalendar) > x="TimeSeries", value="TimeDate" > > Function: TimeSeries (package AHLCalendar) > x="data.frame", ts="TimeDate" > x="matrix", ts="TimeDate" > x="numeric", ts="TimeDate" > > Function: [ (package base) > x="TimeDate", i="POSIXt", j="missing" > x="TimeDate", i="Time", j="missing" > x="TimeDate", i="TimeDate", j="missing" > x="TimeDate", i="integer", j="missing" > (inherited from: x="TimeDateBase", i="ANY", j="missing") > x="TimeDate", i="logical", j="missing" > (inherited from: x="TimeDateBase", i="ANY", j="missing") > x="TimeSeries", i="TimeDate", j="missing" > x="TimeSeries", i="TimeDate", j="vector" > > Function: [<- (package base) > x="TimeDate", i="ANY", j="ANY", value="ANY" > x="TimeDate", i="ANY", j="ANY", value="numeric" > x="TimeDate", i="missing", j="ANY", value="ANY" > x="TimeDate", i="missing", j="ANY", value="numeric" > > Function: add (package AHLCalendar) > x="TimeDate" > > Function: addMonths (package AHLCalendar) > x="TimeDate" > > Function: addYears (package AHLCalendar) > x="TimeDate" > > Function: align (package AHLCalendar) > x="TimeDate", to="character" > x="TimeDate", to="missing" > > Function: as.POSIXct (package base) > x="TimeDate" > > Function: as.POSIXlt (package base) > x="TimeDate" > > Function: coerce (package methods) > from="TimeDate", to="TimeDateBase" > > Function: coerce<- (package methods) > from="TimeDate", to="numeric" > > Function: dates (package AHLCalendar) > x="TimeDate" > > Function: format (package base) > x="TimeDate" > > Function: fxFwdDate (package AHLCalendar) > x="TimeDate", country="character" > > Function: fxSettleDate (package AHLCalendar) > x="TimeDate", country="character" > > Function: holidays (package AHLCalendar) > x="TimeDate" > > Function: index (package AHLCalendar) > x="TimeDate", y="POSIXt" > x="TimeDate", y="Time" > x="TimeDate", y="TimeDate" > > Function: initialize (package methods) > .Object="TimeDate" > (inherited from: .Object="ANY") > > Function: leapYear (package AHLCalendar) > x="TimeDate" > > Function: mday (package AHLCalendar) > x="TimeDate" > > Function: mode (package base) > x="TimeDate" > (inherited from: x="TimeDateBase") > > Function: mode<- (package base) > x="TimeDate", value="character" > (inherited from: x="TimeDateBase", value="character") > > Function: month (package AHLCalendar) > x="TimeDate" > > Function: pretty (package base) > x="TimeDate" > > Function: prettyFormat (package AHLCalendar) > x="TimeDate", munit="character" > x="TimeDate", munit="missing" > > Function: print (package base) > x="TimeDate" > > Function: show (package methods) > object="TimeDate" > (inherited from: object="TimeDateBase") > > Function: summary (package base) > object="TimeDate" > > Function: td2tz (package AHLCalendar) > x="TimeDate" > > Function: times (package AHLCalendar) > x="TimeDate" > > Function: tojulian (package AHLCalendar) > x="TimeDate" > > Function: toposix (package AHLCalendar) > x="TimeDate" > > Function: tots (package AHLCalendar) > x="TimeDate" > > Function: tzone (package AHLCalendar) > x="TimeDate" > > Function: tzone<- (package AHLCalendar) > x="TimeDate" > > Function: wday (package AHLCalendar) > x="TimeDate" > > Function: yday (package AHLCalendar) > x="TimeDate" > > Function: year (package AHLCalendar) > x="TimeDate" > > > > Dr Oleg Sklyar > Research Technologist > AHL / Man Investments Ltd > +44 (0)20 7144 3107 > osklyar@maninvestments.com > > > ********************************************************************** > The contents of this email are for the named addressee(s...{{dropped:22}} > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > >[[alternative HTML version deleted]]