This code gives unexpected result.
library(data.table)
library(lattice)
set.seed(123)
mydt <- data.table(date = seq.Date(as.IDate("2024-01-01"), by = 1,
length.out = 50), xgroup = "A", x = runif(50, 0, 1))
mydt <- rbindlist(list(mydt, data.table(date = mydt$date, xgroup =
"B", x = runif(50, 0, 3))))
mydt[, `:=`(xcumsum = cumsum(x)), by = .(xgroup)]
mydt[, lapply(.SD, sum), by = .(xgroup), .SDcols = c("x")]
# xgroup x
# <char> <num>
#1: A 26.00455
#2: B 71.55405
#For xgroup = "B", line starts at the sum of all previous x values
including xgroup = "A"
#Intended result is to separate cumsum(x) for groups "A" and
"B"
mydt[, xyplot(cumsum(x) ~ date, groups = xgroup, type = c("l",
"g"))]
mydf <- as.data.frame(mydt)
xyplot(cumsum(x) ~ date, groups = xgroup, type = c("l",
"g"), data mydf)
#Same graph
aggregate(x ~ xgroup, FUN = sum, data = mydf)
# xgroup x
#1 A 26.00455
#2 B 71.55405
#In graph, group "A" goes up to 26. But group "B" goes up
to 26 + 71.55
#I can get intended graph in this way
mydt[, xyplot(xcumsum ~ date, groups = xgroup, type = c("l",
"g"),
auto.key = list(columns = 2, space = "bottom"))]
Is this a bug or incorrect use of function?
Thanks,
Naresh
? Sat, 28 Sep 2024 13:34:22 +0000 Naresh Gurbuxani <naresh_gurbuxani at hotmail.com> ?????:> mydt[, lapply(.SD, sum), by = .(xgroup), .SDcols = c("x")] > # xgroup x > # <char> <num> > #1: A 26.00455 > #2: B 71.55405 > > #For xgroup = "B", line starts at the sum of all previous x values > including xgroup = "A" > #Intended result is to separate cumsum(x) for groups "A" and "B" > mydt[, xyplot(cumsum(x) ~ date, groups = xgroup, type = c("l", "g"))]> Is this a bug or incorrect use of function?Thank you very much for providing a reproducible example! Lattice first evaluates the formula and only then splits the results into groups. This is a consequence of the design where grouping is handled by the panel functions such as panel.superpose. I think you can obtain the desired result by letting data.table group the rows: xyplot( x ~ date, groups = xgroup, type = c("l", "g"), data = mydt[, c(list(date=date), lapply(.SD, cumsum)), by = xgroup, .SDcols = 'x' ] ) A more idiomatic solution might be: mydt[,xsum := cumsum(x), by = xgroup] xyplot(xsum ~ date, groups = xgroup, type = c("l", "g"), data = mydt) -- Best regards, Ivan