2024 Sep 28
lattice xyplot with cumsum() function inside
...ate("2024-01-01"), by = 1, length.out = 50), xgroup = "A", x = runif(50, 0, 1)) mydt <- rbindlist(list(mydt, data.table(date = mydt$date, xgroup = "B", x = runif(50, 0, 3)))) mydt[, `:=`(xcumsum = cumsum(x)), by = .(xgroup)] mydt[, lapply(.SD, sum), by = .(xgroup), .SDcols = c("x")] # xgroup x # <char> <num> #1: A 26.00455 #2: B 71.55405 #For xgroup = "B", line starts at the sum of all previous x values including xgroup = "A" #Intended result is to separate cumsum(x) for groups "A" and &quo...
2002 May 11
deleting invariant rows and cols in a matrix
...stp != 1){ stp.row <- rep(0,nrow(clean)) stp.col <- rep(0,ncol(clean)) # Start with rows for (i in 1:nrow(clean)){ sdrow <- sd(clean[i,]) if (sdrow==0) clean <- clean[i * -1,] if (sdrow==0) stp.row[i] <- 1 } # Next check columns for (j in 1:ncol(clean)){ sdcol <- sd(clean[,j]) if (sdcol==0) clean <- clean[,j * -1] if (sdcol==0) stp.col[j] <- 1 } # Do we need to continue with the process? if (sum(stp.row)==0 && sum(stp.col)==0) stp <- 1 } # Output cleaned data to new dataset name cleaned <<- clean } ---- end R c...
2020 Sep 24
How to use `[` without evaluating the arguments.
...which(colnames(colData) %in% colIDs) lockBinding('colIDs', internals) # Assemble the pseudo row and column names for the LongTable .pasteColons <- function(...) paste(..., collapse=':') rowData[, `:=`(.rownames=mapply(.pasteColons, transpose(.SD))), .SDcols=internals$rowIDs] colData[, `:=`(.colnames=mapply(.pasteColons, transpose(.SD))), .SDcols=internals$colIDs] return(.LongTable(rowData=rowData, colData=colData, assays=assays, metadata=metadata, .intern=internals)) } I have also defined a subset...
2013 Mar 13
loop in a data.table
Hi everyone, I have a data.table called "data" with many columns which I want to group by column1 using data.table, given how fast it is. The problem with looping a data.table is that data.table does not like quotations to define the column names (e.g. "col2" instead of col2). I found a way around which is to use get("col2"), which works fine but the
2012 Sep 14
aggregate() runs out of memory
I have a large data.frame Z (2,424,185,944 bytes, 10,256,441 rows, 17 columns). I want to get the result of table(aggregate(Z$V1, FUN = length, by = list(id=Z$V2))$x) alas, aggregate has been running for ~30 minute, RSS is 14G, VIRT is 24.3G, and no end in sight. both V1 and V2 are characters (not factors). Is there anything I could do to speed this up? Thanks. -- Sam Steingold