David McPearson
2020-Apr-17 05:49 UTC
[R] calculate row median of every three columns for a dataframe
Anna wrote:> > Hi all, > I need to calculate a row median for every three columns of a > dataframe. I made it work using the following script, but not happy > with the script. Is there a simpler way for doing this? >To which Jim L responded:> > Hi Anna, > I can't think of a simple way, but this function may make you happier: > > step_median<-function(x,window) { > x<-unlist(x) > stop<-length(x)-window+1 > xout<-NA > nindx<-1 > for(i in seq(1,stop,by=window)) { > xout[nindx]<-do.call("median",list(x[i:(i+window-1)])) > nindx<-nindx+1 > } > return(xout) > } > apply(df,1,step_median,3) > > This should return a matrix where the columns are the medians > calculated from blocks of "window" width on each row of "df". As Bert > noted, you may want to think about a "rolling" median where the > "windows" overlap. This can be done like so: > > library(zoo) > apply(df,1,rollmedian,3) > > JimAnother approach you might try is multiple calls to sapply/lapply. This won't rid you of loops, but it will hide them: # Example data. Some names changed to avoid collisions between # R functions (collisions are in the gap between the headphones, # not i R). dfr <- data.frame(a = c(2,3,4), b = c(3,5,1), c = c(1,3,6), d = c(7,2,1), e = c(2,5,3), f = c(4,5,1)) # Turn each of the three-column groups into their own element # in a list. Note: the subsetting (probably) fails with an # error if ncol(dfr) is not a multiple of 3 dlist <- lapply(seq(1, ncol(dfr), by = 3), function(enn) dfr[ , enn + 0:2]) # Then you can use sapply to calculate the row medians for each # of the elements.. # Both of the following seem to work. I'm not sure which is # more readable? sapply(dlist, function(xx) apply(xx, 1, median)) sapply(dlist, apply, 1, median) # I'm sure the cognoscenti will have a much more elegant way # of doing this. Cheers y'all, DMcP
PIKAL Petr
2020-Apr-17 06:53 UTC
[R] calculate row median of every three columns for a dataframe
Hi As usual in R, things could be done by different ways. idx <- (0:(ncol(dfr)-1))%/%3 aggregate(t(dfr), list(idx), median) Group.1 V1 V2 V3 1 0 2 3 4 2 1 4 5 1 Results should be OK although its structure is different, performance is not tested. Cheers Petr> -----Original Message----- > From: R-help <r-help-bounces at r-project.org> On Behalf Of David McPearson > Sent: Friday, April 17, 2020 7:50 AM > To: r-help at r-project.org > Cc: dcmcp at telstra.com > Subject: Re: [R] calculate row median of every three columns for a dataframe > > Anna wrote: > > > > Hi all, > > I need to calculate a row median for every three columns of a > > dataframe. I made it work using the following script, but not happy > > with the script. Is there a simpler way for doing this? > > > > > > To which Jim L responded: > > > > Hi Anna, > > I can't think of a simple way, but this function may make you happier: > > > > step_median<-function(x,window) { > > x<-unlist(x) > > stop<-length(x)-window+1 > > xout<-NA > > nindx<-1 > > for(i in seq(1,stop,by=window)) { > > xout[nindx]<-do.call("median",list(x[i:(i+window-1)])) > > nindx<-nindx+1 > > } > > return(xout) > > } > > apply(df,1,step_median,3) > > > > This should return a matrix where the columns are the medians > > calculated from blocks of "window" width on each row of "df". As Bert > > noted, you may want to think about a "rolling" median where the > > "windows" overlap. This can be done like so: > > > > library(zoo) > > apply(df,1,rollmedian,3) > > > > Jim > > Another approach you might try is multiple calls to sapply/lapply. This won't > rid you of loops, but it will hide them: > > # Example data. Some names changed to avoid collisions between # R > functions (collisions are in the gap between the headphones, # not i R). > > dfr <- data.frame(a = c(2,3,4), b = c(3,5,1), c = c(1,3,6), > d = c(7,2,1), e = c(2,5,3), f = c(4,5,1)) > > # Turn each of the three-column groups into their own element # in a list. > Note: the subsetting (probably) fails with an # error if ncol(dfr) is not a > multiple of 3 > > dlist <- lapply(seq(1, ncol(dfr), by = 3), function(enn) > dfr[ , enn + 0:2]) > > # Then you can use sapply to calculate the row medians for each # of the > elements.. > > # Both of the following seem to work. I'm not sure which is # more readable? > > sapply(dlist, function(xx) apply(xx, 1, median)) > > sapply(dlist, apply, 1, median) > > # I'm sure the cognoscenti will have a much more elegant way # of doing this. > > > Cheers y'all, > DMcP > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.
Eric Berger
2020-Apr-17 09:36 UTC
[R] calculate row median of every three columns for a dataframe
Some comments on the contributions: a) for Petr's suggestion, to return the desired structure modify the statement to t(aggregate(t(dfr), list(idx), median)[,-1]) And, although less readable, can certainly be put in a one-liner solution by removing the idx definition t(aggregate(t(dfr), list((0:(ncol(dfr)-1))%/%3), median)[,-1]) b) to DMcP: "# I'm sure the cognoscenti will have a much more elegant way" +1 for elegance (in my view) c) to Jim: I think your code is instructive. From a style viewpoint I would recommend against naming a local variable 'stop' :-) Best, Eric On Fri, Apr 17, 2020 at 9:54 AM PIKAL Petr <petr.pikal at precheza.cz> wrote:> Hi > > As usual in R, things could be done by different ways. > > idx <- (0:(ncol(dfr)-1))%/%3 > > aggregate(t(dfr), list(idx), median) > Group.1 V1 V2 V3 > 1 0 2 3 4 > 2 1 4 5 1 > > Results should be OK although its structure is different, performance is > not tested. > > Cheers > Petr > > > -----Original Message----- > > From: R-help <r-help-bounces at r-project.org> On Behalf Of David McPearson > > Sent: Friday, April 17, 2020 7:50 AM > > To: r-help at r-project.org > > Cc: dcmcp at telstra.com > > Subject: Re: [R] calculate row median of every three columns for a > dataframe > > > > Anna wrote: > > > > > > Hi all, > > > I need to calculate a row median for every three columns of a > > > dataframe. I made it work using the following script, but not happy > > > with the script. Is there a simpler way for doing this? > > > > > > > > > > > To which Jim L responded: > > > > > > Hi Anna, > > > I can't think of a simple way, but this function may make you happier: > > > > > > step_median<-function(x,window) { > > > x<-unlist(x) > > > stop<-length(x)-window+1 > > > xout<-NA > > > nindx<-1 > > > for(i in seq(1,stop,by=window)) { > > > xout[nindx]<-do.call("median",list(x[i:(i+window-1)])) > > > nindx<-nindx+1 > > > } > > > return(xout) > > > } > > > apply(df,1,step_median,3) > > > > > > This should return a matrix where the columns are the medians > > > calculated from blocks of "window" width on each row of "df". As Bert > > > noted, you may want to think about a "rolling" median where the > > > "windows" overlap. This can be done like so: > > > > > > library(zoo) > > > apply(df,1,rollmedian,3) > > > > > > Jim > > > > Another approach you might try is multiple calls to sapply/lapply. This > won't > > rid you of loops, but it will hide them: > > > > # Example data. Some names changed to avoid collisions between # R > > functions (collisions are in the gap between the headphones, # not i R). > > > > dfr <- data.frame(a = c(2,3,4), b = c(3,5,1), c = c(1,3,6), > > d = c(7,2,1), e = c(2,5,3), f = c(4,5,1)) > > > > # Turn each of the three-column groups into their own element # in a > list. > > Note: the subsetting (probably) fails with an # error if ncol(dfr) is > not a > > multiple of 3 > > > > dlist <- lapply(seq(1, ncol(dfr), by = 3), function(enn) > > dfr[ , enn + 0:2]) > > > > # Then you can use sapply to calculate the row medians for each # of the > > elements.. > > > > # Both of the following seem to work. I'm not sure which is # more > readable? > > > > sapply(dlist, function(xx) apply(xx, 1, median)) > > > > sapply(dlist, apply, 1, median) > > > > # I'm sure the cognoscenti will have a much more elegant way # of doing > this. > > > > > > Cheers y'all, > > DMcP > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting- > > guide.html > > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >[[alternative HTML version deleted]]