Hao Cen
2009-Nov-11 14:38 UTC
[R] dividing a matrix by positive sum or negative sum depending on the sign
Hi, I have a matrix with positive numbers, negative numbers, and NAs. An example of the matrix is as follows -1 -1 2 NA 3 3 -2 -1 1 1 NA -2 I need to compute a scaled version of this matrix. The scaling method is dividing each positive numbers in each row by the sum of positive numbers in that row and dividing each negative numbers in each row by the sum of absolute value of negative numbers in that row. So the resulting matrix would be -1/2 -1/2 2/2 NA 3/6 3/6 -2/3 -1/3 1/2 1/2 NA -2/2 Is there an efficient way to do that in R? One way I am using is 1. rowSums for positive numbers in the matrix 2. rowSums for negative numbers in the matrix 3. sweep(mat, 1, posSumVec, posDivFun) 4. sweep(mat, 1, negSumVec, negDivFun) posDivFun = function(x,y) { xPosId = x>0 & !is.na(x) x[xPosId] = x[xPosId]/y[xPosId] return(x) } negDivFun = function(x,y) { xNegId = x<0 & !is.na(x) x[xNegId] = -x[xNegId]/y[xNegId] return(x) } It is not fast enough though. This scaling is to be applied to large data sets repetitively. I would like to make it as fast as possible. Any thoughts on improving it would be appreciated. Thanks Jeff
Dimitris Rizopoulos
2009-Nov-11 15:36 UTC
[R] dividing a matrix by positive sum or negative sum depending on the sign
one approach is the following: mat <- rbind(c(-1, -1, 2, NA), c(3, 3, -2, -1), c(1, 1, NA, -2)) mat / ave(abs(mat), row(mat), sign(mat), FUN = sum) I hope it helps. Best, Dimitris Hao Cen wrote:> Hi, > > I have a matrix with positive numbers, negative numbers, and NAs. An > example of the matrix is as follows > > -1 -1 2 NA > 3 3 -2 -1 > 1 1 NA -2 > > I need to compute a scaled version of this matrix. The scaling method is > dividing each positive numbers in each row by the sum of positive numbers > in that row and dividing each negative numbers in each row by the sum of > absolute value of negative numbers in that row. > > So the resulting matrix would be > > -1/2 -1/2 2/2 NA > 3/6 3/6 -2/3 -1/3 > 1/2 1/2 NA -2/2 > > Is there an efficient way to do that in R? One way I am using is > > 1. rowSums for positive numbers in the matrix > 2. rowSums for negative numbers in the matrix > 3. sweep(mat, 1, posSumVec, posDivFun) > 4. sweep(mat, 1, negSumVec, negDivFun) > > posDivFun = function(x,y) { > xPosId = x>0 & !is.na(x) > x[xPosId] = x[xPosId]/y[xPosId] > return(x) > } > > negDivFun = function(x,y) { > xNegId = x<0 & !is.na(x) > x[xNegId] = -x[xNegId]/y[xNegId] > return(x) > } > > It is not fast enough though. This scaling is to be applied to large data > sets repetitively. I would like to make it as fast as possible. Any > thoughts on improving it would be appreciated. > > Thanks > > Jeff > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014
David Winsemius
2009-Nov-11 15:57 UTC
[R] dividing a matrix by positive sum or negative sum depending on the sign
On Nov 11, 2009, at 10:36 AM, Dimitris Rizopoulos wrote:> one approach is the following: > > mat <- rbind(c(-1, -1, 2, NA), c(3, 3, -2, -1), c(1, 1, NA, -2)) > > mat / ave(abs(mat), row(mat), sign(mat), FUN = sum)Very elegant. My solution was a bit more pedestrian, but may have some speed advantage: t( apply(mat, 1, function(x) ifelse( x <0, -x/sum(x[x<0], na.rm=T), x/ sum(x[x>0], na.rm=T) ) ) ) > system.time(replicate(10000, t( apply(mat, 1, function(x) ifelse( x <0, -x/sum(x[x<0], na.rm=T), x/sum(x[x>0], na.rm=T) ) ) ) ) ) user system elapsed 5.958 0.027 5.977 > system.time(replicate(10000, mat / ave(abs(mat), row(mat), sign(mat), FUN = sum) ) ) user system elapsed 12.886 0.064 12.886 -- David> > > I hope it helps. > > Best, > Dimitris > > > Hao Cen wrote: >> Hi, >> I have a matrix with positive numbers, negative numbers, and NAs. An >> example of the matrix is as follows >> -1 -1 2 NA >> 3 3 -2 -1 >> 1 1 NA -2 >> I need to compute a scaled version of this matrix. The scaling >> method is >> dividing each positive numbers in each row by the sum of positive >> numbers >> in that row and dividing each negative numbers in each row by the >> sum of >> absolute value of negative numbers in that row. >> So the resulting matrix would be >> -1/2 -1/2 2/2 NA >> 3/6 3/6 -2/3 -1/3 >> 1/2 1/2 NA -2/2 >> Is there an efficient way to do that in R? One way I am using is >> 1. rowSums for positive numbers in the matrix >> 2. rowSums for negative numbers in the matrix >> 3. sweep(mat, 1, posSumVec, posDivFun) >> 4. sweep(mat, 1, negSumVec, negDivFun) >> posDivFun = function(x,y) { >> xPosId = x>0 & !is.na(x) >> x[xPosId] = x[xPosId]/y[xPosId] >> return(x) >> } >> negDivFun = function(x,y) { >> xNegId = x<0 & !is.na(x) >> x[xNegId] = -x[xNegId]/y[xNegId] >> return(x) >> } >> It is not fast enough though. This scaling is to be applied to >> large data >> sets repetitively. I would like to make it as fast as possible. Any >> thoughts on improving it would be appreciated. >> Thanks >> Jeff >> ______________________________________________ >> R-help at r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > -- > Dimitris Rizopoulos > Assistant Professor > Department of Biostatistics > Erasmus University Medical Center > > Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands > Tel: +31/(0)10/7043478 > Fax: +31/(0)10/7043014 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.David Winsemius, MD Heritage Laboratories West Hartford, CT