Dear R-list, I am in the process of translating a long function written in Matlab into R (mainly because I am a big of fan of R, and folks will not have to pay to use it :). In the translation of this function I got stack because they use spdiags, which, as far as I can tell it is not available in R. I have explored the Matrix package, from which I borrowed some of the functions (e.g., sparseMatrix), but I could not actually find an equivalent to spdiags (please, let me know if it is there somewhere). So, I have written my own spdiags function (below); following also a suggestion in an old, and perhaps unique post, about this issue. It works only for square matrices (that's my need), however I have a couple of issues, mainly related to computational efficiency: 1) if I use it with a sparseMatrix, it throws a tedious warning "<sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient"; can I suppress this warning somehow, this is slowing the computation very radically; 2) I can go around this problem by translating a sparseMatrix back into a logical matrix before I run spdiags on it. However, the loop gets very slow for large matrices (e.g., 2000x2000), which is the kind of matrices I have to handle. If you look in the code, I have placed a system.time() where the code is slowing down, and it takes about: user system elapsed 0.28 0.05 0.33 to complete an iteration...thus, I was wondering whether there is a more efficient way to do what I am doing...also, if you spot other places where the function could be optimized I would be very glad to hear it! thank you very much in advance for your kind help, Best, Moreno ####################################################################### ## it works only for square matrices ## it could work with sparse matrices but it spits a tedious warning ## it is definitely inefficient compared to the original matlab code ## choose below different matrices to test the function. # r = c(2,3,5,5); c = c(2,1,4,5) # A = sparseMatrix(r, c) # A = replicate(1000, rnorm(1000) ) # A = rbind(c(1,2,3),c(2,3,4),c(3,4,5)) spdiags = function(A){ # Find all nonzero diagonals i = rep(seq(1, nrow(A),1),nrow(A)); j = sort(i); d = sort(j-i); # d = d(find(diff([-inf; d]))); ## from Matlab ... # d = c(d[which(diff(d) == 1)], d[length(d)] ) ## this emulate above but needs to stick in last element d = unique(d); ##this should work just fine and it is simpler p = length(d); ##the nr. col of the new matrix m = nrow(A); n = ncol(A); B = matrix(0, nrow = min(c(m,n)), ncol = p); for (k in 1:p){ # print(k) cl = vector(); if (m >= n){ i = max(1, 1+d[k]):min(n, m+d[k]); } else { i = max(1, 1-d[k]):min(m,n-d[k]); } system.time( if (length(i) != 0){ B[i,k] = A[ col(A) == row (A) - d[k]] } ) } return (list( B = B, d = d) ) } -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.
Moreno I. Coco <M.I.Coco <at> sms.ed.ac.uk> writes: [snip snip snip]> > So, I have written my own spdiags function (below); following > also a suggestion in an old, and perhaps unique post, about > this issue. > > It works only for square matrices (that's my need), however I > have a couple of issues, mainly related to computational > efficiency: > > 1) if I use it with a sparseMatrix, it throws a tedious warning > "<sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient"; > can I suppress this warning somehow, this is slowing the computation > very radically;quick answer: use suppressMessages()> > 2) I can go around this problem by translating a sparseMatrix back > into a logical matrix before I run spdiags on it. However, the loop > gets very slow for large matrices (e.g., 2000x2000), which is the > kind of matrices I have to handle. If you look in the code, > I have placed a system.time() where the code is slowing down, and > it takes about: >I'm not quite sure how to do it, but I think you should look at the ?band function in Matrix. In combination with diag() of a suitably truncated matrix, you should be able to extract bands of sparse matrices efficiently ...
Dear R-list, l have been working on a translation of a matlab library into R, it took me a while but I am almost there to submit it to CRAN...however, for this library to be computationally competitive I need to solve an issue with the home-made R version of the spdiags.m function (an old issue coming back). What I have now working is: # A = rbind(c(1,3,1,0), c(1,1,0,2), c(2,1,0,0) ) # A = replicate(100, rnorm(50)) A = replicate(1000, rnorm(1000)) exdiag <- function(mat, off) {mat[row(mat)+off == col(mat)]} spdiags <- function(A){ require(Matrix) indx = which(A != 0, arr.ind = T) i = indx[,1]; j = indx[,2] d = sort(j-i); d = d[which(diff(c(-Inf, d) ) != 0)] m = nrow(A); n = ncol(A); p = length(d) empty = vector() A = as.matrix(A) ## make A logical, otherwise exdiag throws horrible warnings ## might have to condition the above command on the type of input matrix B = Matrix(0, nrow = min(c(m,n)), ncol = p, sparse = TRUE); system.time( for (k in 1:p){ print(k) if (m >= n){ i = max(c(1, 1 + d[k])):min(c(n, m + d[k]) ) } else { i = max(c(1, 1 - d[k])):min(c(m, n-d[k]) ) } if (length(i) != 0){ B[i, k] = exdiag(A, d[k]) # print(B) # print(i) # print(k) } } ) return (list( B = B, d = d) ) } the advantages are: 1) that it works with matrices of any dimension (previous versions worked only with squared matrices). 2) it does not need other functions, beside exdiag to work. However, I run into serious issues with computational efficiency. As I am extracting non-zero diagonals and filling B in a for loop, the time it takes to complete each iteration highly depends on the dimensions of the matrix. Try the different starting A matrices to test this. Unfortunately, I really need to optimize this aspect, as the people who I am expecting to use this library would have really long time-series, i.e. it would make .m win over .R So, as I am wondering if there is a more efficient solution to do this (perhaps using apply?) that I cannot currently see. Perhaps using sparse matrices could help, but exdiag throws a warning: "<sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient" (noted also in previous responses to this post of mine). Using the function ?band has the drawback of padding the matrix with 0 (except the diagonal considered, e.g., band(A,1,1) ). Often, however, you want to keep the 0s along the diagonal of interest, distinguished from the 0s of the other diagonals. Thus, doing something like: which(band(A,1,1) !0, arr.ind = TRUE) to get the non-zero element would also discard the 0s along the diagonal. The nice getband,R posted by Ben above, makes a combined use of diag and band, together with a system of indeces, and tries to work around the issue of the zero padding. However, I did not manage to use/adapt this function to work with matrices that are not squared. And, as I am not really 100% a programmer, I did not find a solution to it... Moreover, I believe the problem of computational efficiency would still be there, even fixing the dimensionality issue. I really hope somebody out there can help me to figure this out. thanks a lot for your precious time and attention, Moreno -- View this message in context: http://r.789695.n4.nabble.com/writing-spdiags-function-for-R-tp4552626p4674540.html Sent from the R help mailing list archive at Nabble.com.
I am posting here the brilliant solutions, gently provided by Prof JC Nash <nashjc at uottawa.ca>, to me; so that people struggling in the future with the same issue can find a way through. FYI, compared to the original Matlab implementation: 1) "it does not handle the case with more than one input, and 2) (m > n) matrices give the B matrix columns in a different order, but the d vector of indices will also be changed accordingly, so the set of columns is OK, just ordered differently" cif. JN. Copied below the .R version of the spdiags code, a Fortran implementation of it, and an R wrapper to run the .f thanks again JN, your help was really invaluable :) ######################################################################################### R version ######################################################################################### spdiagsj <- function(A) { # A is a matrix m <- dim(A)[[1]] n <- dim(A)[[2]] k <- min(m, n) # length of diagonals Bdata<-NULL # start with nothing in B matrix (as vector) jb<-0 # column index of "last" column saved for B d<-NULL # index vector of diagonals from A # d contains 0 for the principal diagonal, -i for i'th lower # diagonal (prefaced with zeros), +j for j'th upper diagonal # (suffixed by zeros) q<-(m-1)+n # There are m-1 subdiagonals and n-1 superdiagonals + main diagonal if (m > n) { # tall matrix Adata <- as.vector(t(A)) # convert to vector BY ROWS } else { # fat or square matrix (m <= n) Adata <- as.vector(A) # convert to vector BY COLUMNS } # Augment the data with columns of zeros fore and aft Adata<-c(rep(0,(k-1)*k), Adata, rep(0,(k-1)*k)) cat("Augmented Adata with ",length(Adata)," elements:\n") print(Adata) for (i in 1:q) { tv <- c(Adata[[k*(i-1)+1]], rep(0,k-1)) # top element of augmented column # plus enough zeros to pad it out (some zeros may be replaced below) # i.e., start at 1, then m+1, 2*m+1 etc. qx<-min((q-i), (k-1)) # cat(" qx=",qx,"\n") if (qx > 0) { # qx will be 0 when we are at last superdiagonal,i.e., i == q for (j in 1:qx) { # get the rest of the diagonal elements tv[[j+1]] <- Adata[[k*(i+j-1)+j+1]] } } if (any(tv != 0)){ # check for non-zeros, if there are, then save jb<-jb+1 # next column of B d<-c(d,(i-k)) # record the index Bdata<-c(Bdata, tv) # save the diagonal as column of B in vector form } } if (m > n) d <- -d # reset index cat("Bdata:"); print(Bdata) B <- matrix(Bdata, nrow=k, byrow=FALSE) # convert to matrix form result<-list(B=B, d=d) } cat("Matlab example 1\n") dta <- c(0, 5, 0, 10, 0, 0, 0, 0, 6, 0, 11, 0, 3, 0, 0, 7, 0, 12, 1, 4, 0, 0, 8, 0, 0, 2, 5, 0, 0, 9) A1 <- matrix(dta, nrow=5, ncol=6, byrow=TRUE) print(A1) res1<-spdiagsj(A1) print(res1) tmp<-readline("Next") cat("Matlab example 2\n") n<-10 # choose 10 for an example A2<-matrix(rep(0, n*n), nrow=n, ncol=n) for (i in 1:n) { for (j in 1:n) { if (i == j) A2[i, j] <- -2 if ( (i == (j-1)) || (i == (j+1))) A2[i,j] <- 1 } } print(A2) res2<-spdiagsj(A2) print(res2) tmp<-readline("Next") cat("Matlab example 3\n") dta3 <- c(11, 0, 13, 0, 0, 22, 0, 24, 0, 0, 33, 0, 41, 0, 0, 44, 0, 52, 0, 0, 0, 0, 63, 0, 0, 0, 0, 74) A3 <- matrix(dta3, nrow=7, ncol=4, byrow=TRUE) print(A3) res3<-spdiagsj(A3) print(res3) tmp<-readline("Next") cat("try transpose\n") A3T<-t(A3) print(A3T) res3T<-spdiagsj(A3T) print(res3T) tmp<-readline("Next") cat("Example 5B \n") dta5b1<-c(6, 0, 13, 0, 0, 0, 7, 0, 14, 0, 1, 0, 8, 0, 15, 0, 2, 0, 9, 0, 0, 0, 3, 0, 10) A5b1<-matrix(dta5b1, nrow=5, ncol=5, byrow=TRUE) print(A5b1) res5b1<-spdiagsj(A5b1) print(res5b1) ######################################################################################### Fortran version ######################################################################################### subroutine jspd(m, n, k, Adata, jb, Bdata, d, tv, na, nb, nd) C Central part of spdiags for R C m and n are row and column sizes of A (underlying matrix) C jb will be number of returned diagonals C returns jb, Bdata, d integer m, n, na, nb, nd, jb, d(nd) integer i, j, k, kend, q, mn, kk1, js, je, qx double precision Adata(na), Bdata(nb), tv(k) LOGICAL not0 C k = min(m, n) C ?? check if k=1 C Bdata<-NULL # start with nothing in B matrix (as vector) jb = 0 kend=0 C column index of "last" column saved for B C d = NULL # index vector of diagonals from A C # d contains 0 for the principal diagonal, -i for i'th lower C # diagonal (prefaced with zeros), +j for j'th upper diagonal C # (suffixed by zeros) q = (m-1)+n C There are m-1 subdiagonals and n-1 superdiagonals + main diagonal C assume we have already built Adata for tall or fat matrix C # Augment the data with columns of zeros fore and aft mn = m*n C print *,"Original Adata" C print 1000, (Adata(i), i=1,mn) do 10 i=1,mn Bdata(i) = Adata(i) 10 continue kk1 = (k-1)*k do 15 i=1,kk1 Adata(i)=0.0 15 continue js = kk1+1 je = kk1+mn do 20 i=js,je Adata(i) = Bdata(i-js+1) 20 continue js = je+1 je = je+kk1 do 25 i=js,je Adata(i) = 0.0 25 continue C print *, "Augmented Adata with ", je," elements" C print 1000,(Adata(i), i=1,je) 1000 FORMAT(1H ,25f4.0) do 100 i=1,q qx = k*(i-1)+1 C print *,"Top element is no ",qx,"=",Adata(qx) tv(1) = Adata(qx) do 30 j=1,(k-1) tv(j+1) = 0.0 30 continue qx = min(q-i, k-1) C print *,"qx=",qx if (qx .gt. 0) then C qx will be 0 when we are at last superdiagonal,i.e., i == q do 35 j=1,qx tv(j+1) = Adata(k*(i+j-1)+j+1) 35 continue endif not0 = .FALSE. do 40 j=1,k if (tv(j) .ne. 0.0) not0 = .TRUE. 40 continue C if (.NOT. not0) print *," Zero column for i=",i if (not0) then C print *,"Nonzero column for i=",i jb = jb+1 d(jb) = (i-k) C print *, " New jb=",jb," d element =",d(jb) C print 1000,(tv(j), j=1,k) do 45 j=1,k Bdata(kend+j) = tv(j) 45 continue kend = kend + k C save the diagonal as column of B in vector form endif 100 continue C print *,"Bdata ", jb, " columns" C qx = k*jb C print 1000,(Bdata(j), j=1,qx) return end ######################################################################################### R Wrapper ######################################################################################### spDIAGS <- function(A) { # A is a matrix m <- dim(A)[[1]] n <- dim(A)[[2]] k <- min(m, n) # length of diagonals if (m > n) { # tall matrix Adata <- as.vector(t(A)) # convert to vector BY ROWS } else { # fat or square matrix (m <= n) Adata <- as.vector(A) # convert to vector BY COLUMNS } # print(Adata) la<-length(Adata) na<-m*n+2*k*(k-1) Adata <- c(Adata, rep(0,(na-la))) nb<-(m+n)*k nd<-m+n-1 jb<-0 Bdata<-rep(0,nb) d<-rep(0,nd) tv<-rep(0,k) tres<-.Fortran("jspd",m=as.integer(m), n=as.integer(n), k=as.integer(k), Adata=as.double(Adata), jb=as.integer(jb), Bdata=as.double(Bdata), d=as.integer(d), tv=as.double(tv), na=as.integer(na), nb=as.integer(nb), nd=as.integer(nd) ) # print(str(tres)) jb<-tres$jb d<-tres$d[1:jb] Bdata<-tres$Bdata[1:(jb*k)] # print(Bdata) if (m > n) d <- -d # reset index B <- matrix(Bdata, nrow=k, byrow=FALSE) # convert to matrix form result<-list(B=B, d=d) } ######################################################################################### -- View this message in context: http://r.789695.n4.nabble.com/writing-spdiags-function-for-R-tp4552626p4675010.html Sent from the R help mailing list archive at Nabble.com.