Dear R-list,
I am in the process of translating a long function written in Matlab
into R (mainly because I am a big of fan of R, and folks will not
have to pay to use it :). In the translation of this function
I got stack because they use spdiags, which, as far as I can tell
it is not available in R. I have explored the Matrix package, from
which I borrowed some of the functions (e.g., sparseMatrix), but
I could not actually find an equivalent to spdiags (please, let
me know if it is there somewhere).
So, I have written my own spdiags function (below); following
also a suggestion in an old, and perhaps unique post, about
this issue.
It works only for square matrices (that's my need), however I
have a couple of issues, mainly related to computational
efficiency:
1) if I use it with a sparseMatrix, it throws a tedious warning
"<sparse>[ <logic> ] : .M.sub.i.logical() maybe
inefficient";
can I suppress this warning somehow, this is slowing the computation
very radically;
2) I can go around this problem by translating a sparseMatrix back
into a logical matrix before I run spdiags on it. However, the loop
gets very slow for large matrices (e.g., 2000x2000), which is the
kind of matrices I have to handle. If you look in the code,
I have placed a system.time() where the code is slowing down, and
it takes about:
user system elapsed
0.28 0.05 0.33
to complete an iteration...thus, I was wondering whether there is
a more efficient way to do what I am doing...also, if you spot
other places where the function could be optimized I would be
very glad to hear it!
thank you very much in advance for your kind help,
Best,
Moreno
#######################################################################
## it works only for square matrices
## it could work with sparse matrices but it spits a tedious warning
## it is definitely inefficient compared to the original matlab code
## choose below different matrices to test the function.
# r = c(2,3,5,5); c = c(2,1,4,5)
# A = sparseMatrix(r, c)
# A = replicate(1000, rnorm(1000) )
# A = rbind(c(1,2,3),c(2,3,4),c(3,4,5))
spdiags = function(A){
# Find all nonzero diagonals
i = rep(seq(1, nrow(A),1),nrow(A));
j = sort(i);
d = sort(j-i);
# d = d(find(diff([-inf; d]))); ## from Matlab ...
# d = c(d[which(diff(d) == 1)], d[length(d)] ) ## this emulate
above but needs to stick in last element
d = unique(d); ##this should work just fine and it is simpler
p = length(d); ##the nr. col of the new matrix
m = nrow(A); n = ncol(A);
B = matrix(0, nrow = min(c(m,n)), ncol = p);
for (k in 1:p){
# print(k)
cl = vector();
if (m >= n){
i = max(1, 1+d[k]):min(n, m+d[k]);
} else { i = max(1, 1-d[k]):min(m,n-d[k]); }
system.time(
if (length(i) != 0){
B[i,k] = A[ col(A) == row (A) - d[k]]
} )
}
return (list( B = B, d = d) )
}
--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Moreno I. Coco <M.I.Coco <at> sms.ed.ac.uk> writes: [snip snip snip]> > So, I have written my own spdiags function (below); following > also a suggestion in an old, and perhaps unique post, about > this issue. > > It works only for square matrices (that's my need), however I > have a couple of issues, mainly related to computational > efficiency: > > 1) if I use it with a sparseMatrix, it throws a tedious warning > "<sparse>[ <logic> ] : .M.sub.i.logical() maybe inefficient"; > can I suppress this warning somehow, this is slowing the computation > very radically;quick answer: use suppressMessages()> > 2) I can go around this problem by translating a sparseMatrix back > into a logical matrix before I run spdiags on it. However, the loop > gets very slow for large matrices (e.g., 2000x2000), which is the > kind of matrices I have to handle. If you look in the code, > I have placed a system.time() where the code is slowing down, and > it takes about: >I'm not quite sure how to do it, but I think you should look at the ?band function in Matrix. In combination with diag() of a suitably truncated matrix, you should be able to extract bands of sparse matrices efficiently ...
Dear R-list,
l have been working on a translation of a matlab library into R, it took me
a while but I am almost there to submit it to CRAN...however, for this
library to be computationally competitive I need to solve an issue with the
home-made R version of the spdiags.m function (an old issue coming back).
What I have now working is:
# A = rbind(c(1,3,1,0), c(1,1,0,2), c(2,1,0,0) )
# A = replicate(100, rnorm(50))
A = replicate(1000, rnorm(1000))
exdiag <- function(mat, off) {mat[row(mat)+off == col(mat)]}
spdiags <- function(A){
require(Matrix)
indx = which(A != 0, arr.ind = T)
i = indx[,1]; j = indx[,2]
d = sort(j-i);
d = d[which(diff(c(-Inf, d) ) != 0)]
m = nrow(A); n = ncol(A);
p = length(d)
empty = vector()
A = as.matrix(A) ## make A logical, otherwise exdiag throws horrible
warnings
## might have to condition the above command on the type of input matrix
B = Matrix(0, nrow = min(c(m,n)), ncol = p, sparse = TRUE);
system.time(
for (k in 1:p){
print(k)
if (m >= n){
i = max(c(1, 1 + d[k])):min(c(n, m + d[k]) )
} else {
i = max(c(1, 1 - d[k])):min(c(m, n-d[k]) )
}
if (length(i) != 0){
B[i, k] = exdiag(A, d[k])
# print(B)
# print(i)
# print(k)
}
}
)
return (list( B = B, d = d) )
}
the advantages are: 1) that it works with matrices of any dimension
(previous versions worked only with squared matrices). 2) it does not need
other functions, beside exdiag to work.
However, I run into serious issues with computational efficiency. As I am
extracting non-zero diagonals and filling B in a for loop, the time it takes
to complete each iteration highly depends on the dimensions of the matrix.
Try the different starting A matrices to test this.
Unfortunately, I really need to optimize this aspect, as the people who I am
expecting to use this library would have really long time-series, i.e. it
would make .m win over .R
So, as I am wondering if there is a more efficient solution to do this
(perhaps using apply?) that I cannot currently see. Perhaps using sparse
matrices could help, but exdiag throws a warning: "<sparse>[
<logic> ] :
.M.sub.i.logical() maybe inefficient" (noted also in previous responses to
this post of mine).
Using the function ?band has the drawback of padding the matrix with 0
(except the diagonal considered, e.g., band(A,1,1) ). Often, however, you
want to keep the 0s along the diagonal of interest, distinguished from the
0s of the other diagonals. Thus, doing something like: which(band(A,1,1) !0,
arr.ind = TRUE) to get the non-zero element would also discard the 0s
along the diagonal.
The nice getband,R posted by Ben above, makes a combined use of diag and
band, together with a system of indeces, and tries to work around the issue
of the zero padding. However, I did not manage to use/adapt this function
to work with matrices that are not squared. And, as I am not really 100% a
programmer, I did not find a solution to it...
Moreover, I believe the problem of computational efficiency would still be
there, even fixing the dimensionality issue.
I really hope somebody out there can help me to figure this out.
thanks a lot for your precious time and attention,
Moreno
--
View this message in context:
http://r.789695.n4.nabble.com/writing-spdiags-function-for-R-tp4552626p4674540.html
Sent from the R help mailing list archive at Nabble.com.
I am posting here the brilliant solutions, gently provided by Prof JC Nash
<nashjc at uottawa.ca>, to me; so that people struggling in the future
with
the same issue can find a way through.
FYI, compared to the original Matlab implementation: 1) "it does not handle
the case with more than one input, and
2) (m > n) matrices give the B matrix columns in a different order, but the
d vector of indices will also be changed
accordingly, so the set of columns is OK, just ordered differently" cif.
JN.
Copied below the .R version of the spdiags code, a Fortran implementation of
it, and an R wrapper to run the .f
thanks again JN, your help was really invaluable :)
#########################################################################################
R version
#########################################################################################
spdiagsj <- function(A) { # A is a matrix
m <- dim(A)[[1]]
n <- dim(A)[[2]]
k <- min(m, n) # length of diagonals
Bdata<-NULL # start with nothing in B matrix (as vector)
jb<-0 # column index of "last" column saved for B
d<-NULL # index vector of diagonals from A
# d contains 0 for the principal diagonal, -i for i'th lower
# diagonal (prefaced with zeros), +j for j'th upper diagonal
# (suffixed by zeros)
q<-(m-1)+n # There are m-1 subdiagonals and n-1 superdiagonals + main
diagonal
if (m > n) { # tall matrix
Adata <- as.vector(t(A)) # convert to vector BY ROWS
} else { # fat or square matrix (m <= n)
Adata <- as.vector(A) # convert to vector BY COLUMNS
}
# Augment the data with columns of zeros fore and aft
Adata<-c(rep(0,(k-1)*k), Adata, rep(0,(k-1)*k))
cat("Augmented Adata with ",length(Adata)," elements:\n")
print(Adata)
for (i in 1:q) {
tv <- c(Adata[[k*(i-1)+1]], rep(0,k-1)) # top element of augmented
column
# plus enough zeros to pad it out (some zeros may be replaced
below)
# i.e., start at 1, then m+1, 2*m+1 etc.
qx<-min((q-i), (k-1)) #
cat(" qx=",qx,"\n")
if (qx > 0) { # qx will be 0 when we are at last
superdiagonal,i.e., i == q
for (j in 1:qx) { # get the rest of the diagonal elements
tv[[j+1]] <- Adata[[k*(i+j-1)+j+1]]
}
}
if (any(tv != 0)){ # check for non-zeros, if there are, then save
jb<-jb+1 # next column of B
d<-c(d,(i-k)) # record the index
Bdata<-c(Bdata, tv) # save the diagonal as column of B in
vector form
}
}
if (m > n) d <- -d # reset index
cat("Bdata:"); print(Bdata)
B <- matrix(Bdata, nrow=k, byrow=FALSE) # convert to matrix form
result<-list(B=B, d=d)
}
cat("Matlab example 1\n")
dta <- c(0, 5, 0, 10, 0, 0, 0, 0, 6, 0, 11, 0, 3, 0, 0, 7, 0, 12, 1, 4, 0,
0, 8, 0, 0, 2, 5, 0, 0, 9)
A1 <- matrix(dta, nrow=5, ncol=6, byrow=TRUE)
print(A1)
res1<-spdiagsj(A1)
print(res1)
tmp<-readline("Next")
cat("Matlab example 2\n")
n<-10 # choose 10 for an example
A2<-matrix(rep(0, n*n), nrow=n, ncol=n)
for (i in 1:n) {
for (j in 1:n) {
if (i == j) A2[i, j] <- -2
if ( (i == (j-1)) || (i == (j+1))) A2[i,j] <- 1
}
}
print(A2)
res2<-spdiagsj(A2)
print(res2)
tmp<-readline("Next")
cat("Matlab example 3\n")
dta3 <- c(11, 0, 13, 0, 0, 22, 0, 24, 0, 0, 33, 0, 41, 0, 0, 44,
0, 52, 0, 0, 0, 0, 63, 0, 0, 0, 0, 74)
A3 <- matrix(dta3, nrow=7, ncol=4, byrow=TRUE)
print(A3)
res3<-spdiagsj(A3)
print(res3)
tmp<-readline("Next")
cat("try transpose\n")
A3T<-t(A3)
print(A3T)
res3T<-spdiagsj(A3T)
print(res3T)
tmp<-readline("Next")
cat("Example 5B \n")
dta5b1<-c(6, 0, 13, 0, 0, 0, 7, 0, 14, 0, 1, 0, 8, 0, 15, 0, 2, 0, 9, 0, 0,
0, 3, 0, 10)
A5b1<-matrix(dta5b1, nrow=5, ncol=5, byrow=TRUE)
print(A5b1)
res5b1<-spdiagsj(A5b1)
print(res5b1)
#########################################################################################
Fortran version
#########################################################################################
subroutine jspd(m, n, k, Adata, jb, Bdata, d, tv, na, nb, nd)
C Central part of spdiags for R
C m and n are row and column sizes of A (underlying matrix)
C jb will be number of returned diagonals
C returns jb, Bdata, d
integer m, n, na, nb, nd, jb, d(nd)
integer i, j, k, kend, q, mn, kk1, js, je, qx
double precision Adata(na), Bdata(nb), tv(k)
LOGICAL not0
C k = min(m, n)
C ?? check if k=1
C Bdata<-NULL # start with nothing in B matrix (as vector)
jb = 0
kend=0
C column index of "last" column saved for B
C d = NULL # index vector of diagonals from A
C # d contains 0 for the principal diagonal, -i for i'th lower
C # diagonal (prefaced with zeros), +j for j'th upper diagonal
C # (suffixed by zeros)
q = (m-1)+n
C There are m-1 subdiagonals and n-1 superdiagonals + main diagonal
C assume we have already built Adata for tall or fat matrix
C # Augment the data with columns of zeros fore and aft
mn = m*n
C print *,"Original Adata"
C print 1000, (Adata(i), i=1,mn)
do 10 i=1,mn
Bdata(i) = Adata(i)
10 continue
kk1 = (k-1)*k
do 15 i=1,kk1
Adata(i)=0.0
15 continue
js = kk1+1
je = kk1+mn
do 20 i=js,je
Adata(i) = Bdata(i-js+1)
20 continue
js = je+1
je = je+kk1
do 25 i=js,je
Adata(i) = 0.0
25 continue
C print *, "Augmented Adata with ", je," elements"
C print 1000,(Adata(i), i=1,je)
1000 FORMAT(1H ,25f4.0)
do 100 i=1,q
qx = k*(i-1)+1
C print *,"Top element is no ",qx,"=",Adata(qx)
tv(1) = Adata(qx)
do 30 j=1,(k-1)
tv(j+1) = 0.0
30 continue
qx = min(q-i, k-1)
C print *,"qx=",qx
if (qx .gt. 0) then
C qx will be 0 when we are at last superdiagonal,i.e., i == q
do 35 j=1,qx
tv(j+1) = Adata(k*(i+j-1)+j+1)
35 continue
endif
not0 = .FALSE.
do 40 j=1,k
if (tv(j) .ne. 0.0) not0 = .TRUE.
40 continue
C if (.NOT. not0) print *," Zero column for i=",i
if (not0) then
C print *,"Nonzero column for i=",i
jb = jb+1
d(jb) = (i-k)
C print *, " New jb=",jb," d element =",d(jb)
C print 1000,(tv(j), j=1,k)
do 45 j=1,k
Bdata(kend+j) = tv(j)
45 continue
kend = kend + k
C save the diagonal as column of B in vector form
endif
100 continue
C print *,"Bdata ", jb, " columns"
C qx = k*jb
C print 1000,(Bdata(j), j=1,qx)
return
end
#########################################################################################
R Wrapper
#########################################################################################
spDIAGS <- function(A) { # A is a matrix
m <- dim(A)[[1]]
n <- dim(A)[[2]]
k <- min(m, n) # length of diagonals
if (m > n) { # tall matrix
Adata <- as.vector(t(A)) # convert to vector BY ROWS
} else { # fat or square matrix (m <= n)
Adata <- as.vector(A) # convert to vector BY COLUMNS
}
# print(Adata)
la<-length(Adata)
na<-m*n+2*k*(k-1)
Adata <- c(Adata, rep(0,(na-la)))
nb<-(m+n)*k
nd<-m+n-1
jb<-0
Bdata<-rep(0,nb)
d<-rep(0,nd)
tv<-rep(0,k)
tres<-.Fortran("jspd",m=as.integer(m), n=as.integer(n),
k=as.integer(k),
Adata=as.double(Adata), jb=as.integer(jb),
Bdata=as.double(Bdata), d=as.integer(d), tv=as.double(tv),
na=as.integer(na), nb=as.integer(nb), nd=as.integer(nd) )
# print(str(tres))
jb<-tres$jb
d<-tres$d[1:jb]
Bdata<-tres$Bdata[1:(jb*k)]
# print(Bdata)
if (m > n) d <- -d # reset index
B <- matrix(Bdata, nrow=k, byrow=FALSE) # convert to matrix form
result<-list(B=B, d=d)
}
#########################################################################################
--
View this message in context:
http://r.789695.n4.nabble.com/writing-spdiags-function-for-R-tp4552626p4675010.html
Sent from the R help mailing list archive at Nabble.com.