I need to write a nonhierarchical clustering routine and I'm studying the way hclust (in the mva library) is built in R to see how things are done and what I can modify. I ran f2c on the hclust.f file (so I could read it in a language I know!) and there is one thing I don't quite understand about the way it gets called and the way it returns values. That Fortran function gets called in hclust.R, like this: hclust <- function(d, method="complete") { <snip> n <- attr(d, "Size") if(is.null(n)) stop("invalid dissimilarities") labels <- attr(d, "Labels") len <- n*(n-1)/2 hcl <- .Fortran("hclust", n = as.integer(n), len = as.integer(len), method = as.integer(method), ia = integer(n), ib = integer(n), crit = double(n), membr = double(n), nn = integer(n), disnn = double(n), flag = logical(n), diss = as.double(d), PACKAGE="mva") <snip> In this call, the "actual data" input is an integer "n", an integer "len", an integer "method", and "diss", which is the vector obtained by coercing bottom left part of a dissimiliarty matrix into a vector of doubles. So, I believe that means d is like this x x x 1.1 x x 3.3 4.4 x And diss is {1.1,3.3,4.4} Now here is the qustion. All of the other variables listed as input, ia,ib, crit,membr,nn,disnn, and flag, are just empty vectors being passed through to be used inside the hclust function. Right? And the changes made in those vectors inside the Fortran function are permanent, so the other functions in the hclust.R file can use those values? (like passing in a pointer in C?) -- Paul E. Johnson email: pauljohn at ukans.edu Dept. of Political Science http://lark.cc.ukans.edu/~pauljohn University of Kansas Office: (785) 864-9086 Lawrence, Kansas 66045 FAX: (785) 864-5700 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Friedrich Leisch
2000-Aug-28 14:43 UTC
[R] R function calling. Do I understand this right?
>>>>> On Mon, 28 Aug 2000 08:31:51 -0500, >>>>> Paul E Johnson (PEJ) wrote:PEJ> I need to write a nonhierarchical clustering routine and I'm studying PEJ> the way hclust (in the mva library) is built in R to see how things are PEJ> done and what I can modify. I ran f2c on the hclust.f file (so I could PEJ> read it in a language I know!) and there is one thing I don't quite PEJ> understand about the way it gets called and the way it returns values. PEJ> That Fortran function gets called in hclust.R, like this: PEJ> hclust <- function(d, method="complete") PEJ> { PEJ> <snip> PEJ> n <- attr(d, "Size") PEJ> if(is.null(n)) PEJ> stop("invalid dissimilarities") PEJ> labels <- attr(d, "Labels") PEJ> len <- n*(n-1)/2 PEJ> hcl <- .Fortran("hclust", PEJ> n = as.integer(n), PEJ> len = as.integer(len), PEJ> method = as.integer(method), PEJ> ia = integer(n), PEJ> ib = integer(n), PEJ> crit = double(n), PEJ> membr = double(n), PEJ> nn = integer(n), PEJ> disnn = double(n), PEJ> flag = logical(n), PEJ> diss = as.double(d), PACKAGE="mva") PEJ> <snip> PEJ> In this call, the "actual data" input is an integer "n", an integer PEJ> "len", an integer "method", and "diss", which is the vector obtained by PEJ> coercing bottom left part of a dissimiliarty matrix into a vector of PEJ> doubles. PEJ> So, I believe that means d is like this PEJ> x x x PEJ> 1.1 x x PEJ> 3.3 4.4 x PEJ> And diss is {1.1,3.3,4.4} yes: R> d <- dist(matrix(1:10,5)) R> d 1 2 3 4 2 1 3 2 1 4 3 2 1 5 4 3 2 1 R> as.double(d) [1] 1 2 3 4 1 2 3 1 2 1 PEJ> Now here is the qustion. All of the other variables listed as input, PEJ> ia,ib, crit,membr,nn,disnn, and flag, are just empty vectors being PEJ> passed through to be used inside the hclust function. Right? And the PEJ> changes made in those vectors inside the Fortran function are permanent, PEJ> so the other functions in the hclust.R file can use those values? (like PEJ> passing in a pointer in C?) well, yes and no: a call to .C or .Fortran returns a list with the modified vectors (and the originals are not modified). only if you set DUP=FALSE this duplication is not performed (but using this is not recommended, see the help page of .C). All these issues are explained in detail in the ``Writing R Extensions'' manual which comes with the sources of R or can be downloaded from CRAN. .f -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Prof Brian D Ripley
2000-Aug-29 06:47 UTC
[R] R function calling. Do I understand this right?
On Mon, 28 Aug 2000, Paul E Johnson wrote:> I need to write a nonhierarchical clustering routine and I'm studying > the way hclust (in the mva library) is built in R to see how things are > done and what I can modify. I ran f2c on the hclust.f file (so I could > read it in a language I know!) and there is one thing I don't quite > understand about the way it gets called and the way it returns values. > That Fortran function gets called in hclust.R, like this: > > hclust <- function(d, method="complete") > { > <snip> > n <- attr(d, "Size") > if(is.null(n)) > stop("invalid dissimilarities") > labels <- attr(d, "Labels") > > len <- n*(n-1)/2 > hcl <- .Fortran("hclust", > n = as.integer(n), > len = as.integer(len), > method = as.integer(method), > ia = integer(n), > ib = integer(n), > crit = double(n), > membr = double(n), > nn = integer(n), > disnn = double(n), > flag = logical(n), > diss = as.double(d), PACKAGE="mva") > <snip> > > In this call, the "actual data" input is an integer "n", an integer > "len", an integer "method", and "diss", which is the vector obtained by > coercing bottom left part of a dissimiliarty matrix into a vector of > doubles. > > So, I believe that means d is like this > > > x x x > 1.1 x x > 3.3 4.4 x > > And diss is {1.1,3.3,4.4} > > Now here is the qustion. All of the other variables listed as input, > ia,ib, crit,membr,nn,disnn, and flag, are just empty vectors being > passed through to be used inside the hclust function. Right? And the > changes made in those vectors inside the Fortran function are permanent, > so the other functions in the hclust.R file can use those values? (like > passing in a pointer in C?)No. Copies of variables are made on entry and on exit. Thus the .C call returns a list of vectors with the names given. You cannot alter the versions in the function, but you have to use explicitly the components of the returned result. It's normal only to name components you are going to use subsequently, but some people name them all as an aide-memoire. (I am ignoring DUP=TRUE in .C in this dicussion.) If you are going to write in C you might want to start with .Call these days. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._