Jesse D Lecy
2007-Nov-25 18:42 UTC
[R] accessing the "address" of items in a recursive list
Dear useRs,
I am working on a project involving the clustering of a large dataset. I need
to extract specific sub-clusters from the parent dendrogram for further
analysis. The data is too large for the use of convenient tools such as
identify.clust (it selects the specific group of interest on a graph), so
alternatively I have saved the plot as a large image file so that it can be
printed or viewed piecemeal. My problem is this: when I find the subclusters
that I need to analyze I have no way to reference that specific component of the
dendrogram in order to extract it:
smallDend <- dend[[i]][[j]][[k]] where i,j,k refer to levels of the
dendrogram (it's "address")
I would like to print the "address" of the smaller dendrograms on the
edge similar to this:
addr <- function(n) {
if(!is.leaf(n)) {
attr(n, "edgetext") <- paste("height
of",(attr(n,"height"))
}
n
}
labeledDends <- dendrapply(dend, addr)
Where "i,j,k" is printed instead of "height". But I have
not been able to figure out how to ask each dendrogram its address within the
dendrapply function. Can anyone help me with this?
Many thanks,
Jesse
PhD Student in Social Science
Syracuse University
j daniel wrote:> > > I would like to print the "address" of the smaller dendrograms on the edge > similar to this: > > addr <- function(n) { > if(!is.leaf(n)) { > attr(n, "edgetext") <- paste("height of",(attr(n,"height")) > } > n > } > labeledDends <- dendrapply(dend, addr) > > Where "i,j,k" is printed instead of "height". But I have not been able to > figure out how to ask each dendrogram its address within the dendrapply > function. Can anyone help me with this? > >Hi! Load the following four functions into R and execute this (assuming your dendrogram is in variable 'dend') newdend <- dendrapplyGlobal(dend, "height",function(x){names(x)},attrNameTo="edgetext") #From the specified dendrogram X a vector of all values of the specified #node attribute is extracted, modified by the function FUN and a new #dendrogram is created using the new values for the attribute. Optionally, #a different attribute can be set using 'attrNameTo' dendrapplyGlobal <- function(X,attrName,FUN,...,attrNameTo=NULL) { if (is.null(attrNameTo)) { attrNameTo <- attrName } funcGet <- function(x){ attr(x,attrName) } funcSet <- function(x,value){ attr(x,attrNameTo) <- value return(x) } values <- dendrapplyToVector(X,funcGet) values <- FUN(values,...) ret <- dendrapplyFromVector(X,values,funcSet) return(ret) } #Traverses the dendrogram in postorder and applies FUN to each node. #The result of each evaluation is stored in the resulting array. #Additional arguments to FUN can be passed as ... #The names attribute of the resulting vector is the 'path' to each node. #This implementation is based on dendrapply(graphics). dendrapplyToVector <- function(X,FUN,...) { FUN <- match.fun(FUN) if (!inherits(X, "dendrogram")) stop("'X' is not a dendrogram") Napply <- function(d,path="") { if (is.leaf(d)) { ret <- c(FUN(d,...)) names(ret)[1] <- substr(path,start=1,stop=nchar(path)-1) return(ret) } ret <- vector() for (j in seq_along(d)) { addr <- paste(path,j,".",sep="") ret <- append(ret,Napply(d[[j]],addr)) } ret <- append(ret,FUN(d,...)) names(ret)[length(ret)] <- substr(path,start=1,stop=nchar(path)-1) return(ret) } Napply(X) } #Traverses the dendrogram X in postorder and constructs a new dendrogram using #the specified function FUN and vector parVec. Each element of parVec must #relate to a node in X, which is the case if parVec was created using #dendrapplyToVector(). #Additional arguments to FUN can be passed as ... #This implementation is based on dendrapply(graphics). dendrapplyFromVector <- function(X,parVec,FUN,...) { FUN <- match.fun(FUN) if (!inherits(X, "dendrogram")) stop("'X' is not a dendrogram") Napply <- function(d,v) { if (is.leaf(d)) { ret <- FUN(d,v,...) return(ret) } else { ret <- d if (!is.list(ret)) ret <- as.list(ret) i <- 1 for (j in seq_along(d)) { childrenCount <- getDendrogramNodeCount(d[[j]]) indices <- i:(i+childrenCount-1) ret[[j]] <- Napply(d[[j]],v[indices]) i <- i + childrenCount } ret <- FUN(ret,v[i],...) } return(ret) } Napply(X,parVec) } #Returns the number of nodes in a dendrogram. getDendrogramNodeCount <- function(dend) { if (!is.leaf(dend)){ childrenSum <- 0 for (child in dend) { childrenSum <- childrenSum + getDendrogramNodeCount(child) } return(childrenSum+1) } return(1) } -- View this message in context: http://www.nabble.com/accessing-the-%22address%22-of-items-in-a-recursive-list-tp13938566p15019890.html Sent from the R help mailing list archive at Nabble.com.
j daniel wrote:> > > I would like to print the "address" of the smaller dendrograms on the edge > similar to this: > > addr <- function(n) { > if(!is.leaf(n)) { > attr(n, "edgetext") <- paste("height of",(attr(n,"height")) > } > n > } > labeledDends <- dendrapply(dend, addr) > > Where "i,j,k" is printed instead of "height". But I have not been able to > figure out how to ask each dendrogram its address within the dendrapply > function. Can anyone help me with this? > >Hi! Load the following four functions into R and execute this (assuming your dendrogram is in variable 'dend') newdend <- dendrapplyGlobal(dend, "height",function(x){names(x)},attrNameTo="edgetext") #From the specified dendrogram X a vector of all values of the specified #node attribute is extracted, modified by the function FUN and a new #dendrogram is created using the new values for the attribute. Optionally, #a different attribute can be set using 'attrNameTo' dendrapplyGlobal <- function(X,attrName,FUN,...,attrNameTo=NULL) { if (is.null(attrNameTo)) { attrNameTo <- attrName } funcGet <- function(x){ attr(x,attrName) } funcSet <- function(x,value){ attr(x,attrNameTo) <- value return(x) } values <- dendrapplyToVector(X,funcGet) values <- FUN(values,...) ret <- dendrapplyFromVector(X,values,funcSet) return(ret) } #Traverses the dendrogram in postorder and applies FUN to each node. #The result of each evaluation is stored in the resulting array. #Additional arguments to FUN can be passed as ... #The names attribute of the resulting vector is the 'path' to each node. #This implementation is based on dendrapply(graphics). dendrapplyToVector <- function(X,FUN,...) { FUN <- match.fun(FUN) if (!inherits(X, "dendrogram")) stop("'X' is not a dendrogram") Napply <- function(d,path="") { if (is.leaf(d)) { ret <- c(FUN(d,...)) names(ret)[1] <- substr(path,start=1,stop=nchar(path)-1) return(ret) } ret <- vector() for (j in seq_along(d)) { addr <- paste(path,j,".",sep="") ret <- append(ret,Napply(d[[j]],addr)) } ret <- append(ret,FUN(d,...)) names(ret)[length(ret)] <- substr(path,start=1,stop=nchar(path)-1) return(ret) } Napply(X) } #Traverses the dendrogram X in postorder and constructs a new dendrogram using #the specified function FUN and vector parVec. Each element of parVec must #relate to a node in X, which is the case if parVec was created using #dendrapplyToVector(). #Additional arguments to FUN can be passed as ... #This implementation is based on dendrapply(graphics). dendrapplyFromVector <- function(X,parVec,FUN,...) { FUN <- match.fun(FUN) if (!inherits(X, "dendrogram")) stop("'X' is not a dendrogram") Napply <- function(d,v) { if (is.leaf(d)) { ret <- FUN(d,v,...) return(ret) } else { ret <- d if (!is.list(ret)) ret <- as.list(ret) i <- 1 for (j in seq_along(d)) { childrenCount <- getDendrogramNodeCount(d[[j]]) indices <- i:(i+childrenCount-1) ret[[j]] <- Napply(d[[j]],v[indices]) i <- i + childrenCount } ret <- FUN(ret,v[i],...) } return(ret) } Napply(X,parVec) } #Returns the number of nodes in a dendrogram. getDendrogramNodeCount <- function(dend) { if (!is.leaf(dend)){ childrenSum <- 0 for (child in dend) { childrenSum <- childrenSum + getDendrogramNodeCount(child) } return(childrenSum+1) } return(1) } hth, Florian -- View this message in context: http://www.nabble.com/accessing-the-%22address%22-of-items-in-a-recursive-list-tp13938566p15019892.html Sent from the R help mailing list archive at Nabble.com.