Jesse D Lecy
2007-Nov-25 18:42 UTC
[R] accessing the "address" of items in a recursive list
Dear useRs, I am working on a project involving the clustering of a large dataset. I need to extract specific sub-clusters from the parent dendrogram for further analysis. The data is too large for the use of convenient tools such as identify.clust (it selects the specific group of interest on a graph), so alternatively I have saved the plot as a large image file so that it can be printed or viewed piecemeal. My problem is this: when I find the subclusters that I need to analyze I have no way to reference that specific component of the dendrogram in order to extract it: smallDend <- dend[[i]][[j]][[k]] where i,j,k refer to levels of the dendrogram (it's "address") I would like to print the "address" of the smaller dendrograms on the edge similar to this: addr <- function(n) { if(!is.leaf(n)) { attr(n, "edgetext") <- paste("height of",(attr(n,"height")) } n } labeledDends <- dendrapply(dend, addr) Where "i,j,k" is printed instead of "height". But I have not been able to figure out how to ask each dendrogram its address within the dendrapply function. Can anyone help me with this? Many thanks, Jesse PhD Student in Social Science Syracuse University
j daniel wrote:> > > I would like to print the "address" of the smaller dendrograms on the edge > similar to this: > > addr <- function(n) { > if(!is.leaf(n)) { > attr(n, "edgetext") <- paste("height of",(attr(n,"height")) > } > n > } > labeledDends <- dendrapply(dend, addr) > > Where "i,j,k" is printed instead of "height". But I have not been able to > figure out how to ask each dendrogram its address within the dendrapply > function. Can anyone help me with this? > >Hi! Load the following four functions into R and execute this (assuming your dendrogram is in variable 'dend') newdend <- dendrapplyGlobal(dend, "height",function(x){names(x)},attrNameTo="edgetext") #From the specified dendrogram X a vector of all values of the specified #node attribute is extracted, modified by the function FUN and a new #dendrogram is created using the new values for the attribute. Optionally, #a different attribute can be set using 'attrNameTo' dendrapplyGlobal <- function(X,attrName,FUN,...,attrNameTo=NULL) { if (is.null(attrNameTo)) { attrNameTo <- attrName } funcGet <- function(x){ attr(x,attrName) } funcSet <- function(x,value){ attr(x,attrNameTo) <- value return(x) } values <- dendrapplyToVector(X,funcGet) values <- FUN(values,...) ret <- dendrapplyFromVector(X,values,funcSet) return(ret) } #Traverses the dendrogram in postorder and applies FUN to each node. #The result of each evaluation is stored in the resulting array. #Additional arguments to FUN can be passed as ... #The names attribute of the resulting vector is the 'path' to each node. #This implementation is based on dendrapply(graphics). dendrapplyToVector <- function(X,FUN,...) { FUN <- match.fun(FUN) if (!inherits(X, "dendrogram")) stop("'X' is not a dendrogram") Napply <- function(d,path="") { if (is.leaf(d)) { ret <- c(FUN(d,...)) names(ret)[1] <- substr(path,start=1,stop=nchar(path)-1) return(ret) } ret <- vector() for (j in seq_along(d)) { addr <- paste(path,j,".",sep="") ret <- append(ret,Napply(d[[j]],addr)) } ret <- append(ret,FUN(d,...)) names(ret)[length(ret)] <- substr(path,start=1,stop=nchar(path)-1) return(ret) } Napply(X) } #Traverses the dendrogram X in postorder and constructs a new dendrogram using #the specified function FUN and vector parVec. Each element of parVec must #relate to a node in X, which is the case if parVec was created using #dendrapplyToVector(). #Additional arguments to FUN can be passed as ... #This implementation is based on dendrapply(graphics). dendrapplyFromVector <- function(X,parVec,FUN,...) { FUN <- match.fun(FUN) if (!inherits(X, "dendrogram")) stop("'X' is not a dendrogram") Napply <- function(d,v) { if (is.leaf(d)) { ret <- FUN(d,v,...) return(ret) } else { ret <- d if (!is.list(ret)) ret <- as.list(ret) i <- 1 for (j in seq_along(d)) { childrenCount <- getDendrogramNodeCount(d[[j]]) indices <- i:(i+childrenCount-1) ret[[j]] <- Napply(d[[j]],v[indices]) i <- i + childrenCount } ret <- FUN(ret,v[i],...) } return(ret) } Napply(X,parVec) } #Returns the number of nodes in a dendrogram. getDendrogramNodeCount <- function(dend) { if (!is.leaf(dend)){ childrenSum <- 0 for (child in dend) { childrenSum <- childrenSum + getDendrogramNodeCount(child) } return(childrenSum+1) } return(1) } -- View this message in context: http://www.nabble.com/accessing-the-%22address%22-of-items-in-a-recursive-list-tp13938566p15019890.html Sent from the R help mailing list archive at Nabble.com.
j daniel wrote:> > > I would like to print the "address" of the smaller dendrograms on the edge > similar to this: > > addr <- function(n) { > if(!is.leaf(n)) { > attr(n, "edgetext") <- paste("height of",(attr(n,"height")) > } > n > } > labeledDends <- dendrapply(dend, addr) > > Where "i,j,k" is printed instead of "height". But I have not been able to > figure out how to ask each dendrogram its address within the dendrapply > function. Can anyone help me with this? > >Hi! Load the following four functions into R and execute this (assuming your dendrogram is in variable 'dend') newdend <- dendrapplyGlobal(dend, "height",function(x){names(x)},attrNameTo="edgetext") #From the specified dendrogram X a vector of all values of the specified #node attribute is extracted, modified by the function FUN and a new #dendrogram is created using the new values for the attribute. Optionally, #a different attribute can be set using 'attrNameTo' dendrapplyGlobal <- function(X,attrName,FUN,...,attrNameTo=NULL) { if (is.null(attrNameTo)) { attrNameTo <- attrName } funcGet <- function(x){ attr(x,attrName) } funcSet <- function(x,value){ attr(x,attrNameTo) <- value return(x) } values <- dendrapplyToVector(X,funcGet) values <- FUN(values,...) ret <- dendrapplyFromVector(X,values,funcSet) return(ret) } #Traverses the dendrogram in postorder and applies FUN to each node. #The result of each evaluation is stored in the resulting array. #Additional arguments to FUN can be passed as ... #The names attribute of the resulting vector is the 'path' to each node. #This implementation is based on dendrapply(graphics). dendrapplyToVector <- function(X,FUN,...) { FUN <- match.fun(FUN) if (!inherits(X, "dendrogram")) stop("'X' is not a dendrogram") Napply <- function(d,path="") { if (is.leaf(d)) { ret <- c(FUN(d,...)) names(ret)[1] <- substr(path,start=1,stop=nchar(path)-1) return(ret) } ret <- vector() for (j in seq_along(d)) { addr <- paste(path,j,".",sep="") ret <- append(ret,Napply(d[[j]],addr)) } ret <- append(ret,FUN(d,...)) names(ret)[length(ret)] <- substr(path,start=1,stop=nchar(path)-1) return(ret) } Napply(X) } #Traverses the dendrogram X in postorder and constructs a new dendrogram using #the specified function FUN and vector parVec. Each element of parVec must #relate to a node in X, which is the case if parVec was created using #dendrapplyToVector(). #Additional arguments to FUN can be passed as ... #This implementation is based on dendrapply(graphics). dendrapplyFromVector <- function(X,parVec,FUN,...) { FUN <- match.fun(FUN) if (!inherits(X, "dendrogram")) stop("'X' is not a dendrogram") Napply <- function(d,v) { if (is.leaf(d)) { ret <- FUN(d,v,...) return(ret) } else { ret <- d if (!is.list(ret)) ret <- as.list(ret) i <- 1 for (j in seq_along(d)) { childrenCount <- getDendrogramNodeCount(d[[j]]) indices <- i:(i+childrenCount-1) ret[[j]] <- Napply(d[[j]],v[indices]) i <- i + childrenCount } ret <- FUN(ret,v[i],...) } return(ret) } Napply(X,parVec) } #Returns the number of nodes in a dendrogram. getDendrogramNodeCount <- function(dend) { if (!is.leaf(dend)){ childrenSum <- 0 for (child in dend) { childrenSum <- childrenSum + getDendrogramNodeCount(child) } return(childrenSum+1) } return(1) } hth, Florian -- View this message in context: http://www.nabble.com/accessing-the-%22address%22-of-items-in-a-recursive-list-tp13938566p15019892.html Sent from the R help mailing list archive at Nabble.com.