Hi all, I'm very new to R and read a few tutorials, however I'm having difficulty trying to figure out how to plot a minimum spanning tree. I have a csv file that contains an n-by-n matrix of distances between strains of bacteria called matrix.csv. Looks like: id,strain1, strain2,strain3 strain1,0,.2,.8 strain2,.3,0,.7 strain3,.4,.6,0 I've been messing around with some information I've found on the web that prints out an mst, however I think it does it with random values, instead of values provided in a dataset (like from my csv file). Here's what I tried looks like: x <- runif(100) y <- runif(100) nearest_neighbour <- function (x, y, d=dist(cbind(x,y)), ...) { n <- length(x) stopifnot(length(x) == length(y)) d <- as.matrix(d) stopifnot( dim(d)[1] == dim(d)[2] ) stopifnot( length(x) == dim(d)[1] ) i <- 1:n j <- apply(d, 2, function (a) order(a)[2]) segments(x[i], y[i], x[j], y[j], ...) } plot(x, y, main="Nearest neighbour graph", xlab = "", ylab = "") nearest_neighbour(x, y) This gets the Nearest Neighbors, and then: plot(x, y, main = "Minimum spanning tree", xlab = "", ylab = "") nearest_neighbour(x, y, lwd=10, col="grey") points(x,y) library(ape) r <- mst(dist(cbind(x, y))) i <- which(r==1, arr.ind=TRUE ) segments(x[i[,1]], y[i[,1]], x[i[,2]], y[i[,2]], lwd = 2, col = "blue") This prints the mst. This would be perfect for me! However I don't know how to make this use my dataset... Any help (including links to helpful tutorials!) would be awesome, Thanks! ~josh -- View this message in context: http://www.nabble.com/Minimum-Spanning-Tree-tp22934813p22934813.html Sent from the R help mailing list archive at Nabble.com.
Josh, I would recommend to use a package that supports networks, e.g. igraph, but there are others as well. You can read in the data using 'read.csv()', transform it to a matrix with 'as.matrix()', and then create an igraph object from it with 'graph.adjacency()'. Then call 'minimum.spanning.tree()' to calculate the tree and 'plot()' to plot it. E.g. for your example file: library(igraph) D <- read.csv("/tmp/matrix.csv") D <- D[,-1] # we don't need the first column G <- graph.adjacency(as.matrix(D), weighted=TRUE) ## Some graphical parameters V(G)$label <- V(G)$name V(G)$shape <- "rectangle" V(G)$color <- "white" V(G)$size <- 40 ## MST and plot mst <- minimum.spanning.tree(G) lay <- layout.reingold.tilford(G, mode="all") plot(mst, layout=lay) Best, Gabor On Tue, Apr 7, 2009 at 8:00 PM, jpearl01 <joshearl1 at hotmail.com> wrote:> > Hi all, I'm very new to R and read a few tutorials, however I'm having > difficulty trying to figure out how to plot a minimum spanning tree. ?I have > a csv file that contains an n-by-n matrix of distances between strains of > bacteria called matrix.csv. > > Looks like: > id,strain1, strain2,strain3 > strain1,0,.2,.8 > strain2,.3,0,.7 > strain3,.4,.6,0 > > I've been messing around with some information I've found on the web that > prints out an mst, however I think it does it with random values, instead of > values provided in a dataset (like from my csv file). ?Here's what I tried > looks like: > x <- runif(100) > y <- runif(100) > nearest_neighbour <- function (x, y, d=dist(cbind(x,y)), ...) { > ?n <- length(x) > ?stopifnot(length(x) == length(y)) > ?d <- as.matrix(d) > ?stopifnot( dim(d)[1] == dim(d)[2] ) > ?stopifnot( length(x) == dim(d)[1] ) > ?i <- 1:n > ?j <- apply(d, 2, function (a) order(a)[2]) > ?segments(x[i], y[i], x[j], y[j], ...) > } > plot(x, y, > ? ? main="Nearest neighbour graph", > ? ? xlab = "", ylab = "") > nearest_neighbour(x, y) > > This gets the Nearest Neighbors, and then: > > plot(x, y, > ? ? main = "Minimum spanning tree", > ? ? xlab = "", ylab = "") > nearest_neighbour(x, y, lwd=10, col="grey") > points(x,y) > library(ape) > r <- mst(dist(cbind(x, y))) > i <- which(r==1, arr.ind=TRUE ) > segments(x[i[,1]], y[i[,1]], x[i[,2]], y[i[,2]], > ? ? ? ? lwd = 2, col = "blue") > > This prints the mst. ?This would be perfect for me! ?However I don't know > how to make this use my dataset... Any help (including links to helpful > tutorials!) would be awesome, > > Thanks! > ~josh > > > > -- > View this message in context: http://www.nabble.com/Minimum-Spanning-Tree-tp22934813p22934813.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Gabor Csardi <Gabor.Csardi at unil.ch> UNIL DGM
Josh, please stay on the list. Thanks. Answers below. On Tue, Apr 7, 2009 at 10:15 PM, Josh Earl <joshearl1 at hotmail.com> wrote:> > > Gabor, > > Thanks so much for your quick reply!? I think this will do exactly what I'm > after.? I'm getting a couple errors though, one is (if I do) > D<-D[,-1] > I get the error: > Error in graph.adjacency.dense(adjmatrix, mode = mode, weighted = weighted, > : > ? not a square matrixThat means that your input file is different from the one you showed in the first email. Anyway, if you need to remove a column then remove it, otherwise don't.> I can remove both the first row and column to maintain the matrix nxn size, > however then I recieve the error: > Error in vector("double", length) : vector size cannot be NA/NaN > > I don't suppose this has anything to do with the diagonal being all 0's?No, I don't think so. Check what is in D, it should be a "square" data frame containing numbers. Gabor> Thanks for your help! > > ~josh > > > > >> Date: Tue, 7 Apr 2009 20:35:16 +0200 >> Subject: Re: [R] Minimum Spanning Tree >> From: csardi at rmki.kfki.hu >> To: joshearl1 at hotmail.com >> CC: r-help at r-project.org >> >> Josh, I would recommend to use a package that supports networks, e.g. >> igraph, but there are others as well. >> >> You can read in the data using 'read.csv()', transform it to a matrix >> with 'as.matrix()', and then create an igraph object from it with >> 'graph.adjacency()'. >> >> Then call 'minimum.spanning.tree()' to calculate the tree and 'plot()' >> to plot it. E.g. for your example file: >> >> library(igraph) >> D <- read.csv("/tmp/matrix.csv") >> D <- D[,-1] # we don't need the first column >> G <- graph.adjacency(as.matrix(D), weighted=TRUE) >> >> ## Some graphical parameters >> V(G)$label <- V(G)$name >> V(G)$shape <- "rectangle" >> V(G)$color <- "white" >> V(G)$size <- 40 >> >> ## MST and plot >> mst <- minimum.spanning.tree(G) >> lay <- layout.reingold.tilford(G, mode="all") >> plot(mst, layout=lay) >> >> Best, >> Gabor >> >> On Tue, Apr 7, 2009 at 8:00 PM, jpearl01 <joshearl1 at hotmail.com> wrote: >> > >> > Hi all, I'm very new to R and read a few tutorials, however I'm having >> > difficulty trying to figure out how to plot a minimum spanning tree. ?I >> > have >> > a csv file that contains an n-by-n matrix of distances between strains >> > of >> > bacteria called matrix.csv. >> > >> > Looks like: >> > id,strain1, strain2,strain3 >> > strain1,0,.2,.8 >> > strain2,.3,0,.7 >> > strain3,.4,.6,0 >> > >> > I've been messing around with some information I've found on the web >> > that >> > prints out an mst, however I think it does it with random values, >> > instead of >> > values provided in a dataset (like from my csv file). ?Here's what I >> > tried >> > looks like: >> > x <- runif(100) >> > y <- runif(100) >> > nearest_neighbour <- function (x, y, d=dist(cbind(x,y)), ...) { >> > ?n <- length(x) >> > ?stopifnot(length(x) == length(y)) >> > ?d <- as.matrix(d) >> > ?stopifnot( dim(d)[1] == dim(d)[2] ) >> > ?stopifnot( length(x) == dim(d)[1] ) >> > ?i <- 1:n >> > ?j <- apply(d, 2, function (a) order(a)[2]) >> > ?segments(x[i], y[i], x[j], y[j], ...) >> > } >> > plot(x, y, >> > ? ? main="Nearest neighbour graph", >> > ? ? xlab = "", ylab = "") >> > nearest_neighbour(x, y) >> > >> > This gets the Nearest Neighbors, and then: >> > >> > plot(x, y, >> > ? ? main = "Minimum spanning tree", >> > ? ? xlab = "", ylab = "") >> > nearest_neighbour(x, y, lwd=10, col="grey") >> > points(x,y) >> > library(ape) >> > r <- mst(dist(cbind(x, y))) >> > i <- which(r==1, arr.ind=TRUE ) >> > segments(x[i[,1]], y[i[,1]], x[i[,2]], y[i[,2]], >> > ? ? ? ? lwd = 2, col = "blue") >> > >> > This prints the mst. ?This would be perfect for me! ?However I don't >> > know >> > how to make this use my dataset... Any help (including links to helpful >> > tutorials!) would be awesome, >> > >> > Thanks! >> > ~josh >> > >> > >> > >> > -- >> > View this message in context: >> > http://www.nabble.com/Minimum-Spanning-Tree-tp22934813p22934813.html >> > Sent from the R help mailing list archive at Nabble.com. >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> >> >> -- >> Gabor Csardi <Gabor.Csardi at unil.ch> UNIL DGM > > ________________________________ > Windows Live?: Keep your life in sync. Check it out.-- Gabor Csardi <Gabor.Csardi at unil.ch> UNIL DGM
Possibly Parallel Threads
- Legend/Substitute/Plotmath problem
- [Bridge] Is there any possible to realize spanning tree/rapid spanning tree in hardware?
- [Bridge] combining vlan tagging and spanning tree
- Minimum Spanning Trees
- [Bridge] help setting up a linux bridge with spanning tree to allow multiple vlans accross multiple uplinks