Hi all, I'm very new to R and read a few tutorials, however I'm having
difficulty trying to figure out how to plot a minimum spanning tree. I have
a csv file that contains an n-by-n matrix of distances between strains of
bacteria called matrix.csv.
Looks like:
id,strain1, strain2,strain3
strain1,0,.2,.8
strain2,.3,0,.7
strain3,.4,.6,0
I've been messing around with some information I've found on the web
that
prints out an mst, however I think it does it with random values, instead of
values provided in a dataset (like from my csv file). Here's what I tried
looks like:
x <- runif(100)
y <- runif(100)
nearest_neighbour <- function (x, y, d=dist(cbind(x,y)), ...) {
n <- length(x)
stopifnot(length(x) == length(y))
d <- as.matrix(d)
stopifnot( dim(d)[1] == dim(d)[2] )
stopifnot( length(x) == dim(d)[1] )
i <- 1:n
j <- apply(d, 2, function (a) order(a)[2])
segments(x[i], y[i], x[j], y[j], ...)
}
plot(x, y,
main="Nearest neighbour graph",
xlab = "", ylab = "")
nearest_neighbour(x, y)
This gets the Nearest Neighbors, and then:
plot(x, y,
main = "Minimum spanning tree",
xlab = "", ylab = "")
nearest_neighbour(x, y, lwd=10, col="grey")
points(x,y)
library(ape)
r <- mst(dist(cbind(x, y)))
i <- which(r==1, arr.ind=TRUE )
segments(x[i[,1]], y[i[,1]], x[i[,2]], y[i[,2]],
lwd = 2, col = "blue")
This prints the mst. This would be perfect for me! However I don't know
how to make this use my dataset... Any help (including links to helpful
tutorials!) would be awesome,
Thanks!
~josh
--
View this message in context:
http://www.nabble.com/Minimum-Spanning-Tree-tp22934813p22934813.html
Sent from the R help mailing list archive at Nabble.com.
Josh, I would recommend to use a package that supports networks, e.g.
igraph, but there are others as well.
You can read in the data using 'read.csv()', transform it to a matrix
with 'as.matrix()', and then create an igraph object from it with
'graph.adjacency()'.
Then call 'minimum.spanning.tree()' to calculate the tree and
'plot()'
to plot it. E.g. for your example file:
library(igraph)
D <- read.csv("/tmp/matrix.csv")
D <- D[,-1] # we don't need the first column
G <- graph.adjacency(as.matrix(D), weighted=TRUE)
## Some graphical parameters
V(G)$label <- V(G)$name
V(G)$shape <- "rectangle"
V(G)$color <- "white"
V(G)$size <- 40
## MST and plot
mst <- minimum.spanning.tree(G)
lay <- layout.reingold.tilford(G, mode="all")
plot(mst, layout=lay)
Best,
Gabor
On Tue, Apr 7, 2009 at 8:00 PM, jpearl01 <joshearl1 at hotmail.com>
wrote:>
> Hi all, I'm very new to R and read a few tutorials, however I'm
having
> difficulty trying to figure out how to plot a minimum spanning tree. ?I
have
> a csv file that contains an n-by-n matrix of distances between strains of
> bacteria called matrix.csv.
>
> Looks like:
> id,strain1, strain2,strain3
> strain1,0,.2,.8
> strain2,.3,0,.7
> strain3,.4,.6,0
>
> I've been messing around with some information I've found on the
web that
> prints out an mst, however I think it does it with random values, instead
of
> values provided in a dataset (like from my csv file). ?Here's what I
tried
> looks like:
> x <- runif(100)
> y <- runif(100)
> nearest_neighbour <- function (x, y, d=dist(cbind(x,y)), ...) {
> ?n <- length(x)
> ?stopifnot(length(x) == length(y))
> ?d <- as.matrix(d)
> ?stopifnot( dim(d)[1] == dim(d)[2] )
> ?stopifnot( length(x) == dim(d)[1] )
> ?i <- 1:n
> ?j <- apply(d, 2, function (a) order(a)[2])
> ?segments(x[i], y[i], x[j], y[j], ...)
> }
> plot(x, y,
> ? ? main="Nearest neighbour graph",
> ? ? xlab = "", ylab = "")
> nearest_neighbour(x, y)
>
> This gets the Nearest Neighbors, and then:
>
> plot(x, y,
> ? ? main = "Minimum spanning tree",
> ? ? xlab = "", ylab = "")
> nearest_neighbour(x, y, lwd=10, col="grey")
> points(x,y)
> library(ape)
> r <- mst(dist(cbind(x, y)))
> i <- which(r==1, arr.ind=TRUE )
> segments(x[i[,1]], y[i[,1]], x[i[,2]], y[i[,2]],
> ? ? ? ? lwd = 2, col = "blue")
>
> This prints the mst. ?This would be perfect for me! ?However I don't
know
> how to make this use my dataset... Any help (including links to helpful
> tutorials!) would be awesome,
>
> Thanks!
> ~josh
>
>
>
> --
> View this message in context:
http://www.nabble.com/Minimum-Spanning-Tree-tp22934813p22934813.html
> Sent from the R help mailing list archive at Nabble.com.
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
--
Gabor Csardi <Gabor.Csardi at unil.ch> UNIL DGM
Josh, please stay on the list. Thanks. Answers below. On Tue, Apr 7, 2009 at 10:15 PM, Josh Earl <joshearl1 at hotmail.com> wrote:> > > Gabor, > > Thanks so much for your quick reply!? I think this will do exactly what I'm > after.? I'm getting a couple errors though, one is (if I do) > D<-D[,-1] > I get the error: > Error in graph.adjacency.dense(adjmatrix, mode = mode, weighted = weighted, > : > ? not a square matrixThat means that your input file is different from the one you showed in the first email. Anyway, if you need to remove a column then remove it, otherwise don't.> I can remove both the first row and column to maintain the matrix nxn size, > however then I recieve the error: > Error in vector("double", length) : vector size cannot be NA/NaN > > I don't suppose this has anything to do with the diagonal being all 0's?No, I don't think so. Check what is in D, it should be a "square" data frame containing numbers. Gabor> Thanks for your help! > > ~josh > > > > >> Date: Tue, 7 Apr 2009 20:35:16 +0200 >> Subject: Re: [R] Minimum Spanning Tree >> From: csardi at rmki.kfki.hu >> To: joshearl1 at hotmail.com >> CC: r-help at r-project.org >> >> Josh, I would recommend to use a package that supports networks, e.g. >> igraph, but there are others as well. >> >> You can read in the data using 'read.csv()', transform it to a matrix >> with 'as.matrix()', and then create an igraph object from it with >> 'graph.adjacency()'. >> >> Then call 'minimum.spanning.tree()' to calculate the tree and 'plot()' >> to plot it. E.g. for your example file: >> >> library(igraph) >> D <- read.csv("/tmp/matrix.csv") >> D <- D[,-1] # we don't need the first column >> G <- graph.adjacency(as.matrix(D), weighted=TRUE) >> >> ## Some graphical parameters >> V(G)$label <- V(G)$name >> V(G)$shape <- "rectangle" >> V(G)$color <- "white" >> V(G)$size <- 40 >> >> ## MST and plot >> mst <- minimum.spanning.tree(G) >> lay <- layout.reingold.tilford(G, mode="all") >> plot(mst, layout=lay) >> >> Best, >> Gabor >> >> On Tue, Apr 7, 2009 at 8:00 PM, jpearl01 <joshearl1 at hotmail.com> wrote: >> > >> > Hi all, I'm very new to R and read a few tutorials, however I'm having >> > difficulty trying to figure out how to plot a minimum spanning tree. ?I >> > have >> > a csv file that contains an n-by-n matrix of distances between strains >> > of >> > bacteria called matrix.csv. >> > >> > Looks like: >> > id,strain1, strain2,strain3 >> > strain1,0,.2,.8 >> > strain2,.3,0,.7 >> > strain3,.4,.6,0 >> > >> > I've been messing around with some information I've found on the web >> > that >> > prints out an mst, however I think it does it with random values, >> > instead of >> > values provided in a dataset (like from my csv file). ?Here's what I >> > tried >> > looks like: >> > x <- runif(100) >> > y <- runif(100) >> > nearest_neighbour <- function (x, y, d=dist(cbind(x,y)), ...) { >> > ?n <- length(x) >> > ?stopifnot(length(x) == length(y)) >> > ?d <- as.matrix(d) >> > ?stopifnot( dim(d)[1] == dim(d)[2] ) >> > ?stopifnot( length(x) == dim(d)[1] ) >> > ?i <- 1:n >> > ?j <- apply(d, 2, function (a) order(a)[2]) >> > ?segments(x[i], y[i], x[j], y[j], ...) >> > } >> > plot(x, y, >> > ? ? main="Nearest neighbour graph", >> > ? ? xlab = "", ylab = "") >> > nearest_neighbour(x, y) >> > >> > This gets the Nearest Neighbors, and then: >> > >> > plot(x, y, >> > ? ? main = "Minimum spanning tree", >> > ? ? xlab = "", ylab = "") >> > nearest_neighbour(x, y, lwd=10, col="grey") >> > points(x,y) >> > library(ape) >> > r <- mst(dist(cbind(x, y))) >> > i <- which(r==1, arr.ind=TRUE ) >> > segments(x[i[,1]], y[i[,1]], x[i[,2]], y[i[,2]], >> > ? ? ? ? lwd = 2, col = "blue") >> > >> > This prints the mst. ?This would be perfect for me! ?However I don't >> > know >> > how to make this use my dataset... Any help (including links to helpful >> > tutorials!) would be awesome, >> > >> > Thanks! >> > ~josh >> > >> > >> > >> > -- >> > View this message in context: >> > http://www.nabble.com/Minimum-Spanning-Tree-tp22934813p22934813.html >> > Sent from the R help mailing list archive at Nabble.com. >> > >> > ______________________________________________ >> > R-help at r-project.org mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide >> > http://www.R-project.org/posting-guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> > >> >> >> >> -- >> Gabor Csardi <Gabor.Csardi at unil.ch> UNIL DGM > > ________________________________ > Windows Live?: Keep your life in sync. Check it out.-- Gabor Csardi <Gabor.Csardi at unil.ch> UNIL DGM
Seemingly Similar Threads
- Legend/Substitute/Plotmath problem
- [Bridge] Is there any possible to realize spanning tree/rapid spanning tree in hardware?
- [Bridge] combining vlan tagging and spanning tree
- Minimum Spanning Trees
- [Bridge] help setting up a linux bridge with spanning tree to allow multiple vlans accross multiple uplinks