Steven Lembark
2010-May-19 14:10 UTC
[R] Where is the construction of a dist object from raw data described?
Any reference to the appropriate documentation would be most appreciated. I am using the TSP module for clustering of HIV genetic sequences. The distances have already been computed and available as either upper-triangular or square, i.e.: a 1 2 3 b 4 5 c 6 d or a 0 1 2 3 b 1 0 4 5 c 2 4 0 6 d 3 5 6 0 The TSP modules takes in a "dist" object. Catch: The only way I can see to get a dist object is with dist(), which computes the distances for itself rather than taking them as-is. Q: How does one convert either of the strucutres above into a "dist" object without having to first feed them through dist()? I can easily split the labels into a seprate output file, leaving me with the rownames and colnames values for the result in a separate place if that makes explaining how to get the numeric values into a dist any easier. Google, searching r-project.org, and the R Nutshell book all lead me back to dist() or daisy(). thanks -- Steven Lembark 85-09 90th St. Workhorse Computing Woodhaven, NY, 11421 lembark at wrkhors.com +1 888 359 3508
Gavin Simpson
2010-May-19 16:18 UTC
[R] Where is the construction of a dist object from raw data described?
On Wed, 2010-05-19 at 10:10 -0400, Steven Lembark wrote:> Any reference to the appropriate documentation would > be most appreciated. > > I am using the TSP module for clustering of HIV > genetic sequences. The distances have already been > computed and available as either upper-triangular > or square, i.e.: > > a 1 2 3 > b 4 5 > c 6 > d > > or > > a 0 1 2 3 > b 1 0 4 5 > c 2 4 0 6 > d 3 5 6 0 > > The TSP modules takes in a "dist" object. > > Catch: The only way I can see to get a dist > object is with dist(), which computes the > distances for itself rather than taking them > as-is. > > Q: How does one convert either of the strucutres > above into a "dist" object without having to > first feed them through dist()?as.dist() will convert the square matrix into a dist object I'm not sure of a convenient way of importing data in the "upper" form you show without having to go via a matrix, or build the dist object by hand, so if this really is an either/or situation and the square form is always available and it is not a problem loading the dissimilarity matrix into RAM, I'd stick with as.dist. HTH G> > I can easily split the labels into a seprate > output file, leaving me with the rownames > and colnames values for the result in a separate > place if that makes explaining how to get the > numeric values into a dist any easier. > > Google, searching r-project.org, and the R > Nutshell book all lead me back to dist() or > daisy(). > > thanks > > -- > Steven Lembark 85-09 90th St. > Workhorse Computing Woodhaven, NY, 11421 > lembark at wrkhors.com +1 888 359 3508 > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~% Dr. Gavin Simpson [t] +44 (0)20 7679 0522 ECRC, UCL Geography, [f] +44 (0)20 7679 0565 Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/ UK. WC1E 6BT. [w] http://www.freshwaters.org.uk %~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%