Nevermind on this questions, I was able to solve the issue by using
as.dist() instead of dist().
Thanks,
Kerrio
On Mon, Dec 30, 2013 at 2:38 PM, Kerrio Brown <kerriobrown@gmail.com>
wrote:
> Hi,
>
> I'm trying to understand what are the appropriate input for the dist()
> function (
> http://stat.ethz.ch/R-manual/R-patched/library/stats/html/dist.html)
>
> If I run the dist() function on a matrix, is it correct to have
*distance*values in the original matrix?
>
> The original values in my matrix are the actual Euclidean distances I want
> to feed into a multi-dimensional scaling algorithm (MDS). My concern is
> that the dist() function is going treat the original values as something
> other than distances. For instance, two data points of 0.9 may be
> considered exactly similar, even though 0.9 is supposed to mean a large
> difference between variables.
>
> For the original values below:
>
> John JakeWilliam John10.10.9 Jake0.11 0.9 William0.90.91
>
> John and Jake are very close (0.1)
> William and John are very distant (0.9)
> William and Jake are very distant (0.9)
>
> But the plotted distances do not reflect anything close to these original
> distance values.
>
> Here is my script using the csv data copied above as MyData.csv (dist
> function in red):
>
> #Read data
>
> MyData <- read.csv("MyData.csv", header = TRUE)
>
> MyMatrix <- as.matrix(MyData)
>
>
> #Compute/translate(?) distances
>
> d <- dist(MyMatrix)
>
> fit <- cmdscale(d,eig=TRUE, k=2)
>
>
>
> # plot solution
>
> x <- fit$points[,1]
>
> y <- fit$points[,2]
>
> plot(x, y, xlab="Coordinate 1", ylab="Coordinate 2",
main="Metric MDS",
> type="n")
>
> text(x, y, labels = names(MyData), cex=.7)
>
>
> One option would be to skip using the dist() function, but I can't seem
to
> get the data in the right format for cmdscale, which has to look like this:
>
>
>
> JohnJakeWilliam John
>
>
> Jake0.1
>
> William0.9 0.9
>
>
> Thank you for any clarification you can provide!
>
> -Kerrio
>
>
[[alternative HTML version deleted]]