I'm not sure this list is the right place for this thing.
I noticed some erratic behaviour in sammon(). Running sammon on
two nearly identical sets of data results in very different
results. Below is an example. I create an initial configuration
with cmdscale() and store it into 'vec1'. I write this to file,
and read it back in again to 'vec2'. According to cor() on the
three columns of 'vec1' and 'vec2', they are identical. However,
if I use sammon() with initialising from 'vec1' or 'vec2', I get
different results. (SAMMON() is a wrapper function).
This I did on a Linux machine (R version 1.3.1):
> dst <- ReadDistFile("PA.lnk")
Loading required package: mva
> vec1 <- MDS(dst, 3)
> WriteVectorFile(vec1, "outfile")
> vec2 <- ReadVectorFile("outfile")
> cor(vec1[,1], vec2[,1])
[1] 1
> cor(vec1[,2], vec2[,2])
[1] 1
> cor(vec1[,3], vec2[,3])
[1] 1
> v1 <- SAMMON(dst, 3, y=vec1)
Loading required package: MASS
Initial stress : 0.20243
stress after 10 iters: 0.11869, magic = 0.018
stress after 20 iters: 0.07572, magic = 0.043
stress after 30 iters: 0.05346, magic = 0.491
stress after 40 iters: 0.04985, magic = 0.500
stress after 50 iters: 0.04945, magic = 0.500
stress after 60 iters: 0.04931, magic = 0.500
stress after 70 iters: 0.04925, magic = 0.500
> v2 <- SAMMON(dst, 3, y=vec2)
Initial stress : 0.20243
stress after 10 iters: 0.11869, magic = 0.018
stress after 20 iters: 0.07572, magic = 0.043
stress after 30 iters: 0.05369, magic = 0.491
stress after 30 iters: 0.05369
> cor(v1[,1], v2[,1])
[1] 0.958089
> cor(v1[,2], v2[,2])
[1] 0.979837
> cor(v1[,3], v2[,3])
[1] 0.9412055
I also tried it on HP-UX, and got different results again:
> dst <- ReadDistFile("PA.lnk")
Loading required package: mva
> vec1 <- MDS(dst, 3)
> WriteVectorFile(vec1, "outfile")
> vec2 <- ReadVectorFile("outfile")
> cor(vec1[,1], vec2[,1])
[1] 1
> cor(vec1[,2], vec2[,2])
[1] 1
> cor(vec1[,3], vec2[,3])
[1] 1
> v1 <- SAMMON(dst, 3, y=vec1)
Loading required package: MASS
Initial stress : 0.20243
stress after 10 iters: 0.11869, magic = 0.018
stress after 20 iters: 0.07572, magic = 0.043
stress after 28 iters: 0.06761
> v2 <- SAMMON(dst, 3, y=vec2)
Initial stress : 0.20243
stress after 10 iters: 0.11869, magic = 0.018
stress after 20 iters: 0.07572, magic = 0.043
stress after 30 iters: 0.06719, magic = 0.020
stress after 40 iters: 0.06115, magic = 0.009
stress after 50 iters: 0.05198, magic = 0.500
stress after 60 iters: 0.04968, magic = 0.500
stress after 70 iters: 0.04933, magic = 0.500
stress after 80 iters: 0.04924, magic = 0.225
> cor(v1[,1], v2[,1])
[1] 0.9106865
> cor(v1[,2], v2[,2])
[1] 0.9727502
> cor(v1[,3], v2[,3])
[1] 0.9411287
I even tried compiling with gcc with and without optimisation,
and I got different results for exactly the same input (no
saving to file first).
So, I gather that sammon() is an unstable function, extremely
sensitive to the tiniest of variations. Is this inherent to the
sammon algorithm, or is there something wrong with how it is
implemented in R?
--
Peter Kleiweg
http://www.let.rug.nl/~kleiweg/
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To:
r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._