thr3ads.net - R devel - [Rd] Erratic behaviour of sammon() [Nov 2001]

If this information is useful, please help other people find it:
Share via:

Peter Kleiweg

2001-Nov-01 21:15 UTC

[Rd] Erratic behaviour of sammon()

I'm not sure this list is the right place for this thing.

I noticed some erratic behaviour in sammon(). Running sammon on
two nearly identical sets of data results in very different
results. Below is an example. I create an initial configuration
with cmdscale() and store it into 'vec1'. I write this to file,
and read it back in again to 'vec2'. According to cor() on the
three columns of 'vec1' and 'vec2', they are identical. However,
if I use sammon() with initialising from 'vec1' or 'vec2', I get
different results. (SAMMON() is a wrapper function).

This I did on a Linux machine (R version 1.3.1):

	> dst <- ReadDistFile("PA.lnk")
	Loading required package: mva
	> vec1 <- MDS(dst, 3)
	> WriteVectorFile(vec1, "outfile")
	> vec2 <- ReadVectorFile("outfile")
	> cor(vec1[,1], vec2[,1])
	[1] 1
	> cor(vec1[,2], vec2[,2])
	[1] 1
	> cor(vec1[,3], vec2[,3])
	[1] 1
	> v1 <- SAMMON(dst, 3, y=vec1)
	Loading required package: MASS
	Initial stress        : 0.20243
	stress after  10 iters: 0.11869, magic = 0.018
	stress after  20 iters: 0.07572, magic = 0.043
	stress after  30 iters: 0.05346, magic = 0.491
	stress after  40 iters: 0.04985, magic = 0.500
	stress after  50 iters: 0.04945, magic = 0.500
	stress after  60 iters: 0.04931, magic = 0.500
	stress after  70 iters: 0.04925, magic = 0.500
	> v2 <- SAMMON(dst, 3, y=vec2)
	Initial stress        : 0.20243
	stress after  10 iters: 0.11869, magic = 0.018
	stress after  20 iters: 0.07572, magic = 0.043
	stress after  30 iters: 0.05369, magic = 0.491
	stress after  30 iters: 0.05369
	> cor(v1[,1], v2[,1])
	[1] 0.958089
	> cor(v1[,2], v2[,2])
	[1] 0.979837
	> cor(v1[,3], v2[,3])
	[1] 0.9412055

I also tried it on HP-UX, and got different results again:

	> dst <- ReadDistFile("PA.lnk")
	Loading required package: mva
	> vec1 <- MDS(dst, 3)
	> WriteVectorFile(vec1, "outfile")
	> vec2 <- ReadVectorFile("outfile")
	> cor(vec1[,1], vec2[,1])
	[1] 1
	> cor(vec1[,2], vec2[,2])
	[1] 1
	> cor(vec1[,3], vec2[,3])
	[1] 1
	> v1 <- SAMMON(dst, 3, y=vec1)
	Loading required package: MASS
	Initial stress        : 0.20243
	stress after  10 iters: 0.11869, magic = 0.018
	stress after  20 iters: 0.07572, magic = 0.043
	stress after  28 iters: 0.06761
	> v2 <- SAMMON(dst, 3, y=vec2)
	Initial stress        : 0.20243
	stress after  10 iters: 0.11869, magic = 0.018
	stress after  20 iters: 0.07572, magic = 0.043
	stress after  30 iters: 0.06719, magic = 0.020
	stress after  40 iters: 0.06115, magic = 0.009
	stress after  50 iters: 0.05198, magic = 0.500
	stress after  60 iters: 0.04968, magic = 0.500
	stress after  70 iters: 0.04933, magic = 0.500
	stress after  80 iters: 0.04924, magic = 0.225
	> cor(v1[,1], v2[,1])
	[1] 0.9106865
	> cor(v1[,2], v2[,2])
	[1] 0.9727502
	> cor(v1[,3], v2[,3])
	[1] 0.9411287

I even tried compiling with gcc with and without optimisation,
and I got different results for exactly the same input (no
saving to file first).

So, I gather that sammon() is an unstable function, extremely
sensitive to the tiniest of variations. Is this inherent to the
sammon algorithm, or is there something wrong with how it is
implemented in R?

-- 
Peter Kleiweg
let.rug.nl/~kleiweg

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To:
r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Prof Brian Ripley

2001-Nov-02 08:09 UTC

head link

[Rd] Erratic behaviour of sammon()

On Thu, 1 Nov 2001, Peter Kleiweg wrote:
> I'm not sure this list is the right place for this thing.
Nor am I, especially with the subject: it seems that it is repeatable from
what you actually reported.
> I noticed some erratic behaviour in sammon(). Running sammon on
> two nearly identical sets of data results in very different
> results. Below is an example. I create an initial configuration
> with cmdscale() and store it into 'vec1'. I write this to file,
> and read it back in again to 'vec2'. According to cor() on the
> three columns of 'vec1' and 'vec2', they are identical.
However,
> if I use sammon() with initialising from 'vec1' or 'vec2',
I get
> different results. (SAMMON() is a wrapper function).
[...]
> So, I gather that sammon() is an unstable function, extremely
> sensitive to the tiniest of variations. Is this inherent to the
> sammon algorithm, or is there something wrong with how it is
> implemented in R?
It is inherent to the Sammon algorithm, on some datasets.
There are lots of similar phenomena in real-life statistics away from
convex optimization problems.

-- 
Brian D. Ripley,                  ripley@stats.ox.ac.uk
Professor of Applied Statistics,  stats.ox.ac.uk/~ripley
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272860 (secr)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595

-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-devel mailing list -- Read ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !)  To:
r-devel-request@stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._

Maybe Matching Threads

Search for more reasonably related threads

R devel - Nov 2001 - Erratic behaviour of sammon()

[Rd] Erratic behaviour of sammon()

[Rd] Erratic behaviour of sammon()

Maybe Matching Threads