I am running GTM on the same datda space points but changing the number of
latent space points, the number of basis functions and parameter sigma.
I found a combination of such parameters that works fine.
On the other hand on page 7 of the paper "The Generative Topographic
Mapping" by Swensen, Bishop, and Williams, it is stated "Thre is no
over-fitting if the number of sample points is increased since the nymber of
degrees of freedom in the model is controlled by the mapping function
However, since you have transalted the code from matLab to R I am pretty sure
you know hwt is the cause of the fllowing messages, which routines generates
them and under which circumstances. Once I have these details clear, I can
possible try and avoid the event that causes them
"gtm_trn: Warning -- M-Step matrix singular, using pinv.\n"
1: In chol.default(A, pivot = TRUE) : matrix not positive definite
2: In gtm_trn(T, FI, W, lambda, 1, b, 2, quiet = FALSE, minSing = 0.01) :
Using 40 out of 40 eigenvalues
Thank you so much.
-----Messaggio originale-----
Da: Ondrej Such [mailto:ondrej.such@gmail.com]
Inviato: gio 01/04/2010 17.07
A: mauede@alice.it
Oggetto: Re: Generative Topographic Map
Hello Maura,
Thank you for your email.
Marcus Svensen, one of method's authors works at Microsoft, and can give you
more insight how it works. Email marcussv@microsoft.com, home page
research.microsoft.com/en-us/um/people/markussv. I also found his
Ph.D. thesis very insightful.
I believe it is never useful to have as many data points in latent space as
there are data points. And with <2000 points I can't imagine going beyond
latent dimension 3 or 4, in applying GTM.
I've heard that GTM can be useful mostly in situations, when data follows a
relatively smooth manifold.
2010/4/1 <mauede@alice.it>
> Thank you. I figured that out myself last night. I always forget that
> read.table does not actually read data into a matrix.
> GTM MatLab toolbox comes with a nice guide to use the package which may as
> well become an R vignette.
> Anyway, I got the singular matrix warnings myself and do not know whether I
> should be concerned about it or not.
> Moreover, I do not know how to avoid that.
> I will go through some other experiments keeping the data space samples and
> dimensionality fixed and changing some of the input parameters.
> I stress our goal is NOT visualization. We do not know the intrinsic
> dimensionality of that data space samples. Therefore we can only proceed by
> trial-&-error. That is we vary the dimensionality of the embedding
space. In
> this experiment the dimensionality of the data space is 7 so we start out
> projecting our original data to a 1D embedding space, then we try out a 2D
> embedding space, ..., all the way up to a 6D embedding space. Since we do
> not know the intrinsic dimensionality of the original data, we need a
> to evaluate the reliability of the projection. To assess that we
> the data back from the embedding to the data space and here we calculate
> RMSD between the original data and the reconstructed ones. Basically, using
> RMSD, we need as many reconstructed points as the original number. Such a
> requirement is achieved by choosing as many points in the latent space as
> the data space. Can such a choice be the cause of the matrix singularity ?
> Futhermore, is the number of basis functions related to the number of
> space points somehow ?
> Unluckily, even GTM MatLab documentation is not explicitly providing any
> clear criteria about the parameters choice and their dependence, if any.
> Thank you,
> Maura
> -----Messaggio originale-----
> Da: Ondrej Such [mailto:ondrej.such@gmail.com
> Inviato: gio 01/04/2010 11.16
> A: mauede@alice.it
> Oggetto: Re: Generative Topographic Map
> Hello,
> the problem that's tripping the package is that T is a data.frame and
not a
> matrix.
> Simply replacing
> T <- read.table("DHA_TNH.txt")
> with
> T <- as.matrix(read.table("DHA_TNH.txt"))
> makes the code run (though warnings about singular matrices remain, I'm
> sure to what degree that is worrisome). I'd be curious, as to how
> suggest improving the documentation.
> Hope this helps,
> --Ondrej
> 2010/3/31 <mauede@alice.it>
> > I tried to use R version of package
> > I noticed the original MatLab Pckage is much better documented.
> > I had a look at the R demo code "gtm_demo" and found that
variable Y is
> > used in advanced of being created:
> > I wrote my own few lines as follows:
> > inDir <- "C:/Documents and Settings/Monville/Alanine
> Dipeptide/DBP1/DHA"
> >
> > setwd(inDir)
> > T <- read.table("DHA_TNH.txt")
> > L <- 3
> > X <- matrix(nrow=nrow(T),ncol=3,byrow=TRUE)
> > MU <- matrix(nrow=round(nrow(T)/5), ncol=L)
> >
> > for(i in 1:ncol(X)) {
> > for(j in 1:nrow(X)) {
> > X[j,i] <- RANDU()
> > }
> > }
> >
> > for(i in 1:ncol(MU)) {
> > for(j in 1:nrow(MU)) {
> > MU[j,i] <- RANDU()
> > }
> > }
> > sigma <-1
> >
> > FI <- gtm_gbf(MU,sigma,X)
> > W <- gtm_ri(T,FI)
> > Y= FI%*%W
> > b = gtm_bi(Y)
> > lambda <- 0.001
> > for (m in 1:15) {
> > trnResult = gtm_trn(T, FI, W, lambda, 1, b, 2,quiet = TRUE,
minSing > > 0.01)
> > W = trnResult$W
> > b = trnResult$beta
> > Y = FI %*% W
> > }
> >
> > I ran the above script on my own data representing 1969 samples of 7
> > dihedral angles of a folding molecule (attached.
> > Upon running the 1st iteration of the training function
"gtm_trn" I get
> the
> > following error that I cannot interpret.
> > Any help and/or suggestion is welcome:
> >
> > > trnResult = gtm_trn(T, FI, W, lambda, 1, b, 2,quiet = TRUE,
minSing > > 1.)
> > Error in gtmGlobalR %*% T :
> > requires numeric/complex matrix/vector arguments
> > In addition: Warning messages:
> > 1: In chol.default(A, pivot = TRUE) : matrix not positive definite
> > 2: In gtm_trn(T, FI, W, lambda, 1, b, 2, quiet = TRUE, minSing = 1) :
> > Using 7 out of 395 eigenvalues
> >
> > Thank you in advance,
> > Maura
> >
> >
> >
> >
> > Alice Messenger ;-) chatti anche con gli amici di Windows Live
> e
> > tutti i telefonini TIM!
> > Vai su
> >
> Alice Messenger ;-) chatti anche con gli amici di Windows Live Messenger e
> tutti i telefonini TIM!
tutti i telefonini TIM!
[[alternative HTML version deleted]]