Dear List, I am looking for a possibility to fit a mixture model under R using maximum likelihood estimation. Venables and Ripley describe a solution working under S+ (in MASS, 3. ed., p. 263) which requires the D system function and deriv3. This solution does not seem to be portable to R or at least I do not realise how. Is there anyone who a) knows how one could make the MASS-method run under R? or b) has written an EM-algorithm to fit mixture models in R that I could use? I would be glad to hear from someone. Sincerely, Jobst Dr. Jobst Landgrebe AG Wurst (Molecular Neurogenetics) MPI of Psychiatry Kraepelinstr. 2-10 D-80804 Munich phone +49 89 30622626 fax +49 89 30622642 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Mon, 3 Sep 2001, Jobst Landgrebe wrote:> Dear List, > > I am looking for a possibility to fit a mixture model under R using > maximum likelihood estimation. > Venables and Ripley describe a solution working under S+ (in MASS, 3. ed., > p. 263) which requires the D system function and deriv3. This solution does not > seem to be portable to R or at least I do not realise how. > > Is there anyone who > > a) knows how one could make the MASS-method run under R?You can optimize without derivatives. ms() is not in R either.> or > > b) has written an EM-algorithm to fit mixture models in R that I could > use?There's one (rather buried) in package mda. -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Hi Jobst, ----- Original Message ----- From: "Jobst Landgrebe" <landgreb at mpipsykl.mpg.de> To: <r-help at lists.R-project.org> Sent: Tuesday, September 04, 2001 7:10 AM Subject: [R] mixture distributions> Dear List, > > Venables and Ripley describe a solution working under S+ (in MASS, 3. ed., > p. 263) which requires the D system function and deriv3. This solutiondoes not> seem to be portable to R or at least I do not realise how.It is, there is a version for R. Look in CRAN (http://cran.r-project.org/). It is called VR package (Venables and Ripley).> Is there anyone who > > a) knows how one could make the MASS-method run under R?Depends on the OS platform are using, install the package then type library( MASS ) in R.> or > > b) has written an EM-algorithm to fit mixture models in R that I could > use? > > > I would be glad to hear from someone. > > > Sincerely, > > Jobst > > > > Dr. Jobst Landgrebe > AG Wurst (Molecular Neurogenetics) > MPI of Psychiatry > Kraepelinstr. 2-10 > D-80804 Munich > phone +49 89 30622626 > fax +49 89 30622642 > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-> r-help mailing list -- Readhttp://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html> Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch >_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._. _._ -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
here are some functions I wrote and tested in Splus; originally based on some else's version for two components. I haven't tried them in R yet... I expect there is plenty of room for improvement. albyn ---------------------------------------------------------------------- "Mixture.em" <- function(y, mu, v, p=1/length(mu), eps = .00001, max.iter=25, prnt=F,probs=F) { p <- p/sum(p) # make sure probabilities add to 1 iter <- 1 n <- length(y) k <- length(mu) done <- 0 loglik0 <- -(2^1000) while(!done && iter <= max.iter) { # # E step: guess probability y[i] is from group j # L <- Norm.mixd(y,p,mu,v) Lsum <- apply(L,1,"sum") loglik <- sum(log(Lsum)) bayes <- sweep(L,1,Lsum,"/") # # M step: estimate component parameters given # group assignment probabilities # N <- apply(bayes,2,sum) mu <- as.numeric(y %*% bayes)/N Y <- matrix(y, nrow=n, ncol=k) M <- matrix(mu, nrow=n, ncol=k, byrow=T) SQ <- bayes*(Y-M)^2 v <- apply(SQ,2,sum)/N p <- N/sum(N) if(abs((loglik-loglik0)/loglik) < eps) done <- 1 if(prnt){ cat("step:",k,"\n") cat("means:",mu,"\n \n") } loglik0 <- loglik iter <- iter + 1 } if( !probs) list(mu=mu,v=v,p=p,loglik=loglik) else list(mu=mu,v=v,p=p,loglik=loglik,P=bayes) } Norm.mixd <- function(y,p,m,v) { # evaluates the components of the normal mixture density determined # by the vector of means mu, variances v, and proportions p, for the # vector of data y. # Pi is assumed to be predefined # returns a matrix with rows corresponding to obs. # and columns corresponding to mixture components, ie. # d(i,j)= f(y[i] | m[j],v[j],p[j]) k<-length(p) n<-length(y) a<-matrix(y,nrow=n,ncol=k) b<-matrix(m,nrow=n,ncol=k,byrow=T) d<-sweep((a-b)^2,2,v,"/")/2 b<-matrix(p,nrow=n,ncol=k,byrow=T) b<-sweep(b,2,sqrt(2*Pi*v),"/") d<- b*exp(-d) d } "norm.mix.plot"<- function(p, m, v) { # creates data structure containing $x, $y for plotting the # mixture density specified by the parameters in params s <- sqrt(v) if(length(s) == 1) s <- rep(s, length(p)) if(length(p) != length(m)) fatal("invalid parameter specification") k <- length(m) xlim <- c(min(m - 3 * s), max(m + 3 * s)) x <- seq(xlim[1], xlim[2], length = 400) y <- matrix(0, nrow = k, ncol = length(x)) for(i in 1:k) { y[i, ] <- p[i] * dnorm(x, m[i], s[i]) ifelse(is.na(y[i, ]), 0, y[i, ]) } y <- apply(y, 2, "sum") z <- list(x = x, y = y) z } -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Dear R users: I am trying to get the runs statistics for a very long binary sequences and I ran into trouble with the sped when using (probably too many) "if" and "for" statements in my program. Let me explain what I would like to see as the final output. Say, I have a sequence 1001101000. The runs vector should be r=(1,2,2,1,1,3), i.e. counting similar subsequences. Also, it could be treated as an independent question, the longest run is of interest. Just to give you an idea the length of the original sequences will be of order 10^6 and greater. Any help greatly appreciated. Janusz. ** Janusz Kawczak ** ** UNC at Charlotte, Department of Mathematics, Room 350F ** ** 9201 University City Blvd. ** ** Charlotte, NC, 28223-0001, U.S.A. ** ** Tel.: (704) 687-2566 (W) (704) 921-0273 (H) Fax.: (704) 687-6415 ** -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On Tue, 4 Sep 2001, Janusz Kawczak wrote:> Dear R users: > > I am trying to get the runs statistics for a very long binary sequences > and I ran into trouble with the sped when using (probably too many) "if" > and "for" statements in my program. Let me explain what I would like to > see as the final output. Say, I have a sequence 1001101000. The runs vector > should be r=(1,2,2,1,1,3), i.e. counting similar subsequences. Also, it > could be treated as an independent question, the longest run is of > interest.> inp <- as.numeric(strsplit("1001101000", "")[[1]]) > inp[1] 1 0 0 1 1 0 1 0 0 0> rle(inp)$lengths[1] 1 2 2 1 1 3> Just to give you an idea the length of the original sequences will be of > order 10^6 and greater.That might tax rle, but you can do it in pieces if needed. However, on a 1GHz 512Mb machine:> inp <- rbinom(1e6, 1, 0.4) > system.time(foo <- rle(inp)$lengths)[1] 3.49 0.25 3.74 0.00 0.00 -- Brian D. Ripley, ripley at stats.ox.ac.uk Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272860 (secr) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Jobst Landgrebe wrote:> > Dear List, > > I am looking for a possibility to fit a mixture model under R using > maximum likelihood estimation. > Venables and Ripley describe a solution working under S+ (in MASS, 3. ed., > p. 263) which requires the D system function and deriv3. This solution does not > seem to be portable to R or at least I do not realise how. > > Is there anyone who > > a) knows how one could make the MASS-method run under R? > > or > > b) has written an EM-algorithm to fit mixture models in R that I could > use? > > I would be glad to hear from someone. > > Sincerely, > > JobstDear Jobst, Check out the R package mclust at http://www.ci.tuwien.ac.at/R/src/contrib/PACKAGES.html#mclust It enables you to fit Gaussian Mixture models using the EM algoritm. Cheers, Jonathan> Dr. Jobst Landgrebe > AG Wurst (Molecular Neurogenetics) > MPI of Psychiatry > Kraepelinstr. 2-10 > D-80804 Munich > phone +49 89 30622626 > fax +49 89 30622642 > -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- > r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html > Send "info", "help", or "[un]subscribe" > (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch > _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._-- Jonathan L. Marchini, (home) jonathan.marchini at balliol.ox.ac.uk Department of Statistics, (work) marchini at stats.ox.ac.uk University of Oxford, http://www.stats.ox.ac.uk/~marchini 1 South Parks Road, Tel: +44 1865 272593 (work) Oxford OX1 3TG, UK Fax: +44 1865 272595 -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
Is this a bug or a feature?> seq(0:4) # colon[1] 1 2 3 4 5> 0:4[1] 0 1 2 3 4> seq(0,4)[1] 0 1 2 3 4 I am aware that the first usage is incorrect, but I was surprised that I did not get any sort of syntax error -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
On 03-May-2002 John Aitchison wrote:> Is this a bug or a feature? > >> seq(0:4) # colon > [1] 1 2 3 4 5 >> 0:4 > [1] 0 1 2 3 4 >> seq(0,4) > [1] 0 1 2 3 4 > > I am aware that the first usage is incorrect, but I was surprised that > I did not get any sort of syntax errorFeature. With a single (vector) argument, seq returns the sequence 1:n where n is the length of the vector. This is described on the help page ( see argument "along" ). Martyn -.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.- r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html Send "info", "help", or "[un]subscribe" (in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch _._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._