I have a data matrix X (n x k, say) each row of which constitutes an observation of a k-dimensional random variable which I am willing, if not happy, to assume to be Gaussian, with mean ``mu'' and covariance matrix ``Sigma''. Distinct rows of X may be assumed to correspond to independent realizations of this random variable. Most rows of X (all but 240 out of 6000+ rows) contain one or more missing values. If I am willing to assume that missing entries are missing completely at random (MCAR) then I can estimate the covariance matrix Sigma via maximum likelihood, by employing the EM algorithm. Or so I believe. Has this procedure been implemented in R in an accessible form? I've had a bit of a scrounge through the searching facilities, and have gone through the FAQ, and have found nothing that I could discern to be directly relevant. Thanks for any pointers that anyone may be able to give. cheers, Rolf Turner ###################################################################### Attention:\ This e-mail message is privileged and confidenti...{{dropped}}
Hi Rolf! Have a look at the 'norm' package. This does just what you;re asking for (assuming multivariate normal, and allowing CAR missingness -- i.e. probability of>>>>>> OOPS, I meant MAR!missing may depend on observed values, but must not depend on unobserved). Read the documentation for the various function *very* carefully! Drop me a line if you want more info. Best wishes, Ted. On 15-Aug-07 21:16:32, Rolf Turner wrote:> > I have a data matrix X (n x k, say) each row of which constitutes > an observation of a k-dimensional random variable which I am willing, > if not happy, to assume to be Gaussian, with mean ``mu'' and > covariance matrix ``Sigma''. Distinct rows of X may be assumed to > correspond to independent realizations of this random variable. > > Most rows of X (all but 240 out of 6000+ rows) contain one or more > missing values. If I am willing to assume that missing entries are > missing completely at random (MCAR) then I can estimate the covariance > matrix Sigma via maximum likelihood, by employing the EM algorithm. > Or so I believe. > > Has this procedure been implemented in R in an accessible form? > I've had a bit of a scrounge through the searching facilities, > and have gone through the FAQ, and have found nothing that I could > discern to be directly relevant. > > Thanks for any pointers that anyone may be able to give. > > cheers, > > Rolf Turner-------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 15-Aug-07 Time: 23:17:48 ------------------------------ XFMail ------------------------------
Ted Harding wrote:> Hi Rolf! > > Have a look at the 'norm' package. > > This does just what you;re asking for (assuming multivariate > normal, and allowing CAR missingness -- i.e. probability of > missing may depend on observed values, but must not depend on > unobserved). > > Read the documentation for the various function *very* carefully! > Drop me a line if you want more info. > > Best wishes, > Ted.Also, consider mlest() in the mvnmle package.> On 15-Aug-07 21:16:32, Rolf Turner wrote: >> I have a data matrix X (n x k, say) each row of which constitutes >> an observation of a k-dimensional random variable which I am willing, >> if not happy, to assume to be Gaussian, with mean ``mu'' and >> covariance matrix ``Sigma''. Distinct rows of X may be assumed to >> correspond to independent realizations of this random variable. >> >> Most rows of X (all but 240 out of 6000+ rows) contain one or more >> missing values. If I am willing to assume that missing entries are >> missing completely at random (MCAR) then I can estimate the covariance >> matrix Sigma via maximum likelihood, by employing the EM algorithm. >> Or so I believe. >> >> Has this procedure been implemented in R in an accessible form? >> I've had a bit of a scrounge through the searching facilities, >> and have gone through the FAQ, and have found nothing that I could >> discern to be directly relevant. >> >> Thanks for any pointers that anyone may be able to give. >> >> cheers, >> >> Rolf Turner > > -------------------------------------------------------------------- > E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> > Fax-to-email: +44 (0)870 094 0861 > Date: 15-Aug-07 Time: 23:17:48 > ------------------------------ XFMail ------------------------------ > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Chuck Cleland, Ph.D. NDRI, Inc. 71 West 23rd Street, 8th floor New York, NY 10010 tel: (212) 845-4495 (Tu, Th) tel: (732) 512-0171 (M, W, F) fax: (917) 438-0894
On 15-Aug-07 21:16:32, Rolf Turner wrote:> > I have a data matrix X (n x k, say) each row of which constitutes > an observation of a k-dimensional random variable which I am willing, > if not happy, to assume to be Gaussian, with mean ``mu'' and > covariance matrix ``Sigma''. Distinct rows of X may be assumed to > correspond to independent realizations of this random variable. > > Most rows of X (all but 240 out of 6000+ rows) contain one or more > missing values. > [...]One question, Rolf: How big is k (no of columns)? If it's greater than 30, you may have problems with 'norm', since the function prelim.norm() builds up its image of the places where there are missing values as "packed integers" with code: r <- 1 * is.na(x) .... mdp <- as.integer((r %*% (2^((1:ncol(x)) - 1))) + 1) i.e. 'x' would be nxk and have 1s where your X had missing, 0s elsewhere. Then each row of 'x' is converted into a 32-bit integer whose "1" bits correspond to the 1s in 'x'. You'll get "NA" warnings if k>30, and things could go wrong! In that case, I hope Chuck's suggestion works! Best wishes, Ted. -------------------------------------------------------------------- E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk> Fax-to-email: +44 (0)870 094 0861 Date: 16-Aug-07 Time: 00:10:33 ------------------------------ XFMail ------------------------------