Hi list, I am wondering if there is a way to use EM algorithm to handle missing data and get a completed data set in R? I usually do it in SPSS because EM in SPSS kind of "fill in" the estimated value for the missing data, and then the completed dataset can be saved and used for further analysis. But I have not found a way to get the a completed data set like this in R or SAS. With Amelia or MICE, the missing data set were imputed a couple of times, and the new imputed datasets were not combined. I understand that the parameter estimation can still be done in the way of combination of estimates from each imputed data set, but it would be more convenient to have a combined dataset to do some analysis, for example, ANOVA with IVs having more than two categories. In this case, the only way to get the main effect of the whole IV is to estimate parameters in a single data set(as far as I know). If the separated imputed data sets were used, then the main effect showed in the result were for each category of the IV, respectively. I figured sometimes the readers and reviewers would like to see how big the effect for the whole IV instead of the effect of each category of that IV. This is one of the reasons I can not fully move to R from SPSS. So any suggestions? Thank you very much. ya [[alternative HTML version deleted]]
Search yourself! 1. Google on "Impute missing data in R" 2. See the ?RSiteSearch function 3. Go to Rseek.org and enter in keywords. 4. Download and install the sos package and use its search functionality. R has multiple packages and functions with multiple approaches to missing data imputation. Choose what you need. -- Bert On Sat, Jul 21, 2012 at 4:58 AM, ya <xinxi813 at 163.com> wrote:> > Hi list, > > I am wondering if there is a way to use EM algorithm to handle missing data and get a completed data set in R? > > I usually do it in SPSS because EM in SPSS kind of "fill in" the estimated value for the missing data, and then the completed dataset can be saved and used for further analysis. But I have not found a way to get the a completed data set like this in R or SAS. With Amelia or MICE, the missing data set were imputed a couple of times, and the new imputed datasets were not combined. I understand that the parameter estimation can still be done in the way of combination of estimates from each imputed data set, but it would be more convenient to have a combined dataset to do some analysis, for example, ANOVA with IVs having more than two categories. In this case, the only way to get the main effect of the whole IV is to estimate parameters in a single data set(as far as I know). If the separated imputed data sets were used, then the main effect showed in the result were for each category of the IV, respectively. I figured sometimes the readers and reviewers would like to see how bi! > g the effect for the whole IV instead of the effect of each category of that IV. > > This is one of the reasons I can not fully move to R from SPSS. So any suggestions? > > Thank you very much. > > > > > ya > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Bert Gunter Genentech Nonclinical Biostatistics Internal Contact Info: Phone: 467-7374 Website: http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
It is not clear what you actually want. Do you want to save imputed data sets for further analysis? That is pretty simple. What do you mean by combining the data sets? Are you confusing single imputation with multiple imputation? In addition to the packages you mentioned, there are many others. See the Official Statistics & Survey Methodology Task View: http://cran.r-project.org/web/views/OfficialStatistics.html ---------------------------------------------- David L Carlson Associate Professor of Anthropology Texas A&M University College Station, TX 77843-4352> -----Original Message----- > From: r-help-bounces at r-project.org [mailto:r-help-bounces at r- > project.org] On Behalf Of ya > Sent: Saturday, July 21, 2012 6:59 AM > To: r-help > Subject: [R] combined EM dataset for missing data? > > > Hi list, > > I am wondering if there is a way to use EM algorithm to handle missing > data and get a completed data set in R? > > I usually do it in SPSS because EM in SPSS kind of "fill in" the > estimated value for the missing data, and then the completed dataset > can be saved and used for further analysis. But I have not found a way > to get the a completed data set like this in R or SAS. With Amelia or > MICE, the missing data set were imputed a couple of times, and the new > imputed datasets were not combined. I understand that the parameter > estimation can still be done in the way of combination of estimates > from each imputed data set, but it would be more convenient to have a > combined dataset to do some analysis, for example, ANOVA with IVs > having more than two categories. In this case, the only way to get the > main effect of the whole IV is to estimate parameters in a single data > set(as far as I know). If the separated imputed data sets were used, > then the main effect showed in the result were for each category of the > IV, respectively. I figured sometimes the readers and reviewers would > like to see how bi! > g the effect for the whole IV instead of the effect of each category > of that IV. > > This is one of the reasons I can not fully move to R from SPSS. So any > suggestions? > > Thank you very much. > > > > > ya > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code.