Bullinger, Johannes
2025-Feb-20 09:19 UTC
[R] Discrepancy in imputed data results across different operating systems
Hi, I am currently facing an issue with data imputation in R using the random forest method from the MICE package. Specifically, I have imputed data that was missing completely at random (MCAR) using the following R code: # IMPUTATION set.seed(500) # Perform multiple imputation imputed_data <- mice(JPBR, m = 5, method = 'rf', maxit = 50) Initially, I performed the imputation on a Mac running macOS Big Sur version 11.7.10 with R version 4.2.3. However, when I attempted to replicate the results using the same seed, data, and script on a Windows 10 Enterprise machine with R version 4.4.1, the imputed results differed from those obtained on the Mac. Concerned by this discrepancy, I ran the same script on another Windows 10 Enterprise system using R version 4.2.3. The results were consistent with the previous Windows imputation but still differed from the Mac results. Furthermore, I conducted the imputation on a different Mac running macOS Sonoma with R version 4.4.1. Interestingly, the results from this setup matched the original imputation outcome from the Mac, suggesting that the issue may not be related to the R version but rather to the operating system. I am seeking insights or suggestions on why these discrepancies occur and how to ensure consistent imputation results across different operating systems. Any guidance or advice would be greatly appreciated. Best, Johannes __________________________________ Johannes Bullinger, M.Sc. Developmental Psychology Department Ludwig-Maximilians-University of Munich he/him; er/ihn Mail: johannes.bullinger at lmu.de [[alternative HTML version deleted]]
Duncan Murdoch
2025-Feb-20 13:28 UTC
[R] Discrepancy in imputed data results across different operating systems
Normally I'd say you should post a reproducible example where you get a discrepancy, but that's already been done in an open bug report here: https://github.com/amices/mice/issues/688 . Duncan Murdoch On 2025-02-20 4:19 a.m., Bullinger, Johannes wrote:> Hi, > > > I am currently facing an issue with data imputation in R using the random forest method from the MICE package. Specifically, I have imputed data that was missing completely at random (MCAR) using the following R code: > > > # IMPUTATION > > set.seed(500) > > # Perform multiple imputation > > imputed_data <- mice(JPBR, m = 5, method = 'rf', maxit = 50) > > > Initially, I performed the imputation on a Mac running macOS Big Sur version 11.7.10 with R version 4.2.3. > > > However, when I attempted to replicate the results using the same seed, data, and script on a Windows 10 Enterprise machine with R version 4.4.1, the imputed results differed from those obtained on the Mac. > > > Concerned by this discrepancy, I ran the same script on another Windows 10 Enterprise system using R version 4.2.3. The results were consistent with the previous Windows imputation but still differed from the Mac results. > > > Furthermore, I conducted the imputation on a different Mac running macOS Sonoma with R version 4.4.1. Interestingly, the results from this setup matched the original imputation outcome from the Mac, suggesting that the issue may not be related to the R version but rather to the operating system. > > > I am seeking insights or suggestions on why these discrepancies occur and how to ensure consistent imputation results across different operating systems. Any guidance or advice would be greatly appreciated. > > > Best, > > > Johannes > > > __________________________________ > > Johannes Bullinger, M.Sc. > Developmental Psychology Department > Ludwig-Maximilians-University of Munich > he/him; er/ihn > Mail: johannes.bullinger at lmu.de > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide https://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.