Hello, I have been toying with the survey package's withReplicates function, which lets users easily extend the survey package to support any weighted statistic. There are a number of ML algorithms in various packages that accept weights, and it is fairly easy to use them with withReplicates. Below is a na?ve example: library(survey) library(rpart) library(gbm) data(api) # create survey object dstrat<-svydesign(id=~1,strata=~stype, weights=~pw, data=apistrat, fpc=~fpc) rstrat<-as.svrepdesign(dstrat) # try rpart predr <- as.data.frame(withReplicates(rstrat, function(w, data) { predict(rpart(api00~ell+meals+mobility,data=data,weights=w)) })) # try gbm predg <- as.data.frame(withReplicates(rstrat, function(w, data) { predict(gbm(api00~ell+meals+mobility,data=data,weights=w, n.trees=100)) })) # try regular svyglm preds <- as.data.frame(predict(svyglm(api00~ell+meals+mobility,rstrat))) head(data.frame(predr,predg,preds)) With rpart, the standard errors are absurdly large, and clearly incorrect. With gbm, the results seem reasonable. I see in this extremely old post that you can't use quantile regression with withReplicates for some survey designs and expect to get reasonable results: https://stat.ethz.ch/pipermail/r-help/2008-August/171620.html Quantiles and survey stats are messy business so that issue may be unique to quantile regressions, but based on that post it would seem that the function, and survey design need to have certain properties for withReplicates to generate valid SEs. This is not documented with withReplicates though. So my question is, what properties does an ML algorithm/survey design need for withReplicates to generate valid SEs? Kind Regards, Carl Ganz