thr3ads.net - R help - [R] Validation / Training

If this information is useful, please help other people find it:
Share via:

Sam

2010-Sep-29 08:25 UTC

[R] Validation / Training - test data

Dear List,

I have developed two models i want to use to predict a response, one with a
binary response and one with a ordinal response.

My original plan was to divide the data into test (300 entries) and training
(1000 entries) and check the power of the model by looking at the % correct
predictions. However i have been told my a colleague  that 1300 entries is far
too little to partition the data set and i should use the whole data set, and
determine the power of the model with scores such as c-value and Brier score and
use bootstrapping.

I understand how to bootstrap in R however i have never used it on predicted
values.

My questions are -

1. Using the boot() command how do i use this to test the power of my predictive
model?
2. Is it possible to bootstrap brier score or is this not necessary?
3. ( This is a separate point i am struggling with, i thought i would include it
here instead of posting again!) I have selected the most likely model with AIC
criteria from a set of candidate GLMM models, however as GLMM has no predict
function i have used the best model and excluded the random effects and ran it
as a glm and used the predict function from here - is this OK?

Thanks

Sam

Frank Harrell

2010-Sep-29 12:38 UTC

head link

[R] Validation / Training - test data

Split sample validation is highly unstable with your sample size.

The rms package can help with bootstrapping or cross-validation, assuming
you have all modeling steps repreated for each resample.

Frank

-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context:
http://r.789695.n4.nabble.com/Validation-Training-test-data-tp2718523p2718905.html
Sent from the R help mailing list archive at Nabble.com.

Sam

2010-Sep-29 13:03 UTC

head link

[R] Validation / Training - test data

Thanks for this,

I had used 
> validate(model0, method="boot",B=200)
To get a index.corrected Brier score, 

However i am also wanting to bootstrap the predicted probabilities  output from 
predict(model1, type = "response") to get a idea of confidence, or am
i best just using se.fit = TRUE and then calculating the 95%CI? Does what i want
to do make sense?

Thanks


On 29 Sep 2010, at 13:38, Frank Harrell wrote:


Split sample validation is highly unstable with your sample size.

The rms package can help with bootstrapping or cross-validation, assuming
you have all modeling steps repreated for each resample.

Frank

-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context:
http://r.789695.n4.nabble.com/Validation-Training-test-data-tp2718523p2718905.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Frank Harrell

2010-Sep-29 16:48 UTC

head link

[R] Validation / Training - test data

It all depends on the ultimate use of the results.

Frank

-----
Frank Harrell
Department of Biostatistics, Vanderbilt University
-- 
View this message in context:
http://r.789695.n4.nabble.com/Validation-Training-test-data-tp2718523p2719370.html
Sent from the R help mailing list archive at Nabble.com.

R help - Sep 2010 - Validation / Training - test data

[R] Validation / Training - test data

[R] Validation / Training - test data

[R] Validation / Training - test data

[R] Validation / Training - test data