Noah Silverman
2009-Aug-19 08:04 UTC
[R] Performance measure for probabilistic predictions
Hello, I'm using an SVM for predicting a model, but I'm most interested in the probability output. This is easy enough to calculate. My challenge is how to measure the relative performance of the SVM for different settings/parameters/etc. An AUC curve comes to mind, but I'm NOT interested in predicting true vs false. I am interested in finding the most accurate probability predictions possible. I've seen some literature where the probability range is cut into segments and then the predicted probability is compared to the actual. This looks nice, but I need a more tangible numeric measure. One thought was a measure of "probability accuracy" for each range, but how to calculate this. Any thoughts? -N
Frank E Harrell Jr
2009-Aug-19 12:21 UTC
[R] Performance measure for probabilistic predictions
Noah Silverman wrote:> Hello, > > I'm using an SVM for predicting a model, but I'm most interested in the > probability output. This is easy enough to calculate. > > My challenge is how to measure the relative performance of the SVM for > different settings/parameters/etc. > > An AUC curve comes to mind, but I'm NOT interested in predicting true vs > false. I am interested in finding the most accurate probability > predictions possible. > > I've seen some literature where the probability range is cut into > segments and then the predicted probability is compared to the actual. > This looks nice, but I need a more tangible numeric measure. One > thought was a measure of "probability accuracy" for each range, but how > to calculate this. > > Any thoughts? > > -NNoah, This is a big area but I'm glad you are interested in probability accuracy rather than the more frequently (mis)-used classification accuracy. There are many measures available. For independent test samples the val.prob function in the Design package provides many. When making a calibration plot to demonstrate absolute prediction accuracy, it is not a good idea to bin the predicted probabilities. val.prob uses loess to produce a smooth calibration curve. Frank> > ______________________________________________ > R-help at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >-- Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University