Hi, This is probably going to be one of those, "It depends what you want" kind of answers, but I'm very curious to see if the group has an opinion or some general suggestions. The actual experiment is too complicated for a quick e-mail, but I'll summarize well enough(hopefully) to get the concepts across. Binary classification problem Using and SVM (e1071) to train a model Experimenting with different features, costs, etc.) Training data and test data are complete separate data sets drawn from the same population. The general concept was to train on a large set of data and then test of a medium sized set of unseen data. We're looking for the best classification performance for future unlabeled data. Here is the puzzle: Comparing two versions of the model. A - Lower R2 (r squared) score but higher percentage labeled correct on test data B - Higher R2 score but lower percentage labeled correct on test data We're using the val.prob function from the Design library to evaluate our model. Additionally graphs from val.prob are interesting: A - Our "non-parametric" line mostly parallels the ideal line but is just a bit above. B - Our "non-parametric" line mostly parallels the ideal line but is just a bit below. If I understand things correctly, with model A, the actual probability is slightly higher than our predicted probability (not a bad thing for our application - better to under-predict than over predict.) One thought was that the R2 measures the distance from the "ideal line". With model A, we are a touch further from the ideal line, but in a better position than model B. Does anybody have any insight? Thanks, -N
Hi, This is probably going to be one of those, "It depends what you want" kind of answers, but I'm very curious to see if the group has an opinion or some general suggestions. The actual experiment is too complicated for a quick e-mail, but I'll summarize well enough(hopefully) to get the concepts across. Binary classification problem Using and SVM (e1071) to train a model Experimenting with different features, costs, etc.) Training data and test data are complete separate data sets drawn from the same population. The general concept was to train on a large set of data and then test of a medium sized set of unseen data. We're looking for the best classification performance for future unlabeled data. Here is the puzzle: Comparing two versions of the model. A - Lower R2 (r squared) score but higher percentage labeled correct on test data B - Higher R2 score but lower percentage labeled correct on test data We're using the val.prob function from the Design library to evaluate our model. Additionally graphs from val.prob are interesting: A - Our "non-parametric" line mostly parallels the ideal line but is just a bit above. B - Our "non-parametric" line mostly parallels the ideal line but is just a bit below. If I understand things correctly, with model A, the actual probability is slightly higher than our predicted probability (not a bad thing for our application - better to under-predict than over predict.) One thought was that the R2 measures the distance from the "ideal line". With model A, we are a touch further from the ideal line, but in a better position than model B. Does anybody have any insight? Thanks, -N