similar to: Performance measure for probabilistic predictions

Displaying 20 results from an estimated 10000 matches similar to: "Performance measure for probabilistic predictions"

2009 Sep 07
2
Confused - better empirical results with error in data
Hi, I have a strange one for the group. We have a system that predicts probabilities using a fairly standard svm (e1017). We are looking at probabilities of a binary outcome. The input data is generated by a perl script that calculates a bunch of things, fetches data from a database, etc. We train the system on 30,000 examples and then test the system on an unseen set of 5,000 records.
2009 Aug 25
1
Clogit or LRM?
Hello I believe that I'm getting very close in my modeling application. I've come across a challenge that I am unable to solve and would really appreciate the group's opinion. I've been using the val.prob function from the Design library (Thanks Frank!!) to both evaluate and visualize my model. From the scores and graph, it appears as my model is very accurate in
2009 Aug 04
1
Build a dataframe row by row?
Hi, Time for another of my "newbie" questions. Is it possible to build up a data.frame "row by row" as I go I'm going to be running a bunch of experiments (many in a loop) to test different things. I'm using AUC as my main performance measure. My thought was to add a row to a data.frame for each iteration and then have a nice summary report at the end. I found
2009 Aug 20
1
Calculating loess value
Hello, I'm attempting to evaluate the accuracy of the probability predictions for my model. As previously discussed here, the AUC is not a good measure as I'm not concerned with classification accuracy but probability accurcy. It was suggested to me that the loess function would be a good measure to look at. I can see some libraries (Design) will plot the loess function as a curve
2009 Aug 21
1
Question about validating predicted probabilities
Hello, Frank was nice enough to point me to the val.prob function of the Design library. It creates a beautiful graph that really helps me visualize how well my model is predicting probabilities. By default, there are two lines on the graph 1) fitted logistic calibration curve 2) nonparametric fit using lowess Right now, the nonparametric line doesn't look very good. The
2009 Aug 04
1
Save model and predictions from svm
Hello, I'm using the e1071 package for training an SVM. It seems to be working well. This question has two parts: 1) Once I've trained an SVM model, I want to USE it within R at a later date to predict various new data. I see the write.svm command, but don't know how to LOAD the model back in so that I can use it tomorrow. How can I do this? 2) I would like to add the
2008 Jun 12
1
About Mcneil Hanley test for a portion of AUC!
Dear all I am trying to compare the performances of several methods using the AUC0.1 and not the whole AUC. (meaning I wanted to compare to AUC's whose x axis only goes to 0.1 not 1) I came to know about the Mcneil Hanley test from Bernardo Rangel Tura and I referred to the original paper for the calculation of "r" which is an argument of the function cROC. I can only find the
2012 Jun 06
1
Data scientist // Berlin-based startup using probabilistic models in ecommerce
*Fluidshopping is a Berlin-based startup working on a customer analytics tool for online retailers. Customer Lifefitime Value (CLV) is the mythical 'magic number', the amount of money a particular customer will ever bring in. Knowing your CLV makes it trivial to: - optimize marketing spend for different inbound channels. - identify your highest value customers, - identify those in danger
2008 Jul 17
1
Comparing differences in AUC from 2 different models
Hi, I would like to compare differences in AUC from 2 different models, glm and gam for predicting presence / absence. I know that in theory the model with a higher AUC is better, but what I am interested in is if statistically the increase in AUC from the glm model to the gam model is significant. I also read quite extensive discussions on the list about ROC and AUC but I still didn't find
2009 Aug 30
1
SVM coefficients
Hello, I'm using the svm function from the e1071 package. It works well and gives me nice results. I'm very curious to see the actual coefficients calculated for each input variable. (Other packages, like RapidMiner, show you this automatically.) I've tried looking at attributes for the model and do see a "coefficients" item, but printing it returns an NULL result.
2008 Jan 05
1
AUC values from LRM and ROCR
Dear List, I am trying to assess the prediction accuracy of an ordinal model fit with LRM in the Design package. I used predict.lrm to predict on an independent dataset and am now attempting to assess the accuracy of these predictions. >From what I have read, the AUC is good for this because it is threshold independent. I obtained the AUC for the fit model output from the c score (c =
2009 Aug 02
2
Strange column shifting with read.table
Hi, I am reading in a dataframe from a CSV file. It has 70 columns. I do not have any kind of unique "row id". rawdata <- read.table("r_work/train_data.csv", header=T, sep=",", na.strings=0) When training an svm, I keep getting an error So, as an experiment, I wrote the data back out to a new file so that I could see what the svm function sees.
2011 Jan 07
2
Stepwise SVM Variable selection
I have a data set with about 30,000 training cases and 103 variable. I've trained an SVM (using the e1071 package) for a binary classifier {0,1}. The accuracy isn't great. I used a grid search over the C and G parameters with an RBF kernel to find the best settings. I remember that for least squares, R has a nice stepwise function that will try combining subsets of variables to find
2008 Dec 04
2
Logistic Regression: variable selection based on p value?
Hi, When I use logistic regression, each variable has a p value associated with it. Do I only include the variables that have a statistically significant p value (<0.05), or are there situations when I should include variables when their p values are high? I had heard that if a variable has a high p value but it's not the terminal variable, keep it; otherwise, take it out. Not sure if
2010 Apr 29
2
can not print probabilities in svm of e1071
> x <- train[,c( 2:18, 20:21, 24, 27:31)] > y <- train$out > > svm.pr <- svm(x, y, probability = TRUE, method="C-classification", kernel="radial", cost=bestc, gamma=bestg, cross=10) > > pred <- predict(svm.pr, valid[,c( 2:18, 20:21, 24, 27:31)], decision.values = TRUE, probability = TRUE) > attr(pred, "decision.values")[1:4,]
2009 Sep 22
2
Pull Coefficients from MCMCpack models
Hi, I've been testing some models with the MCMCpack library. I can run the process and get a nice model "object". I can easily see the summary and even plot it. I can't seem to figure out how to: 1) Access the final coefficients in the model 2) Turn the coefficients into a model so I can then run predictions using them. A summary command will SHOW Me the coefficients, but
2006 Nov 24
1
How to find AUC in SVM (kernlab package)
Dear all, I was wondering if someone can help me. I am learning SVM for classification in my research with kernlab package. I want to know about classification performance using Area Under Curve (AUC). I know ROCR package can do this job but I found all example in ROCR package have include prediction, for example, ROCR.hiv {ROCR}. My problem is how to produce prediction in SVM and to find
2011 Feb 21
3
ROC from R-SVM?
*Hi, *Does anyone know how can I show an *ROC curve for R-SVM*? I understand in R-SVM we are not optimizing over SVM cost parameter. Any example ROC for R-SVM code or guidance can be really useful. Thanks, Angel. [[alternative HTML version deleted]]
2010 Jun 01
4
Plot multiple columns
I'm running a long MCMC chain that is generating samples for 22 variables. I have each run of the chain as a row in a matrix. So: Chain[,1] is the column with all the samples for variable one. Chain[,2] is the column with all the samples for variable 2, etc. I'd like to fit all 22 on a single page to print a nice summary. It is OK if the graphs are small, I just need to show the
2007 Jan 24
2
Logistic regression model + precision/recall
Hi, I am using logistic regression model named lrm(Design) Rite now I was using Area Under Curve (AUC) for testing my model. But, now I have to calculate precision/recall of the model on test cases. For lrm, precision and recal would be simply defined with the help of 2 terms below: True Positive (TP) - Number of test cases where class 1 is given probability >= 0.5. False Negative (FP) -