thr3ads.net - similar to: "Performance measure for probabilistic predictions"

Displaying 20 results from an estimated 10000 matches similar to: "Performance measure for probabilistic predictions"

Confused - better empirical results with error in data

2009 Sep 07

Confused - better empirical results with error in data

Hi, I have a strange one for the group. We have a system that predicts probabilities using a fairly standard svm (e1017). We are looking at probabilities of a binary outcome. The input data is generated by a perl script that calculates a bunch of things, fetches data from a database, etc. We train the system on 30,000 examples and then test the system on an unseen set of 5,000 records.

Clogit or LRM?

2009 Aug 25

Clogit or LRM?

Hello I believe that I'm getting very close in my modeling application. I've come across a challenge that I am unable to solve and would really appreciate the group's opinion. I've been using the val.prob function from the Design library (Thanks Frank!!) to both evaluate and visualize my model. From the scores and graph, it appears as my model is very accurate in

Build a dataframe row by row?

2009 Aug 04

Build a dataframe row by row?

Hi, Time for another of my "newbie" questions. Is it possible to build up a data.frame "row by row" as I go I'm going to be running a bunch of experiments (many in a loop) to test different things. I'm using AUC as my main performance measure. My thought was to add a row to a data.frame for each iteration and then have a nice summary report at the end. I found

Calculating loess value

2009 Aug 20

Calculating loess value

Hello, I'm attempting to evaluate the accuracy of the probability predictions for my model. As previously discussed here, the AUC is not a good measure as I'm not concerned with classification accuracy but probability accurcy. It was suggested to me that the loess function would be a good measure to look at. I can see some libraries (Design) will plot the loess function as a curve

Question about validating predicted probabilities

2009 Aug 21

Question about validating predicted probabilities

Hello, Frank was nice enough to point me to the val.prob function of the Design library. It creates a beautiful graph that really helps me visualize how well my model is predicting probabilities. By default, there are two lines on the graph 1) fitted logistic calibration curve 2) nonparametric fit using lowess Right now, the nonparametric line doesn't look very good. The

Save model and predictions from svm

2009 Aug 04

Save model and predictions from svm

Hello, I'm using the e1071 package for training an SVM. It seems to be working well. This question has two parts: 1) Once I've trained an SVM model, I want to USE it within R at a later date to predict various new data. I see the write.svm command, but don't know how to LOAD the model back in so that I can use it tomorrow. How can I do this? 2) I would like to add the

About Mcneil Hanley test for a portion of AUC!

2008 Jun 12

About Mcneil Hanley test for a portion of AUC!

Dear all I am trying to compare the performances of several methods using the AUC0.1 and not the whole AUC. (meaning I wanted to compare to AUC's whose x axis only goes to 0.1 not 1) I came to know about the Mcneil Hanley test from Bernardo Rangel Tura and I referred to the original paper for the calculation of "r" which is an argument of the function cROC. I can only find the

Data scientist // Berlin-based startup using probabilistic models in ecommerce

2012 Jun 06

Data scientist // Berlin-based startup using probabilistic models in ecommerce

*Fluidshopping is a Berlin-based startup working on a customer analytics tool for online retailers. Customer Lifefitime Value (CLV) is the mythical 'magic number', the amount of money a particular customer will ever bring in. Knowing your CLV makes it trivial to: - optimize marketing spend for different inbound channels. - identify your highest value customers, - identify those in danger

Comparing differences in AUC from 2 different models

2008 Jul 17

Comparing differences in AUC from 2 different models

Hi, I would like to compare differences in AUC from 2 different models, glm and gam for predicting presence / absence. I know that in theory the model with a higher AUC is better, but what I am interested in is if statistically the increase in AUC from the glm model to the gam model is significant. I also read quite extensive discussions on the list about ROC and AUC but I still didn't find

SVM coefficients

2009 Aug 30

SVM coefficients

Hello, I'm using the svm function from the e1071 package. It works well and gives me nice results. I'm very curious to see the actual coefficients calculated for each input variable. (Other packages, like RapidMiner, show you this automatically.) I've tried looking at attributes for the model and do see a "coefficients" item, but printing it returns an NULL result.

AUC values from LRM and ROCR

2008 Jan 05

AUC values from LRM and ROCR

Dear List, I am trying to assess the prediction accuracy of an ordinal model fit with LRM in the Design package. I used predict.lrm to predict on an independent dataset and am now attempting to assess the accuracy of these predictions. >From what I have read, the AUC is good for this because it is threshold independent. I obtained the AUC for the fit model output from the c score (c =

Strange column shifting with read.table

2009 Aug 02

Strange column shifting with read.table

Hi, I am reading in a dataframe from a CSV file. It has 70 columns. I do not have any kind of unique "row id". rawdata <- read.table("r_work/train_data.csv", header=T, sep=",", na.strings=0) When training an svm, I keep getting an error So, as an experiment, I wrote the data back out to a new file so that I could see what the svm function sees.

Stepwise SVM Variable selection

2011 Jan 07

Stepwise SVM Variable selection

I have a data set with about 30,000 training cases and 103 variable. I've trained an SVM (using the e1071 package) for a binary classifier {0,1}. The accuracy isn't great. I used a grid search over the C and G parameters with an RBF kernel to find the best settings. I remember that for least squares, R has a nice stepwise function that will try combining subsets of variables to find

Logistic Regression: variable selection based on p value?

2008 Dec 04

Logistic Regression: variable selection based on p value?

Hi, When I use logistic regression, each variable has a p value associated with it. Do I only include the variables that have a statistically significant p value (<0.05), or are there situations when I should include variables when their p values are high? I had heard that if a variable has a high p value but it's not the terminal variable, keep it; otherwise, take it out. Not sure if

can not print probabilities in svm of e1071

2010 Apr 29

can not print probabilities in svm of e1071

> x <- train[,c( 2:18, 20:21, 24, 27:31)] > y <- train$out > > svm.pr <- svm(x, y, probability = TRUE, method="C-classification", kernel="radial", cost=bestc, gamma=bestg, cross=10) > > pred <- predict(svm.pr, valid[,c( 2:18, 20:21, 24, 27:31)], decision.values = TRUE, probability = TRUE) > attr(pred, "decision.values")[1:4,]

Pull Coefficients from MCMCpack models

2009 Sep 22

Pull Coefficients from MCMCpack models

Hi, I've been testing some models with the MCMCpack library. I can run the process and get a nice model "object". I can easily see the summary and even plot it. I can't seem to figure out how to: 1) Access the final coefficients in the model 2) Turn the coefficients into a model so I can then run predictions using them. A summary command will SHOW Me the coefficients, but

How to find AUC in SVM (kernlab package)

2006 Nov 24

How to find AUC in SVM (kernlab package)

Dear all, I was wondering if someone can help me. I am learning SVM for classification in my research with kernlab package. I want to know about classification performance using Area Under Curve (AUC). I know ROCR package can do this job but I found all example in ROCR package have include prediction, for example, ROCR.hiv {ROCR}. My problem is how to produce prediction in SVM and to find

ROC from R-SVM?

2011 Feb 21

ROC from R-SVM?

*Hi, *Does anyone know how can I show an *ROC curve for R-SVM*? I understand in R-SVM we are not optimizing over SVM cost parameter. Any example ROC for R-SVM code or guidance can be really useful. Thanks, Angel. [[alternative HTML version deleted]]

Plot multiple columns

2010 Jun 01

Plot multiple columns

I'm running a long MCMC chain that is generating samples for 22 variables. I have each run of the chain as a row in a matrix. So: Chain[,1] is the column with all the samples for variable one. Chain[,2] is the column with all the samples for variable 2, etc. I'd like to fit all 22 on a single page to print a nice summary. It is OK if the graphs are small, I just need to show the

Logistic regression model + precision/recall

2007 Jan 24

Logistic regression model + precision/recall

Hi, I am using logistic regression model named lrm(Design) Rite now I was using Area Under Curve (AUC) for testing my model. But, now I have to calculate precision/recall of the model on test cases. For lrm, precision and recal would be simply defined with the help of 2 terms below: True Positive (TP) - Number of test cases where class 1 is given probability >= 0.5. False Negative (FP) -

similar to: Performance measure for probabilistic predictions