similar to: Analyzing Poor Performance Using naiveBayes()

Displaying 20 results from an estimated 5000 matches similar to: "Analyzing Poor Performance Using naiveBayes()"

2008 Jun 25
1
Extract naiveBayes details
Hey, I just like to know how to extract details from the naiveBayes model (package e1071). I mean, for each possible value the model defines how much it influences the outcome. I want to sort those probabilities and show the values with the highest impact. How could I do that? PS: I tried using []'s to get to the model's internals, however, all I get is a "list" not a
2012 Feb 09
2
AUC, C-index and p-value of Wilcoxon
Dear all, I am using the ROCR library to compute the AUC and also the Hmisc library to compute the C-index of a predictor and a group variable. The results of AUC and C-index are similar and give a value of about 0.57. The Wilcoxon p-value is <0.001! Why the AUC is showing small value and the p-value is high significant? The AUC is based on Wilcoxon calculation? Many thanks, Lina
2012 Feb 07
2
predict.naiveBayes() bug in e1071 package
Hi, I'm currently using the R package e1071 to train naive bayes classifiers and came across a bug: When the posterior probabilities of all classes are small, the result from the predict.naiveBayes function become NaNs. This is an issue with the treatment of the log-transformed probabilities inside the predict.naiveBayes function. Here is an example to demonstrate the problem (you might need
2009 Jun 30
2
NaiveBayes fails with one input variable (caret and klarR packages)
Hello, We have a system which creates thousands of regression/classification models and in cases where we have only one input variable NaiveBayes throws an error. Maybe I am mistaken and I shouldn't expect to have a model with only one input variable. We use R version 2.6.0 (2007-10-03). We use caret (v4.1.19), but have tested similar code with klaR (v.0.5.8), because caret relies on
2012 Dec 19
2
pROC and ROCR give different values for AUC
Packages pROC and ROCR both calculate/approximate the Area Under (Receiver Operator) Curve. However the results are different. I am computing a new variable as a predictor for a label. The new variable is a (non-linear) function of a set of input values, and I'm checking how different parameter settings contribute to prediction. All my settings are predictive, but some are better. The AUC i
2007 Nov 01
1
RWeka and naiveBayes
Hi I'm trying to use RWeka to use a NaiveBayes Classifier(the Weka version). However it crashes whenever there is a NA in the class Gender Here is the.code I have with d2 as the data frame. The first call to NB doesn't make R crash but the second call does. NB <- make_Weka_classifier("weka/classifiers/bayes/NaiveBayesSimple") d2[,64]<-d2$Gender=="M"
2012 Feb 10
2
naiveBayes: slow predict, weird results
I did this: nb <- naiveBayes(users, platform) pl <- predict(nb,users) nrow(users) ==> 314781 ncol(users) ==> 109 1. naiveBayes() was quite fast (~20 seconds), while predict() was slow (tens of minutes). why? 2. the predict results were completely off the mark (quite the opposite of the expected overfitting). suffice it to show the tables: pl: android blackberry ipad
2010 Nov 03
2
[klaR package] [NaiveBayes] warning message numerical 0 probability
Hi, I run R 2.10.1 under ubuntu 10.04 LTS (Lucid Lynx) and klaR version 0.6-4. I compute a model over a 2 classes dataset (composed of 700 examples). To that aim, I use the function NaiveBayes provided in the package klaR. When I then use the prediction function : predict(my_model, new_data). I get the following warning : "In FUN(1:747[[747L]], ...) : Numerical 0 probability with
2010 Jun 30
1
help on naivebayes function in R
Hi, I have written a code in R for classifying microarray data using naive bayes, the code is given below: library(e1071) train<-read.table("Z:/Documents/train.txt",header=T); test<-read.table("Z:/Documents/test.txt",header=T); cl <- c(c(rep("ALL",10), rep("AML",10))); cl <- factor(cl) model <- NaiveBayes(train,cl);
2007 Aug 22
1
"subscript out of bounds" Error in predict.naivebayes
I'm trying to fit a naive Bayes model and predict on a new data set using the functions naivebayes and predict (package = e1071). R version 2.5.1 on a Linux machine My data set looks like this. "class" is the response and k1 - k3 are the independent variables. All of them are factors. The response has 52 levels and k1 - k3 have 2-6 levels. I have about 9,300 independent variables
2010 Oct 22
2
Random Forest AUC
Guys, I used Random Forest with a couple of data sets I had to predict for binary response. In all the cases, the AUC of the training set is coming to be 1. Is this always the case with random forests? Can someone please clarify this? I have given a simple example, first using logistic regression and then using random forests to explain the problem. AUC of the random forest is coming out to be
2006 Mar 20
1
How to compare areas under ROC curves calculated with ROC R package
I might be missing something but I thought that AUC was a measure for comparing ROC curves, so there is nothing else needed to "compare" them. The larger AUC is the higher correlation of 2 variables compared. No other measures or calculations are needed. Jarek Tuszynski -----Original Message----- From: r-help-bounces at stat.math.ethz.ch [mailto:r-help-bounces at stat.math.ethz.ch] On
2009 Feb 19
1
Bug in predict function for naiveBayes?
Dear all, I tried a simple naive Bayes classification on an artificial dataset, but I have troubles getting the predict function to work with the type="class" specification. With type= "raw", it works perfectly, but with type="class" I get following error : Error in as.vector(x, mode) : invalid 'mode' argument Data : mixture.train is a training set with 100
2011 Jul 22
4
glmnet with binary logistic regression
Hi all, I am using the glmnet R package to run LASSO with binary logistic regression. I have over 290 samples with outcome data (0 for alive, 1 for dead) and over 230 predictor variables. I currently using LASSO to reduce the number of predictor variables. I am using the cv.glmnet function to do 10-fold cross validation on a sequence of lambda values which I let glmnet determine. I then take
2007 Feb 15
1
Problem in summaryBy
The R script below gives values of 1 for all minimum values when I use a custom function in summaryBy. I get the correct values when I use FUN=min directly. Any help is much appreciated. The continuous information provided in this forum is fabulous as are the different R packages available. Rene # Simulated simplified data Subj <- rep(1:4, each=6) Analyte <-
2008 Jul 17
1
Comparing differences in AUC from 2 different models
Hi, I would like to compare differences in AUC from 2 different models, glm and gam for predicting presence / absence. I know that in theory the model with a higher AUC is better, but what I am interested in is if statistically the increase in AUC from the glm model to the gam model is significant. I also read quite extensive discussions on the list about ROC and AUC but I still didn't find
2010 Jan 22
2
Computing Confidence Intervals for AUC in ROCR Package
Dear R-philes, I am plotting ROC curves for several cross-validation runs of a classifier (using the function below). In addition to the average AUC, I am interested in obtaining a confidence interval for the average AUC. Is there a straightforward way to do this via the ROCR package? plot_roc_curve <- function(roc.dat, plt.title) { #print(str(vowel.ROC)) pred <-
2008 Jul 24
1
[Fwd: Re: Coefficients of Logistic Regression from bootstrap - how to get them?]
Thank you Frank and all for your advices. Here I attach the raw data from the Pawinski's paper. I have obtained permission from the corresponding Author to post it here for everyone. The only condition of use is that the Authors retain ownership of the data, and any publication resulting from these data must be managed by them. The dataset is composed as follows: patient number / MMF dose in
2011 Aug 02
2
Help with aggregate syntax for a multi-column function please.
Dear R-experts: I am using a function called AUC whose arguments are data, time, id, and dv. data is the name of the dataframe, time is the independent variable column name, id is the subject id and dv is the dependent variable. The function computes area under the curve by trapezoidal rule, for each subject id. I would like to embed this in aggregate to further subset by each
2010 May 19
1
col allocation is not right
plot(svm.auc, col=2, main="ROC curves comparing classification performance\n of six machine learning models") legend(0.5, 0.6, c(ns, nb, nr, nt, nl,ne), 2:6, 9) # Draw a legend. plot(bo.auc, col=3, add=T) # add=TRUE draws on the existing chart plot(rf.auc, col=4, add=T) plot(tree.auc, col=5, add=T) plot(nn.auc, col=6, add=T) plot(en.auc, col=9,lty="dotted",lwd=3, add=T) Hi,