Displaying 20 results from an estimated 5000 matches similar to: "Analyzing Poor Performance Using naiveBayes()"
2008 Jun 25
1
Extract naiveBayes details
Hey,
I just like to know how to extract details from the naiveBayes model
(package e1071). I mean, for each possible value the model defines how much
it influences the outcome. I want to sort those probabilities and show the
values with the highest impact.
How could I do that?
PS: I tried using []'s to get to the model's internals, however, all I get
is a "list" not a
2012 Feb 09
2
AUC, C-index and p-value of Wilcoxon
Dear all,
I am using the ROCR library to compute the AUC and also the Hmisc library
to compute the C-index of a predictor and a group variable. The results of
AUC and C-index are similar and give a value of about 0.57. The Wilcoxon
p-value is <0.001! Why the AUC is showing small value and the p-value is
high significant? The AUC is based on Wilcoxon calculation?
Many thanks,
Lina
2012 Feb 07
2
predict.naiveBayes() bug in e1071 package
Hi,
I'm currently using the R package e1071 to train naive bayes
classifiers and came across a bug: When the posterior probabilities of
all classes are small, the result from the predict.naiveBayes function
become NaNs. This is an issue with the treatment of the
log-transformed probabilities inside the predict.naiveBayes function.
Here is an example to demonstrate the problem (you might need
2009 Jun 30
2
NaiveBayes fails with one input variable (caret and klarR packages)
Hello,
We have a system which creates thousands of regression/classification models and in cases where we have only one input variable NaiveBayes throws an error. Maybe I am mistaken and I shouldn't expect to have a model with only one input variable.
We use R version 2.6.0 (2007-10-03). We use caret (v4.1.19), but have tested similar code with klaR (v.0.5.8), because caret relies on
2012 Dec 19
2
pROC and ROCR give different values for AUC
Packages pROC and ROCR both calculate/approximate the Area Under (Receiver Operator) Curve. However the results are different.
I am computing a new variable as a predictor for a label. The new variable is a (non-linear) function of a set of input values, and I'm checking how different parameter settings contribute to prediction. All my settings are predictive, but some are better.
The AUC i
2007 Nov 01
1
RWeka and naiveBayes
Hi
I'm trying to use RWeka to use a NaiveBayes Classifier(the Weka
version). However it crashes whenever there is a NA in the class
Gender
Here is the.code I have with d2 as the data frame.
The first call to NB doesn't make R crash but the second call does.
NB <- make_Weka_classifier("weka/classifiers/bayes/NaiveBayesSimple")
d2[,64]<-d2$Gender=="M"
2012 Feb 10
2
naiveBayes: slow predict, weird results
I did this:
nb <- naiveBayes(users, platform)
pl <- predict(nb,users)
nrow(users) ==> 314781
ncol(users) ==> 109
1. naiveBayes() was quite fast (~20 seconds), while predict() was slow
(tens of minutes). why?
2. the predict results were completely off the mark (quite the opposite
of the expected overfitting). suffice it to show the tables:
pl:
android blackberry ipad
2010 Nov 03
2
[klaR package] [NaiveBayes] warning message numerical 0 probability
Hi,
I run R 2.10.1 under ubuntu 10.04 LTS (Lucid Lynx) and klaR version 0.6-4.
I compute a model over a 2 classes dataset (composed of 700 examples).
To that aim, I use the function NaiveBayes provided in the package
klaR.
When I then use the prediction function : predict(my_model, new_data).
I get the following warning :
"In FUN(1:747[[747L]], ...) : Numerical 0 probability with
2010 Jun 30
1
help on naivebayes function in R
Hi,
I have written a code in R for classifying microarray data using naive
bayes, the code is given below:
library(e1071)
train<-read.table("Z:/Documents/train.txt",header=T);
test<-read.table("Z:/Documents/test.txt",header=T);
cl <- c(c(rep("ALL",10), rep("AML",10)));
cl <- factor(cl)
model <- NaiveBayes(train,cl);
2007 Aug 22
1
"subscript out of bounds" Error in predict.naivebayes
I'm trying to fit a naive Bayes model and predict on a new data set using
the functions naivebayes and predict (package = e1071).
R version 2.5.1 on a Linux machine
My data set looks like this. "class" is the response and k1 - k3 are the
independent variables. All of them are factors. The response has 52 levels
and k1 - k3 have 2-6 levels. I have about 9,300 independent variables
2010 Oct 22
2
Random Forest AUC
Guys,
I used Random Forest with a couple of data sets I had to predict for binary
response. In all the cases, the AUC of the training set is coming to be 1.
Is this always the case with random forests? Can someone please clarify
this?
I have given a simple example, first using logistic regression and then
using random forests to explain the problem. AUC of the random forest is
coming out to be
2006 Mar 20
1
How to compare areas under ROC curves calculated with ROC R package
I might be missing something but I thought that AUC was a measure for
comparing ROC curves, so there is nothing else needed to "compare" them. The
larger AUC is the higher correlation of 2 variables compared. No other
measures or calculations are needed.
Jarek Tuszynski
-----Original Message-----
From: r-help-bounces at stat.math.ethz.ch
[mailto:r-help-bounces at stat.math.ethz.ch] On
2009 Feb 19
1
Bug in predict function for naiveBayes?
Dear all,
I tried a simple naive Bayes classification on an artificial dataset, but I
have troubles getting the predict function to work with the type="class"
specification. With type= "raw", it works perfectly, but with type="class" I
get following error :
Error in as.vector(x, mode) : invalid 'mode' argument
Data : mixture.train is a training set with 100
2011 Jul 22
4
glmnet with binary logistic regression
Hi all,
I am using the glmnet R package to run LASSO with binary logistic
regression. I have over 290 samples with outcome data (0 for alive, 1 for
dead) and over 230 predictor variables. I currently using LASSO to reduce
the number of predictor variables.
I am using the cv.glmnet function to do 10-fold cross validation on a
sequence of lambda values which I let glmnet determine. I then take
2007 Feb 15
1
Problem in summaryBy
The R script below gives values of 1 for all minimum values when I use a
custom function in summaryBy. I get the correct values when I use FUN=min
directly. Any help is much appreciated.
The continuous information provided in this forum is fabulous as are the
different R packages available.
Rene
# Simulated simplified data
Subj <- rep(1:4, each=6)
Analyte <-
2008 Jul 17
1
Comparing differences in AUC from 2 different models
Hi,
I would like to compare differences in AUC from 2 different models, glm and gam for predicting presence / absence. I know that in theory the model with a higher AUC is better, but what I am interested in is if statistically the increase in AUC from the glm model to the gam model is significant. I also read quite extensive discussions on the list about ROC and AUC but I still didn't find
2010 Jan 22
2
Computing Confidence Intervals for AUC in ROCR Package
Dear R-philes,
I am plotting ROC curves for several cross-validation runs of a
classifier (using the function below). In addition to the average
AUC, I am interested in obtaining a confidence interval for the
average AUC. Is there a straightforward way to do this via the ROCR
package?
plot_roc_curve <- function(roc.dat, plt.title) {
#print(str(vowel.ROC))
pred <-
2008 Jul 24
1
[Fwd: Re: Coefficients of Logistic Regression from bootstrap - how to get them?]
Thank you Frank and all for your advices.
Here I attach the raw data from the Pawinski's paper. I have obtained
permission from the corresponding Author to post it here for everyone.
The only condition of use is that the Authors retain ownership of the
data, and any publication resulting from these data must be managed by them.
The dataset is composed as follows: patient number / MMF dose in
2011 Aug 02
2
Help with aggregate syntax for a multi-column function please.
Dear R-experts:
I am using a function called AUC whose arguments are data, time, id, and
dv.
data is the name of the dataframe,
time is the independent variable column name,
id is the subject id and
dv is the dependent variable.
The function computes area under the curve by trapezoidal rule, for each
subject id.
I would like to embed this in aggregate to further subset by each
2010 May 19
1
col allocation is not right
plot(svm.auc, col=2, main="ROC curves comparing classification performance\n
of six machine learning models")
legend(0.5, 0.6, c(ns, nb, nr, nt, nl,ne), 2:6, 9) # Draw a legend.
plot(bo.auc, col=3, add=T) # add=TRUE draws on the existing chart
plot(rf.auc, col=4, add=T)
plot(tree.auc, col=5, add=T)
plot(nn.auc, col=6, add=T)
plot(en.auc, col=9,lty="dotted",lwd=3, add=T)
Hi,