Displaying 20 results from an estimated 20000 matches similar to: "Cost-sensitive classification"
2009 Jun 17
1
gbm for cost-sensitive binary classification?
I recently use gbm for a binary classification problem. As expected, it gets very good results, based on Area under ROC with 7-fold cross validation. However, the application (malware detection) is cost-sensitive, getting a FP (classify a clean sample as a dirty one) is much worse than getting a FN (miss a dirty sample). I would like to tune the gbm model biased to very low FP rate.
For this
2003 Feb 12
1
rpart v. lda classification.
I've been groping my way through a classification/discrimination
problem, from a consulting client. There are 26 observations, with 4
possible categories and 24 (!!!) potential predictor variables.
I tried using lda() on the first 7 predictor variables and got 24 of
the 26 observations correctly classified. (Training and testing both
on the complete data set --- just to get started.)
I
2005 Jan 25
0
Collapsing solution to the question discussed above: Re: multi-class classification using rpart
You could break your 3 class problem into several (2 or 3) 2 class problems,
and then use Andy's suggestion (see the CART book). There are several ways
to break the problem into 2 class problems, and several ways to combine the
resulting classifiers. Tom Dietterich, Jerry Friedman, Trevor Hastie and Rob
Tibshirani, among others, have articles on the question, in places like
Annals of
2005 Mar 18
1
How to show which variables include in plot of classification tree
Dear all
For my research, I am learning classification now.
I was trying some example about classification tree pakages, such as
tree and rpart, for instance,
in Pima.te dataset have 8 variables (include class=type):
library(rpart)
library(datasets)
pima.rpart <- rpart(type ~ npreg+glu+bp+skin+bmi+ped+age,data=Pima.te,
method='class')
plot(pima.rpart, uniform=TRUE)
text(pima.rpart)
2011 Nov 04
1
Decision tree model using rpart ( classification
Hi Experts,
I am new to R, using decision tree model for getting segmentation rules.
A) Using behavioural data (attributes defining customer behaviour, ( example
balances, number of accounts etc.)
1. Clustering: Cluster behavioural data to suitable number of clusters
2. Decision Tree: Using rpart classification tree for generating rules for
segmentation using cluster number(cluster id) as target
2004 Mar 13
4
nnet classification accuracy vs. other models
I was wandering if anybody ever tried to compare the classification
accuracy of nnet to other (rpart, tree, bagging) models. From what I
know, there is no reason to expect a significant difference in
classification accuracy between these models, yet in my particular case
I get about 10% error rate for tree, rpart and bagging model and 80%
error rate for nnet, applied to the same data.
Thanks.
2009 Aug 02
0
rpart: which is correct?
I am using rpart in classification mode and am confused about this
particular model's predictions.
> predict(fit, train[8,])
-1 1
8 0.5974089 0.4025911
> predict(fit, train[8,], type="class")
1
Levels: -1 1
So, it seems like there is a 60% change of being class -1 according the
the "prob" output (which is the default for classification) but gives
2003 Apr 10
1
Classification problem - rpart
I am performing a binary classification using a classification tree.
Ironically, the data themselves are 2483 tree (real biological ones)
locations as described by a suite of environmental variables (slope, soil
moisture, radiation load, etc). I want to separate them from an equal number
of random points. Doing eda on the data shows that there is substantial
difference between the tree and random
2006 Aug 24
0
Classification tree with a random variable
Hi,
I am planning on using classification trees to build a predictive model for data which includes a random variable. I intend to use the R functions 'rpart' (and potentially also 'randomForest' and 'bagging').
I have a data set with 390 data points. The response variable is binary. There are a large number of variables (>20, both categorical and continuous). The
2005 Oct 14
1
Predicting classification error from rpart
Hi,
I think I'm missing something very obvious, but I am missing it, so I
would be very grateful for help. I'm using rpart to analyse data on
skull base morphology, essentially predicting sex from one or several
skull base measurements. The sex of the people whose skulls are being
studied is known, and lives as a factor (M,F) in the data. I want to
get back predictions of gender, and
2008 Mar 06
1
Rpart and bagging - how is it done?
Hi there.
I was wondering if somebody knows how to perform a bagging procedure on a
classification tree without running the classifier with weights.
Let me first explain why I need this and then give some details of what I
have found out so far.
I am thinking about implementing the bagging procedure in Matlab. Matlab
has a simple classification tree function (in their Statistics toolbox) but
2005 Jul 01
1
p-values for classification
Dear All,
I'm classifying some data with various methods (binary classification). I'm interpreting the results via a confusion matrix from which I calculate the sensitifity and the fdr. The classifiers are trained on 575 data points and my test set has 50 data points.
I'd like to calculate p-values for obtaining <=fdr and >=sensitifity for each classifier. I was thinking about
2011 Oct 19
0
R classification
hello, i am so glad to write you.
i am dealing now with writing my M.Sc in Applied Statistics thesis, titled " Data Mining Classifiers and Predictive Models Validation and Evaluation".
I am planning to compare several DM classifiers like "NN, kNN, SVM, Dtree, and Naïve Bayes" according to their Predicting accuracy, interpretability, scalability, and time consuming etc.
I have
2012 Aug 02
0
Changing the classification threshold for cost function
Dear All
I am trying to perform leave-one-out cross validation on a logistic
regression model using cv.glm from the boot package in R.
As I understand it, the standard cost function:
cost<-function(r,pi=0) mean(abs(r-pi)>0.5)
Uses a 50% risk threshold to classify cases as positive or negative and
calculates the prediction error based on this.
I would like to alter this threshold to,
2011 Aug 08
1
Classification trees problem.
Hello Everyone,
I'm doing a Classification trees with categorical explanatory variables using library rpart and I would like to do a prediction for some data imputs. I don't know where's a function or how can I do it?. Is there someone can help ?? ¿. Here's the code that I'm using.
library(rpart)
fit <- rpart(Kyphosis ~ Age + Number + Start, data=kyphosis)
plot(fit)
2010 Dec 14
1
rpart - how to estimate the “meaningful” predictors for an outcome (in classification trees)
Hi dear R-help memebers,
When building a CART model (specifically classification tree) using rpart,
it is sometimes obvious that there are variables (X's) that are meaningful
for predicting some of the outcome (y) variables - while other predictors
are relevant for other outcome variables (y's only).
*How can it be estimated, which explanatory variable is "used" for which of
2006 Jul 18
1
Classification error rate increased by bagging - any ideas?
Hi,
I'm analysing some anthropometric data on fifty odd skull bases. We know the
gender of each skull, and we are trying to develop a predictor to identify
the
sex of unknown skulls.
Rpart with cross-validation produces two models - one of which predicts
gender
for Males well, and Females poorly, and the other does the opposite (Females
well, and Males poorly). In both cases the error
2007 Jan 29
3
comparing random forests and classification trees
Hi,
I have done an analysis using 'rpart' to construct a Classification Tree. I
am wanting to retain the output in tree form so that it is easily
interpretable. However, I am wanting to compare the 'accuracy' of the tree
to a Random Forest to estimate how much predictive ability is lost by using
one simple tree. My understanding is that the error automatically displayed
by the two
2002 Mar 13
0
rpart error with 0-frequency factor levels (with partial fix) (PR#1378)
(I'm sending to r-bugs because rpart is one of the recommended packages and
is always installed. I'm also sending it directly to Dr. Ripley, as the
maintainer.)
rpart working as a classifier does not work (produces no splits) when the
class indicator has no instances of one of the factor levels, as long as the
factor level is not the final level. I have at least a partial fix, which I
2011 Feb 11
4
About classification methods.
Dear R users,
I'm new of the R, I really don't know much.
I want classification some data (two class, many features and huge size of data) by using R.
At this case, I want using Support Vector Machine, Bayes theory based classifier, Discriminant Analysis, Regression based at least.
Which package should I using, and can I compare each classifier result by predictions?
Thank you.