Displaying 20 results from an estimated 6000 matches similar to: "Changing the classification threshold for cost function"
2007 Apr 27
0
Logistic Regression Question: Risk Threshold
Hi,
I am working on problem 2 of Chapter 8 in Data Analysis and Graphics Using R and don't know how to approach the second half of the question:
In the data set (an artificial one of 3121 patients, that is similar to a subset of the data analyzed in Stiell et al., 2001) head.injury, obtain a logistic regression model relating clinically.important.brain.injury to other variables. Patients
2011 Jul 14
0
Cost-sensitive classification
Hi , everybody !!!
I want to perform a cost-sensitive classification using the rpart as a base
classifier .
Is it possible ?
Nissim
--
View this message in context: http://r.789695.n4.nabble.com/Cost-sensitive-classification-tp3668749p3668749.html
Sent from the R help mailing list archive at Nabble.com.
2012 Nov 19
2
Classification methods - which one?
Dear all,
i searched for some classification methods and I have no glue if i took the right once.
My problem: I have a matrix with 17000 rows and 33 colums (genes and patients). The patients are grouped into 3 diseases.
No I want to classify the patients and for sure i want to know which rows are more helpful for the classification than others.
I tried SVM and random forest. Do you think this
2009 Jun 17
1
gbm for cost-sensitive binary classification?
I recently use gbm for a binary classification problem. As expected, it gets very good results, based on Area under ROC with 7-fold cross validation. However, the application (malware detection) is cost-sensitive, getting a FP (classify a clean sample as a dirty one) is much worse than getting a FN (miss a dirty sample). I would like to tune the gbm model biased to very low FP rate.
For this
2004 Jun 11
1
ROC for threshold value, biometrics
Hello,
I am just a beginner of R 1.9.0.
I try to construct a predictive score for the development of liver
cancer in cirrhotic patients. So dependant variable is binanry (cancer
yes or no). Independant variables are biological data. The aim is to
find out a cut-off value which differentiate (theoratically) from
normal to pathological state for each biological data.
How can I step in procedue to
2011 Oct 31
1
Question on estimating standard errors with noisy signals using the quantreg package
Dear all,
My question might be more of a statistics question than a question on R,
although it's on how to apply the 'quantreg' package. Please accept my
apologies if you believe I am strongly misusing this list.
To be very brief, the problem is that I have data on only a random draw, not
all of doctors' patients. I am interested in the, say, median number of
patients of
2010 Oct 18
0
Question about legend parameters
Hello!
The code below works - if you run it you'll see a stacked area chart
generated based on the data example.
I only have one understanding question about the legend location (the
very last snippet of code):
legend(par()$usr[2],
mean(par()$usr[3:4]),
rev(order.of.vars),
xpd=T,
bty="n",
pch=15,
col=all.colors[rev(order.of.colors)])
I see that par()$usr[2] = 14763.72
2010 Oct 04
1
adding a legend to the plot (but outside of it)
Hello!
My code below creates a data frame and a plot for it.
However, I can't figure out how to add a legend that is not ON the
plot itself, but outside of it (e.g., to the right of my graph or
below it). I tried something: I put a line par(xpd=T,
mar=par()$mar+c(0,0,0,4)) right before my plot command), but that
screwed up all my gridlines - they covered all graph and do not
coincide with
2010 Oct 04
1
reducing distances between tickmarks
Hello, everybody!
I have a code below that creates a data set and then a stacked bar
chart based on that data set.
No need to look at it - just notice please that my horizontal axis is
a date varible (x=my.data$date).
I have a question about the last 2 lines of this code:
grid(nx=NULL,ny=NULL,col = "lightgray", lty = "dotted",lwd = par("lwd"))
axis(1, las = 2)
Could
2010 Sep 27
1
stacked area chart
Dear R-ers!
Asking for your help with building the stacked area chart for the
following simple data (several variables - with date on the X axis):
### Creating a data set
my.data<-data.frame(date=c(20080301,20080402,20080503,20090301,20090402,20090503,20100301,20100402,20100503),
x=c(1.1,1.0,1.6,1,2,1.5,2.1,1.3,1.9),y=c(-4,-3,-6,-5,-7,-5.2,-6,-4,-4.9),
2011 Sep 27
0
Workflow for binary classification problem using svm via e1071 package
Dear R-list!
I am using the e1071 package in R to solve a binary classification problem
in a dataset of round 180 predictor variables (blood metabolites) of two
groups of subjects (patients and healthy controls). I am confused regarding
the correct way to assess the classification accuracy of the trained svm.
(A) The svm command allows to specificy via the 'cross=k' parameter to
specify a
2010 Oct 05
2
is there a way to avoid "traveling" grid?
Hello!
If you run the whole code below, it'll produce a stacked diagram. And
it looks good - because the tick-marks are aligned with the grid.
However, if I stretch the graph window, grid becomes misaligned with
the tickmarks. Or, rather, it seems aligned for the first and the last
tick mark, but not for tickmarks in between.
Can it be addressed?
Thank you!
Dimitri
### Creating a data set
2013 Jun 28
0
[LLVMdev] [LNT] Question about results reliability in LNT infrustructure
On 28 June 2013 19:45, Chris Matthews <chris.matthews at apple.com> wrote:
> Given this tradeoff I think we want to tend towards false positives (over
> false negatives) strictly as a matter of compiler quality.
>
False hits are not binary, but (at least) two-dimensional. You can't say
it's better to have any amount of false positives than any amount of false
negatives
2013 Jan 18
1
scaling of nonbinROC penalties
Dear R Helpers
I am having difficulty understanding how to use the penalty matrix for the nomROC function in package 'nonbinROC'.
The documentation says that the values of the penalty matrix code the
penalty function L[i,j] in which 0 <= L[i,j] <= 1 for
j > i. It gives an example that if we have an ordered response with 4 categories, then we might wish to penalise larger
2009 May 10
1
Function recommendation for this study...
Hi,
I'm not used to thinking along these lines, and wanted to ask your advice:
Suppose you have a sample of around 100, consisting of patients according to
doctors, in which patients and doctors are given a questionnaire with
categorical responses. Each patient somehow has roughly 3 doctors, or 3
rows of data. The goal is to assess by category of each question or DV the
agreement between
2013 Jun 28
2
[LLVMdev] [LNT] Question about results reliability in LNT infrustructure
I should describe the cost of false negatives and false positives, since I think it matters for how this problem is approached. False negatives mean we miss a real regression --- we don’t want that. False positives mean somebody has to spend some time looking at and reproducing the regression when there is not one --- bad too. Given this tradeoff I think we want to tend towards false positives
2017 May 11
4
Using queue priorities to add agents
Hi,
I have a scenario that I am failing to implement using the Queue app, but
which I had thought would be commonplace...
1) (this bit works fine) I want a queue caller to have access to the basic
set of agents initially, with an overflow to additional agents if they are
busy - This is done using penalty:
queues.conf:
member => SIP/dev1,0,Agent1
member => SIP/dev2,0,Agent2
member =>
2012 Feb 01
1
Function to compute multi-response, multi-rater kappa?
I'm looking for a function in R that extends kappa to multiple raters when
there is more than one response per subject. For example, say a group of
doctors have to assign diseases to patients. Each patient will be assigned
one to many diseases, and the number of doctors assigning diseases to any
one patient will be two to many.
Here's an extremely simple example of the type of data I
2005 Jun 20
1
(no subject)
R friends,
I am using R 2.1.0 in a Win XP . I have a problem working with lists, probably I
do not understand how to use them.
Lets suppose that a set of patients visit a clinic once a year for 4 years
on each visit a test, say 'eib' is performed with results 0 or 1
The patients do not all visit the clinic the 4 times but they missed a lot
of visits.
The test is considered positive if it
2006 Apr 03
0
Weighted Sensitivity, PPV etc.
All,
Appreciate any leads on the following:
In a recent blind-validation study of a depression screening instrument
we used a two-stage sampling design.
In stage 1, we used a broad paper-and-pencil screen to identify likely
positives (say 30% of entire sample). In stage 2 we conducted in-depth
interviews with the 30% of likely positives plus another 20% of the
negatives as controls.