Displaying 20 results from an estimated 1000 matches similar to: "can not print probabilities in svm of e1071"
2010 May 05
2
probabilities in svm output in e1071 package
svm.fit<-svm(as.factor(out) ~ ., data=all_h, method="C-classification",
kernel="radial", cost=bestc, gamma=bestg, cross=10) # model fitting
svm.pred<-predict(svm.fit, hh, decision.values = TRUE, probability = TRUE) #
find the probability, but can not find.
attr(svm.pred, "probabilities")
> attr(svm.pred, "probabilities")
1 0
1 0 0
2 0
2010 Jun 24
1
help in SVM
HI, GUYS,
I used the following codes to run SVM and get prediction on new data set hh.
dim(all_h)
[1] 2034 24
dim(hh) # it contains all the variables besides the variables in all_h
data set.
[1] 640 415
require(e1071)
svm.tune<-tune(svm, as.factor(out) ~ ., data=all_h,
ranges=list(gamma=2^(-5:5), cost=2^(-5:5)))# find the best parameters.
bestg<-svm.tune$best.parameters[[1]]
2010 Jul 01
5
ROC curve in R
Hi,
i have a fairly large amount of genomic data. I have created a dataframe
which has "Reference" as one column and "Variation" as another. I want to
plot a ROC curve based on these 2 columns. I have serached the R manual but
I could not understand. Can anybody help me with the R code for plotting ROC
curve.
Thnx
ashu6886
--
View this message in context:
2011 Apr 09
3
In svm(), how to connect quantitative prediction result to categorical result?
Hi,
I am studying using SVM functions of e1071 package to do prediction, and I found during the training data are "factor" type, then svm.predict() can predict data directly by categories; but if response variables are "numerical", the predicted value from svm will be continuous quantitative numbers, then how can I connect these quantitative numbers to categories? (for
2010 Apr 26
3
R.GBM package
HI, Dear Greg,
I AM A NEW to GBM package. Can boosting decision tree be implemented in
'gbm' package? Or 'gbm' can only be used for regression?
IF can, DO I need to combine the rpart and gbm command?
Thanks so much!
--
Sincerely,
Changbin
--
[[alternative HTML version deleted]]
2007 Dec 06
1
R on a multi core unix box
Hi,
I installed the snow package on a unix box that has multiple cores. To be
able to exploit the multiple cores (on one pc) do I still need to install
the rmpi package (or rpvm). Another question, if i run a bayesian simulation
on the multiple core after setting them up correctly (using snow), would you
think there will be a noticeable speedup gain.
Thanks,
Saeed
---
linux centos
4 dual core
2010 Aug 26
1
Importance of levels in a factor variable
I have a dataset of multiple variables and a response. For example,
> str(x)
'data.frame': 3557238 obs. of 44 variables:
$ response : Factor w/ 2 levels
$ var2: Factor w/5000 levels
If var2 for example is a factor with 5000 levels, what is the best
approach to determine which of these levels is the most important to
include in building the model, and which ones to discard.
2008 Jan 08
1
Invoking R on BSD
Thanks to Saeed Abu Nimeh. I used pkg_add to install R package on 4.4BSD.
My directory now has the following:
BUILDDIR Makefrag.cc_lo config.log m4 tests
Makeconf Makefrag.cxx config.status po tools
Makefile R-2.6.1 doc roots
Makefile.bak R-2.6.1.tar.gz etc share
Makefrag.cc SVN-REVISION libtool
2010 Apr 21
2
?rpart
HI, Dear R community,
Last friday, I used the codes, it works, but today, it does not run?
> fit.dimer <- rpart(outcome ~., method="class", data=p.df)
Error in `[.data.frame`(frame, predictors) : undefined columns selected
DOEs anyone have comments or suggestions? Thanks in advance!
--
Sincerely,
Changbin
--
[[alternative HTML version deleted]]
2011 Sep 01
3
how to split a data frame by two variables
HI, Dear R community,
I want to split a data frame by using two variables: let and g
> x = data.frame(num =
c(10,11,12,43,23,14,52,52,12,23,21,23,32,31,24,45,56,56,76,45), let =
letters[1:5], g = 1:2)
> x
num let g
1 10 a 1
2 11 b 2
3 12 c 1
4 43 d 2
5 23 e 1
6 14 a 2
7 52 b 1
8 52 c 2
9 12 d 1
10 23 e 2
11 21 a 1
12 23 b 2
13 32 c 1
14
2009 Mar 27
1
ROCR package finding maximum accuracy and optimal cutoff point
If we use the ROCR package to find the accuracy of a classifier
pred <- prediction(svm.pred, testset[,2])
perf.acc <- performance(pred,"acc")
Do we?find the maximum accuracy?as follows?(is there a simplier way?):
> max(perf.acc at x.values[[1]])
Then to find the cutoff point that maximizes the accuracy?do we do the
following?(is there a simpler way):
> cutoff.list <-
2010 Nov 04
4
how to work with long vectors
HI, Dear R community,
I have one data set like this, What I want to do is to calculate the
cumulative coverage. The following codes works for small data set (#rows =
100), but when feed the whole data set, it still running after 24 hours.
Can someone give some suggestions for long vector?
id reads
Contig79:1 4
Contig79:2 8
Contig79:3 13
Contig79:4 14
Contig79:5 17
2010 Apr 06
2
help output figures in R
somfunc<- function (file) {
aa_som<-scale(file)
final.som<-som(data=aa_som, rlen=10000, grid=somgrid(5,4, "hexagonal"))
pdf(file="/home/cdu/changbin/file.pdf") #output graphic file.
plot(final.som, main="Unsupervised SOM")
dev.off()
}
I have many different files, if I want output pdf file with the same name
as for each dataset I feed to the function
2010 May 18
2
get the row sums
> head(en.id.pr)
valid.gene_id b.pred rf.pred svm.pred
1521 2500151211 0 0 0
366 639679745 0 0 0
1965 2502081603 1 1 1
1420 644148030 1 1 1
1565 2500626489 1 1 1
1816 2501711016 1 1 1
> p.pred <- data.frame(en.id.pr, sum=apply(en.id.pr[,2:4], 1, sum)) #
2010 Apr 23
1
help in conditional histogram
Dear Dr. Sarkar,
When I try to run the codes, I found the following problem:
> h<- sample(1:14, 319, rep=T)
> c<- sample(1:14, 608, rep=T)
> n<- sample(1:14, 1140, rep=T)
> vt<-c(h, c, n)
> ta<-rep(c("h", "c", "n"), c(319, 608, 1140))
>
> to<-data.frame(vt,ta)
> library(lattice)
Attaching package: 'lattice'
2011 Sep 22
2
create variables through a loop
HI, Dear R community,
I am trying to created new variables and put into a data frame through a
loop.
My original data set:
head(first)
probe_name chr_id position array1
1 C-7SARK 1 849467 10
2 C-4WYLN 1 854278 10
3 C-3BFNY 1 854471 10
4 C-7ONNE 1 874460 10
5 C-6HYCN 1 874571 10
6 C-7SCGC 1 874609 10
I have
2010 May 05
3
sort the data set by one variable
> #sort the data by predicted probability
> b.order<-bo.id.pred[(order(-predict)),]
> b.order[1:20,]
gene_id predict
43 637882902 0.07823997
53 638101634 0.66256490
61 639084581 0.08587504
41 637832824 0.02461066
25 637261662 0.11613879
22 637240022 0.06350477
62 639084582 0.02238538
63 639097718 0.06792841
44 637943079 0.04532625
80 640158389 0.06582658
3 637006517 0.57648451
2010 Oct 25
1
help with adding lines to current plot
HI, Dear R community,
I am using the following codes to plot, however, the lines code works. But
the line was not drawn on the previous plot and did not shown up.
How comes?
# specify the data for missense simulation
x <- seq(0,10, by=1)
y <- c(0.952, 0.947, 0.943, 0.941, 0.933, 0.932, 0.939, 0.932, 0.924, 0.918,
0.920) # missense
z <- c(0.068, 0.082, 0.080, 0.099, 0.108, 0.107,
2010 May 25
4
R eat my data
HI, Dear R community,
My original file has 1932 lines, but when I read into R, it changed to 1068
lines, how comes?
cdu@nuuk:~/operon$ wc -l id_name_gh5.txt
1932 id_name_gh5.txt
> gene_name<-read.table("/home/cdu/operon/id_name_gh5.txt", sep="\t",
skip=0, header=F, fill=T)
> dim(gene_name)
[1] 1068 3
--
Sincerely,
Changbin
--
Changbin Du
DOE Joint Genome
2010 Jun 15
1
output from the gbm package
HI, Dear Greg and R community,
I have one question about the output of gbm package. the output of Boosting
should be f(x), from it , how to calculate the probability for each
observations in data set?
SInce it is stochastic, how can guarantee that each observation in training
data are selected at least once? IF SOME obs are not selected, how to
calculate the training error?
Thanks?
--