Displaying 20 results from an estimated 600 matches similar to: "strange data set output"
2010 Apr 06
2
help output figures in R
somfunc<- function (file) {
aa_som<-scale(file)
final.som<-som(data=aa_som, rlen=10000, grid=somgrid(5,4, "hexagonal"))
pdf(file="/home/cdu/changbin/file.pdf") #output graphic file.
plot(final.som, main="Unsupervised SOM")
dev.off()
}
I have many different files, if I want output pdf file with the same name
as for each dataset I feed to the function
2010 May 05
3
sort the data set by one variable
> #sort the data by predicted probability
> b.order<-bo.id.pred[(order(-predict)),]
> b.order[1:20,]
gene_id predict
43 637882902 0.07823997
53 638101634 0.66256490
61 639084581 0.08587504
41 637832824 0.02461066
25 637261662 0.11613879
22 637240022 0.06350477
62 639084582 0.02238538
63 639097718 0.06792841
44 637943079 0.04532625
80 640158389 0.06582658
3 637006517 0.57648451
2010 May 18
2
get the row sums
> head(en.id.pr)
valid.gene_id b.pred rf.pred svm.pred
1521 2500151211 0 0 0
366 639679745 0 0 0
1965 2502081603 1 1 1
1420 644148030 1 1 1
1565 2500626489 1 1 1
1816 2501711016 1 1 1
> p.pred <- data.frame(en.id.pr, sum=apply(en.id.pr[,2:4], 1, sum)) #
2010 Apr 26
3
R.GBM package
HI, Dear Greg,
I AM A NEW to GBM package. Can boosting decision tree be implemented in
'gbm' package? Or 'gbm' can only be used for regression?
IF can, DO I need to combine the rpart and gbm command?
Thanks so much!
--
Sincerely,
Changbin
--
[[alternative HTML version deleted]]
2010 May 05
2
probabilities in svm output in e1071 package
svm.fit<-svm(as.factor(out) ~ ., data=all_h, method="C-classification",
kernel="radial", cost=bestc, gamma=bestg, cross=10) # model fitting
svm.pred<-predict(svm.fit, hh, decision.values = TRUE, probability = TRUE) #
find the probability, but can not find.
attr(svm.pred, "probabilities")
> attr(svm.pred, "probabilities")
1 0
1 0 0
2 0
2010 Apr 29
2
can not print probabilities in svm of e1071
> x <- train[,c( 2:18, 20:21, 24, 27:31)]
> y <- train$out
>
> svm.pr <- svm(x, y, probability = TRUE, method="C-classification",
kernel="radial", cost=bestc, gamma=bestg, cross=10)
>
> pred <- predict(svm.pr, valid[,c( 2:18, 20:21, 24, 27:31)],
decision.values = TRUE, probability = TRUE)
> attr(pred, "decision.values")[1:4,]
2010 Nov 04
4
how to work with long vectors
HI, Dear R community,
I have one data set like this, What I want to do is to calculate the
cumulative coverage. The following codes works for small data set (#rows =
100), but when feed the whole data set, it still running after 24 hours.
Can someone give some suggestions for long vector?
id reads
Contig79:1 4
Contig79:2 8
Contig79:3 13
Contig79:4 14
Contig79:5 17
2010 Oct 25
1
help with adding lines to current plot
HI, Dear R community,
I am using the following codes to plot, however, the lines code works. But
the line was not drawn on the previous plot and did not shown up.
How comes?
# specify the data for missense simulation
x <- seq(0,10, by=1)
y <- c(0.952, 0.947, 0.943, 0.941, 0.933, 0.932, 0.939, 0.932, 0.924, 0.918,
0.920) # missense
z <- c(0.068, 0.082, 0.080, 0.099, 0.108, 0.107,
2010 Apr 15
2
r-loop
HI, Dear community,
I am building the following loop,
ww<-function(file) {
lossw<-vector()
for (x in seq(0.1, 0.9, by=0.1)) {
cat('xweight ', x, '\n')
lossw[i] <- cross.validation(file, x)$avg
}
return(lossw) }
MY question is how to index the lossw[i]?
for (i in 1:9)
for (x in seq(0.1, 0.9, by=0.1))
Thanks so much!
2010 May 25
4
R eat my data
HI, Dear R community,
My original file has 1932 lines, but when I read into R, it changed to 1068
lines, how comes?
cdu@nuuk:~/operon$ wc -l id_name_gh5.txt
1932 id_name_gh5.txt
> gene_name<-read.table("/home/cdu/operon/id_name_gh5.txt", sep="\t",
skip=0, header=F, fill=T)
> dim(gene_name)
[1] 1068 3
--
Sincerely,
Changbin
--
Changbin Du
DOE Joint Genome
2011 Sep 01
3
how to split a data frame by two variables
HI, Dear R community,
I want to split a data frame by using two variables: let and g
> x = data.frame(num =
c(10,11,12,43,23,14,52,52,12,23,21,23,32,31,24,45,56,56,76,45), let =
letters[1:5], g = 1:2)
> x
num let g
1 10 a 1
2 11 b 2
3 12 c 1
4 43 d 2
5 23 e 1
6 14 a 2
7 52 b 1
8 52 c 2
9 12 d 1
10 23 e 2
11 21 a 1
12 23 b 2
13 32 c 1
14
2010 Nov 01
2
how to save this result in a vector
HI, Dear R community,
I have the following codes to calculate the commulative coverage. I want to
save the output in a vector, How to do this?
test<-seq(10, 342, by=2)
#cover is a vector
cover_per<-function (cover) {
for (i in min(cover):max(cover)) {print(100*sum(ifelse(cover >= i, 1,
0))/length(cover))}
}
result<-cover_per(test)
> result
NULL
Can anyone help me this this?
2010 Jun 15
1
output from the gbm package
HI, Dear Greg and R community,
I have one question about the output of gbm package. the output of Boosting
should be f(x), from it , how to calculate the probability for each
observations in data set?
SInce it is stochastic, how can guarantee that each observation in training
data are selected at least once? IF SOME obs are not selected, how to
calculate the training error?
Thanks?
--
2010 Dec 16
1
my function does not work for large data set
Dear R community,
I have one function, it works for small data set, but does not work on large
data set, can anyone help me with this?
> #creat new variable by dividing each aa dimer by total_length.
> imper<-function(x, file) {
+ round(x/file$length, 5)
+ }
> dim(test)
[1] 999 2402
> test[varname[2:2401]]<-
2010 Sep 24
1
How to read this file into R.
Dear community,
I have one file named ca_boost_feature.txt,
Feature selection (Boosting:0.0025,5)!
H.2.C C.1.D C.3.R E.0.N C.2.S C.0.G H.3.G
log file: ep
If I want to use the second line of this file, how to read it into R?
varr<-read.table("/home/cdu/operon/carbonic/ca_boost_feature.txt", sep=" ",
skip=1, header=F, strip.white=TRUE, nrows=1)
Warning message:
In
2010 Apr 21
2
?rpart
HI, Dear R community,
Last friday, I used the codes, it works, but today, it does not run?
> fit.dimer <- rpart(outcome ~., method="class", data=p.df)
Error in `[.data.frame`(frame, predictors) : undefined columns selected
DOEs anyone have comments or suggestions? Thanks in advance!
--
Sincerely,
Changbin
--
[[alternative HTML version deleted]]
2010 May 19
1
col allocation is not right
plot(svm.auc, col=2, main="ROC curves comparing classification performance\n
of six machine learning models")
legend(0.5, 0.6, c(ns, nb, nr, nt, nl,ne), 2:6, 9) # Draw a legend.
plot(bo.auc, col=3, add=T) # add=TRUE draws on the existing chart
plot(rf.auc, col=4, add=T)
plot(tree.auc, col=5, add=T)
plot(nn.auc, col=6, add=T)
plot(en.auc, col=9,lty="dotted",lwd=3, add=T)
Hi,
2010 Jun 19
1
question about boosting(Adaboosting. M1)
HI, Guys,
I am trying to use the AdaBoosting. M.1 algorithm to integrate three models.
I found the sum of weights for each model is not equal to one.
How to deal with this?
Thanks, any response or suggestions are appreciated!
--
Sincerely,
Changbin
--
[[alternative HTML version deleted]]
2010 Oct 12
1
need help with nnet
HI, Dear R community,
My data set has 2409 variables, the last one is response variable. I have
used the nnet after feature selection and works. But this time, I am using
nnet to fit a model without feature selection. I got the following error
information:
> dim(train)
[1] 1827 2409
nnet.fit<-nnet(as.factor(out) ~ ., data=train, size=3, rang=0.3,
decay=5e-4, maxit=500) # model
2011 Feb 24
2
create dummy variables by for loop
HI, Dear R community,
I try to create 100 dummy variables like the following:
ack$id_1 <- (ack$ID==1)*1
ack$id_2 <- (ack$ID==2)*1
..
.
ack$id_100 <- (ack$ID==100)*1
I used the following codes:
for(i in 1:100){
ack$id_[i] <- (ack$ID==i)*1
}
But only one column is created, can anyone help me?
Thanks a lot!
--
Sincerely,
Changbin
--