thr3ads.net - similar to: "?rpart"

Displaying 20 results from an estimated 9000 matches similar to: "?rpart"

how to Store loop output from a function

2010 May 26

how to Store loop output from a function

HI, Dear R community, I am writing the following function to create one data set(*tree.pred*) and one vector(*valid.out*) from loops. Later, I want to use the data set from this loop to plot curves. I have tried return, list, but I can not use the *tree.pred* data and *valid.out* vector. auc.tree<- function(msplit,mbucket) { * tree.pred<-data.frame()

my function does not work for large data set

2010 Dec 16

my function does not work for large data set

Dear R community, I have one function, it works for small data set, but does not work on large data set, can anyone help me with this? > #creat new variable by dividing each aa dimer by total_length. > imper<-function(x, file) { + round(x/file$length, 5) + } > dim(test) [1] 999 2402 > test[varname[2:2401]]<-

exact the variables used in tree construction

2010 May 12

exact the variables used in tree construction

> fit.dimer <- rpart(as.factor(out) ~ ., method="class", data=p_df) > > fit.dimer$frame[, "var"] [1] NE WC <leaf> TA <leaf> <leaf> WG WD WW WC [11] <leaf> <leaf> <leaf> CT <leaf> FC <leaf> YG QT <leaf> [21] <leaf> <leaf> NW DP DY <leaf> SK

how to extract the variables used in decision tree

2010 May 11

how to extract the variables used in decision tree

HI, Dear R community, How to extract the variables actually used in tree construction? I want to extract these variables and combine other variable as my features in next step model building. > printcp(fit.dimer) Classification tree: rpart(formula = outcome ~ ., data = p_df, method = "class") Variables actually used in tree construction: [1] CT DP DY FC NE NW QT SK TA WC WD WG WW

change the for loops with lapply

2010 Sep 07

change the for loops with lapply

cv.fold<-function(i, size=3, rang=0.3){ cat('Fold ', i, '\n') out.fold.c <-((i-1)*c.each.part +1):(i*c.each.part) out.fold.n <-((i-1)*n.each.part +1):(i*n.each.part) train.cv <- n.cc[-out.fold.c, c(2:2401, 2417)] train.nv <- n.nn[-out.fold.n, c(2:2401, 2417)] train.v<-rbind(train.cv, train.nv) #training data for feature

question about read.columns

2011 Jun 22

question about read.columns

HI, Dear R community, I have a large data set names dd.txt, the columns are: there are 2402 variables. a1, b1, ..z1, a11, b11, ...z11, a111, b111, ..z111.. IF I dont know the relative position of the columns, but I know I need the following variables: var<-c(a1, c1,a11,b11,f111) Can I use read.columns to read the data into R? I have tried the following codes, but it does not work

R.GBM package

2010 Apr 26

R.GBM package

HI, Dear Greg, I AM A NEW to GBM package. Can boosting decision tree be implemented in 'gbm' package? Or 'gbm' can only be used for regression? IF can, DO I need to combine the rpart and gbm command? Thanks so much! -- Sincerely, Changbin -- [[alternative HTML version deleted]]

probabilities in svm output in e1071 package

2010 May 05

probabilities in svm output in e1071 package

svm.fit<-svm(as.factor(out) ~ ., data=all_h, method="C-classification", kernel="radial", cost=bestc, gamma=bestg, cross=10) # model fitting svm.pred<-predict(svm.fit, hh, decision.values = TRUE, probability = TRUE) # find the probability, but can not find. attr(svm.pred, "probabilities") > attr(svm.pred, "probabilities") 1 0 1 0 0 2 0

help in attach function

2010 Apr 07

help in attach function

Hi, r-community, This morning, I MET the following problem several times when I try to attach the data set. When I closed the current console and reopen the R console, the problem disappear. BUt with the time passed on, the problem occurs again. Can anyone help me with this? > attach(total) The following object(s) are masked from total ( position 3 ) : acid base cell_evalue

help in SVM

2010 Jun 24

help in SVM

HI, GUYS, I used the following codes to run SVM and get prediction on new data set hh. dim(all_h) [1] 2034 24 dim(hh) # it contains all the variables besides the variables in all_h data set. [1] 640 415 require(e1071) svm.tune<-tune(svm, as.factor(out) ~ ., data=all_h, ranges=list(gamma=2^(-5:5), cost=2^(-5:5)))# find the best parameters. bestg<-svm.tune$best.parameters[[1]]

help in output file

2010 Apr 19

help in output file

HI, Dear R-community, I AM using the following codes to grow tree and plot tree: # Classification Tree with rpart library(rpart) pdf(file="/home/cdu/changbin/dimer_tree.pdf") # grow tree fit.dimer <- rpart(outcome ~ ., method="class", data=p.dimer[,2:402]) plotcp(fit.dimer) # visualize cross-validation results # plot tree plot(fit.dimer, uniform=TRUE,

can not print probabilities in svm of e1071

2010 Apr 29

can not print probabilities in svm of e1071

> x <- train[,c( 2:18, 20:21, 24, 27:31)] > y <- train$out > > svm.pr <- svm(x, y, probability = TRUE, method="C-classification", kernel="radial", cost=bestc, gamma=bestg, cross=10) > > pred <- predict(svm.pr, valid[,c( 2:18, 20:21, 24, 27:31)], decision.values = TRUE, probability = TRUE) > attr(pred, "decision.values")[1:4,]

help output figures in R

2010 Apr 06

help output figures in R

somfunc<- function (file) { aa_som<-scale(file) final.som<-som(data=aa_som, rlen=10000, grid=somgrid(5,4, "hexagonal")) pdf(file="/home/cdu/changbin/file.pdf") #output graphic file. plot(final.som, main="Unsupervised SOM") dev.off() } I have many different files, if I want output pdf file with the same name as for each dataset I feed to the function

get the row sums

2010 May 18

get the row sums

> head(en.id.pr) valid.gene_id b.pred rf.pred svm.pred 1521 2500151211 0 0 0 366 639679745 0 0 0 1965 2502081603 1 1 1 1420 644148030 1 1 1 1565 2500626489 1 1 1 1816 2501711016 1 1 1 > p.pred <- data.frame(en.id.pr, sum=apply(en.id.pr[,2:4], 1, sum)) #

how to split a data frame by two variables

2011 Sep 01

how to split a data frame by two variables

HI, Dear R community, I want to split a data frame by using two variables: let and g > x = data.frame(num = c(10,11,12,43,23,14,52,52,12,23,21,23,32,31,24,45,56,56,76,45), let = letters[1:5], g = 1:2) > x num let g 1 10 a 1 2 11 b 2 3 12 c 1 4 43 d 2 5 23 e 1 6 14 a 2 7 52 b 1 8 52 c 2 9 12 d 1 10 23 e 2 11 21 a 1 12 23 b 2 13 32 c 1 14

help in conditional histogram

2010 Apr 23

help in conditional histogram

Dear Dr. Sarkar, When I try to run the codes, I found the following problem: > h<- sample(1:14, 319, rep=T) > c<- sample(1:14, 608, rep=T) > n<- sample(1:14, 1140, rep=T) > vt<-c(h, c, n) > ta<-rep(c("h", "c", "n"), c(319, 608, 1140)) > > to<-data.frame(vt,ta) > library(lattice) Attaching package: 'lattice'

output from the gbm package

2010 Jun 15

output from the gbm package

HI, Dear Greg and R community, I have one question about the output of gbm package. the output of Boosting should be f(x), from it , how to calculate the probability for each observations in data set? SInce it is stochastic, how can guarantee that each observation in training data are selected at least once? IF SOME obs are not selected, how to calculate the training error? Thanks? --

help with adding lines to current plot

2010 Oct 25

help with adding lines to current plot

HI, Dear R community, I am using the following codes to plot, however, the lines code works. But the line was not drawn on the previous plot and did not shown up. How comes? # specify the data for missense simulation x <- seq(0,10, by=1) y <- c(0.952, 0.947, 0.943, 0.941, 0.933, 0.932, 0.939, 0.932, 0.924, 0.918, 0.920) # missense z <- c(0.068, 0.082, 0.080, 0.099, 0.108, 0.107,

R eat my data

2010 May 25

R eat my data

HI, Dear R community, My original file has 1932 lines, but when I read into R, it changed to 1068 lines, how comes? cdu@nuuk:~/operon$ wc -l id_name_gh5.txt 1932 id_name_gh5.txt > gene_name<-read.table("/home/cdu/operon/id_name_gh5.txt", sep="\t", skip=0, header=F, fill=T) > dim(gene_name) [1] 1068 3 -- Sincerely, Changbin -- Changbin Du DOE Joint Genome

how to label the som notes by the majority vote

2010 Jun 02

how to label the som notes by the majority vote

HI, Dear R community, I am using the following codes to do the som. I tried to label the notes by the majority vote. either through mapping or prediction. I attached my output, the left one dont have any labels in the note, the right one has more than one label in each note. I need to have only one label for each note either by majority vote or prediction. Can anyone give some suggestions or

similar to: ?rpart