thr3ads.net - similar to: "Re-evaluating the tree in the random forest"

Displaying 20 results from an estimated 2000 matches similar to: "Re-evaluating the tree in the random forest"

2003 Aug 20

RandomForest

Hello, When I plot or look at the error rate vector for a random forest (rf$err.rate) it looks like a descending function except for a few first points of the vector with error rates values lower(sometimes much lower) than the general level of error rates for a forest with such number of trees when the error rates stop descending. Does it mean that there is a tree(s) (that is built the first in

Random Forest prediction questions

2010 Mar 01

Random Forest prediction questions

Hi, I need help with the randomForest prediction. i run the folowing code: > iris.rf <- randomForest(Species ~ ., data=iris, > importance=TRUE,keep.forest=TRUE, proximity=TRUE) > pr<-predict(iris.rf,iris,predict.all=T) > iris.rf$votes[53,] setosa versicolor virginica 0.0000000 0.8074866 0.1925134 > table(pr$individual[53,])/500 versicolor virginica 0.928

randomForest gives different results for formula call v. x, y methods. Why?

2007 Apr 29

randomForest gives different results for formula call v. x, y methods. Why?

Just out of curiosity, I took the default "iris" example in the RF helpfile... but seeing the admonition against using the formula interface for large data sets, I wanted to play around a bit to see how the various options affected the output. Found something interesting I couldn't find documentation for... Just like the example... > set.seed(12) # to be sure I have

tuning random forest. An unexpected result

2011 Nov 17

tuning random forest. An unexpected result

Dear Researches, I am using RF (in regression way) for analize several metrics extract from image. I am tuning RF setting a loop using different range of mtry, tree and nodesize using the lower value of MSE-OOB mtry from 1 to 5 nodesize from1 to 10 tree from 1 to 500 using this paper as refery Palmer, D. S., O'Boyle, N. M., Glen, R. C., & Mitchell, J. B. O. (2007). Random Forest Models

sampsize in Random Forests

2008 Mar 09

sampsize in Random Forests

Hi all, I have a dataset where each point is assigned to a class A, B, C, or D. Each point is also assigned to a study site. Each study site is coded with a number ranging between 1-100. This information is stored in the vector studySites. I want to run randomForests using stratified sampling, so I chose the option strata = factor(studySites) But I am not sure how to control the number of

random forest -optimising mtry

2004 Oct 13

random forest -optimising mtry

Dear R-helpers, I'm working on mass spectra in randomForest/R, and following the recommendations for the case of noisy variables, I don't want to use the default mtry (sqrt of nvariables), but I'm not sure up to which proportion mtry/nvariables it makes sense to increase mtry without "overtuning" RF. Let me tell my example: I have 106 spectra belonging to 4 classes, the

confusion matrix in randomForest

2008 Jul 20

confusion matrix in randomForest

I have a question on the output generated by randomForest in classification mode, specifically, the confusion matrix. The confusion matrix lists the various classes and how the forest classified each one, plus the classification error. Are these numbers essentially averages over all the trees in the forest? If so, is there a way I can get the standard deviation values out of the randomForest,

Error: Can not handle categorical predictors with more than 32 categories.

2005 Mar 22

Error: Can not handle categorical predictors with more than 32 categories.

Hi All, My question is in regards to an error generated when using randomForest in R. Is there a special way to format the data in order to avoid this error, or am I completely confused on what the error implies? "Error in randomForest.default(m, y, ...) : Can not handle categorical predictors with more than 32 categories." This is generated from the command line: >

problem with certain data sets when using randomForest

2005 Aug 26

problem with certain data sets when using randomForest

Hi, Since I've had no replies on my previous post about my problem I am posting it again in the hope someone notice it. The problem is that the randomForest function doesn't take datasets which has instances only containing a subset of all the classes. So the dataset with instances that either belong to class "a" or "b" from the levels "a", "b" and

rpart vs. randomForest

2003 Apr 12

rpart vs. randomForest

Greetings. I'm trying to determine whether to use rpart or randomForest for a classification tree. Has anybody tested efficacy formally? I've run both and the confusion matrix for rf beats rpart. I've looking at the rf help page and am unable to figure out how to extract the tree. But more than that I'm looking for a more comprehensive user's guide for randomForest including

Problems using rfImpute

2008 May 05

Problems using rfImpute

Hello R-user! I am running R 2.7.0 on a Power Book (Tiger). (I am still R and statistics beginner) I tried rfImpute (randomForest) and as far as I understood should it replace NA`s using a proximity matrix: > set.seed(100000) > Subset5Imputed<-rfImpute(Sex~., data=Subset5) ntree OOB 1 2 300: 11.78% 12.36% 11.21% ntree OOB 1 2 300: 12.07% 12.64%

randomForest question--problem with ntree

2009 Aug 13

randomForest question--problem with ntree

Hi, I would like to use a random Forest model to get an idea about which variables from a dataset may have some prognostic significance in a smallish study. The default for the number of trees seems to be 500. I tried changing the default to ntree=2000 or ntree=200 and the results appear identical. Have changed mtry from mtry=5 to mtry=6 successfully. Have seen same problem on both a Windows

[LLVMdev] Splitting basic block results in unknown instruction type assertion

2014 Jul 07

[LLVMdev] Splitting basic block results in unknown instruction type assertion

Hello, I would like to see if this issue is a result of a misunderstanding on my part before I file a bug. I am using LLVM 3.4, built from the source tarballs. My system's uname is "Darwin tyler-air 12.5.0 Darwin Kernel Version 12.5.0: Sun Sep 29 13:33:47 PDT 2013; root:xnu-2050.48.12~1/RELEASE_X86_64 x86_64". All I'm trying to do is add a runtime check after all call

rfImpute

2007 Aug 10

rfImpute

I am having trouble with the rfImpute function in the randomForest package. Here is a sample... clunk.roughfix<-na.roughfix(clunk) > > clunk.impute<-rfImpute(CONVERT~.,data=clunk) ntree OOB 1 2 300: 26.80% 3.83% 85.37% ntree OOB 1 2 300: 18.56% 5.74% 51.22% Error in randomForest.default(xf, y, ntree = ntree, ..., do.trace = ntree, : NA not

Help me! using random Forest package, how to calculate Error Rates in the training set ?

2010 Jan 11

Help me! using random Forest package, how to calculate Error Rates in the training set ?

now I am learining random forest and using random forest package, I can get the OOB error rates, and test set rate, now I want to get the training set error rate, how can I do? pgp.rf<-randomForest(x.tr,y.tr,x.ts,y.ts,ntree=1e3,keep.forest=FALSE,do.trace=1e2) using the code can get oob and test set error rate, if I replace x.ts and y.ts with x.tr and y.tr,respectively, is the error rate

Concern with randomForest

2009 Apr 07

Concern with randomForest

Hi all, When running a randomForest run using the following command: forestplas=randomForest(Prev~.,data=plas,ntree=200000) print(forestplas) I get the following result: Call: randomForest(formula = Prev ~ ., data = plas, ntree = 2e+05, importance = TRUE) Type of random forest: regression Number of trees: 2e+05 No. of variables tried at each split: 5

HELP - How Do I Separate incoming channels from the others on a PRI

2005 Sep 04

HELP - How Do I Separate incoming channels from the others on a PRI

Okay, here is the background. I have a PRI with 15 active channels on it. I originally setup all of them in group=1 and all outgoing and incoming calls used this group. The phone number that I have associated with these channels ends with 750 and that is how I direct the calls. i.e. In my extensions.conf I have: exten => 750,1,Dial(SIP/120,20) All this works fine. Now I have the need

Cannot Compute Box's M (Three Days Trying...)

2017 Oct 27

Cannot Compute Box's M (Three Days Trying...)

It can't be this hard, right? I really need a shove in the right direction here. Been spinning wheels for three days. Cannot get past the errors. I'm doing something wrong, obviously, since I can easily compute the Box's M right there in RStudio But I don't see what is wrong below with the coding equivalent. The entire code snippet is below. The code fails below on the call to

Renjin?

2017 Oct 29

Renjin?

Hi All, OK, in the "back to the drawing board" department, I found what looks like a much better solution to using R in Java. Renjin. Looking at the docs and then trying a quick example, didn't quite work. Of course I'm missing something. Although I'm telling the engine to require ("biotools") just like I would in R itself, when I get to the line of code that

Cannot Compute Box's M (Three Days Trying...)

2017 Oct 28

Cannot Compute Box's M (Three Days Trying...)

I'm not sure what you mean. Could you please be more specific? If I print the string, I get: boxM(boxMVariable[, -5], boxMVariable[, 5]) From this code: . . . // assign the data to a variable.rConnection.assign("boxMVariable", myDf); // create a string command with that variable name.String boxVariable = "boxM(boxMVariable[, -5], boxMVariable[, 5])";

similar to: Re-evaluating the tree in the random forest