thr3ads.net - search: "classwt"

Displaying 17 results from an estimated 17 matches for "classwt".

Did you mean: class

How to use classwt parameter option in RandomForest

2008 May 21

How to use classwt parameter option in RandomForest

...actor variables using random forests in R. The variable Y acts like an ordinal variable, but I recoded it as factor variable. I ran a simulation and got OOB estimate of error rate 60%. I validated against some external datasets and got about 59% misclassification error. I would like to tinker with classwt option in the function randomForest to see if I can get a better performance the model. My confusion arises from how to define these weights. If I say, classwt = c(3,6,9,1,2,3), how exactly the levels get weighted. If this is a 6X6 matrix, I can put a number in each cell to adjust the weights. How...

help with RandomForest classwt option

2007 Jan 28

help with RandomForest classwt option

Hello there, I am working on an extremely unbalanced two class classification problems. I wanna use "classwt" with "down sampling" together. By checking the rfNews() in R, it looks that classwt is not working yet. Then I looked at the software from Salford. I did not find the down sampling option. I am wondering if you have any experience to deal with this problem. Do you know any method o...

random forest question

2004 Jan 20

random forest question

Hi, here are three results of random forest (version 4.0-1). The results seem to be more or less the same which is strange because I changed the classwt. I hoped that for example classwt=c(0.45,0.1,0.45) would result in fewer cases classified as class 2. Did I understand something wrong? Christian x1rf <- randomForest(x=as.data.frame(mfilters[cvtrain,]), y=as.factor(traingroups), xtest=as.data.frame(m...

Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?

2005 Oct 27

Repost: Examples of "classwt", "strata", and "sampsize" in randomForest?

...data and was wondering if, in a "0" v "1" classification forest, some combo of these options might yield better predictions when the proportion of one class is low (less than 10% in a sample of 2,000 observations). Not sure how to specify these terms... from the docs, we have: classwt: Priors of the classes. Need not add up to one. Ignored for regression. So is this something like "... classwt=c(.90,.10)" ? I didn't see the syntax demonstrated. Similar for "strata" and "sampsize" though there is a default for sampsize that makes sense... not su...

Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?

2005 Oct 27

Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?

"classwt" in the current version of the randomForest package doesn't work too well. (It's what was in version 3.x of the original Fortran code by Breiman and Cutler, not the one in the new Fortran code.) I'd advise against using it. "sampsize" and "strata" can be use...

Examples of "classwt", "strata", and "sampsize" in randomForest?

2005 Oct 25

Examples of "classwt", "strata", and "sampsize" in randomForest?

...unbalance data and was wondering if, in a "0" v "1" classification forest, if these options might yield better predictions when the proportion of one class is low (less than 10% in a sample of 2,000 observations). Not sure how to specify these terms... from the docs, we have: classwt: Priors of the classes. Need not add up to one. Ignored for regression. So is this something like "... classwt=c(.90,.10)" ? I didn't see the syntax demonstrated. Similar for "strata" and "sampsize" though there is a default for sampsize that makes sense... not su...

class weights with Random Forest

2011 Sep 13

class weights with Random Forest

Hi All, I am looking for a reference that explains how the randomForest function in the randomForest package uses the classwt parameter. Here: http://tolstoy.newcastle.edu.au/R/e4/help/08/05/12088.html Andy Liaw suggests not using classwt. And according to: http://r.789695.n4.nabble.com/R-help-with-RandomForest-classwt-option-td817149.html it has "not been implemented" as of 2007. However it improved classif...

error when using svm routine: Error in if (any(co)) { : missing value where TRUE/FALSE needed

2010 Mar 08

error when using svm routine: Error in if (any(co)) { : missing value where TRUE/FALSE needed

Hi, I met with this error message with the following data set. Do you know how to resolve it? Thanks. > data<-read.table("c://temp3//abc.csv", sep = ",", header=T) > classwt<-c( 0.5806452, 0.4193548) > y<-data[,1] > x<-data[,2:ncol(data)] > print(y) [1] 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 1 1 1 1 1 [36] 1 1 1 1 2 2 2 2 2 2 2 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 > print(x) rs2289472 rs1551398 rs7927894 1 CT AA...

imbalanced classes

2006 Jan 25

imbalanced classes

...t package to compare two classes of data, the number of cases in each class relatively low, 28 in class 1 and 9 in class 2. I'd really like to use R environment to analyze this data, however I'm finding it difficult to put much trust in the results of my analysis. As you've stated, the classwt variables do not do much, and I've tried working with the cuttoff and sampsize variables as well, with limited success in balancing error rates between the two classes. It was unclear to me how to use the cuttoff parameter correctly. If you have any recommendations here, it would be appreciat...

CART vs. Random Forest

2002 Sep 25

CART vs. Random Forest

According to Dr. Breiman, the RF should be more accurate method than a single tree. However, the performance of each method seems to depend on the proprotion of outcome variable in my case. My data set is a typical classification problem (predict bad guys). When I ran both of them with different proportion of outcome variables(there's a criterion to measure the degree of bad behavior), I

Random Forest with highly imbalanced data

2004 May 12

Random Forest with highly imbalanced data

Hi group, I am trying to do a RF with approx 250,000 cases. My objective is to determine the risk factors of a person being readmitted to hospital (response=1) or else (response=0). Only 10%, or 25,000 cases were readmitted. I've heard about down-sampling and class weight approach and am wondering if R can do it. Even some reference to articles will help. >From the statistical point

random forest proximities

2007 Feb 05

random forest proximities

Good Day, I'm using the randomForest package to perform a classification. If I supply weights to the optional classwt argument are proximity values computed as a weighted average? I understand that the forest will possibly change as a function of the particular weights I supply. Thanks in advance. Mike Michael Fugate Los Alamos National Laboratory Mail Stop MS-F600, Los Alamos, NM 87545 (505) 667-0398

What is the default nPerm for regression in randomForest?

2010 May 05

What is the default nPerm for regression in randomForest?

Could not find it in ?randomForest. Thank you for your help! -- Dimitri Liakhovitski Ninah.com Dimitri.Liakhovitski at ninah.com

How to optimize or build a better random forest?

2012 Oct 17

How to optimize or build a better random forest?

...sibsp pclass2 pclass3 sexmale "factor" "numeric" "integer" "factor" "factor" "factor" > sapply(split(train,train$survived),function(x) dim(x)[1]) 0 1 549 342 > rf <- randomForest(train[,-1], train[,1], ntree=10000,classwt=c(549/891,342/891),importance=TRUE,do.trace=FALSE) OOB estimate of error rate: 17.73% Confusion matrix: 0 1 class.error 0 500 49 0.08925319 1 109 233 0.31871345 [[alternative HTML version deleted]]

Strategies to deal with unbalanced classification data in randomForest

2012 Mar 03

Strategies to deal with unbalanced classification data in randomForest

...ate. This approach I've mostly drawn from here: ## http://stat-www.berkeley.edu/users/breiman/RandomForests/cc_home.htm#balance ## This might not be appropriate, however, as of September it looks like Breiman method wasn't used in R df.rf.weights<-randomForest(cls~var1+var2+var3, data=df,classwt=c(1, 600), importance=TRUE) ## Nevertheless, what I am concerned about is the effect of an unbalanced data set has on my randomForest model ## For example: par(mfrow=c(1,3)) plot(df.rf) plot(df.rf.downsamp) plot(df.rf.weights) presents three very different scenarios and I having trouble resolvin...

RandomForest

2003 Aug 20

RandomForest

Hello, When I plot or look at the error rate vector for a random forest (rf$err.rate) it looks like a descending function except for a few first points of the vector with error rates values lower(sometimes much lower) than the general level of error rates for a forest with such number of trees when the error rates stop descending. Does it mean that there is a tree(s) (that is built the first in

highly biased PCA data?

2004 Nov 04

highly biased PCA data?

Hello, supposing that I have two or three clear categories for my data, lets say pet preferece across fish, cat, dog. Lets say most people rate their preference as being mostly one of the categories. I want to do pca on the data to see three 'groups' of people, one group for fish, one for cat and one for dog. I would like to see the odd person who likes both or all three in the

search for: classwt