similar to: NA and NaN randomForest

Displaying 20 results from an estimated 4000 matches similar to: "NA and NaN randomForest"

2010 Sep 21
5
removed data is still there!
I'm confused, hope someone can point out what is not obvious to me. I thought I was creating a new data frame by 'deleting' rows from an existing dataframe - I've tried 2 methods. But this new data frame seems to remember values from its parent - even though there are no occurences. Where does it get the values versicolor and virginica from and give then a count of 0? What
2010 Jan 15
1
randomForest maxnodes
Has anyone sucessfully used the maxnodes feature in randomForest? I tried setting it, but when it is non-NULL I always get back a forest in which all trees have size 1. I am using a continuous response (regression). Any help would be appreciated. Thanks. [[alternative HTML version deleted]]
2012 Aug 01
3
Neuralnet Error
I require some help in debugging this codeĀ  library(neuralnet) ir<-read.table(file="iris_data.txt",header=TRUE,row.names=NULL) ir1 <- data.frame(ir[1:100,2:6]) ir2 <- data.frame(ifelse(ir1$Species=="setosa",1,ifelse(ir1$Species=="versicolor",0,""))) colnames(ir2)<-("Output") ir3 <- data.frame(rbind(ir1[1:4],ir2))
2007 Jan 28
2
help with RandomForest classwt option
Hello there, I am working on an extremely unbalanced two class classification problems. I wanna use "classwt" with "down sampling" together. By checking the rfNews() in R, it looks that classwt is not working yet. Then I looked at the software from Salford. I did not find the down sampling option. I am wondering if you have any experience to deal with this problem. Do you
2008 Mar 09
1
sampsize in Random Forests
Hi all, I have a dataset where each point is assigned to a class A, B, C, or D. Each point is also assigned to a study site. Each study site is coded with a number ranging between 1-100. This information is stored in the vector studySites. I want to run randomForests using stratified sampling, so I chose the option strata = factor(studySites) But I am not sure how to control the number of
2007 Apr 29
1
randomForest gives different results for formula call v. x, y methods. Why?
Just out of curiosity, I took the default "iris" example in the RF helpfile... but seeing the admonition against using the formula interface for large data sets, I wanted to play around a bit to see how the various options affected the output. Found something interesting I couldn't find documentation for... Just like the example... > set.seed(12) # to be sure I have
2005 Aug 26
2
problem with certain data sets when using randomForest
Hi, Since I've had no replies on my previous post about my problem I am posting it again in the hope someone notice it. The problem is that the randomForest function doesn't take datasets which has instances only containing a subset of all the classes. So the dataset with instances that either belong to class "a" or "b" from the levels "a", "b" and
2005 Jul 21
4
RandomForest question
Hello, I'm trying to find out the optimal number of splits (mtry parameter) for a randomForest classification. The classification is binary and there are 32 explanatory variables (mostly factors with each up to 4 levels but also some numeric variables) and 575 cases. I've seen that although there are only 32 explanatory variables the best classification performance is reached when
2007 Oct 31
1
seg fault with randomForest ( ... , xtest )
Dear R-help, what are the limits on xtest? > NOT_A.rf <- randomForest (log10(Y[!A] ) ~ . , data = notA_desc , proximity=T ,xtest = A_desc) *** caught segfault *** address 0x9cdd000, cause 'memory not mapped' Segmentation fault I don't think that the matrix are large: notA_desc is 651 obs of 27 variables A_desc is 17 obs of 27 variables thanks in advance, Clayton
2008 Jul 20
1
confusion matrix in randomForest
I have a question on the output generated by randomForest in classification mode, specifically, the confusion matrix. The confusion matrix lists the various classes and how the forest classified each one, plus the classification error. Are these numbers essentially averages over all the trees in the forest? If so, is there a way I can get the standard deviation values out of the randomForest,
2009 Feb 12
2
barplot() x axes are not updated after removal of categories from the dataframe
Hi all, I'd be grateful for your help. I am a new user struggling with a barplot issue. I am plotting categories (X axis) and their mean count (Y axies) with barplot(). The first call to barplot works fine. I remove records from the dataframe using final=[!final$varname == "some value",] I echo the dataframe and the records are no longer in the dataframe. When I call plot again
2012 Dec 10
3
splitting dataset based on variable and re-combining
I have a dataset and I wish to use two different models to predict. Both models are SVM. The reason for two different models is based on the sex of the observation. I wish to be able to make predictions and have the results be in the same order as my original dataset. To illustrate I will use iris: # Take Iris and create a dataframe of just two Species, setosa and versicolor, shuffle them
2010 Sep 22
2
randomForest - partialPlot - Reg
Dear R Group I had an observation that in some cases, when I use the randomForest model to create partialPlot in R using the package "randomForest" the y-axis displays values that are more than -1! It is a classification problem that i was trying to address. Any insights as to how the y axis can display value more than -1 for some variables? Am i missing something! Thanks Regards
2008 Apr 29
1
randomForest and ordered factors
Hello R-user! I am running R 2.7.0 on a Power Book (Tiger). (I am still R and statistics beginner) I try to find the most important variables to divide my dataset as given in a categorical variable. code: Test.rf4<-randomForest(Sex~.,na.action=na.roughfix, data=Subset4, importance=TRUE, proximity=TRUE, ntree=10000, do.trace=1000, keep.forest=FALSE) My dataset contains also ordered
2008 Sep 02
2
cluster a distance(analogue)-object using agnes(cluster)
I try to perform a clustering using an existing dissimilarity matrix that I calculated using distance (analogue) I tried two different things. One of them worked and one not and I don`t understand why. Here the code: not working example library(cluster) library(analogue) iris2<-as.data.frame(iris) str(iris2) 'data.frame': 150 obs. of 5 variables: $ Sepal.Length: num 5.1 4.9 4.7
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?
"classwt" in the current version of the randomForest package doesn't work too well. (It's what was in version 3.x of the original Fortran code by Breiman and Cutler, not the one in the new Fortran code.) I'd advise against using it. "sampsize" and "strata" can be use in conjunction. If "strata" is not specified, the class labels will be used.
2009 Feb 26
1
Random Forest confusion matrix
Dear R users, I have a question on the confusion matrix generated by function randomForest. I used the entire data set to generate the forest, for example: > print(iris.rf) Call: randomForest(formula = Species ~ ., data = iris, importance = TRUE, keep.forest = TRUE) confusion setosa versicolor virginica class.error setosa 50 0 0 0.00
2005 Jun 01
3
x[x$a=="q",,drop=TRUE]
I'm trying to select a subset of a dataframe while dropping some factors. While the dataset gets smaller all Factor levels remain and I need to get rid of them. Strangely enough, I am almost certain that the same code on the same data worked OK earlier today - and it is not the first time that I'm not able to replicate earlier results with this command (I know, I might just be going
2011 Sep 20
1
randomForest - NaN in %IncMSE
Hi I am having a problem using varImpPlot in randomForest. I get the error message "Error in plot.window(xlim = xlim, ylim = ylim, log = "") : need finite 'xlim' values" When print $importance, several variables have NaN under %IncMSE. There are no NaNs in the original data. Can someone help me figure out what is happening here? Thanks! [[alternative HTML
2005 Sep 08
2
Re-evaluating the tree in the random forest
Dear mailinglist members, I was wondering if there was a way to re-evaluate the instances of a tree (in the forest) again after I have manually changed a splitpoint (or split variable) of a decision node. Here's an illustration: library("randomForest") forest.rf <- randomForest(formula = Species ~ ., data = iris, do.trace = TRUE, ntree = 3, mtry = 2, norm.votes = FALSE) # I am