Displaying 20 results from an estimated 3000 matches similar to: "problem with certain data sets when using randomForest"
2007 Apr 29
1
randomForest gives different results for formula call v. x, y methods. Why?
Just out of curiosity, I took the default "iris" example in the RF
helpfile...
but seeing the admonition against using the formula interface for large data
sets, I wanted to play around a bit to see how the various options affected
the output. Found something interesting I couldn't find documentation for...
Just like the example...
> set.seed(12) # to be sure I have
2009 Feb 26
1
Random Forest confusion matrix
Dear R users,
I have a question on the confusion matrix generated by function randomForest. 
I used the entire data
set to generate the forest, for example:
> print(iris.rf) 
Call: 
 randomForest(formula = Species ~ ., data = iris, importance = TRUE,     
keep.forest = TRUE) 
confusion
           setosa versicolor virginica class.error
setosa         50          0         0        0.00
2008 Jul 20
1
confusion matrix in randomForest
I have a question on the output generated by randomForest in classification
mode, specifically, the confusion matrix.  The confusion matrix lists the
various classes and how the forest classified each one, plus the
classification error.  Are these numbers essentially averages over all the
trees in the forest?  If so, is there a way I can get the standard deviation
values out of the randomForest,
2017 Oct 29
3
Renjin?
Hi All,
OK, in the "back to the drawing board" department, I found what looks like a much better solution to using R in Java. Renjin.
Looking at the docs and then trying a quick example, didn't quite work.
Of course I'm missing something.
Although I'm telling the engine to require ("biotools") just like I would in R itself, when I get to the line of code that
2017 Oct 27
4
Cannot Compute Box's M (Three Days Trying...)
It can't be this hard, right? I really need a shove in the right direction here. Been spinning wheels for three days. Cannot get past the errors.
I'm doing something wrong, obviously, since I can easily compute the Box's M right there in RStudio
But I don't see what is wrong below with the coding equivalent.
The entire code snippet is below. The code fails below on the call to
2017 Oct 28
2
Cannot Compute Box's M (Three Days Trying...)
I'm not sure what you mean. Could you please be more specific?
If I print the string, I get:  boxM(boxMVariable[, -5], boxMVariable[, 5])
From this code:
.
.
.
// assign the data to a variable.rConnection.assign("boxMVariable", myDf);
// create a string command with that variable name.String boxVariable = "boxM(boxMVariable[, -5], boxMVariable[, 5])";
2017 Oct 28
2
Cannot Compute Box's M (Three Days Trying...)
Thanks Duncan. Awesome ideas!
I think we're getting closer!
I tried what you suggested and got a possibly better error...
.
.
.
rConnection.assign("boxMVariable", myDf);
String resultBV = "str(boxMVariable)";   // your suggestion.
RESULTING ERROR:
Error in format.default(nam.ob, width = max(ncn), justify = "left") :  invalid 'width' argument
(No idea
2017 Oct 28
2
Cannot Compute Box's M (Three Days Trying...)
Hey Duncan,
Hard to debug? That's an understatement. Eyes bleeding....
In any case, I tried all your suggestions. To get "integer" for the final column, I had to change the code to get integers instead of strings.
double[] d1 = ((REXPVector) ((RList) tableRead).get(0)).asDoubles();
double[] d2 = ((REXPVector) ((RList) tableRead).get(1)).asDoubles();
double[] d3 = ((REXPVector)
2010 Mar 01
1
Random Forest prediction questions
Hi,
I need help with the randomForest prediction. i run the folowing code:
 
> iris.rf <- randomForest(Species ~ ., data=iris,
> importance=TRUE,keep.forest=TRUE, proximity=TRUE)
> pr<-predict(iris.rf,iris,predict.all=T)
> iris.rf$votes[53,]
    setosa versicolor  virginica 
 0.0000000  0.8074866  0.1925134 
> table(pr$individual[53,])/500
versicolor  virginica 
     0.928   
2017 Oct 29
2
Cannot Compute Box's M (Three Days Trying...)
Thanks Duncan. I can't tell you how helpful all your terrific replies have been.
I think the biggest surprise is that nobody appears to be using Java and R together like I"m trying to do. I suppose it should be a surprise since there are no books on the subject and almost no technical documentation other than a few sites here and there.
-----
I originally had the "int" as the
2010 Jun 09
4
question about "mean"
Hi there:
     I have a question about generating mean value of a data.frame. Take
iris data for example, if I have a data.frame looking like the following:
---------------------
    Sepal.Length Sepal.Width Petal.Length Petal.Width    Species
1                    5.1               3.5                  1.4
    0.2     setosa
2                    4.9               3.0                  1.4
    0.2  
2010 Sep 21
5
removed data is still there!
I'm confused, hope someone can point out what is not obvious to me.
I thought I was creating a new data frame by 'deleting' rows from an
existing dataframe - I've tried 2 methods.
But this new data frame seems to remember values from its parent - even
though there are no occurences.  
Where does it get the values versicolor  and virginica from and give then a
count of 0?
What
2007 Apr 24
1
NA and NaN randomForest
Dear R-help,
This is about randomForest's handling of NA and NaNs in test set data.
Currently, if the test set data contains an NA or NaN then 
predict.randomForest will skip that row in the output.
I would like to change that behavior to outputting an NA.
Can this be done with flags to randomForest?
If not can some sort of wrapper be built to put the NAs back in?
thanks,
Clayton
2011 Feb 18
1
segfault during example(svm)
If do:
> library("e1071")
> example(svm)
I get:
svm> data(iris)
svm> attach(iris)
svm> ## classification mode
svm> # default with factor response:
svm> model <- svm(Species ~ ., data = iris)
svm> # alternatively the traditional interface:
svm> x <- subset(iris, select = -Species)
svm> y <- Species
svm> model <- svm(x, y) 
svm>
2018 Mar 23
2
aggregate() naming -- bug or feature
In the examples below, the first loses the name attached by foo(), the second retains names attached by bar().  Is this an intentional difference?  I?d prefer that the names be retained in both cases.
foo <- function(x) { c(mean = base::mean(x)) }
bar <- function(x) { c(mean = base::mean(x), sd = stats::sd(x))}
aggregate(iris$Sepal.Length, by = list(iris$Species), FUN = foo)
#>     
2018 Mar 23
1
aggregate() naming -- bug or feature
On Fri, Mar 23, 2018 at 6:43 PM, Rui Barradas <ruipbarradas at sapo.pt> wrote:
> Hello,
>
> Not exactly an answer but here it goes.
> If you use the formula interface the names will be retained.
Also if you pass named arguments:
aggregate(iris["Sepal.Length"], by = iris["Species"], FUN = foo)
#      Species Sepal.Length
# 1     setosa        5.006
# 2
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?
"classwt" in the current version of the randomForest package doesn't work
too well.  (It's what was in version 3.x of the original Fortran code by
Breiman and Cutler, not the one in the new Fortran code.)  I'd advise
against using it.
"sampsize" and "strata" can be use in conjunction.  If "strata" is not
specified, the class labels will be used.
2005 Sep 26
3
How to get the rowindices without using which?
Hi,
I was wondering if it is possible to get the
rowindices without using the function "which" because
I don't have a restriction criteria. Here's an example
of what I mean:
# take 10 randomly selected instances
iris[sample(1:nrow(iris), 10),]
# output
    Sepal.Length Sepal.Width Petal.Length Petal.Width 
  Species
76           6.6         3.0          4.4         1.4
2012 Jun 11
1
saving sublist lda object with save.image()
Greetings R experts,
I'm having some difficulty recovering lda objects that I've saved within sublists using the save.image() function. I am running a script that exports a variety of different information as a list, included within that list is an lda object. I then take that list and create a list of that with all the different replications I've run. Unfortunately I've been
2003 Jun 13
1
problem with latex of object summary reverse
Hi,
I have the following problem (library Hmisc loaded, 
iris data loaded, R Version 1.7.0  (2003-04-16), packages 
updated, running on a linux Debian i386):
> summary(Species~Sepal.Length,method="reverse")->a
> a
Descriptive Statistics by Species
+------------+-----------------+-----------------+-----------------+
|            |setosa           |versicolor       |virginica