Displaying 20 results from an estimated 4000 matches similar to: "randomForest 4.3-0 released"
2004 Jan 12
0
new version of randomForest (4.0-7)
Dear R users,
I've just released a new version of randomForest (available on CRAN now).
This version contained quite a number of new features and bug fixes,
compared to version prior to 4.0-x (and few more since 4.0-1).
For those not familiar with randomForest, it's an ensemble
classifier/regression tool. Please see
http://www.math.usu.edu/~adele/forests/ for more detailed information,
2004 Jan 12
0
new version of randomForest (4.0-7)
Dear R users,
I've just released a new version of randomForest (available on CRAN now).
This version contained quite a number of new features and bug fixes,
compared to version prior to 4.0-x (and few more since 4.0-1).
For those not familiar with randomForest, it's an ensemble
classifier/regression tool. Please see
http://www.math.usu.edu/~adele/forests/ for more detailed information,
2012 Aug 07
0
predicting test dataset response from training dataset with randomForest
Hi
I am new to R so I apologize if this is trivial.
I am trying to predict the resistance or susceptibility of my
sequences to a certain drug with a randomForest function from a file
with amino acids on each of the positions in the protein. I ran the
following:
> library(randomForest)
>
> path <- "C:\\..."
> path2 <- "..."
> name <-
2010 Jul 13
1
question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"
Hi everyone,
I have another "Random Forest" package question:
- my (presumably incorrect) understanding of the varImpPlot is that it
should plot the "% increase in MSE" and "IncNodePurity" exactly as can be
found from the "importance" section of the model results.
- However, the plot does not, in fact, match the "importance"
2010 May 05
1
randomForest: predictor importance (for regressions)
I have a question about predictor importances in randomForest.
Once I've run randomForest and got my object, I get their importances:
rfresult$importance
I also get the "standard errors" of the permutation-based importance
measure: rfresult$importanceSD
I have 2 questions:
1. Because I am dealing with regressions, I am getting an importance object
(rfresult$importance) with two
2005 Oct 27
1
Repost: Examples of "classwt", "strata", and "sampsize" i n randomForest?
"classwt" in the current version of the randomForest package doesn't work
too well. (It's what was in version 3.x of the original Fortran code by
Breiman and Cutler, not the one in the new Fortran code.) I'd advise
against using it.
"sampsize" and "strata" can be use in conjunction. If "strata" is not
specified, the class labels will be used.
2010 Aug 06
1
Error on random forest variable importance estimates
Hello,
I am using the R randomForest package to classify variable stars. I have
a training set of 1755 stars described by (too) many variables. Some of
these variables are highly correlated.
I believe that I understand how randomForest works and how the variable
importance are evaluated (through variable permutations). Here are my
questions.
1) variable importance error? Is there any ways
2011 Sep 14
1
substitute games with randomForest::partialPlot
I'm having trouble calling randomForest::partialPlot programmatically.
It tries to use name of the (R) variable as the data column name.
Example:
library(randomForest)
iris.rf <- randomForest(Species ~ ., data=iris, importance=TRUE, proximity=TRUE)
partialPlot(iris.rf, iris, Sepal.Width) # works
partialPlot(iris.rf, iris, "Sepal.Width") # works
(function(var.name)
2005 May 13
0
randomForest partialPlot x.var through function
All,
I'm trying to set up a function which calls the partialPlot function but
am getting an error that I can't seem to solve. Here's a simplified
version of the function and error...
> pplot <-
function(rf,pred.var){partialPlot(x=rf,pred.data=acoust,x.var=pred.var)}
>
> attach(acoust)
> acoust.rf <-
2002 Dec 17
0
new version of randomForest
A new version of the randomForest package is now available on CRAN. The
DESCRIPTION is:
Package: randomForest
Title: Breiman's random forest for classification and regression
Version: 3.4-1
Depends: R (>= 1.5.0)
Author: Fortran original by Leo Breiman and Adele Cutler, R port by Andy
Liaw and Matthew Wiener.
Description: Classification and regression based on a forest of trees using
2005 Mar 23
0
Question on class 1, 2 output for RandomForest
The `1' and `2' columns are the error rates within those classes. E.g., the
last row of the `1' column should correspond to the class.error for "-", and
the last row of the `2' column to the class.error for "+". (I would
have thought that that should be fairly obvious, but I guess not. It mimics
what Breiman and Cutler's Fortran code does.) I suspect
2011 Aug 19
0
sign of the y axis in partialPlot for randomForest regression
Hi everybody,
I used randomForest to regress invertebrates abundances in least impaired
river reaches from some environmental parameters. Then I used these models
to predict invertebrates abundances in impaired reaches.
Now I would like to model the deviation (observation - prediction) with a
set of chemical parameters to see if the deviations from predictions could
be explained with water
2006 Jul 26
0
randomForest question [Broadcast]
When mtry is equal to total number of features, you just get regular bagging
(in the R package -- Breiman & Cutler's Fortran code samples variable with
replacement, so you can't do bagging with that). There are cases when
bagging will do better than random feature selection (i.e., RF), even in
simulated data, but I'd say not very often.
HTH,
Andy
From: Arne.Muller at
2010 Nov 16
1
Force evaluation of variable when calling partialPlot
Greg,
Two thoughts:
1. It might be possible that 'vars' is a reserved word of sorts and if you change the name of your vector RF might be happier
2. A way that works for me is to call importance as follows:
sel.imp <- importance(sel.rf, class=NULL, scale=TRUE, type=NULL)
and then use the 'names' of the imp data frame to be absolutely clear to RF you are talking about the
2011 Aug 04
1
randomForest partial dependence plot variable names
Hello,
I am running randomForest models on a number of species. I would like to be
able to automate the printing of dependence plots for the most important
variables in each model, but I am unable to figure out how to enter the
variable names into my code. I had originally thought to extract them from
the $importance matrix after sorting by metric (e.g. %IncMSE), but the
importance matrix is n
2011 Sep 20
1
randomForest - NaN in %IncMSE
Hi
I am having a problem using varImpPlot in randomForest. I get the error
message "Error in plot.window(xlim = xlim, ylim = ylim, log = "") : need
finite 'xlim' values"
When print $importance, several variables have NaN under %IncMSE. There
are no NaNs in the original data. Can someone help me figure out what is
happening here?
Thanks!
[[alternative HTML
2012 Mar 03
0
Strategies to deal with unbalanced classification data in randomForest
Hello all,
I have become somewhat confused with options available for dealing
with a highly unbalanced data set (10000 in one class, 50 in the
other). As a summary I am unsure:
a) if I am perform the two class weighting methods properly,
b) if the data are too unbalanced and that this type of analysis is
appropriate and
c) if there is any interaction between the weighting for class
imbalances
2010 Sep 22
2
randomForest - partialPlot - Reg
Dear R Group
I had an observation that in some cases, when I use the randomForest model
to create partialPlot in R using the package "randomForest"
the y-axis displays values that are more than -1!
It is a classification problem that i was trying to address.
Any insights as to how the y axis can display value more than -1 for some
variables?
Am i missing something!
Thanks
Regards
2012 Nov 22
1
Partial dependence plot in randomForest package (all flat responses)
Hi,
I'm trying to make a partial plot with package randomForest in R. After I
perform my random forest object I type
partialPlot(data.rforest, pred.data=act2, x.var=centroid, "C")
where data.rforest is my randomforest object, act2 is the original dataset,
centroid is one of the predictor and C is one of the classes in my response
variable.
Whatever predictor or response class I
2008 Mar 09
1
sampsize in Random Forests
Hi all,
I have a dataset where each point is assigned to a class A, B, C, or
D. Each point is also assigned to a study site. Each study site is
coded with a number ranging between 1-100. This information is stored
in the vector studySites.
I want to run randomForests using stratified sampling, so I chose the option
strata = factor(studySites)
But I am not sure how to control the number of