Displaying 20 results from an estimated 3000 matches similar to: "question regarding "varImpPlot" results vs. model$importance data on package "RandomForest""
2010 Jun 30
2
anyone know why package "RandomForest" na.roughfix is so slow??
Hi all,
I am using the package "random forest" for random forest predictions. I
like the package. However, I have fairly large data sets, and it can often
take *hours* just to go through the "na.roughfix" call, which simply goes
through and cleans up any NA values to either the median (numerical data) or
the most frequent occurrence (factors).
I am going to start
2011 Feb 08
4
manipulating the Date & Time classes
Hello,
This is mostly to developers, but in case I missed something in my
literature search, I am sending this to the broader audience.
- Are there any plans in the works to make "time" classes a bit more
friendly to the rest of the "R" world? I am not suggesting to allow for
fancy functions to manipulate times, per se, or to figure out how to
properly
2011 Feb 08
4
manipulating the Date & Time classes
Hello,
This is mostly to developers, but in case I missed something in my
literature search, I am sending this to the broader audience.
- Are there any plans in the works to make "time" classes a bit more
friendly to the rest of the "R" world? I am not suggesting to allow for
fancy functions to manipulate times, per se, or to figure out how to
properly
2011 Sep 20
1
randomForest - NaN in %IncMSE
Hi
I am having a problem using varImpPlot in randomForest. I get the error
message "Error in plot.window(xlim = xlim, ylim = ylim, log = "") : need
finite 'xlim' values"
When print $importance, several variables have NaN under %IncMSE. There
are no NaNs in the original data. Can someone help me figure out what is
happening here?
Thanks!
[[alternative HTML
2010 Dec 20
1
ideas, modeling highly discrete time-series data
Hello all,
First of all, thanks so those of you who helped me a week or so ago
managing a time series with varying gaps between the data series in 'R'.
(My final preferred solution was to use "its" function & then
forecast(Arima( ) ). )
My next question is a general statistical question where I'd like some
advice, for those willing / able to proffer any wisdom:
2010 Dec 03
2
How to get 'R' to talk BACK to other languages / scripts??
Hey everyone,
I know that I can call 'R' from other scripts, and that I can make
command calls from 'R' (e.g., using system() ). But how can I get 'R' to
RETURN values to the script that called it. E.g., I would like to be able
to do something like the following (as a simpler example) from a bash
script:
#!/bin/bash
myTest=echo /usr/local/bin/R --no-restore
2006 Nov 30
1
strange error from R CMD check about xaxp
Dear R-devel,
Kurt had alerted me to the problem that the randomForest package that I
maintain has been failing checks in R-devel. However, I just can't see
why or where it's failing. I'd very much appreciate any pointer.
The failure occur when running the example code in varImpPlot.Rd:
> varImpPlot(mtcars.rf)
Error in par(opar) : invalid value specified for graphical parameter
2010 Dec 17
2
how to convert "sloppy data" into a time series?
Hi All,
First let me state that I did search for a while on r-help, google, and
using the "sos" package inside of 'R', without much luck. I want to know
how to create a univariate time series from a set of data that will have
huge time gaps in it. For instance, here is a snapshot of a piece of data
that I would like to analyze:
*Row queued_time
2010 May 05
1
randomForest: predictor importance (for regressions)
I have a question about predictor importances in randomForest.
Once I've run randomForest and got my object, I get their importances:
rfresult$importance
I also get the "standard errors" of the permutation-based importance
measure: rfresult$importanceSD
I have 2 questions:
1. Because I am dealing with regressions, I am getting an importance object
(rfresult$importance) with two
2005 May 09
1
Random Forests 4.5-10 varImpPlot (PR#7844)
Full_Name: Daniel Normolle
Version: 2.0.1
OS: Linux/Fedora Core 3
Submission from: (NULL) (141.214.17.5)
varImpPlot in Random Forests 4.5-10 produces the error "incorrect number of
subscripts on matrix" (and no plot) when applied to a randomForest object. This
error did not occur with 4.5-4 or earlier versions.
2010 May 05
0
Which column in randomForest importances (for regression) is MSE and which IncNodePurity
I've run the function randomForest with importance=T. All my variables
(predictors and the dependent variable) are numeric.
rf<-randomForest(formula, data=mydata, importance=T, etc.)
my results object "rf" contains predictor importances:
rf$importance
I am seeing two columns:
%IncMSE IncNodePurity
V1 -0.01683558 58.10910
V2 0.04000299 71.27579
V3 0.01974636
2010 Aug 24
3
odd behavior of "summary" function
Hello All,
Using the standard "summary" function in 'R', I ran across some odd
behavior that I cannot understand. Easy to reproduce:
Typing:
summary(c(6,207936))
Yields::
Min. *1st Qu. Median Mean 3rd Qu. Max.*
6 *51990 104000 104000 156000 207900*
None of these values are correct except for the minimum. If I perform
"quantile(c(6,
2011 Jan 12
2
syntax for extending a line in a script??
Hello,
A hopefully simple question. I use 'R' through emacs, but I suspect the
following would occur with any manner of text editor:
- my editor has a normally quite handy feature where it will
automatically indent to the appropriate level when I start a new line.
However, this occasionally creates cases where there is no friendly way to
break a long line of code into
2010 Jul 29
2
ggplot2 histograms... a subtle error found
Hello all,
I have a peculiar and particular bug that I stumbled across with
ggplot2. I cannot seem to replicate it with anything other than my specific
data set.
Here is the problem:
- when I try to plot a histogram, allowing for ggplot2 to decide the
binwidths itself, I get the following error:
- stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to
2010 Jul 16
1
garbage collection & memory leaks in 'R', it seems...
Hello developers,
I noticed that if I am running 'R', type "rm(list=objects())" and
"gc()", 'R' will still be consuming (a lot) more memory than when I then
close 'R' and re-open it. In my ignorance, I'm presuming this is something
in 'R' where it doesn't really do a great job of garbage collection... at
least not nearly as well as
2013 May 17
2
Selecting A List of Columns
Dear R Helpers,
I need help with a slightly unusual situation in which I am trying to
select some columns from a data frame. I know how to use the subset
statement with column names as in:
x=as.data.frame(matrix(c(1,2,3,
1,2,3,
1,2,2,
1,2,2,
1,1,1),ncol=3,byrow=T))
all.cols<-colnames(x)
to.keep<-all.cols[1:2]
Kept<-subset(x,select=to.keep)
Kept
2010 Jun 24
1
how can I evaluate a formula passed as a string?
Hey everyone,
I've been using 'R' long enough that I should have some idea of what the
heck either expression() or eval() are really ever useful for. I come
across another instance where I WISH they would be useful, but I cannot get
them to work.
Here is the crux of what I would like to do:
presume df looks like this
A B C
=== === ===
M 45 0
M
2004 Jul 08
0
randomForest 4.3-0 released
Dear all,
Version 4.3-0 of the randomForest package is now available on CRAN (in
source; binaries will follow in due course). There are some interface
changes and a few new features, as well as bug fixes. For those who had
used previous versions, the important things to note are: 1. there's a
namespace now, and 2. some functions have been renamed. The list of changes
since 4.0-7 (last
2004 Jul 08
0
randomForest 4.3-0 released
Dear all,
Version 4.3-0 of the randomForest package is now available on CRAN (in
source; binaries will follow in due course). There are some interface
changes and a few new features, as well as bug fixes. For those who had
used previous versions, the important things to note are: 1. there's a
namespace now, and 2. some functions have been renamed. The list of changes
since 4.0-7 (last
2010 Oct 01
2
trouble with RODBC -- chopping off part of column names
Hello all,
I have a strange / interesting problem that might be 'R' settings
themselves, or it might be something with the OS.
I am using the RODBC library. I have a script that goes out and, before
making a query for a big data set, will first query for the column names of
the data set. The column names could sometimes be quite long (e.g., "Time
Background Estimation