similar to: question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"

Displaying 20 results from an estimated 3000 matches similar to: "question regarding "varImpPlot" results vs. model$importance data on package "RandomForest""

2010 Jun 30
2
anyone know why package "RandomForest" na.roughfix is so slow??
Hi all, I am using the package "random forest" for random forest predictions. I like the package. However, I have fairly large data sets, and it can often take *hours* just to go through the "na.roughfix" call, which simply goes through and cleans up any NA values to either the median (numerical data) or the most frequent occurrence (factors). I am going to start
2011 Feb 08
4
manipulating the Date & Time classes
Hello, This is mostly to developers, but in case I missed something in my literature search, I am sending this to the broader audience. - Are there any plans in the works to make "time" classes a bit more friendly to the rest of the "R" world? I am not suggesting to allow for fancy functions to manipulate times, per se, or to figure out how to properly
2011 Feb 08
4
manipulating the Date & Time classes
Hello, This is mostly to developers, but in case I missed something in my literature search, I am sending this to the broader audience. - Are there any plans in the works to make "time" classes a bit more friendly to the rest of the "R" world? I am not suggesting to allow for fancy functions to manipulate times, per se, or to figure out how to properly
2011 Sep 20
1
randomForest - NaN in %IncMSE
Hi I am having a problem using varImpPlot in randomForest. I get the error message "Error in plot.window(xlim = xlim, ylim = ylim, log = "") : need finite 'xlim' values" When print $importance, several variables have NaN under %IncMSE. There are no NaNs in the original data. Can someone help me figure out what is happening here? Thanks! [[alternative HTML
2010 Dec 20
1
ideas, modeling highly discrete time-series data
Hello all, First of all, thanks so those of you who helped me a week or so ago managing a time series with varying gaps between the data series in 'R'. (My final preferred solution was to use "its" function & then forecast(Arima( ) ). ) My next question is a general statistical question where I'd like some advice, for those willing / able to proffer any wisdom:
2010 Dec 03
2
How to get 'R' to talk BACK to other languages / scripts??
Hey everyone, I know that I can call 'R' from other scripts, and that I can make command calls from 'R' (e.g., using system() ). But how can I get 'R' to RETURN values to the script that called it. E.g., I would like to be able to do something like the following (as a simpler example) from a bash script: #!/bin/bash myTest=echo /usr/local/bin/R --no-restore
2006 Nov 30
1
strange error from R CMD check about xaxp
Dear R-devel, Kurt had alerted me to the problem that the randomForest package that I maintain has been failing checks in R-devel. However, I just can't see why or where it's failing. I'd very much appreciate any pointer. The failure occur when running the example code in varImpPlot.Rd: > varImpPlot(mtcars.rf) Error in par(opar) : invalid value specified for graphical parameter
2010 Dec 17
2
how to convert "sloppy data" into a time series?
Hi All, First let me state that I did search for a while on r-help, google, and using the "sos" package inside of 'R', without much luck. I want to know how to create a univariate time series from a set of data that will have huge time gaps in it. For instance, here is a snapshot of a piece of data that I would like to analyze: *Row queued_time
2010 May 05
1
randomForest: predictor importance (for regressions)
I have a question about predictor importances in randomForest. Once I've run randomForest and got my object, I get their importances: rfresult$importance I also get the "standard errors" of the permutation-based importance measure: rfresult$importanceSD I have 2 questions: 1. Because I am dealing with regressions, I am getting an importance object (rfresult$importance) with two
2005 May 09
1
Random Forests 4.5-10 varImpPlot (PR#7844)
Full_Name: Daniel Normolle Version: 2.0.1 OS: Linux/Fedora Core 3 Submission from: (NULL) (141.214.17.5) varImpPlot in Random Forests 4.5-10 produces the error "incorrect number of subscripts on matrix" (and no plot) when applied to a randomForest object. This error did not occur with 4.5-4 or earlier versions.
2010 May 05
0
Which column in randomForest importances (for regression) is MSE and which IncNodePurity
I've run the function randomForest with importance=T. All my variables (predictors and the dependent variable) are numeric. rf<-randomForest(formula, data=mydata, importance=T, etc.) my results object "rf" contains predictor importances: rf$importance I am seeing two columns: %IncMSE IncNodePurity V1 -0.01683558 58.10910 V2 0.04000299 71.27579 V3 0.01974636
2010 Aug 24
3
odd behavior of "summary" function
Hello All, Using the standard "summary" function in 'R', I ran across some odd behavior that I cannot understand. Easy to reproduce: Typing: summary(c(6,207936)) Yields:: Min. *1st Qu. Median Mean 3rd Qu. Max.* 6 *51990 104000 104000 156000 207900* None of these values are correct except for the minimum. If I perform "quantile(c(6,
2011 Jan 12
2
syntax for extending a line in a script??
Hello, A hopefully simple question. I use 'R' through emacs, but I suspect the following would occur with any manner of text editor: - my editor has a normally quite handy feature where it will automatically indent to the appropriate level when I start a new line. However, this occasionally creates cases where there is no friendly way to break a long line of code into
2010 Jul 29
2
ggplot2 histograms... a subtle error found
Hello all, I have a peculiar and particular bug that I stumbled across with ggplot2. I cannot seem to replicate it with anything other than my specific data set. Here is the problem: - when I try to plot a histogram, allowing for ggplot2 to decide the binwidths itself, I get the following error: - stat_bin: binwidth defaulted to range/30. Use 'binwidth = x' to
2010 Jul 16
1
garbage collection & memory leaks in 'R', it seems...
Hello developers, I noticed that if I am running 'R', type "rm(list=objects())" and "gc()", 'R' will still be consuming (a lot) more memory than when I then close 'R' and re-open it. In my ignorance, I'm presuming this is something in 'R' where it doesn't really do a great job of garbage collection... at least not nearly as well as
2013 May 17
2
Selecting A List of Columns
Dear R Helpers, I need help with a slightly unusual situation in which I am trying to select some columns from a data frame. I know how to use the subset statement with column names as in: x=as.data.frame(matrix(c(1,2,3, 1,2,3, 1,2,2, 1,2,2, 1,1,1),ncol=3,byrow=T)) all.cols<-colnames(x) to.keep<-all.cols[1:2] Kept<-subset(x,select=to.keep) Kept
2010 Jun 24
1
how can I evaluate a formula passed as a string?
Hey everyone, I've been using 'R' long enough that I should have some idea of what the heck either expression() or eval() are really ever useful for. I come across another instance where I WISH they would be useful, but I cannot get them to work. Here is the crux of what I would like to do: presume df looks like this A B C === === === M 45 0 M
2004 Jul 08
0
randomForest 4.3-0 released
Dear all, Version 4.3-0 of the randomForest package is now available on CRAN (in source; binaries will follow in due course). There are some interface changes and a few new features, as well as bug fixes. For those who had used previous versions, the important things to note are: 1. there's a namespace now, and 2. some functions have been renamed. The list of changes since 4.0-7 (last
2004 Jul 08
0
randomForest 4.3-0 released
Dear all, Version 4.3-0 of the randomForest package is now available on CRAN (in source; binaries will follow in due course). There are some interface changes and a few new features, as well as bug fixes. For those who had used previous versions, the important things to note are: 1. there's a namespace now, and 2. some functions have been renamed. The list of changes since 4.0-7 (last
2010 Oct 01
2
trouble with RODBC -- chopping off part of column names
Hello all, I have a strange / interesting problem that might be 'R' settings themselves, or it might be something with the OS. I am using the RODBC library. I have a script that goes out and, before making a query for a big data set, will first query for the column names of the data set. The column names could sometimes be quite long (e.g., "Time Background Estimation