thr3ads.net - similar to: "randomForest partial dependence plot variable names"

Displaying 20 results from an estimated 2000 matches similar to: "randomForest partial dependence plot variable names"

randomForest: predictor importance (for regressions)

2010 May 05

randomForest: predictor importance (for regressions)

I have a question about predictor importances in randomForest. Once I've run randomForest and got my object, I get their importances: rfresult$importance I also get the "standard errors" of the permutation-based importance measure: rfresult$importanceSD I have 2 questions: 1. Because I am dealing with regressions, I am getting an importance object (rfresult$importance) with two

question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"

2010 Jul 13

question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"

Hi everyone, I have another "Random Forest" package question: - my (presumably incorrect) understanding of the varImpPlot is that it should plot the "% increase in MSE" and "IncNodePurity" exactly as can be found from the "importance" section of the model results. - However, the plot does not, in fact, match the "importance"

Which column in randomForest importances (for regression) is MSE and which IncNodePurity

2010 May 05

Which column in randomForest importances (for regression) is MSE and which IncNodePurity

I've run the function randomForest with importance=T. All my variables (predictors and the dependent variable) are numeric. rf<-randomForest(formula, data=mydata, importance=T, etc.) my results object "rf" contains predictor importances: rf$importance I am seeing two columns: %IncMSE IncNodePurity V1 -0.01683558 58.10910 V2 0.04000299 71.27579 V3 0.01974636

Selecting A List of Columns

2013 May 17

Selecting A List of Columns

Dear R Helpers, I need help with a slightly unusual situation in which I am trying to select some columns from a data frame. I know how to use the subset statement with column names as in: x=as.data.frame(matrix(c(1,2,3, 1,2,3, 1,2,2, 1,2,2, 1,1,1),ncol=3,byrow=T)) all.cols<-colnames(x) to.keep<-all.cols[1:2] Kept<-subset(x,select=to.keep) Kept

Question on: Random Forest Variable Importance for Regression Problems

2010 Apr 28

Question on: Random Forest Variable Importance for Regression Problems

I am trying to use the package RandomForest performing regression. The variable importance estimates are given as: "%IncMSE" and "IncNodePurity" Can anyone explain me what these refer to and how they are calculated? I found a lot of information on variable importance measures for classification problems, but nothing on regression. Thanks a lot. Mareike

randomForest - NaN in %IncMSE

2011 Sep 20

randomForest - NaN in %IncMSE

Hi I am having a problem using varImpPlot in randomForest. I get the error message "Error in plot.window(xlim = xlim, ylim = ylim, log = "") : need finite 'xlim' values" When print $importance, several variables have NaN under %IncMSE. There are no NaNs in the original data. Can someone help me figure out what is happening here? Thanks! [[alternative HTML

randomforests - how to classify

2010 May 04

randomforests - how to classify

Hi, I'm experimenting with random forests and want to perform a binary classification task. I've tried some of the sample codes in the help files and things run, but I get a message to the effect 'you don't have very many unique values in the target - are you sure you want to do regression?' (sorry, don't know exact message but r is busy now so can't check). In

randomForest - partialPlot - Reg

2010 Sep 22

randomForest - partialPlot - Reg

Dear R Group I had an observation that in some cases, when I use the randomForest model to create partialPlot in R using the package "randomForest" the y-axis displays values that are more than -1! It is a classification problem that i was trying to address. Any insights as to how the y axis can display value more than -1 for some variables? Am i missing something! Thanks Regards

Partial Dependence and RandomForest

2012 Apr 11

Partial Dependence and RandomForest

Hello all~ I am interested in clarifying something more conceptual, so I won't be providing any data or code here. >From what I understand, partial dependence plots can help you understand the relative dependence on a variable, and the subsequent values of that variable, after "averaging out the effects" of the other input variables. This is great, but what I am interested in

Odd behaviour in within.list() when deleting 2+ variables

2017 Jun 26

Odd behaviour in within.list() when deleting 2+ variables

The behaviour of within() with list input changes if you delete 2 or more variables, compared to deleting one: l <- list(x=1, y=2, z=3) within(l, { rm(z) }) #$x #[1] 1 # #$y #[1] 2 within(l, { rm(y) rm(z) }) #$x #[1] 1 # #$y #NULL # #$z #NULL When 2 or more variables are deleted, the list entries are instead set to NULL. Is this intended?

partialPlot en un Randomforest

2018 Jan 07

partialPlot en un Randomforest

Hola erreros. A ver si alguien podría decirme qué son los dos ejes del plot que resulta de aplicar partialPlot en un Randomforest. Encuentro que: Partial dependence plot gives a graphical depiction of the marginal effect of a variable on the class probability (classification) or response (regression) que nos indica como varía la VR en función de la variable considerada, manteniendo el

Partial dependence plot in randomForest package (all flat responses)

2012 Nov 22

Partial dependence plot in randomForest package (all flat responses)

Hi, I'm trying to make a partial plot with package randomForest in R. After I perform my random forest object I type partialPlot(data.rforest, pred.data=act2, x.var=centroid, "C") where data.rforest is my randomforest object, act2 is the original dataset, centroid is one of the predictor and C is one of the classes in my response variable. Whatever predictor or response class I

Odd behaviour in within.list() when deleting 2+ variables

2017 Jun 26

Odd behaviour in within.list() when deleting 2+ variables

>>>>> peter dalgaard <pdalgd at gmail.com> >>>>> on Mon, 26 Jun 2017 13:43:28 +0200 writes: > This seems to be due to changes made by Martin Maechler in > 2008. Presumably this fixed something, but it escapes my > memory. Yes: The change set (svn -c46441) also contains the following NEWS entry BUG FIXES o

Identifying objects from a data set

2007 Sep 16

Identifying objects from a data set

Hello Given the following data for a data set called airquality. To identify the nature of the objects from the data set airquality example "Ozone" would it be best to use the command is. like is.character(airquality$Ozone) ....... I tried attributes(airquality$Ozone) but it came up null. Would there be a better way to identify these objects. Thanking you in advance for your

Column renaming

2008 May 05

Column renaming

Dear all, Is there a less cumbersome way to rename a column by name (as opposed to index) than -- names( X)[ names[ X] == "bob"]<-"sue" ? A semi-related question: how does one get the index of a column by name, something along the lines of col.index( X, "sue") ? Chip Barnaby --------------------------------------------------------- Chip Barnaby

Simple indexing conundrum

2005 Jul 01

Simple indexing conundrum

My apologies in advance for my thickness but I can't seem to solve the following, seemingly simple, data manipulation problem: I have a data frame that contains multiple factors and multiple continuous response variables, but duplicates of some factor combinations. The duplicates contain bad data, so I would like to eliminate the duplicates. I would like to retain the entire rows

two easy questions...

2001 Mar 22

two easy questions...

Hi all. 1) If I have a dataframe with variable names as follow: PC1 PC2 ... PCn and I want to pass only some of them to a function, e.g. glm(resp~from PC1 to PC10, PC15, etc.,...) is there a faster way than simply writing each variable name in the formula? 2) Again, I have a dataframe, say ali.df, with tha following variables: ali1, ali2, ...ali78 I want to sum, for example, ali1+al2+ali7+f rom

Using transform to add a date column to a dataframe

2008 Dec 23

Using transform to add a date column to a dataframe

I would like to add a column to the airquality dataset that contains the date 1950-01-01 in each row. This method does not appear to work: > attach(airquality) > data1 <- transform(airquality,Date=as.Date("1950-01-01")) Error in data.frame(list(Ozone = c(41L, 36L, 12L, 18L, NA, 28L, 23L, 19L, : arguments imply differing number of rows: 153, 1 I can't decipher what

Newbie help with Sweave

2008 Mar 24

Newbie help with Sweave

I think I've gotten my Emacs/Sweave/R system set up correctly, thanks to Vincent and Jim, but I haven't been successful getting my first document produced. I'm trying to use one of Friedrich Leisch's examples, http://www.ci.tuwien.ac.at/~leisch/Sweave/example-1.Snw. I cut and pasted the text into a document sweaveexample.Rnw in Emacs. It seemed to be processed successfully with R:

How to define degree=1 in mgcv

2010 Jan 24

How to define degree=1 in mgcv

Hi, all I have a question on mgcv and ns. Now I want to compare the results from glm, gam and ns. Take a simple model y~x for example. glm1 = glm(y~x, data=data1) gam1 = gam(y~s(x), data=data1) ns1 = glm(y~ns(x),data=data1) In order to confirm the result from glm1 is consistent to those from gam1 and ns1, I want to define degree=1 in mgcv and ns. I am wondering if there is somebody can give me

similar to: randomForest partial dependence plot variable names