thr3ads.net - similar to: "Selecting A List of Columns"

Displaying 20 results from an estimated 400 matches similar to: "Selecting A List of Columns"

Which column in randomForest importances (for regression) is MSE and which IncNodePurity

2010 May 05

Which column in randomForest importances (for regression) is MSE and which IncNodePurity

I've run the function randomForest with importance=T. All my variables (predictors and the dependent variable) are numeric. rf<-randomForest(formula, data=mydata, importance=T, etc.) my results object "rf" contains predictor importances: rf$importance I am seeing two columns: %IncMSE IncNodePurity V1 -0.01683558 58.10910 V2 0.04000299 71.27579 V3 0.01974636

question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"

2010 Jul 13

question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"

Hi everyone, I have another "Random Forest" package question: - my (presumably incorrect) understanding of the varImpPlot is that it should plot the "% increase in MSE" and "IncNodePurity" exactly as can be found from the "importance" section of the model results. - However, the plot does not, in fact, match the "importance"

randomForest partial dependence plot variable names

2011 Aug 04

randomForest partial dependence plot variable names

Hello, I am running randomForest models on a number of species. I would like to be able to automate the printing of dependence plots for the most important variables in each model, but I am unable to figure out how to enter the variable names into my code. I had originally thought to extract them from the $importance matrix after sorting by metric (e.g. %IncMSE), but the importance matrix is n

Question on: Random Forest Variable Importance for Regression Problems

2010 Apr 28

Question on: Random Forest Variable Importance for Regression Problems

I am trying to use the package RandomForest performing regression. The variable importance estimates are given as: "%IncMSE" and "IncNodePurity" Can anyone explain me what these refer to and how they are calculated? I found a lot of information on variable importance measures for classification problems, but nothing on regression. Thanks a lot. Mareike

randomForest: predictor importance (for regressions)

2010 May 05

randomForest: predictor importance (for regressions)

I have a question about predictor importances in randomForest. Once I've run randomForest and got my object, I get their importances: rfresult$importance I also get the "standard errors" of the permutation-based importance measure: rfresult$importanceSD I have 2 questions: 1. Because I am dealing with regressions, I am getting an importance object (rfresult$importance) with two

randomForest - NaN in %IncMSE

2011 Sep 20

randomForest - NaN in %IncMSE

Hi I am having a problem using varImpPlot in randomForest. I get the error message "Error in plot.window(xlim = xlim, ylim = ylim, log = "") : need finite 'xlim' values" When print $importance, several variables have NaN under %IncMSE. There are no NaNs in the original data. Can someone help me figure out what is happening here? Thanks! [[alternative HTML

dependency and communication between defined classes

2007 Apr 18

dependency and communication between defined classes

Hi, i wanted to know how you handle case when classes or define need to communicate between them. For exemple i got an ftpd define and a apachevhost define. Both need to know the path where the vhost is set and this path is defined by the ftpuser home''s directory. How can i ask information from other define or other classes ? we allready seen that tag are not reliable as they

interpret the importance output?

2012 Aug 27

interpret the importance output?

> importance(rfor.pdp11_t25.comb1,type=1) %IncMSE v1 -0.28956401263 v2 1.92865561147 v3 -0.63443929130 v4 1.58949137047 v5 0.03190940065 I wasn't entirely confident with interpreting these results based on the documentation. Could you please interpret? [[alternative HTML version deleted]]

Random Forest Variable Importance Interpretation

2009 Jun 24

Random Forest Variable Importance Interpretation

Hi I am trying to explore the use of random forests for regression to identify the important environmental/microclimate variables involved in predicting the abundance of a species in different habitats, there are approx 40 variable and between 200 and 500 data points depending on the dataset. I have successfully used the randomForest package to conduct the analysis and looked at the %IncMSE

Adding Column to Data Frames Using a Loop

2013 May 01

Adding Column to Data Frames Using a Loop

Dear R Helpers, I am trying to do calculations on multiple data frames and do not want to create a list of them to go through each one. I know that lists have many wonderful advantages, but I believe the better thing is to work df by df for my particular situation. For background, I have already received some wonderful help on how to handle some situations, such as removing columns:

Is there a function to test if all the elements in a vector are unique

2009 Dec 01

Is there a function to test if all the elements in a vector are unique

length(unique(c(1,2,2)))==length(c(1,2,2)) I use the above test to test if all the elements in a vector are unique. But I'm wondering if there is a convenient function to do so in R library.

Looping Over Data Frames

2013 Apr 30

Looping Over Data Frames

Dear R Helpers, I am re-phrasing a question that I put forth earlier today due to some particulars in the solution that I am searching for. Many thanks to those who answered the previous post and to any who would be willing to answer this one. I have a set of data frames. I need to perform some data scrubbing on each of them. I am trying to figure out how to perform the same steps on each

managing data

2006 Jun 17

managing data

Dear mailing list, may some one be kind to help me solve following problem. I am trying to write a code that will combine two tables "x" and "y". The first columns of both tables are unique identification for the rows. The first column of table "X" is a sub set of the first column of "Y". I need to find the matching rows in both tables by looking on their

In plot.zoo the screens and ylim arguments seem incompatible

2009 Apr 02

In plot.zoo the screens and ylim arguments seem incompatible

I am plotting multiple graphs per window with multiple series on each graph. When I try to set ylim I get the error below: Error in ylim[[idx]] : subscript out of bounds Am I incorrectly specifying my ylim list or is this a bug? Here is a simple reproduction: z <- zoo(cbind(a = 1:10, b = 11:20, c = 21:30)) # This works plot(z, ylim = list(a = c(1,40))) # This works plot(z, screens=c(1,2,2)) #

Select only unique rows from a data frame

2013 Feb 07

Select only unique rows from a data frame

Hello! I have a data frame with several rows, for example: x=as.data.frame(matrix(c(1,2,3, 1,2,3, 1,2,2, 1,2,2, 1,1,1),ncol=3,byrow=T)) I would like to find y - a data frame that only has the unique rows from x, i.e.: 1,2,3 1,2,2 1,1,1 Thanks a lot for your hints! Dimitri -- Dimitri Liakhovitski gfk.com <http://marketfusionanalytics.com/> [[alternative HTML

Function for Data Frame

2013 Apr 29

Function for Data Frame

Dear R Helpers, I have about 20 data frames that I need to do a series of data scrubbing steps to. I have the list of data frames in a list so that I can use lapply. I am trying to build a function that will do the data scrubbing that I need. However, I am new to functions and there is something fundamental that I am not understanding. I use the return function at the end of the function and

return value in function

2004 Feb 26

return value in function

suppose I have a function example: getMatrix <- function(a,b){ A1<-diag(1,2,2) } If I want to get the both the A1 and dim(A1) from the function, Can I do return(A1,dim(A1)) inside the function ? And how can I access A1 and dim(A1) later on? --------------------------------- [[alternative HTML version deleted]]

How to generate table output of t-test

2008 Feb 05

How to generate table output of t-test

Hi, Given test <- matrix(c(1, 1,2,2), 2,2) t <- apply(test, 1, t.test) How can I obtain a table of p-values, confidence interval etc, instead of [[1]] One Sample t-test data: newX[, i] t = 3, df = 1, p-value = 0.2048 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: -4.853102 7.853102 sample estimates: mean of x 1.5 [[2]]

is it possible to form matrix of matrices...and multiple arrays

2005 Sep 28

is it possible to form matrix of matrices...and multiple arrays

Dear sirs, 1...........Kindly tell me is it possible to form a matrix which contains a no of matrices.. for eg.. if a,b,c,d are matrices.... and e is a matrix which contains a,b,c,d as rows and columns.. 2..........Is it possible to form array of array of arrays for eg.. "A" contains two set of arrays (1,2)...and each A[1] and A[2] individually contains two set of arrays I tried like

Need help understanding output from aov and from anova

2009 Jun 03

Need help understanding output from aov and from anova

Hi all, I noticed something strange when I ran aov and anova. vtot=c(7.29917, 7.29917, 7.29917) #identical values fac=as.factor(c(1,1,2)) #group 1 has first two elements, group 2 has the 3rd element When I run: > anova(lm(vtot~fac)) Analysis of Variance Table Response: vtot Df Sum Sq Mean Sq F value Pr(>F) fac 1 1.6818e-30 1.6818e-30 0.3333 0.6667 Residuals 1

similar to: Selecting A List of Columns