similar to: Selecting A List of Columns

Displaying 20 results from an estimated 400 matches similar to: "Selecting A List of Columns"

2010 May 05
0
Which column in randomForest importances (for regression) is MSE and which IncNodePurity
I've run the function randomForest with importance=T. All my variables (predictors and the dependent variable) are numeric. rf<-randomForest(formula, data=mydata, importance=T, etc.) my results object "rf" contains predictor importances: rf$importance I am seeing two columns: %IncMSE IncNodePurity V1 -0.01683558 58.10910 V2 0.04000299 71.27579 V3 0.01974636
2010 Jul 13
1
question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"
Hi everyone, I have another "Random Forest" package question: - my (presumably incorrect) understanding of the varImpPlot is that it should plot the "% increase in MSE" and "IncNodePurity" exactly as can be found from the "importance" section of the model results. - However, the plot does not, in fact, match the "importance"
2011 Aug 04
1
randomForest partial dependence plot variable names
Hello, I am running randomForest models on a number of species. I would like to be able to automate the printing of dependence plots for the most important variables in each model, but I am unable to figure out how to enter the variable names into my code. I had originally thought to extract them from the $importance matrix after sorting by metric (e.g. %IncMSE), but the importance matrix is n
2010 Apr 28
1
Question on: Random Forest Variable Importance for Regression Problems
I am trying to use the package RandomForest performing regression. The variable importance estimates are given as: "%IncMSE" and "IncNodePurity" Can anyone explain me what these refer to and how they are calculated? I found a lot of information on variable importance measures for classification problems, but nothing on regression. Thanks a lot. Mareike
2010 May 05
1
randomForest: predictor importance (for regressions)
I have a question about predictor importances in randomForest. Once I've run randomForest and got my object, I get their importances: rfresult$importance I also get the "standard errors" of the permutation-based importance measure: rfresult$importanceSD I have 2 questions: 1. Because I am dealing with regressions, I am getting an importance object (rfresult$importance) with two
2011 Sep 20
1
randomForest - NaN in %IncMSE
Hi I am having a problem using varImpPlot in randomForest. I get the error message "Error in plot.window(xlim = xlim, ylim = ylim, log = "") : need finite 'xlim' values" When print $importance, several variables have NaN under %IncMSE. There are no NaNs in the original data. Can someone help me figure out what is happening here? Thanks! [[alternative HTML
2007 Apr 18
20
dependency and communication between defined classes
Hi, i wanted to know how you handle case when classes or define need to communicate between them. For exemple i got an ftpd define and a apachevhost define. Both need to know the path where the vhost is set and this path is defined by the ftpuser home''s directory. How can i ask information from other define or other classes ? we allready seen that tag are not reliable as they
2012 Aug 27
1
interpret the importance output?
> importance(rfor.pdp11_t25.comb1,type=1) %IncMSE v1 -0.28956401263 v2 1.92865561147 v3 -0.63443929130 v4 1.58949137047 v5 0.03190940065 I wasn't entirely confident with interpreting these results based on the documentation. Could you please interpret? [[alternative HTML version deleted]]
2009 Jun 24
1
Random Forest Variable Importance Interpretation
Hi I am trying to explore the use of random forests for regression to identify the important environmental/microclimate variables involved in predicting the abundance of a species in different habitats, there are approx 40 variable and between 200 and 500 data points depending on the dataset. I have successfully used the randomForest package to conduct the analysis and looked at the %IncMSE
2013 May 01
3
Adding Column to Data Frames Using a Loop
Dear R Helpers, I am trying to do calculations on multiple data frames and do not want to create a list of them to go through each one. I know that lists have many wonderful advantages, but I believe the better thing is to work df by df for my particular situation. For background, I have already received some wonderful help on how to handle some situations, such as removing columns:
2009 Dec 01
4
Is there a function to test if all the elements in a vector are unique
length(unique(c(1,2,2)))==length(c(1,2,2)) I use the above test to test if all the elements in a vector are unique. But I'm wondering if there is a convenient function to do so in R library.
2013 Apr 30
1
Looping Over Data Frames
Dear R Helpers, I am re-phrasing a question that I put forth earlier today due to some particulars in the solution that I am searching for. Many thanks to those who answered the previous post and to any who would be willing to answer this one. I have a set of data frames. I need to perform some data scrubbing on each of them. I am trying to figure out how to perform the same steps on each
2006 Jun 17
2
managing data
Dear mailing list, may some one be kind to help me solve following problem. I am trying to write a code that will combine two tables "x" and "y". The first columns of both tables are unique identification for the rows. The first column of table "X" is a sub set of the first column of "Y". I need to find the matching rows in both tables by looking on their
2009 Apr 02
1
In plot.zoo the screens and ylim arguments seem incompatible
I am plotting multiple graphs per window with multiple series on each graph. When I try to set ylim I get the error below: Error in ylim[[idx]] : subscript out of bounds Am I incorrectly specifying my ylim list or is this a bug? Here is a simple reproduction: z <- zoo(cbind(a = 1:10, b = 11:20, c = 21:30)) # This works plot(z, ylim = list(a = c(1,40))) # This works plot(z, screens=c(1,2,2)) #
2013 Feb 07
1
Select only unique rows from a data frame
Hello! I have a data frame with several rows, for example: x=as.data.frame(matrix(c(1,2,3, 1,2,3, 1,2,2, 1,2,2, 1,1,1),ncol=3,byrow=T)) I would like to find y - a data frame that only has the unique rows from x, i.e.: 1,2,3 1,2,2 1,1,1 Thanks a lot for your hints! Dimitri -- Dimitri Liakhovitski gfk.com <http://marketfusionanalytics.com/> [[alternative HTML
2013 Apr 29
3
Function for Data Frame
Dear R Helpers, I have about 20 data frames that I need to do a series of data scrubbing steps to. I have the list of data frames in a list so that I can use lapply. I am trying to build a function that will do the data scrubbing that I need. However, I am new to functions and there is something fundamental that I am not understanding. I use the return function at the end of the function and
2004 Feb 26
2
return value in function
suppose I have a function example: getMatrix <- function(a,b){ A1<-diag(1,2,2) } If I want to get the both the A1 and dim(A1) from the function, Can I do return(A1,dim(A1)) inside the function ? And how can I access A1 and dim(A1) later on? --------------------------------- [[alternative HTML version deleted]]
2008 Feb 05
2
How to generate table output of t-test
Hi, Given test <- matrix(c(1, 1,2,2), 2,2) t <- apply(test, 1, t.test) How can I obtain a table of p-values, confidence interval etc, instead of [[1]] One Sample t-test data: newX[, i] t = 3, df = 1, p-value = 0.2048 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: -4.853102 7.853102 sample estimates: mean of x 1.5 [[2]]
2005 Sep 28
3
is it possible to form matrix of matrices...and multiple arrays
Dear sirs, 1...........Kindly tell me is it possible to form a matrix which contains a no of matrices.. for eg.. if a,b,c,d are matrices.... and e is a matrix which contains a,b,c,d as rows and columns.. 2..........Is it possible to form array of array of arrays for eg.. "A" contains two set of arrays (1,2)...and each A[1] and A[2] individually contains two set of arrays I tried like
2009 Jun 03
1
Need help understanding output from aov and from anova
Hi all, I noticed something strange when I ran aov and anova. vtot=c(7.29917, 7.29917, 7.29917) #identical values fac=as.factor(c(1,1,2)) #group 1 has first two elements, group 2 has the 3rd element When I run: > anova(lm(vtot~fac)) Analysis of Variance Table Response: vtot Df Sum Sq Mean Sq F value Pr(>F) fac 1 1.6818e-30 1.6818e-30 0.3333 0.6667 Residuals 1