Displaying 20 results from an estimated 400 matches similar to: "Selecting A List of Columns"
2010 May 05
0
Which column in randomForest importances (for regression) is MSE and which IncNodePurity
I've run the function randomForest with importance=T. All my variables
(predictors and the dependent variable) are numeric.
rf<-randomForest(formula, data=mydata, importance=T, etc.)
my results object "rf" contains predictor importances:
rf$importance
I am seeing two columns:
%IncMSE IncNodePurity
V1 -0.01683558 58.10910
V2 0.04000299 71.27579
V3 0.01974636
2010 Jul 13
1
question regarding "varImpPlot" results vs. model$importance data on package "RandomForest"
Hi everyone,
I have another "Random Forest" package question:
- my (presumably incorrect) understanding of the varImpPlot is that it
should plot the "% increase in MSE" and "IncNodePurity" exactly as can be
found from the "importance" section of the model results.
- However, the plot does not, in fact, match the "importance"
2011 Aug 04
1
randomForest partial dependence plot variable names
Hello,
I am running randomForest models on a number of species. I would like to be
able to automate the printing of dependence plots for the most important
variables in each model, but I am unable to figure out how to enter the
variable names into my code. I had originally thought to extract them from
the $importance matrix after sorting by metric (e.g. %IncMSE), but the
importance matrix is n
2010 Apr 28
1
Question on: Random Forest Variable Importance for Regression Problems
I am trying to use the package RandomForest performing regression.
The variable importance estimates are given as: "%IncMSE" and
"IncNodePurity"
Can anyone explain me what these refer to and how they are calculated?
I found a lot of information on variable importance measures for
classification problems, but nothing on regression.
Thanks a lot.
Mareike
2010 May 05
1
randomForest: predictor importance (for regressions)
I have a question about predictor importances in randomForest.
Once I've run randomForest and got my object, I get their importances:
rfresult$importance
I also get the "standard errors" of the permutation-based importance
measure: rfresult$importanceSD
I have 2 questions:
1. Because I am dealing with regressions, I am getting an importance object
(rfresult$importance) with two
2011 Sep 20
1
randomForest - NaN in %IncMSE
Hi
I am having a problem using varImpPlot in randomForest. I get the error
message "Error in plot.window(xlim = xlim, ylim = ylim, log = "") : need
finite 'xlim' values"
When print $importance, several variables have NaN under %IncMSE. There
are no NaNs in the original data. Can someone help me figure out what is
happening here?
Thanks!
[[alternative HTML
2007 Apr 18
20
dependency and communication between defined classes
Hi,
i wanted to know how you handle case when classes or define need to
communicate between them. For exemple i got an ftpd define and a
apachevhost define. Both need to know the path where the vhost is set
and this path is defined by the ftpuser home''s directory. How can i ask
information from other define or other classes ? we allready seen that
tag are not reliable as they
2012 Aug 27
1
interpret the importance output?
> importance(rfor.pdp11_t25.comb1,type=1)
%IncMSE
v1 -0.28956401263
v2 1.92865561147
v3 -0.63443929130
v4 1.58949137047
v5 0.03190940065
I wasn't entirely confident with interpreting these results based on the
documentation.
Could you please interpret?
[[alternative HTML version deleted]]
2009 Jun 24
1
Random Forest Variable Importance Interpretation
Hi
I am trying to explore the use of random forests for regression to
identify the important environmental/microclimate variables involved in
predicting the abundance of a species in different habitats, there are
approx 40 variable and between 200 and 500 data points depending on the
dataset. I have successfully used the randomForest package to conduct
the analysis and looked at the %IncMSE
2013 May 01
3
Adding Column to Data Frames Using a Loop
Dear R Helpers,
I am trying to do calculations on multiple data frames and do not want to
create a list of them to go through each one. I know that lists have many
wonderful advantages, but I believe the better thing is to work df by df
for my particular situation. For background, I have already received some
wonderful help on how to handle some situations, such as removing columns:
2009 Dec 01
4
Is there a function to test if all the elements in a vector are unique
length(unique(c(1,2,2)))==length(c(1,2,2))
I use the above test to test if all the elements in a vector are
unique. But I'm wondering if there is a convenient function to do so
in R library.
2013 Apr 30
1
Looping Over Data Frames
Dear R Helpers,
I am re-phrasing a question that I put forth earlier today due to some
particulars in the solution that I am searching for. Many thanks to those
who answered the previous post and to any who would be willing to answer
this one.
I have a set of data frames. I need to perform some data scrubbing on
each of them. I am trying to figure out how to perform the same steps on
each
2006 Jun 17
2
managing data
Dear mailing list, may some one be kind to help me solve following problem.
I am trying to write a code that will combine two tables "x" and "y". The
first columns of both tables are unique identification for the rows. The
first column of table "X" is a sub set of the first column of "Y". I need to
find the matching rows in both tables by looking on their
2009 Apr 02
1
In plot.zoo the screens and ylim arguments seem incompatible
I am plotting multiple graphs per window with multiple series on each graph.
When I try to set ylim I get the error below:
Error in ylim[[idx]] : subscript out of bounds
Am I incorrectly specifying my ylim list or is this a bug?
Here is a simple reproduction:
z <- zoo(cbind(a = 1:10, b = 11:20, c = 21:30))
# This works
plot(z, ylim = list(a = c(1,40)))
# This works
plot(z, screens=c(1,2,2))
#
2013 Feb 07
1
Select only unique rows from a data frame
Hello!
I have a data frame with several rows, for example:
x=as.data.frame(matrix(c(1,2,3,
1,2,3,
1,2,2,
1,2,2,
1,1,1),ncol=3,byrow=T))
I would like to find y - a data frame that only has the unique rows from x,
i.e.:
1,2,3
1,2,2
1,1,1
Thanks a lot for your hints!
Dimitri
--
Dimitri Liakhovitski
gfk.com <http://marketfusionanalytics.com/>
[[alternative HTML
2013 Apr 29
3
Function for Data Frame
Dear R Helpers,
I have about 20 data frames that I need to do a series of data scrubbing
steps to. I have the list of data frames in a list so that I can use
lapply. I am trying to build a function that will do the data scrubbing
that I need. However, I am new to functions and there is something
fundamental that I am not understanding. I use the return function at the
end of the function and
2004 Feb 26
2
return value in function
suppose I have a function example:
getMatrix <- function(a,b){
A1<-diag(1,2,2)
}
If I want to get the both the A1 and dim(A1) from the function, Can I do
return(A1,dim(A1)) inside the function ? And how can I access A1 and dim(A1) later on?
---------------------------------
[[alternative HTML version deleted]]
2008 Feb 05
2
How to generate table output of t-test
Hi,
Given
test <- matrix(c(1, 1,2,2), 2,2)
t <- apply(test, 1, t.test)
How can I obtain a table of p-values, confidence interval etc, instead of
[[1]]
One Sample t-test
data: newX[, i]
t = 3, df = 1, p-value = 0.2048
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
-4.853102 7.853102
sample estimates:
mean of x
1.5
[[2]]
2005 Sep 28
3
is it possible to form matrix of matrices...and multiple arrays
Dear sirs,
1...........Kindly tell me is it possible to form a matrix which contains a no of matrices..
for eg..
if a,b,c,d are matrices....
and e is a matrix which contains a,b,c,d as rows and columns..
2..........Is it possible to form array of array of arrays
for eg..
"A" contains two set of arrays (1,2)...and each A[1] and A[2] individually contains two set of arrays
I tried like
2009 Jun 03
1
Need help understanding output from aov and from anova
Hi all,
I noticed something strange when I ran aov and anova.
vtot=c(7.29917, 7.29917, 7.29917) #identical values
fac=as.factor(c(1,1,2)) #group 1 has first two elements, group 2 has
the 3rd element
When I run:
> anova(lm(vtot~fac))
Analysis of Variance Table
Response: vtot
Df Sum Sq Mean Sq F value Pr(>F)
fac 1 1.6818e-30 1.6818e-30 0.3333 0.6667
Residuals 1