Displaying 20 results from an estimated 3000 matches similar to: "Decision Tree and Random Forrest"
2016 Apr 13
0
Decision Tree and Random Forrest
Tjats great that you are familiar and thanks for responding. Have you ever
done what I am referring to? I have alteady spent time going through links
and tutorials about decision trees and random forrests and have even used
them both before.
Mike
On Apr 13, 2016 5:32 PM, "Sarah Goslee" <sarah.goslee at gmail.com> wrote:
It sounds like you want classification or regression trees.
2016 Apr 15
0
Decision Tree and Random Forrest
Since you only have 3 predictors, each categorical with a small number of
categories, you can use expand.grid to make a data.frame containing all
possible combinations and give that the predict method for your model to
get all possible predictions.
Something like the following untested code.
newdata <- expand.grid(
Humidity = levels(Humidity), #(High, Medium,Low)
2016 Apr 15
1
Decision Tree and Random Forrest
I need the output to have groups and the probability any given record in
that group then has of being in the response class. Just like my email in
the beginning i need the output that looks like if A and if B and if C then
%77 it will be D. The examples you provided are just simply not similar.
They are different and would take interpretation to get what i need.
On Apr 14, 2016 1:26 AM,
2016 Apr 13
4
Decision Tree and Random Forrest
Ah yes I will have to use the predict function. But the predict function
will not get me there really. If I can take the example that I have a
model predicting whether or not I will play golf (this is the dependent
value), and there are three independent variables Humidity(High, Medium,
Low), Pending_Chores(Taxes, None, Laundry, Car Maintenance) and Wind (High,
Low). I would like rules like
2016 Apr 13
0
Decision Tree and Random Forrest
I think you are missing the point of random forests. But if you just
want to predict using the forest, there is a predict() method that you
can use. Other than that, I certainly don't understand what you mean.
Maybe someone else might.
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka
2016 Apr 13
3
Decision Tree and Random Forrest
Ok is there a way to do it with decision tree? I just need to make the
decision rules. Perhaps I can pick one of the trees used with Random
Forrest. I am somewhat familiar already with Random Forrest with
respective to bagging and feature sampling and getting the mode from the
leaf nodes and it being an ensemble technique of many trees. I am just
working from the perspective that I need
2016 Apr 13
0
Decision Tree and Random Forrest
Nope.
Random forests are not decision trees -- they are ensembles (forests)
of trees. You need to go back and read up on them so you understand
how they work. The Hastie/Tibshirani/Friedman "The Elements of
Statistical Learning" has a nice explanation, but I'm sure there are
lots of good web resources, too.
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is
2016 Apr 13
2
Decision Tree and Random Forrest
Hi I'm trying to get the top decision rules from a decision tree.
Eventually I will like to do this with R and Random Forrest. There has to
be a way to output the decsion rules of each leaf node in an easily
readable way. I am looking at the randomforrest and rpart packages and I
dont see anything yet.
Mike
[[alternative HTML version deleted]]
2016 Apr 07
0
simple question on data frames assignment
lapply(colordata2[ -1 ], f )
When you put the parentheses on, you are calling the function yourself before lapply gets a chance. The error pops up because you are giving a vector of numbers (the answer f gave you) to the second argument of lapply instead of a function.
--
Sent from my phone. Please excuse my brevity.
On April 7, 2016 7:31:18 AM PDT, Michael Artz <michaeleartz at
2016 Apr 19
1
Interquartile Range
HI that did not work for me either. The value I got returned from that
function was "<rounded mean> - <rounded mean>" :(. thanks for the reply
through
On Tue, Apr 19, 2016 at 10:34 AM, William Dunlap <wdunlap at tibco.com> wrote:
> > That didn't work Jim!
>
> It always helps to say how the suggestion did not work. Jim's
> function had a typo
2016 Apr 07
2
simple question on data frames assignment
If you are not using an anonymous function and say you had written the
function out
The below gives me the error > 'f(colordata2$color1)' is not a function,
character or symbol' But then how is the anonymous function working?
f <- function(col){
ifelse(col == 'blue', 1, 0)
}
responses <- lapply(colordata2[ -1 ], f(colordata2$color1) )
2016 Apr 20
0
Interquartile Range
???
IQR returns a single number.
> IQR(rnorm(10))
[1] 1.090168
To your 2nd response:
"I could have used average, min, max, they all would have returned the
same thing., "
I can only respond: huh?? Are all your values identical?
You really need to provide a small reproducible example as requested
by the posting guide -- I certainly don't get it, and I'm done
guessing.
2016 Apr 19
2
Interquartile Range
If you show us, not just tell us about, a self-contained example
someone might show you a non-hacky way of getting the job done.
(I don't see an argument to plyr::ddply called 'transform'.)
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Tue, Apr 19, 2016 at 12:18 PM, Michael Artz <michaeleartz at gmail.com>
wrote:
> Oh thanks for that clarification Bert! Hope you enjoyed
2016 Apr 20
2
Interquartile Range
Well, instead of your functions try:
Mode <- function(x) {
tabx <- table(x)
tabx[which.max(tabx)]
}
and use R's IQR function instead of yours.
... so I still don't get why you want to return a character string
instead of a value for the IQR;
and the mode of a sample defined as above is generally a bad estimator
of the mode of the distribution. To say more than that would
2016 Apr 19
0
Interquartile Range
> That didn't work Jim!
It always helps to say how the suggestion did not work. Jim's
function had a typo in it - was that the problem? Or did you not
change the call to ddply to use that function. Here is something
that might "work" for you:
library(plyr)
data <- data.frame(groupColumn=rep(1:5,1:5), col1=2^(0:14))
myIqr <- function(x) {
2016 Apr 19
0
Interquartile Range
Hi,
Here is what I am doing
notGroupedAll <- ddply(data
,~groupColumn
,summarise
,col1_mean=mean(col1)
,col2_mode=Mode(col2) #Function I wrote for getting the
mode shown below
,col3_Range=myIqr(col3)
)
groupedAll <- ddply(data
,~groupColumn
,summarise
2016 Apr 19
0
Interquartile Range
Are you aware that there *already is* a function that does this?
?IQR
(also your "function" iqr" is just a character string and would have
to be parsed and evaluated to become a function. But this is a
TERRIBLE way to do things in R as it completely circumvents R's
central functional programming paradigm).
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind
2016 Apr 20
2
Interquartile Range
Again, IQR returns two both a .25 and a .75 value and it failed, which is
why I didn't use it before. Also, the first function just returns tha same
value repeating. Since they are the same, before the second call, using
the mode function is just a way to grab one value. I could have used
average, min, max, they all would have returned the same thing.
Mike
On Tue, Apr 19, 2016 at 7:24 PM,
2016 Apr 20
0
Interquartile Range
Hi,
Jumping into this thread mainly on the point of the mode of the distribution, while also supporting Bert's comments below on theory.
If the vector 'x' that is being passed to this function is an integer vector, then a tabulation of the integers can yield a 'mode', presuming of course that there is only one unique mode. You may have to decide how you want to handle a
2016 Apr 07
0
simple question on data frames assignment
Lapply is not a vectorized function. It is compact to read, but it would not be worth using for this calculation.
However, if your data frame had multiple color columns in your data frame that you wanted to make responses for then you might want to use lapply as a more compact version of a for loop to repeat this operation.
colordata2 <- data.frame(id = c(1,2,3,4,5), color1 =