Paul Fisch
2008-Aug-13 18:39 UTC
[R] need help with stat functions(like adaboost, random forests and glm)
Ok, so basically I have a dataframe named data_frame data_frame contains: startdate startprice endpricethreshold1 endpricethreshold2 endpricethreshold3 all of these endpricethresholds are true/false binary vectors. They are true or false depending on whether the endprice was above or below whatever the endpricethreshold is. now I want to try to use lets say the general linear model to have it try and predict which endprice thresholds will be true or false dependent upon startdate and startprice. So I have a formula like: glm(endpricethreshold1 ~ ., data=data_frame[,c(1,2,3)], family=binomial(logit)); but, for the first term endpricethreshold1(since I really have tons of endpricethresholds and would like to make this a loop) I don't want to refer to it by its name but instead by its column indice like this: glm(data_frame[[3]] ~ ., data=data_frame[,c(1,2,3)], family=binomial(logit)); However, when I do this I am getting completely different results and I have no idea why. If anyone could help it would be greatly appreciated. Thanks, Paul Fisch [[alternative HTML version deleted]]
Fisch, Paul J.
2008-Aug-14 01:10 UTC
[R] need help with stat functions(like adaboost, random forests and glm)
Ok, so basically I have a dataframe named data_frame data_frame contains: startdate startprice endpricethreshold1 endpricethreshold2 endpricethreshold3 all of these endpricethresholds are true/false binary vectors. They are true or false depending on whether the endprice was above or below whatever the endpricethreshold is. now I want to try to use lets say the general linear model to have it try and predict which endprice thresholds will be true or false dependent upon startdate and startprice. So I have a formula like: glm(endpricethreshold1 ~ ., data=data_frame[,c(1,2,3)], family=binomial(logit)); but, for the first term endpricethreshold1(since I really have tons of endpricethresholds and would like to make this a loop) I don't want to refer to it by its name but instead by its column indice like this: glm(data_frame[[3]] ~ ., data=data_frame[,c(1,2,3)], family=binomial(logit)); However, when I do this I am getting completely different results and I have no idea why. If anyone could help it would be greatly appreciated. Thanks, Paul Fisch [[alternative HTML version deleted]]