Hi, there: I am wondering if I can find some detailed explanation on gbm or explanation on examples of gbm. thanks, Ed
Weiwei Shi <helprhelp at yahoo.com> writes:> Hi, there: > I am wondering if I can find some detailed explanation > on gbm or explanation on examples of gbm.What is gbm? Green Belt Movement? Georgie Boy Manufacturing? I'm serious! Well, only sort of, but try Google on "gbm" and you'll find those two expansions and several others like them. I suppose you mean Gradient Boosting Machine, or Generalized Boosted regression Models. Have you followed up on the references and examples on its help page? -- O__ ---- Peter Dalgaard Blegdamsvej 3 c/ /'_ --- Dept. of Biostatistics 2200 Cph. N (*) \(*) -- University of Copenhagen Denmark Ph: (+45) 35327918 ~~~~~~~~~~ - (p.dalgaard at biostat.ku.dk) FAX: (+45) 35327907
I just got 25 hits from "r-project.org" -> search -> "R site search". Might one or more of these help you? If they don't solve your problem, I suggest you try "the posting guide! R-project.org/posting-guide.html". If that still doesn't solve your problem, it should help you phrase your question to increase the chances of getting a helpful reply. hope this helps. spencer graves Weiwei Shi wrote:>Hi, there: >I am wondering if I can find some detailed explanation >on gbm or explanation on examples of gbm. > >thanks, > >Ed > >______________________________________________ >R-help at stat.math.ethz.ch mailing list >stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide! R-project.org/posting-guide.html > >
Hi, there: Thanks a lot for all people' prompt replies. In detail, I am facing a huge amount of data: over 10,000 and 400 vars. This project is very challenging and interesting to me. I tried rpart which gives me some promising results but not good enough. So I am trying randomForest and gbm now. My plan of using gbm is like this: rt<-rpart(...) gbm(formula(rt)...) Does this work? (My first question) My another CONCERN FOR GBM is the scalability since I realize R seems to load all the data into memory. (My second question) But I believe the idea above will run very slowly. (I think I might try TreeNet, though I don't like it since it is commercial.). BTW, sampling might be a good idea, but it does not seem a good idea for my project from previous experiments. I read some reference mentioned earlier by helpers before I sent my first email. But I still appreciate any helps. You guys are so nice! BTW, gbm means gradient boosting modeling :) Ed
> From: Weiwei Shi > > Hi, there: > Thanks a lot for all people' prompt replies. > > In detail, I am facing a huge amount of data: over > 10,000 and 400 vars. This project is very challenging > and interesting to me. I tried rpart which gives me > some promising results but not good enough. So I am > trying randomForest and gbm now. > > My plan of using gbm is like this: > rt<-rpart(...) > gbm(formula(rt)...) > > Does this work? (My first question)Given a machine with sufficient memory and CPU speed, yes.> My another CONCERN FOR GBM is the scalability since I > realize R seems to load all the data into memory. (My > second question)We have dealt with data larger than what you described. One thing to avoid is the use of the formula interface if you have _lots_ (like, hundreds) of variables. gbm.fit(), I believe, was created for that reason.> But I believe the idea above will run very slowly. (I > think I might try TreeNet, though I don't like it > since it is commercial.). BTW, sampling might be a > good idea, but it does not seem a good idea for my > project from previous experiments.To me being commercial is not a crime. I judge software on quality, ease of use, access to source (if I need it), etc. To me, TreeNet failed on several of those criteria, but it works just fine for some people.> I read some reference mentioned earlier by helpers > before I sent my first email. But I still appreciate > any helps. You guys are so nice!That's no excuse for not following the posting guide, right?> BTW, gbm means gradient boosting modeling :)No. I believe Greg calls it `generalized boosting models'. Andy> Ed > > ______________________________________________ > R-help at stat.math.ethz.ch mailing list > stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > R-project.org/posting-guide.html > >