thr3ads.net - similar to: "Which is the final model for a Boosted Regression Trees (GBM)?"

Displaying 20 results from an estimated 100 matches similar to: "Which is the final model for a Boosted Regression Trees (GBM)?"

gbm error

2008 Sep 22

gbm error

Good afternoon Has anyone tried using Dr. Elith's BRT script? I cannot seem to run gbm.step from the installed gbm package. Is it something external to gbm? When I run the script itself <- gbm.step(data=model.data, gbm.x = colx:coly, gbm.y = colz, family = "bernoulli", tree.complexity = 5, learning.rate = 0.01, bag.fraction = 0.5) ... I

How do I call a C++ function (for k-means) within R?

2014 Jul 02

How do I call a C++ function (for k-means) within R?

I am trying to call a C++ k-means function within R and I am struggling. I know that the below code is used to call a C++ function for gbm but how do I do it for k-means? gbm.obj <- .Call("gbm", Y=as.double(y), Offset=as.double(offset), X=as.double(x), X.order=as.integer(x.order),

Parallelizing GBM

2013 Mar 24

Parallelizing GBM

Dear All, I am far from being a guru about parallel programming. Most of the time, I rely or randomForest for data mining large datasets. I would like to give a try also to the gradient boosted methods in GBM, but I have a need for parallelization. I normally rely on gbm.fit for speed reasons, and I usually call it this way gbm_model <- gbm.fit(trainRF,prices_train, offset = NULL, misc =

package announcement: Generalized Boosted Models (gbm)

2003 Jul 14

package announcement: Generalized Boosted Models (gbm)

Generalized Boosted Models (gbm) This package implements extensions to Y. Freund and R. Schapire's AdaBoost algorithm and J. Friedman's gradient boosting machine (aka multivariate adaptive regression trees, MART). It includes regression methods for least squares, absolute loss, logistic, Poisson, Cox proportional hazards/partial likelihood, and the AdaBoost exponential loss. It handles

package announcement: Generalized Boosted Models (gbm)

2003 Jul 14

package announcement: Generalized Boosted Models (gbm)

Using tune with gbm --grid search for best hyperparameters

2008 Mar 05

Using tune with gbm --grid search for best hyperparameters

Hello LIST, I'd like to use tune from e1071 to do a grid search for hyperparameter values in gbm. However, I can not get this to work. I note that there is no wrapper for gbm but that it is possible to use non-wrapped functions (like lm) without problem. Here's a snippet of code to illustrate. > data(mtcars) obj <- >

gbm for cost-sensitive binary classification?

2009 Jun 17

gbm for cost-sensitive binary classification?

I recently use gbm for a binary classification problem. As expected, it gets very good results, based on Area under ROC with 7-fold cross validation. However, the application (malware detection) is cost-sensitive, getting a FP (classify a clean sample as a dirty one) is much worse than getting a FN (miss a dirty sample). I would like to tune the gbm model biased to very low FP rate. For this

Gradient Boosting Trees with correlated predictors in gbm

2010 Feb 28

Gradient Boosting Trees with correlated predictors in gbm

Dear R users, I’m trying to understand how correlated predictors impact the Relative Importance measure in Stochastic Boosting Trees (J. Friedman). As Friedman described “ …with single decision trees (referring to Brieman’s CART algorithm), the relative importance measure is augmented by a strategy involving surrogate splits intended to uncover the masking of influential variables by others

Distributions for gbm models

2017 Dec 14

Distributions for gbm models

On page 409 of "Applied Predictive Modeling" by Max Kuhn, it states that the gbm function can accomodate only two class problems when referring to the distribution parameter. >From gbm help re: the distribution parameter: Currently available options are "gaussian" (squared error), "laplace" (absolute loss), "tdist" (t-distribution

R.GBM package

2010 Apr 26

R.GBM package

HI, Dear Greg, I AM A NEW to GBM package. Can boosting decision tree be implemented in 'gbm' package? Or 'gbm' can only be used for regression? IF can, DO I need to combine the rpart and gbm command? Thanks so much! -- Sincerely, Changbin -- [[alternative HTML version deleted]]

how to control the sampling to make each sample unique

2007 May 10

how to control the sampling to make each sample unique

I have a dataset of 10000 records which I want to use to compare two prediction models. I split the records into test dataset (size = ntest) and training dataset (size = ntrain). Then I run the two models. Now I want to shuffle the data and rerun the models. I want many shuffles. I know that the following command sample ((1:10000), ntrain) can pick ntrain numbers from 1 to 10000. Then I just

Forecasting using ARIMAX

2008 Oct 15

Forecasting using ARIMAX

Dear R-helpers, I would appreicate if someone can help me on the transfer parameter in ARIMAX and also see what I am doing is correct. I am using ARIMAX with 2 Exogeneous Variables and 10 years data are as follows: DepVar Period, depVar, IndepVar1 Period, indepVar1, IndepVar2 Period, indepVar2 Jan 1998,708,Jan 1998,495,Jan 1998,245.490 Feb 1998,670,Feb 1998,421.25,Feb 1998,288.170 Mar

Interpretation of randomForest results

2005 Jan 18

Interpretation of randomForest results

> From: luk > > I got the following results when I run radomForest with below > commands: > > qair <- read.table("train10.dat", header = T) > oz.rf <- randomForest(LESION ~ ., data = qair, ntree = 220, > importance = TRUE) > print(oz.rf) > > Call: > randomForest.formula(x = LESION ~ ., data = qair, ntree = > 220, importance =

possible memory leak in predict.gbm(), package gbm ?

2009 Oct 30

possible memory leak in predict.gbm(), package gbm ?

Dear gbm users, When running predict.gbm() on a "large" dataset (150,000 rows, 300 columns, 500 trees), I notice that the memory used by R grows beyond reasonable limits. My 14GB of RAM are often not sufficient. I am interpreting this as a memory leak since there should be no reason to expand memory needs once the data are loaded and passed to predict.gbm() ? Running R version 2.9.2 on

package gbm, predict.gbm with offset

2010 Sep 21

package gbm, predict.gbm with offset

Dear all, the help file for predict.gbm states that "The predictions from gbm do not include the offset term. The user may add the value of the offset to the predicted value if desired." I am just not sure how exactly, especially for a Poisson model, where I believe the offset is multiplicative ? For example: library(MASS) fit1 <- glm(Claims ~ District + Group + Age +

gbm for multi-class problems

2009 Apr 07

gbm for multi-class problems

Dear List, I´m working on a classification problem. My response has 60 levels. I`m very interested in boosted trees like AdaBoost or gradient boosting machine as implemented in the package "gbm". Unfortunately gbm is only applicable for 2-class problems. Is anybody out there who can help me? Is there a way to use gbm() for multi-class problems? Maybe there is a way to transform my

gbm package: relationship between interaction.depth and number of features?

2009 Jul 29

gbm package: relationship between interaction.depth and number of features?

Hello. I'm currently stuck with the same "what does interaction.depth really mean" stuff. Did you find out what the right answer is? Best regards, Boris Yangel. [[alternative HTML version deleted]]

GBM package: Extract coefficients

2009 Dec 14

GBM package: Extract coefficients

I am using the gbm package for generalized boosted regression models, and would like to be able to extract the coefficients produced for storage in a database. I am already using R to automatically generate formulas that I can export to a database and store. For example, I have been using Dr. Harrell's lrm package to perform logistic regression, e.g.: output <-

output from the gbm package

2010 Jun 15

output from the gbm package

HI, Dear Greg and R community, I have one question about the output of gbm package. the output of Boosting should be f(x), from it , how to calculate the probability for each observations in data set? SInce it is stochastic, how can guarantee that each observation in training data are selected at least once? IF SOME obs are not selected, how to calculate the training error? Thanks? --

mboost vs gbm

2012 Jul 23

mboost vs gbm

I'm attempting to fit boosted regression trees to a censored response using IPCW weighting. I've implemented this through two libraries, mboost and gbm, which I believe should yield models that would perform comparably. This, however, is not the case - mboost performs much better. This seems odd. This issue is meaningful since the output of this regression needs to be implemented in a

similar to: Which is the final model for a Boosted Regression Trees (GBM)?