thr3ads.net - similar to: "R.GBM package"

Displaying 20 results from an estimated 2000 matches similar to: "R.GBM package"

gbm for cost-sensitive binary classification?

2009 Jun 17

gbm for cost-sensitive binary classification?

I recently use gbm for a binary classification problem. As expected, it gets very good results, based on Area under ROC with 7-fold cross validation. However, the application (malware detection) is cost-sensitive, getting a FP (classify a clean sample as a dirty one) is much worse than getting a FN (miss a dirty sample). I would like to tune the gbm model biased to very low FP rate. For this

bag.fraction in gbm package

2010 May 01

bag.fraction in gbm package

Hi, Dear Greg, Sorry to bother you again. I have several questions about the 'gbm' package. if the train.fraction is less than 1 (ie. 0.5) , then the* first* 50% will be used to fit the model, the other 50% can be used to estimate the performance. if bag.fraction is 0.5, then gbm use the* random* 50% of the data to fit the model, and the other 50% data is used to estimate the

boosting - second posting

2006 May 27

boosting - second posting

Hi I am using boosting for a classification and prediction problem. For some reason it is giving me an outcome that doesn't fall between 0 and 1 for the predictions. I have tried type="response" but it made no difference. Can anyone see what I am doing wrong? Screen output shown below: > boost.model <- gbm(as.factor(train$simNuance) ~ ., # formula +

Gradient Boosting Trees with correlated predictors in gbm

2010 Feb 28

Gradient Boosting Trees with correlated predictors in gbm

Dear R users, I’m trying to understand how correlated predictors impact the Relative Importance measure in Stochastic Boosting Trees (J. Friedman). As Friedman described “ …with single decision trees (referring to Brieman’s CART algorithm), the relative importance measure is augmented by a strategy involving surrogate splits intended to uncover the masking of influential variables by others

Which is the final model for a Boosted Regression Trees (GBM)?

2013 Jun 23

Which is the final model for a Boosted Regression Trees (GBM)?

Hi R User, I was trying to find a final model in the following example by using the Boosted regression trees (GBM). The program gives the fitted values but I wanted to calculate the fitted value by hand to understand in depth. Would you give moe some hints on what is the final model for this example? Thanks KG ------- The following script I used #----------------------- library(dismo)

output from the gbm package

2010 Jun 15

output from the gbm package

HI, Dear Greg and R community, I have one question about the output of gbm package. the output of Boosting should be f(x), from it , how to calculate the probability for each observations in data set? SInce it is stochastic, how can guarantee that each observation in training data are selected at least once? IF SOME obs are not selected, how to calculate the training error? Thanks? --

Parallelizing GBM

2013 Mar 24

Parallelizing GBM

Dear All, I am far from being a guru about parallel programming. Most of the time, I rely or randomForest for data mining large datasets. I would like to give a try also to the gradient boosted methods in GBM, but I have a need for parallelization. I normally rely on gbm.fit for speed reasons, and I usually call it this way gbm_model <- gbm.fit(trainRF,prices_train, offset = NULL, misc =

caret package: arguments passed to the classification or regression routine

2008 Sep 18

caret package: arguments passed to the classification or regression routine

Hi, I am having problems passing arguments to method="gbm" using the train() function. I would like to train gbm using the laplace distribution or the quantile distribution. here is the code I used and the error: gbm.test <- train(x.enet, y.matrix[,7], method="gbm", distribution=list(name="quantile",alpha=0.5), verbose=FALSE,

help! Error in using Boosting...

2009 Jul 10

help! Error in using Boosting...

Here is my code: mygbm<-gbm.fit(y=mytraindata[, 1], x=mytraindata[, -1], interaction.depth=4, shrinkage=0.001, n.trees=20000, bag.fraction=1, distribution="bernoulli") Here is the error: Error in gbm.fit(y = mytraindata[, 1], x = mytraindata[, -1], interaction.depth = 4, : The dataset size is too small or subsampling rate is too large: cRows*train.fraction*bag.fraction <=

gbm error

2008 Sep 22

gbm error

Good afternoon Has anyone tried using Dr. Elith's BRT script? I cannot seem to run gbm.step from the installed gbm package. Is it something external to gbm? When I run the script itself <- gbm.step(data=model.data, gbm.x = colx:coly, gbm.y = colz, family = "bernoulli", tree.complexity = 5, learning.rate = 0.01, bag.fraction = 0.5) ... I

gbm.step para clasificación no binaria

2018 Feb 19

gbm.step para clasificación no binaria

Gracias Carlos. Hasta donde yo entiendo si las hay: El argumento family puede ser: "gaussian" (for minimizing squared error); por lo que tiene que ser numérica "bernoulli" (logistic regression for 0-1 out-comes); binaria por narices "poisson" (count outcomes; requires the response to be a positive integer); numérica también, pues. La única podría ser

possible memory leak in predict.gbm(), package gbm ?

2009 Oct 30

possible memory leak in predict.gbm(), package gbm ?

Dear gbm users, When running predict.gbm() on a "large" dataset (150,000 rows, 300 columns, 500 trees), I notice that the memory used by R grows beyond reasonable limits. My 14GB of RAM are often not sufficient. I am interpreting this as a memory leak since there should be no reason to expand memory needs once the data are loaded and passed to predict.gbm() ? Running R version 2.9.2 on

package gbm, predict.gbm with offset

2010 Sep 21

package gbm, predict.gbm with offset

Dear all, the help file for predict.gbm states that "The predictions from gbm do not include the offset term. The user may add the value of the offset to the predicted value if desired." I am just not sure how exactly, especially for a Poisson model, where I believe the offset is multiplicative ? For example: library(MASS) fit1 <- glm(Claims ~ District + Group + Age +

gbm

2005 Feb 18

gbm

Hi, there: I am always experiencing the scalability of some R packages. This time, I am trying gbm to do adaboosting on my project. Initially I tried to grow trees by using rpart on a dataset with 200 variables and 30,000 observations. Now, I am thinking if I can apply adaboosting on it. I am wondering if here is anyone who did a similar thing before and can provide some sample codes. Also any

gbm.step para clasificación no binaria

2018 Feb 19

gbm.step para clasificación no binaria

Hola de nuevo. Se me olvidaba la principal razón para utilizar gbm.step del paquete dismo. Como sabéis, los boosted si sobreajustan (a diferencia de los random forest o cualquier otro bootstrap) pero gbm.step hace validación cruzada para determinar el nº óptimo de árboles y evitarlo. Es fundamental. La opción que me queda, Carlos, es hacerlo con gbm, pero muchas veces, y usar el

gbm.step para clasificación no binaria

2018 Feb 19

gbm.step para clasificación no binaria

Hola erreros, ¿sabéis si gbm.step puede usarse para clasificación no binaria? Gracias -- Dr Manuel Mendoza Department of Biogeography and Global Change National Museum of Natural History (MNCN) Spanish Scientific Council (CSIC) C/ Serrano 115bis, 28006 MADRID Spain

Reproducibility issue in gbm (32 vs 64 bit)

2011 Feb 26

Reproducibility issue in gbm (32 vs 64 bit)

Dear List, The gbm package on Win 7 produces different results for the relative importance of input variables in R 32-bit relative to R 64-bit. Any idea why? Any idea which one is correct? Based on this example, it looks like the relative importance of 2 perfectly correlated predictors is "diluted" by half in 32-bit, whereas in 64-bit, one of these predictors gets all the importance

mboost vs gbm

2012 Jul 23

mboost vs gbm

I'm attempting to fit boosted regression trees to a censored response using IPCW weighting. I've implemented this through two libraries, mboost and gbm, which I believe should yield models that would perform comparably. This, however, is not the case - mboost performs much better. This seems odd. This issue is meaningful since the output of this regression needs to be implemented in a

Distributions for gbm models

2017 Dec 14

Distributions for gbm models

On page 409 of "Applied Predictive Modeling" by Max Kuhn, it states that the gbm function can accomodate only two class problems when referring to the distribution parameter. >From gbm help re: the distribution parameter: Currently available options are "gaussian" (squared error), "laplace" (absolute loss), "tdist" (t-distribution

can not print probabilities in svm of e1071

2010 Apr 29

can not print probabilities in svm of e1071

> x <- train[,c( 2:18, 20:21, 24, 27:31)] > y <- train$out > > svm.pr <- svm(x, y, probability = TRUE, method="C-classification", kernel="radial", cost=bestc, gamma=bestg, cross=10) > > pred <- predict(svm.pr, valid[,c( 2:18, 20:21, 24, 27:31)], decision.values = TRUE, probability = TRUE) > attr(pred, "decision.values")[1:4,]

similar to: R.GBM package