similar to: R.GBM package

Displaying 20 results from an estimated 2000 matches similar to: "R.GBM package"

2009 Jun 17
1
gbm for cost-sensitive binary classification?
I recently use gbm for a binary classification problem. As expected, it gets very good results, based on Area under ROC with 7-fold cross validation. However, the application (malware detection) is cost-sensitive, getting a FP (classify a clean sample as a dirty one) is much worse than getting a FN (miss a dirty sample). I would like to tune the gbm model biased to very low FP rate. For this
2010 May 01
1
bag.fraction in gbm package
Hi, Dear Greg, Sorry to bother you again. I have several questions about the 'gbm' package. if the train.fraction is less than 1 (ie. 0.5) , then the* first* 50% will be used to fit the model, the other 50% can be used to estimate the performance. if bag.fraction is 0.5, then gbm use the* random* 50% of the data to fit the model, and the other 50% data is used to estimate the
2006 May 27
2
boosting - second posting
Hi I am using boosting for a classification and prediction problem. For some reason it is giving me an outcome that doesn't fall between 0 and 1 for the predictions. I have tried type="response" but it made no difference. Can anyone see what I am doing wrong? Screen output shown below: > boost.model <- gbm(as.factor(train$simNuance) ~ ., # formula +
2010 Feb 28
1
Gradient Boosting Trees with correlated predictors in gbm
Dear R users, I’m trying to understand how correlated predictors impact the Relative Importance measure in Stochastic Boosting Trees (J. Friedman). As Friedman described “ …with single decision trees (referring to Brieman’s CART algorithm), the relative importance measure is augmented by a strategy involving surrogate splits intended to uncover the masking of influential variables by others
2013 Jun 23
1
Which is the final model for a Boosted Regression Trees (GBM)?
Hi R User, I was trying to find a final model in the following example by using the Boosted regression trees (GBM). The program gives the fitted values but I wanted to calculate the fitted value by hand to understand in depth. Would you give moe some hints on what is the final model for this example? Thanks KG ------- The following script I used #----------------------- library(dismo)
2010 Jun 15
1
output from the gbm package
HI, Dear Greg and R community, I have one question about the output of gbm package. the output of Boosting should be f(x), from it , how to calculate the probability for each observations in data set? SInce it is stochastic, how can guarantee that each observation in training data are selected at least once? IF SOME obs are not selected, how to calculate the training error? Thanks? --
2013 Mar 24
3
Parallelizing GBM
Dear All, I am far from being a guru about parallel programming. Most of the time, I rely or randomForest for data mining large datasets. I would like to give a try also to the gradient boosted methods in GBM, but I have a need for parallelization. I normally rely on gbm.fit for speed reasons, and I usually call it this way gbm_model <- gbm.fit(trainRF,prices_train, offset = NULL, misc =
2008 Sep 18
1
caret package: arguments passed to the classification or regression routine
Hi, I am having problems passing arguments to method="gbm" using the train() function. I would like to train gbm using the laplace distribution or the quantile distribution. here is the code I used and the error: gbm.test <- train(x.enet, y.matrix[,7], method="gbm", distribution=list(name="quantile",alpha=0.5), verbose=FALSE,
2009 Jul 10
1
help! Error in using Boosting...
Here is my code: mygbm<-gbm.fit(y=mytraindata[, 1], x=mytraindata[, -1], interaction.depth=4, shrinkage=0.001, n.trees=20000, bag.fraction=1, distribution="bernoulli") Here is the error: Error in gbm.fit(y = mytraindata[, 1], x = mytraindata[, -1], interaction.depth = 4, : The dataset size is too small or subsampling rate is too large: cRows*train.fraction*bag.fraction <=
2008 Sep 22
1
gbm error
Good afternoon Has anyone tried using Dr. Elith's BRT script? I cannot seem to run gbm.step from the installed gbm package. Is it something external to gbm? When I run the script itself <- gbm.step(data=model.data, gbm.x = colx:coly, gbm.y = colz, family = "bernoulli", tree.complexity = 5, learning.rate = 0.01, bag.fraction = 0.5) ... I
2018 Feb 19
3
gbm.step para clasificación no binaria
Gracias Carlos. Hasta donde yo entiendo si las hay: El argumento family puede ser: "gaussian" (for minimizing squared error); por lo que tiene que ser numérica "bernoulli" (logistic regression for 0-1 out-comes); binaria por narices "poisson" (count outcomes; requires the response to be a positive integer); numérica también, pues. La única podría ser
2009 Oct 30
1
possible memory leak in predict.gbm(), package gbm ?
Dear gbm users, When running predict.gbm() on a "large" dataset (150,000 rows, 300 columns, 500 trees), I notice that the memory used by R grows beyond reasonable limits. My 14GB of RAM are often not sufficient. I am interpreting this as a memory leak since there should be no reason to expand memory needs once the data are loaded and passed to predict.gbm() ? Running R version 2.9.2 on
2010 Sep 21
1
package gbm, predict.gbm with offset
Dear all, the help file for predict.gbm states that "The predictions from gbm do not include the offset term. The user may add the value of the offset to the predicted value if desired." I am just not sure how exactly, especially for a Poisson model, where I believe the offset is multiplicative ? For example: library(MASS) fit1 <- glm(Claims ~ District + Group + Age +
2005 Feb 18
2
gbm
Hi, there: I am always experiencing the scalability of some R packages. This time, I am trying gbm to do adaboosting on my project. Initially I tried to grow trees by using rpart on a dataset with 200 variables and 30,000 observations. Now, I am thinking if I can apply adaboosting on it. I am wondering if here is anyone who did a similar thing before and can provide some sample codes. Also any
2018 Feb 19
3
gbm.step para clasificación no binaria
Hola de nuevo. Se me olvidaba la principal razón para utilizar gbm.step del paquete dismo. Como sabéis, los boosted si sobreajustan (a diferencia de los random forest o cualquier otro bootstrap) pero gbm.step hace validación cruzada para determinar el nº óptimo de árboles y evitarlo. Es fundamental. La opción que me queda, Carlos, es hacerlo con gbm, pero muchas veces, y usar el
2018 Feb 19
2
gbm.step para clasificación no binaria
Hola erreros, ¿sabéis si gbm.step puede usarse para clasificación no binaria? Gracias -- Dr Manuel Mendoza Department of Biogeography and Global Change National Museum of Natural History (MNCN) Spanish Scientific Council (CSIC) C/ Serrano 115bis, 28006 MADRID Spain
2011 Feb 26
2
Reproducibility issue in gbm (32 vs 64 bit)
Dear List, The gbm package on Win 7 produces different results for the relative importance of input variables in R 32-bit relative to R 64-bit. Any idea why? Any idea which one is correct? Based on this example, it looks like the relative importance of 2 perfectly correlated predictors is "diluted" by half in 32-bit, whereas in 64-bit, one of these predictors gets all the importance
2012 Jul 23
1
mboost vs gbm
I'm attempting to fit boosted regression trees to a censored response using IPCW weighting. I've implemented this through two libraries, mboost and gbm, which I believe should yield models that would perform comparably. This, however, is not the case - mboost performs much better. This seems odd. This issue is meaningful since the output of this regression needs to be implemented in a
2017 Dec 14
0
Distributions for gbm models
On page 409 of "Applied Predictive Modeling" by Max Kuhn, it states that the gbm function can accomodate only two class problems when referring to the distribution parameter. >From gbm help re: the distribution parameter: Currently available options are "gaussian" (squared error), "laplace" (absolute loss), "tdist" (t-distribution
2010 Apr 29
2
can not print probabilities in svm of e1071
> x <- train[,c( 2:18, 20:21, 24, 27:31)] > y <- train$out > > svm.pr <- svm(x, y, probability = TRUE, method="C-classification", kernel="radial", cost=bestc, gamma=bestg, cross=10) > > pred <- predict(svm.pr, valid[,c( 2:18, 20:21, 24, 27:31)], decision.values = TRUE, probability = TRUE) > attr(pred, "decision.values")[1:4,]