Karteek Pradyumna Bulusu
2015-Nov-29 03:04 UTC
[R] Error in 'Contrasts<-' while using GBM.
Hey, I was trying to implement Stochastic Gradient Boosting in R. Following is my code in rstudio: library(caret); library(gbm); library(plyr); library(survival); library(splines); library(mlbench); set.seed(35); stack = read.csv("E:/Semester 3/BDA/PROJECT/Sample_SO.csv", head =TRUE,sep=","); dim(stack); #displaying dimensions of the dataset #SPLITTING TRAINING AND TESTING SET totraining <- createDataPartition(stack$ID, p = .6, list = FALSE); training <- stack[ totraining,] test <- stack[-totraining,] #PARAMETER SETTING t_control <- trainControl(method = "cv", number = 10); # GLM start <- proc.time(); glm = train(ID ~ ., data = training, method = "gbm", metric = "ROC", trControl = t_control, verbose = FALSE) When I am compiling last line, I am getting following error: Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : contrasts can be applied only to factors with 2 or more levels Can anyone tell me where I am going wrong and How to rectify it. It?ll be greatful. Thank you. Looking forward to it. Regards, Karteek Pradyumna Bulusu. [[alternative HTML version deleted]]
Providing a reproducible example and the results of `sessionInfo` will help get your question answered. My only guess is that one or more of your predictors are factors and that the in-sample data (used to build the model during resampling) have different levels than the holdout samples. Max On Sat, Nov 28, 2015 at 10:04 PM, Karteek Pradyumna Bulusu < kartikpradyumna92 at gmail.com> wrote:> Hey, > > I was trying to implement Stochastic Gradient Boosting in R. Following is > my code in rstudio: > > > > library(caret); > > library(gbm); > > library(plyr); > > library(survival); > > library(splines); > > library(mlbench); > > set.seed(35); > > stack = read.csv("E:/Semester 3/BDA/PROJECT/Sample_SO.csv", head > =TRUE,sep=","); > > dim(stack); #displaying dimensions of the dataset > > > > #SPLITTING TRAINING AND TESTING SET > > totraining <- createDataPartition(stack$ID, p = .6, list = FALSE); > > training <- stack[ totraining,] > > test <- stack[-totraining,] > > > > #PARAMETER SETTING > > t_control <- trainControl(method = "cv", number = 10); > > > > > > # GLM > > start <- proc.time(); > > > > glm = train(ID ~ ., data = training, > > method = "gbm", > > metric = "ROC", > > trControl = t_control, > > verbose = FALSE) > > > > When I am compiling last line, I am getting following error: > > > > Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : > > contrasts can be applied only to factors with 2 or more levels > > > > > > Can anyone tell me where I am going wrong and How to rectify it. It?ll be > greatful. > > > > Thank you. Looking forward to it. > > > > Regards, > Karteek Pradyumna Bulusu. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
On 30 Nov 2015, at 02:59 , Max Kuhn <mxkuhn at gmail.com> wrote:> Providing a reproducible example and the results of `sessionInfo` will help > get your question answered. > > My only guess is that one or more of your predictors are factors and that > the in-sample data (used to build the model during resampling) have > different levels than the holdout samples.Another guess is that there's a factor in your (Karteek's) data that has only one level and that "ID ~ ." is pullling more variables into the model than you actually want. -pf> > Max > > On Sat, Nov 28, 2015 at 10:04 PM, Karteek Pradyumna Bulusu < > kartikpradyumna92 at gmail.com> wrote: > >> Hey, >> >> I was trying to implement Stochastic Gradient Boosting in R. Following is >> my code in rstudio: >> >> >> >> library(caret); >> >> library(gbm); >> >> library(plyr); >> >> library(survival); >> >> library(splines); >> >> library(mlbench); >> >> set.seed(35); >> >> stack = read.csv("E:/Semester 3/BDA/PROJECT/Sample_SO.csv", head >> =TRUE,sep=","); >> >> dim(stack); #displaying dimensions of the dataset >> >> >> >> #SPLITTING TRAINING AND TESTING SET >> >> totraining <- createDataPartition(stack$ID, p = .6, list = FALSE); >> >> training <- stack[ totraining,] >> >> test <- stack[-totraining,] >> >> >> >> #PARAMETER SETTING >> >> t_control <- trainControl(method = "cv", number = 10); >> >> >> >> >> >> # GLM >> >> start <- proc.time(); >> >> >> >> glm = train(ID ~ ., data = training, >> >> method = "gbm", >> >> metric = "ROC", >> >> trControl = t_control, >> >> verbose = FALSE) >> >> >> >> When I am compiling last line, I am getting following error: >> >> >> >> Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : >> >> contrasts can be applied only to factors with 2 or more levels >> >> >> >> >> >> Can anyone tell me where I am going wrong and How to rectify it. It?ll be >> greatful. >> >> >> >> Thank you. Looking forward to it. >> >> >> >> Regards, >> Karteek Pradyumna Bulusu. >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.-- Peter Dalgaard, Professor, Center for Statistics, Copenhagen Business School Solbjerg Plads 3, 2000 Frederiksberg, Denmark Phone: (+45)38153501 Office: A 4.23 Email: pd.mes at cbs.dk Priv: PDalgd at gmail.com