Hi, While learning how to implement XGBoost in R I came across below case and want to know how to go about it. Outcome variable: continous independent features: mix of categorical and continuous nrow(train_set): 8523 Since, XGBoost natively supports only numeric features, I applied one hot encoding on the training data set: target <- train_set$Outlet_sales sparsed_train_set <- sparse.model.matrix(~.-1, data=train_set) nrow(sparsed_train_set) : 4526 #As expected, the row count is reduced. Note: The target variable is continuous and has as many rows as in train_set i.e 8523, before one hot encoding is applied. # To build mode: bst <- xgboost(data = sparsed_train_set, label = target, max.depth = 4, eta = 1, nthread = 4, nround = 50, objective=reg:linear) # Above execution would fail as My questions: - How should I handle above disparity between sparsed training data and label while building the model ? - How should I use XGBoost to perform regression where outcome is continuous ? Most of the web portals refers to the cases related to classification. If any could lead me to the source explaining this. I have gone through the documentation but not much cleared in this case. Regards, Sandeep S. Rana [[alternative HTML version deleted]]