search for: thogiti

Displaying 9 results from an estimated 9 matches for "thogiti".

Did you mean: thoght
2008 Feb 25
1
Running randomForests on large datasets
Hi, I am trying to run randomForests on a datasets of size 500000X650 and R pops up memory allocation error. Are there any better ways to deal with large datasets in R, for example, Splus had something like bigData library. Thank you, Nagu
2008 Mar 11
1
More digits in prediction using random forest object
I need to get more digits in predicting a test sample with a random forests object. Format or options(digits=) do nothing. Any ideas? Thank you, Nagu
2006 Jan 17
0
CLuster analysis with only nominal variables
Hi All, I am wondering if there is any literature or any prior implementations of cluster analysis for only nominal (categorical) variables for a large dataset, apprx 20,000 rows with 15 variables. I came across one or two such implementations, but they seem to assume certain data distributions. Thank you, Nagu
2006 Mar 28
0
linear regression of verydispersed data
Hi, I need some help in modeling a linear regression problem. I am trying to fit a relationship between the dependent variable y and the independent variables matrix X. I tried different set of models, and also did some EDA and saw clearly no linear relationship exist between y and X. I also tried with some transformations of the variables, robust regressions, ace and avas (the variance
2008 May 21
1
How to use classwt parameter option in RandomForest
Hi, I am trying to model a dataset with the response variable Y, which has 6 levels { Great, Greater, Greatest, Weak, Weaker, Weakest}, and predictor variables X, with continuous and factor variables using random forests in R. The variable Y acts like an ordinal variable, but I recoded it as factor variable. I ran a simulation and got OOB estimate of error rate 60%. I validated against some
2008 May 30
0
Progress bar or execution plan for modeling process
Hi, I often run predictive models on large datasets with multiple combination of parameter space. I am wondering if there is any way to quickly check the execution plan, like how much time does it take to run a model. Here is a more specific example: I have a set of datasets, S, of size 40000X700. I am fitting some ensemble models like, boosting, and random forests. I would like find out some
2008 Mar 07
2
error in random forest
Hi, I get the following error when I try to predict the probabilities of a test sample: Error in predict.randomForest(fit.EBA.OM.rf.50, x.OM, type = "prob") : New factor levels not present in the training data I have about 630 predictor variables in the dataset x.OM (25 factor variables and the remaining are continuous variables). Any ideas on how to trace it? Thank you, Nagu
2008 Feb 25
1
To get more digits in precision of predict function of randomForests
Hi, I am using randomForests for a classification problem. The predict function in the randomForest library, when asked to return the probabilities, has precision of two digits after the decimal. I need at least four digits of precision for the predicted probabilities. How do I achieve this? Thank you, Nagu
2006 Apr 05
2
Multivariate linear regression
Hi, I am working on a multivariate linear regression of the form y = Ax. I am seeing a great dispersion of y w.r.t x. For example, the correlations between y and x are very small, even after using some typical transformations like log, power. I tried with simple linear regression, robust regression and ace and avas package in R (or splus). I didn't see an improvement in the fit and