similar to: Bias correction for random forests?

Displaying 20 results from an estimated 9000 matches similar to: "Bias correction for random forests?"

2006 Jul 23
1
Iterated Data Input/Output with Random Forests
Hi, I am currently writing code to input a few thousand files, run them through the Random Forests package, and then output corresponding results. When I use the code below: zz<-textConnection("ex.lm.out", "w") sink(zz)
2009 Jan 12
1
Loading workspaces from the command line
Hi, Is there any way to load workspaces (e.g. stuff from save.image) from the command line? I'm on Linux, and would find this very helpful. I'm guessing this functionality can be duplicated with a skillful bash script to rename the particular file to .RData (and then back once R terminates), but I'm wondering if there's a better way. Zhou Fang
2004 Jan 12
0
new version of randomForest (4.0-7)
Dear R users, I've just released a new version of randomForest (available on CRAN now). This version contained quite a number of new features and bug fixes, compared to version prior to 4.0-x (and few more since 4.0-1). For those not familiar with randomForest, it's an ensemble classifier/regression tool. Please see http://www.math.usu.edu/~adele/forests/ for more detailed information,
2004 Jan 12
0
new version of randomForest (4.0-7)
Dear R users, I've just released a new version of randomForest (available on CRAN now). This version contained quite a number of new features and bug fixes, compared to version prior to 4.0-x (and few more since 4.0-1). For those not familiar with randomForest, it's an ensemble classifier/regression tool. Please see http://www.math.usu.edu/~adele/forests/ for more detailed information,
2008 Mar 09
1
sampsize in Random Forests
Hi all, I have a dataset where each point is assigned to a class A, B, C, or D. Each point is also assigned to a study site. Each study site is coded with a number ranging between 1-100. This information is stored in the vector studySites. I want to run randomForests using stratified sampling, so I chose the option strata = factor(studySites) But I am not sure how to control the number of
2009 Apr 10
1
Random Forests: Question about R^2
Dear Random Forests gurus, I have a question about R^2 provided by randomForest (for regression). I don't succeed in finding this information. In the help file for randomForest under "Value" it says: rsq: (regression only) - "pseudo R-squared'': 1 - mse / Var(y). Could someone please explain in somewhat more detail how exactly R^2 is calculated? Is "mse"
2007 Sep 17
1
random Forests
Hi, I am new to R and have a specific question about the randomForest package and the saving of trees and scoring. 1) I am looking to save the trees and score at a later time. Is there a way to load the saved trees and use the predict function? Can objects be saved and loaded i.e. the randomForest function call? I dont want to have to rerun trees. Hopefully this applies to any stat type
2009 Feb 06
1
Finding a basis in a set of vectors
Hi, Okay, I have a n x p matrix X, which I know is not full rank. In particular, there may be linear dependencies amongst the columns (but not that many). What is a fast way of finding a linearly independent subset of the columns of X that will span the column space of X, in R? If it helps, I have the QR decomposition of the original X 'for free'. I know that it's possible to do this
2009 May 28
2
Replace is leaking?
Okay, someone explain this behaviour to me: Browse[1]> replace(rep(0, 4000), temp1[12] , temp2[12])[3925] [1] 0.4462404 Browse[1]> temp1[12] [1] 3926 Browse[1]> temp2[12] [1] 0.4462404 Browse[1]> replace(rep(0, 4000), 3926 , temp2[12])[3925] [1] 0 For some reason, R seems to shift indices along when doing this replacement. Has anyone encountered this bug before? It seems to crop up
2008 Jan 20
1
Looping over subsets
Hi, Possibly a dumb question, but I wonder if anyone can help me with this. What I want to do, essentially, is to loop over all ordered subsets of a given size of a certain set. Ultimately, the idea is to find the subset that maximises a certain value. The set in question is likely large (the subset size is likely small, though), so things like combn don't seem to be a good solution. The
2009 Feb 15
2
Fast ave for sorted data?
Hi, This is probably really obvious, by I can't seem to find anything on it. Is there a fast version of ave for when the data is already sorted in terms of the factor, or if the breaks are already known? Basically, I have: X = 0.1, 0.2, 0.32, 0.32, 0.4, 0.56, 0.56, 0.7... Y = 223, 434, 343, 544, 231.... etc of the same, admittedly large length. Now note that some of the values of X are
2002 May 28
0
random Forests
Hi, I have a data set with 1000 observations and 260 predictors. The predictor variables are all ordinal. There are 2 classes labeled as, F and T with class proportions of 0.44 and 0.56, respectively. In a call to the function randomForest() with mytry=1 and nodesize=1 and ntree=100 the resulting classifier puts all observations in class T. When I change nodesize to nodesize=5 I get the
2018 Jan 20
2
Random Forests
Gracias Carlos y Javier, ntrees es el nº de árboles y treesize sus respectivos tamaños (nº de nodos) ntree: Number of trees to grow. This should not be set to too small ...... treesize: Size of trees (number of nodes) in and ensemble. Puse 1000 árboles (ntree=1000), si, pero la función treesize te da el nº de nodos: treesize(RFfit, terminal=TRUE) me da un vector de 1000 elementos (uno
2010 Apr 09
1
Question on implementing Random Forests scoring
So I've been working with Random Forests ( R library is randomForest) and I curious if Random Forests could be applied to classifying on a real time basis. For instance lets say I've scored fraud from a group of transactions. If I want to score any new incoming transactions for fraud could Random Forests be used in that context. Linear Regression is nice in that it is very easy to
2012 Jan 27
1
Bivariate Partial Dependence Plots in Random Forests
Hello, I was wondering if anyone knew of an R function/R code to plot bivariate (3 dimensional) partial dependence plots in random forests (randomForest package). It is apparently possible using the rgl package (http://esapubs.org/archive/ecol/E088/173/appendix-C.htm) or there may be a more direct function such as the pairplot() in MART (multiple additive regression trees)? Many
2005 May 09
1
Random Forests 4.5-10 varImpPlot (PR#7844)
Full_Name: Daniel Normolle Version: 2.0.1 OS: Linux/Fedora Core 3 Submission from: (NULL) (141.214.17.5) varImpPlot in Random Forests 4.5-10 produces the error "incorrect number of subscripts on matrix" (and no plot) when applied to a randomForest object. This error did not occur with 4.5-4 or earlier versions.
2018 Jan 20
2
Random Forests
Si, Carlos. Yo hago lo mismo, pero esos mismos numeritos salen enormes. > treesize(RFfit) [1] 4304 4302 4311 4319 4343 4298 4298 4311 4349 4327 4331 4317 4294 4321 4283 4362 [17] 4300 4330 4266 4331 4308 4352 4294 4315 4372 4349 4331 4347 4329 4348 4298 4335 [33] 4346 4396 4345 4313 4293 4276 4353 4272 4304 4325 4317 4336 4308 4351 4374 4324 [49] 4386 4359 4311 4346 4300
2012 May 08
1
Fast reading of hex data?
Hi all, Basically, I have data in the format of (up to 1 gig in size) text files containing stuff like: F34060F81000F28055F8A000F2E05EF8F000F34 (...) The data is basically strings denoting hex values (9 = 9, A = 10, B = 11, ...) organised in fixed, small blocks. What I want to do is to read in a specified segment of the string, break it up into blocks, and convert it into a vector of integers
2018 Jan 17
4
Random Forests
Buenas tardes a todos. El paquete randomForest tiene la función treesize, que es el nº de nodos. Me dan valores realmente elevados (en torno a 1000), y eso me parece extraño. ¿sabéis si es así? Gracias, Manuel -- Dr Manuel Mendoza Department of Biogeography and Global Change National Museum of Natural History (MNCN) Spanish Scientific Council (CSIC) C/ Serrano 115bis, 28006 MADRID Spain
2018 Jan 22
2
Random Forests
Muchas gracias Carlos, como siempre. Es raro que se me pasase. En su momento miré todos los argumentos del RF, como hago siempre, pero ese lo había olvidado. La verdad es que funcionaba estupendamente, pero me parecía extraño. Aunque dado que los RF no sobreajustan, no hay problema con que sus árboles sean todo lo grandes que quieras. Lo he testado con una base de datos externa y explica