Izmirlian, Grant (NIH/NCI)
2005-Nov-04 15:43 UTC
[Rd] Classification Trees and basic Random Forest pkg using tree structures in C
Hello R-devel: I have written a package, called "woods", that does classification trees (R function CT), and currently, only the most basic functionality of Random Forest, e.g. bagged trees with choices about sample size, with/without replacement, size of (random) subset of covariates drawn when nodes are split. My reason for writing this is twofold. First, I wanted to base this development entirely in C (as others have done), but using data structures such as a node, pointer to node (for trees), and pointer to pointer of node (for forests) implemented in C. The algorithm which does bagging isn't any faster (its 30% slower) than one by Leo Breiman/Adele Cutler/Andy Liaw/ Matt Weiner. The CT function runs about equally as fast as Professor Brian Ripley's. The only interesting feature is that the tree structure has been implemented in C. Its a neater way to carry stuff around and I am guessing would make future implementation easier. Because of its inherent redundancy from the users standpoint, it isn't something to send to CRAN. However, I was wondering whether anyone is interested in a copy? Grant Izmirlian NCI
Hin-Tak Leung
2005-Nov-04 17:43 UTC
[Rd] Classification Trees and basic Random Forest pkg using tree structures in C
Izmirlian, Grant (NIH/NCI) wrote: <snipped>> The only interesting feature is that the tree structure has been > implemented in C. Its a neater way to carry stuff around and I am > guessing would make future implementation easier. > > Because of its inherent redundancy from the users standpoint, it > isn't something to send to CRAN. However, I was wondering whether > anyone is interested in a copy?Hi, Hmm, why didn't you just post a URL? Incidentally I am actually very interested in seeing your code. I am working on a project where the data set is extremely large, but the permuntation of the states of the data is extremely small. Each piece of data consists of only 4 states, so stuffing it as an R object (which takes up 32-byte? on 32-bit machines) or even an char vector is quite wasteful; so I have written a "strange" data.frame where internally it uses only 2-bit for storage. (it is still work-in-process but I have got to the point of being able to get and set each 2-bit cell now). Hin-Tak Leung
Izmirlian, Grant (NIH/NCI)
2005-Nov-04 18:57 UTC
[Rd] Classification Trees and basic Random Forest pkg using tree structures in C
Hello Hin-Tak: Thanks for your interest. This is just a short not to tell you and others that the URL idea is a good one. This will take a few days at our organization. When its available I will post again to this thread. In the meantime, I will will send copies directly to those interested. So far, you and one other person. Regards, Grant
Possibly Parallel Threads
- Classification Trees and basic Random Forest pkg using t ree structures in C
- [EXTERNAL] Re: NOTE: multiple local function definitions for ?fun? with different formal arguments
- [EXTERNAL] Re: NOTE: multiple local function definitions for ?fun? with different formal arguments
- NOTE: multiple local function definitions for ?fun? with different formal arguments
- NOTE: multiple local function definitions for ?fun? with different formal arguments