Liaw, Andy
2005-Nov-04 19:00 UTC
[Rd] Classification Trees and basic Random Forest pkg using t ree structures in C
> From: Hin-Tak Leung > > Izmirlian, Grant (NIH/NCI) wrote: > <snipped> > > The only interesting feature is that the tree structure has been > > implemented in C. Its a neater way to carry stuff around and I am > > guessing would make future implementation easier. > > > > Because of its inherent redundancy from the users standpoint, it > > isn't something to send to CRAN. However, I was wondering whether > > anyone is interested in a copy? > > Hi, > > Hmm, why didn't you just post a URL?Isn't it a bit too much to assume that everyone has a personal web space somewhere?> Incidentally I am actually very > interested in seeing your code. I am working on a project where > the data set is extremely large, but the permuntation of the states of > the data is extremely small. Each piece of data consists of only 4 > states, so stuffing it as an R object (which takes up 32-byte? on > 32-bit machines) or even an char vector is quite wasteful; so I > have written a "strange" data.frame where internally it uses only > 2-bit for storage. (it is still work-in-process but I have got to > the point of being able to get and set each 2-bit cell now).For some of the data we encounter, all X variables are binary, so each data point can be encoded into a bitstring. There are algorithms that take advantage of that. The problem is interfacing such code with R. I know of no good solutions. As I told Grant, I thought about what he did, too, but the difficulty is how to pass such data structures to R. Actually, some time down the road I might try to use the dendrogram class that's in R, and manipulate them in C. Not sure about efficiency though. Andy> Hin-Tak Leung > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > >
Torsten Hothorn
2005-Nov-07 08:20 UTC
[Rd] Classification Trees and basic Random Forest pkg using tree structures in C
On Fri, 4 Nov 2005, Liaw, Andy wrote:> > For some of the data we encounter, all X variables are binary, so each data > point can be encoded into a bitstring. There are algorithms that take > advantage of that. The problem is interfacing such code with R. I know of > no good solutions. As I told Grant, I thought about what he did, too, but > the difficulty is how to pass such data structures to R. Actually, some > time down the road I might try to use the dendrogram class that's in R, and > manipulate them in C.I faced similar problems some time ago and ended up representing a (binary) tree as recursive lists which can be manipulated from both the C and R side. The `party' package has the code (and an internal random forest function, however, without R interface yet) and the vignette explains some details. Best, Torsten> Not sure about efficiency though. > > Andy > > > > Hin-Tak Leung > > > > ______________________________________________ > > R-devel at r-project.org mailing list > > https://stat.ethz.ch/mailman/listinfo/r-devel > > > > > > ______________________________________________ > R-devel at r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > >
Hin-Tak Leung
2005-Nov-07 10:39 UTC
[Rd] Classification Trees and basic Random Forest pkg using t ree structures in C
Liaw, Andy wrote:>>From: Hin-Tak Leung<snipped>>>Hmm, why didn't you just post a URL? > > Isn't it a bit too much to assume that everyone has a personal web space > somewhere?Just for the sake of argument... I did assume that nih.gov is a sizeable government organization and have official channnels for such things. That's what government agencies do, and I suppose this software work is in-the-line-of-duty for public consumption and therefore quite appropriate to put on a *.gov web site. (the same applies to *.ac.uk and *.edu postings).> For some of the data we encounter, all X variables are binary, so each data > point can be encoded into a bitstring. There are algorithms that take > advantage of that. The problem is interfacing such code with R. I know of > no good solutions. As I told Grant, I thought about what he did, too, but > the difficulty is how to pass such data structures to R. Actually, some > time down the road I might try to use the dendrogram class that's in R, and > manipulate them in C. Not sure about efficiency though.The best examples I have seen of manipulating foreign object is among the omegahat projects, like RSPerl and PSPython. Quite insteresting reading. Hin-Tak Leung
Possibly Parallel Threads
- Classification Trees and basic Random Forest pkg using tree structures in C
- Rgnome depends on obsolete components libglade/libxml (PR#8247)
- problem with \eqn (PR#8322)
- wine and build difference between R.2.4.0 and R 2.4.1 windows binaries?
- another fix for R crashes under enable-strict-barrier, lto, trunk@72156