From: Michael Lindgren>
> Greetings R Users!
>
> I am posting to inquire about the proximity matrix in the randomForest
> R-package. I am having difficulty pushing very large data through the
> algorithm and it appears to hang on the building of the prox
> matrix. I have
> read on Dr. Breiman's website that in the original code a
> choice can be made
> between using an N x N matrix OR to increase the ability to
> compute large
> datasets an N x T matrix can be created. The N refers to the
> number of
> samples and the T refers to the number of trees in the
> forest. It is a
> sentence in the FORTRAN documentation and nothing else is
> stated about it...
> My question is, does the randomForest module in R allow for
> this choice in
> proximity matrices generated by the algorithm? If so, can
> someone please
> point me in the direction of how to implement it? That would
> be great!
The R package is based on version 3.3 of the Fortran code, with some new
features grafted on. Unfortunately the sparse proximity matrix is one
of the features that hasn't been added in the R version. The truth is
that I find the way it's done in the Fortran code not terribly
satisfying, but do not know any other better way of doing it.
Andy
> Many thanks in advance and best wishes from Alaska!
>
> Michael
>
> --
> Michael Lindgren
> GIS Technician / Programmer
> EWHALE Lab - Institute of Arctic Biology
> University of Alaska
> 419 IRVING I
> Fairbanks, AK 99775-7000
>
> Email: malindgren at alaska.edu
> Phone: 907 474 7959
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Notice: This e-mail message, together with any attachme...{{dropped:11}}