Hi Aidan, I think you are on the right email list.
I'm not R-core, but this looks like an interesting/meaningful/significant
contribution to base R.
I'm not sure what the original dendrapply looks like in terms of code style
(variable names/white space formatting/etc) but in my experience it is
important that your code contribution makes minimal changes in that area.
Did you hear about the R project sprint 2023?
https://contributor.r-project.org/r-project-sprint-2023/ Your work falls
into the "new developments" category so I think you could apply for
that
funding to participate.
Toby
On Fri, Feb 24, 2023 at 3:47 AM Lakshman, Aidan H <AHL27 at pitt.edu>
wrote:
> Hi everyone,
>
> My apologies if this isn?t the right place to submit this?I?m new to the
> R-devel community and still figuring out what is where.
>
> If people want to skip my writeup and just look at the code, I?ve made a
> repository for it here:
> https://github.com/ahl27/new_dendrapply/tree/master. I?m not quite sure
> how to integrate it into a fork of R-devel; the package structure is
> different from what I?m used to.
>
> I had written a slightly improved version of dendrapply for one of my
> research projects, and my advisor encouraged me to submit it to the R
> project. It took me longer than I expected, but I?ve finally gotten my
> implementation to be a drop-in replacement for `stats::dendrapply`. The man
> page for `stats::dendrapply` says ?The implementation is somewhat
> experimental and suggestions for enhancements (or nice examples of usage)
> are very welcome,? so I figured this had the potential to be a worthwhile
> contribution. I wanted to send it out to R-devel to see if this was
> something worth pursuing as an enhancement to R.
>
> The implementation I have is based in C, which I understand implies an
> increased burden of maintenance over pure R code. However, it does come
> with the following benefits:
>
> - Completely eliminates recursion, so no memory overhead from function
> calls or possibility of stack overflows (this was a major issue reported on
> some of the functions in one of our Bioconductor packages that previously
> used `dendrapply`).
> - Modest runtime improvement, around 2x on my computer (2021 MBP, 32GB
> RAM). I?m relatively confident this could be optimized more.
> - Seemingly significant reduction in memory reduction, still working on a
> robust benchmark. Suggestions for the best way to do that are welcome.
> - Support for applying functions with an inorder traversal (as in
> `stats::dendrapply`) as well as using a postorder traversal.
>
> This implementation was tested manually as well as running all the unit
> tests in `dendextend`, which comprises a lot of applications of
> `dendrapply`.
>
> The postorder traversal would be a significant new functionality to
> dendrapply, as it would allow for functions that use the child nodes to
> correctly execute. A toy example of this is something like:
> ```
> exFunc <- function(x){
> attr(x, 'newA') <- 'a'
> if(is.null(attr(x, 'leaf'))){
> cat(attr(x[[1]], 'newA'), attr(x[[2]], 'newA'))
> cat('\n')
> }
> x
> })
>
> dendrapply(dend, exFunc)
> ```
>
> With the current version of dendrapply, this prints nothing, but the
> postorder traversal version will print ?a? twice for each internal branch.
> If this would be a worthwhile addition, I can refactor the code for brevity
> and add a `how=c("in.order", "post.order")`, with the
default value
> ?in.order? to maintain backwards compatibility. A preorder traversal
> version should also be possible, I just haven?t gotten to it yet.
>
> I think the runtime could be optimized more as well.
>
> Thank you in advance for looking at my code and offering feedback; I?m
> excited at the possibility of helping contribute to the R project! I?m
> happy to discuss more either here, on GitHub, or on the R Contributors
> Slack.
>
> Sincerely,
> Aidan Lakshman
>
> -----------------------
> Aidan Lakshman (he/him)<https://www.ahl27.com/>
> Doctoral Candidate, Wright Lab<https://www.wrightlabscience.com/>
> University of Pittsburgh School of Medicine
> Department of Biomedical Informatics
> ahl27 at pitt.edu
> (724) 612-9940
>
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-devel at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-devel
>
[[alternative HTML version deleted]]