thr3ads.net - R devel - [Rd] Do rowMeans and colMeans of complex vars need adjusting following r88444? [Aug 2025]

If this information is useful, please help other people find it:
Share via:

Dirk Eddelbuettel

2025-Aug-25 10:46 UTC

[Rd] Do rowMeans and colMeans of complex vars need adjusting following r88444?

On 25 August 2025 at 11:54, Martin Maechler wrote:
| I still agree we should address this:  We do have a discrepancy
| with mean() i.e.,  mean.default()  which does "exactly" what you
| do "by hand" above,  and hence using is.na() for the full
| complex vector, and not *separately* for Re() and Im() parts;
| ... and I do tend to agree that  colMeans(*, na.rm=TRUE) etc
| probably should be adapted to *not* work coordinate-wise but
| drop all "complex NAs" both for Re and Im.

I agree with that.

The culprit is the per-component call (here from colMeans, same for rowMeans)

    z <- if(is.complex(x))
        .Internal(colMeans(Re(x), n, prod(dn), na.rm)) +
            1i * .Internal(colMeans(Im(x), n, prod(dn), na.rm))
    else .Internal(colMeans(x, n, prod(dn), na.rm))

where we may need to add a na.omit() if na.rm is TRUE.

But I didn't like that idea well enough yet yesterday to propose a patch
along those lines as I did not think about the comparison with mean.default
which is quite convincing. Having results differ even within r-devel is quite
bad indeed, and mean.default, swept by hand across rows or columns, also
return what r-release returns here for a matrix containing NAs.

So maybe na.omit() is the best we can do here?

Dirk

-- 
dirk.eddelbuettel.com | @eddelbuettel | edd at debian.org

Martin Maechler

2025-Aug-25 15:56 UTC

head link

[Rd] Do rowMeans and colMeans of complex vars need adjusting following r88444?

>>>>> Dirk Eddelbuettel 
>>>>>     on Mon, 25 Aug 2025 05:46:56 -0500 writes:
    > On 25 August 2025 at 11:54, Martin Maechler wrote:
    > | I still agree we should address this:  We do have a discrepancy
    > | with mean() i.e.,  mean.default()  which does "exactly"
what you
    > | do "by hand" above,  and hence using is.na() for the full
    > | complex vector, and not *separately* for Re() and Im() parts;
    > | ... and I do tend to agree that  colMeans(*, na.rm=TRUE) etc
    > | probably should be adapted to *not* work coordinate-wise but
    > | drop all "complex NAs" both for Re and Im.

    > I agree with that.

    > The culprit is the per-component call (here from colMeans, same for
rowMeans)

    > z <- if(is.complex(x))
    >       .Internal(colMeans(Re(x), n, prod(dn), na.rm)) +
    >  1i * .Internal(colMeans(Im(x), n, prod(dn), na.rm))
    > else  .Internal(colMeans(x, n, prod(dn), na.rm))

    > where we may need to add a na.omit() if na.rm is TRUE.

unfortunately, no:  na.omit() cannot be the solution easily for
_both_ colMeans() and rowMeans()  { colSums() & rowSums() !
which already make use of na.rm,
as indeed  na.omit() treats matrices as data matrices, i.e.,
rows treated very differently than columns,

but in any case, it would drop full rows (or columns if you try to transpose
first) and
that may be very wrong as different rows/columns may have NAs in verydifferent
locations.

I really think a solution must happen in C code.

Further, note that the

   if(is.complex(x))
	    .Internal(colMeans(Re(x), n, prod(dn), na.rm)) +
       1i * .Internal(colMeans(Im(x), n, prod(dn), na.rm))

with na.rm = TRUE  has been *wrong* always, for all cases where
is.na(Re(x))  and
is.na(Im(x))  have differed ...

So this problem is really a new one, only tangentially related to the
original PR #18918  and also only loosely related Mikael's / my
patch to fix PR #18918.

As you have unearthed it, I'm fine if you open an R bugzilla
issue for it;  otherwise, I'd do it myself.

Martin

R devel - Aug 2025 - Do rowMeans and colMeans of complex vars need adjusting following r88444?

[Rd] Do rowMeans and colMeans of complex vars need adjusting following r88444?

[Rd] Do rowMeans and colMeans of complex vars need adjusting following r88444?