Weighted mean behaves differently: ? weight is excluded for missing x ? no warning for sum(weights) != 1> weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1, 1))[1] 2.5> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1))[1] NA> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1), na.rm=TRUE)[1] 2 Von: Richard O'Keefe Gesendet: Montag, 12. Juli 2021 13:18 An: Matthias Gondan Betreff: Re: [R] density with weights missing values Does your copy of R say that the weights must add up to 1? ?density doesn't say that in mine. But it does check. On Mon, 12 Jul 2021 at 22:42, Matthias Gondan <matthias-gondan at gmx.de> wrote:> > Dear R users, > > This works as expected: > > ? plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE)) > > This raises an error > > ? plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1))) > ? plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA))) > > This seems to work (it triggers a warning that the weights don?t add up to 1, which makes sense*): > > ? plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1))) > > Questions > > ? But shouldn?t the na.rm filter also filter the corresponding weights? > ? Extra question: In case the na.rm filter is changed to filter the weights, the check for sum(weights) == 1 might trigger false positive warnings since the weights might not add up to 1 anymore > > Best wishes, > > Matthias > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.[[alternative HTML version deleted]]
The behavior is as documented AFAICS. na.rm logical; if TRUE, missing values are removed from x. If FALSE any missing values cause an error. The default is FALSE. weights numeric vector of non-negative observation weights. NA is not a non-negative numeric. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Mon, Jul 12, 2021 at 6:10 AM Matthias Gondan <matthias-gondan at gmx.de> wrote:> > Weighted mean behaves differently: > ? weight is excluded for missing x > ? no warning for sum(weights) != 1 > > > weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1, 1)) > [1] 2.5 > > weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1)) > [1] NA > > weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1), na.rm=TRUE) > [1] 2 > > > > > Von: Richard O'Keefe > Gesendet: Montag, 12. Juli 2021 13:18 > An: Matthias Gondan > Betreff: Re: [R] density with weights missing values > > Does your copy of R say that the weights must add up to 1? > ?density doesn't say that in mine. But it does check. > > On Mon, 12 Jul 2021 at 22:42, Matthias Gondan <matthias-gondan at gmx.de> wrote: > > > > Dear R users, > > > > This works as expected: > > > > ? plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE)) > > > > This raises an error > > > > ? plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1))) > > ? plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA))) > > > > This seems to work (it triggers a warning that the weights don?t add up to 1, which makes sense*): > > > > ? plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1))) > > > > Questions > > > > ? But shouldn?t the na.rm filter also filter the corresponding weights? > > ? Extra question: In case the na.rm filter is changed to filter the weights, the check for sum(weights) == 1 might trigger false positive warnings since the weights might not add up to 1 anymore > > > > Best wishes, > > > > Matthias > > > > > > [[alternative HTML version deleted]] > > > > ______________________________________________ > > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, reproducible code. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code.
>>>>> Matthias Gondan >>>>> on Mon, 12 Jul 2021 15:09:38 +0200 writes:> Weighted mean behaves differently: > ? weight is excluded for missing x > ? no warning for sum(weights) != 1 >> weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1, 1)) > [1] 2.5 >> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1)) > [1] NA >> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1), na.rm=TRUE) > [1] 2 I'm sure the 'weights' argument in weighted.mean() has been used much more often than the one in density(). Hence, it's quite "probable statistically" :-) that the weighted.mean() behavior in the NA case has been more rational and thought through So I agree with you, Matthias, that ideally density() should behave differently here, probably entirely analogously to weighted.mean(). Still, Bert and others are right that there is no bug formally, but something that possibly should be changed; even though it breaks back compatibility for those cases, such case may be very rare (I'm not sure I've ever used weights in density() but I know I've used it very much all those 25 years ..). https://www.r-project.org/bugs.html contains good information about determining if something may be a bug in R *and* tell you how to apply for an account on R's bugzilla for reporting it formally. I'm hereby encouraging you, Matthias, to do that and then in your report mention both density() and weighted.mean(), i.e., a cleaned up version of the union of your first 2 e-mails.. Thank you for thinking about this and concisely reporting it. Martin > Von: Richard O'Keefe > Gesendet: Montag, 12. Juli 2021 13:18 > An: Matthias Gondan > Betreff: Re: [R] density with weights missing values > Does your copy of R say that the weights must add up to 1? > ?density doesn't say that in mine. But it does check. another small part to could be improved, indeed, thank you, Richard. -- Martin Maechler ETH Zurich and R Core team > On Mon, 12 Jul 2021 at 22:42, Matthias Gondan <matthias-gondan at gmx.de> wrote: >> >> Dear R users, >> >> This works as expected: >> >> ? plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE)) >> >> This raises an error >> >> ? plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1))) >> ? plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA))) [..............]
Den 2021-07-12 kl. 15:09, skrev Matthias Gondan:> Weighted mean behaves differently:One difference is that density has a named argument 'weights' not present in weighted.mean, which instead has 'w' for weights. Annoying. So, in your examples, the argument 'weights = ' is always ignored, at least for weighted.mean.default: > stats:::weighted.mean.default function (x, w, ..., na.rm = FALSE) { if (missing(w)) { if (na.rm) x <- x[!is.na(x)] return(sum(x)/length(x)) } if (length(w) != length(x)) stop("'x' and 'w' must have the same length") if (na.rm) { i <- !is.na(x) w <- w[i] x <- x[i] } sum((x * w)[w != 0])/sum(w) } But, using 'w' for weights, missing values in weights will work only if na.rm = TRUE and they match missing values in x. As documented. [...]> ? no warning for sum(weights) != 1and no warning for sum(w) != 1 That's because the weights w are normalized (after removing weights corresponding to missing values in x). G,> >> weighted.mean(c(1, 2, 3, 4), weights=c(1, 1, 1, 1)) > [1] 2.5 >> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1)) > [1] NA >> weighted.mean(c(1, 2, 3, NA), weights=c(1, 1, 1, 1), na.rm=TRUE) > [1] 2 > > > > > Von: Richard O'Keefe > Gesendet: Montag, 12. Juli 2021 13:18 > An: Matthias Gondan > Betreff: Re: [R] density with weights missing values > > Does your copy of R say that the weights must add up to 1? > ?density doesn't say that in mine. But it does check. > > On Mon, 12 Jul 2021 at 22:42, Matthias Gondan <matthias-gondan at gmx.de> wrote: >> >> Dear R users, >> >> This works as expected: >> >> ? plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE)) >> >> This raises an error >> >> ? plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, 1))) >> ? plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1, NA))) >> >> This seems to work (it triggers a warning that the weights don?t add up to 1, which makes sense*): >> >> ? plot(density(c(1,2, 3, 4, 5, NA), na.rm=TRUE, weights=c(1, 1, 1, 1, 1))) >> >> Questions >> >> ? But shouldn?t the na.rm filter also filter the corresponding weights? >> ? Extra question: In case the na.rm filter is changed to filter the weights, the check for sum(weights) == 1 might trigger false positive warnings since the weights might not add up to 1 anymore >> >> Best wishes, >> >> Matthias >> >> >> [[alternative HTML version deleted]] >> >> ______________________________________________ >> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. >