Hi I noticed a small inconsistency when using sum() vs cumsum() I have a char-based series > tryjpy$long [1] "0.0022" "-0.0002" "-0.0149" "-0.0023" "-0.0342" "-0.0245" "-0.0022" [8] "0.0003" "-0.0001" "-0.0004" "-0.0036" "-0.001" "-0.0011" "-0.0012" [15] "-0.0006" "0.0016" "0.0006" When I run sum() vs cumsum() , sum fails but cumsum converts the series to numeric before summing:> sum(tryjpy$long)Error in sum(tryjpy$long) : invalid 'type' (character) of argument> cumsum(tryjpy$long)[1] 0.0022 0.0020 -0.0129 -0.0152 -0.0494 -0.0739 -0.0761 -0.0758 -0.0759 [10] -0.0763 -0.0799 -0.0809 -0.0820 -0.0832 -0.0838 -0.0822 -0.0816 Which I guess is due to the following line in do_cum(): PROTECT(t = coerceVector(CAR(args), REALSXP)); This might be fine and there may be very good reasons why there is no coercion in sum - just seems a little inconsistent in usage Cheers -- Rory
On 8/23/20 5:02 PM, Rory Winston wrote:> Hi > > I noticed a small inconsistency when using sum() vs cumsum() > > I have a char-based series > > > tryjpy$long > > [1] "0.0022" "-0.0002" "-0.0149" "-0.0023" "-0.0342" "-0.0245" "-0.0022" > > [8] "0.0003" "-0.0001" "-0.0004" "-0.0036" "-0.001" "-0.0011" "-0.0012" > > [15] "-0.0006" "0.0016" "0.0006" > > When I run sum() vs cumsum() , sum fails but cumsum converts the > series to numeric before summing: > >> sum(tryjpy$long) > Error in sum(tryjpy$long) : invalid 'type' (character) of argument > >> cumsum(tryjpy$long) > [1] 0.0022 0.0020 -0.0129 -0.0152 -0.0494 -0.0739 -0.0761 -0.0758 -0.0759 > [10] -0.0763 -0.0799 -0.0809 -0.0820 -0.0832 -0.0838 -0.0822 -0.0816 > > Which I guess is due to the following line in do_cum(): > > PROTECT(t = coerceVector(CAR(args), REALSXP)); > This might be fine and there may be very good reasons why there is no > coercion in sum - just seems a little inconsistent in usageYes. I don't know the reason for this design, but please note it is documented in ?sum and in ?cumsum, which would also make it harder to change. One can always use a consistent subset (not rely on the coercion e.g. from characters). Best Tomas> > Cheers > -- Rory > > ______________________________________________ > R-devel at r-project.org mailing list > stat.ethz.ch/mailman/listinfo/r-devel
>>>>> Tomas Kalibera >>>>> on Tue, 25 Aug 2020 09:29:05 +0200 writes:> On 8/23/20 5:02 PM, Rory Winston wrote: >> Hi >> >> I noticed a small inconsistency when using sum() vs cumsum() >> >> I have a char-based series >> >> > tryjpy$long >> >> [1] "0.0022" "-0.0002" "-0.0149" "-0.0023" "-0.0342" "-0.0245" "-0.0022" >> >> [8] "0.0003" "-0.0001" "-0.0004" "-0.0036" "-0.001" "-0.0011" "-0.0012" >> >> [15] "-0.0006" "0.0016" "0.0006" >> >> When I run sum() vs cumsum() , sum fails but cumsum converts the >> series to numeric before summing: >> >>> sum(tryjpy$long) >> Error in sum(tryjpy$long) : invalid 'type' (character) of argument >> >>> cumsum(tryjpy$long) >> [1] 0.0022 0.0020 -0.0129 -0.0152 -0.0494 -0.0739 -0.0761 -0.0758 -0.0759 >> [10] -0.0763 -0.0799 -0.0809 -0.0820 -0.0832 -0.0838 -0.0822 -0.0816 >> >> Which I guess is due to the following line in do_cum(): >> >> PROTECT(t = coerceVector(CAR(args), REALSXP)); >> This might be fine and there may be very good reasons why there is no >> coercion in sum - just seems a little inconsistent in usage > Yes. I don't know the reason for this design, but please note it is > documented in ?sum and in ?cumsum, which would also make it harder to > change. One can always use a consistent subset (not rely on the coercion > e.g. from characters). > Best > Tomas Indeed. Further note that most arithmetic/math *fails* on character vectors, so if a change would have to be made, it should rather be such that cumsum() also rejects character input. We would have consistency then, but potentially break user code, even package code which has hitherto assumed cumsum() to coerce to numeric first. If a majority of commentators and R core thinks we should make such a change, I'd agree to consider it. Otherwise, we save (ourselves and others) a bit of time. Martin