Displaying 4 results from an estimated 4 matches for "smlald".
2014 Jun 20
2
Alleged bug in Silk codec
...32 or 16 bit processor. If you
>> would find the opposite to be true (ie that a 64 bit implementation is
>> faster on, say, a 32 bit ARM CPU) then perhaps we should reconsider.
>>
>
> Doesn't ARMv6 have a dual signed 16x16->32 multiply with a 64-bit
> accumulator (SMLALD)? Even v5E should have a single 16x16->32 with a 64-bit
> accumulator (SMLALBB). I would think a 64-bit version could be made pretty
> fast on 32-bit ARM, without even resorting to SIMD.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xiph....
2014 Jun 20
2
Alleged bug in Silk codec
Right, there shouldn't be a problem with undefined behavior.
That said, a 64 bit implementation will work very well - in fact that's how
it was done originally.
The reason for the current implementation is to minimize 64-bit operations
in order to improve performance on limited-width architectures. This
functions gets used extensively, and I think the current implementation is
faster on
2014 Jun 20
0
Alleged bug in Silk codec
...implementation is faster on a 32 or 16 bit processor. If you
> would find the opposite to be true (ie that a 64 bit implementation is
> faster on, say, a 32 bit ARM CPU) then perhaps we should reconsider.
Doesn't ARMv6 have a dual signed 16x16->32 multiply with a 64-bit
accumulator (SMLALD)? Even v5E should have a single 16x16->32 with a
64-bit accumulator (SMLALBB). I would think a 64-bit version could be
made pretty fast on 32-bit ARM, without even resorting to SIMD.
2014 Jun 25
0
Alleged bug in Silk codec
...e:
current implementation is faster on a 32 or 16 bit processor. If you
would find the opposite to be true (ie that a 64 bit implementation is
faster on, say, a 32 bit ARM CPU) then perhaps we should reconsider.
Doesn't ARMv6 have a dual signed 16x16->32 multiply with a 64-bit accumulator (SMLALD)? Even v5E should have a single 16x16->32 with a 64-bit accumulator (SMLALBB). I would think a 64-bit version could be made pretty fast on 32-bit ARM, without even resorting to SIMD.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.xiph.org/pipermail/...