You do not talk about the SSE 4.1 version in your bench.
Have you tried this use case ?
Thanks !
Le 04/07/2022 ? 19:23, Martijn van Beurden a ?crit?:> Op ma 4 jul. 2022 om 15:06 schreef olivier tristan <o.tristan at
uvi.net>:
>
>     While I can understand the rationale for manual assembly as 32
>     bits x86
>     is dead, it seems a greater deal to remove all optimization including
>     intrinsic ones.
>
>
> Yes, it does seem a great deal to remove all optimization, but it 
> really isn't. See the pull request associated with that change for 
> more information: https://github.com/xiph/flac/pull/347 I did quite a 
> bit of testing before merging this change, on two different CPUs, each 
> with 3 different compilers, each with 4 variants of the 
> non-intrinsics-accelerated functions. It turns out that there is no 
> performance loss at all, and in many cases this change makes flac 
> actually faster, not slower as one would expect.
>
>     Maybe there should be a an opt in if you don't want to be included
by
>     default but some people including me don't want to see those
>     optimization been removed ?
>
>
> There would be no advantage of that over keeping the original code: it 
> still needs to be maintained and tested, even if it is hidden behind 
> some configuration option. The only case where this patch could be 
> problematic in terms of speed is when one compiles flac to be used on 
> CPUs that do not support SSE2.
-- 
Olivier Tristan
Research & Development
www.uvi.net
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.xiph.org/pipermail/flac-dev/attachments/20220705/b6d2c59e/attachment.htm>