similar to: Why is sha256-generic preferred over sha256-ssse3?

Displaying 20 results from an estimated 4000 matches similar to: "Why is sha256-generic preferred over sha256-ssse3?"

2020 May 18
3
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
What do you base this on? Per https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html : "For the x86-32 compiler, you must use -march=cpu-type, -msse or -msse2 switches to enable SSE extensions and make this option effective. For the x86-64 compiler, these extensions are enabled by default." That reads to me like we're fine for SSE2. As stated in my comments, SSSE3 support must be
2020 May 18
2
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
I don't disagree that MD5 could (or even should) be replaced so it is no longer the bottleneck in several real-world cases (including mine). However this patch is not for MD5 performance, rather for the rolling checksum rsync uses to match blocks on existing files on both ends to reduce transfer size. On Mon, May 18, 2020 at 5:44 PM Filipe Maia via rsync <rsync at lists.samba.org>
2020 May 18
6
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
This drop-in patch increases the performance of the get_checksum1() function on x86-64. On the target slow CPU performance of the function increased by nearly 50% in the x86-64 default SSE2 mode, and by nearly 100% if the compiler was told to enable SSSE3 support. The increase was over 200% on the fastest CPU tested in SSSE3 mode. Transfer time improvement with large files existing on both ends
2020 May 19
5
[PATCHv2] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
I've read up some more on the subject, and it seems the proper way to do this with GCC is g++ and target attributes. I've refactored the patch that way, and it indeed uses SSSE3 automatically on supporting CPUs, regardless of the build host, so this should be ideal both for home builders and distros. Getting the code to build right in c++ mode (checksum_sse2.cpp only) was a bit of an
2020 May 18
0
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
I think this is a great patch but, in my view, an even better way to tackle the fundamental problem (the performance limitations) is to use a much faster checksum like xxhash, as has been suggested before: https://lists.samba.org/archive/rsync/2019-October/031975.html Cheers, Filipe On Mon, 18 May 2020 at 17:08, Jorrit Jongma via rsync <rsync at lists.samba.org> wrote: > This drop-in
2020 May 20
0
[PATCHv2] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
would it perhaps make sense to have a "--disable-sse2/3" commandline switch in rsync, too - at least for some timeframe until this is considered "rock solid" ? i dislike having automatic cpu feature switching code in a tool which needs to be reliable for me, this new optimization may have issues - and without such switch it can't be easily workarounded without replacing
2008 Dec 11
6
Any way to reduce CPU use of OpenSSH?
On my CentOS v5.2 server (dual Pentium4) the OpenSSH daemon stands out as being the most CPU-intensive of the applications running, It's used 176 minutes of CPU time in the last 2 days alone. Is there any way to lower the CPU utilization without compromising security? (I.e. without using a less processor-intensive encrypt/decrypt algorithm?) I'm getting the CPU use figures from top,
2009 Apr 01
6
Where's v5.3 source RPMs?
It seems that the mirrors are now all sync'd with the binary RPMs, but where are the source packages? The source RPMs are available for the few packages updated since the upstream 5.3 release, but the SRPMS for the release itself are missing. Example: http://mirror.centos.org/centos-5/5.3/os/ has directories containing the x86 and x86_64 binaries, but should also have a SRPMS directory.
2016 May 15
3
How to disable audio in CentOS7?
How can I completely disable audio drivers and services in my CentOS7 system? This system is a server that will never run any audio applications. The problem is, I can't disable the audio device in my BIOS, so the system finds it in the PCI device list and configures it on each boot. Yes, I know I can blacklist specific device driver modules, but I've got a minimum of 11
2020 May 18
0
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
On 2020-05-18 17:55:58 [+0200], Jorrit Jongma via rsync wrote: > I don't disagree that MD5 could (or even should) be replaced so it is > no longer the bottleneck in several real-world cases (including mine). > > However this patch is not for MD5 performance, rather for the rolling > checksum rsync uses to match blocks on existing files on both ends to > reduce transfer size.
2020 May 18
0
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
On 2020-05-18 21:55:13 [+0200], Jorrit Jongma wrote: > What do you base this on? So my memory was wrong. SSE2 is supported by all x86-64bit CPUs. Sorry for that. > would imply that SSSE3 is enabled out of the box on builds on machines > that support it, this is not the case (it certainly isn't on my Ubuntu > box). It would be preferred to detect this at runtime but getting that
2020 May 21
0
[PATCHv2] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
On Tue, May 19, 2020 at 7:29 AM Jorrit Jongma via rsync < rsync at lists.samba.org> wrote: > I've read up some more on the subject, and it seems the proper way to do > this with GCC is g++ and target attributes. I've refactored the patch that > way, and it indeed uses SSSE3 automatically on supporting CPUs, regardless > of the build host, so this should be ideal both for
2009 Mar 23
2
[LLVMdev] X86InstrFormats.td Question
I'm looking at the instruction formats and I can't grok the comments. For example: // SSSE3 Instruction Templates: // // SS38I - SSSE3 instructions with T8 prefix. // SS3AI - SSSE3 instructions with TA prefix. // Where are these prefix names coming from? I can't find any mention of them in the Intel literature. Also, there's this curious table: // Prefix byte classes
2009 Mar 23
0
[LLVMdev] X86InstrFormats.td Question
On Mar 23, 2009, at 12:57 PM, David A. Greene wrote: > I'm looking at the instruction formats and I can't grok the > comments. For > example: > > // SSSE3 Instruction Templates: > // > // SS38I - SSSE3 instructions with T8 prefix. > // SS3AI - SSSE3 instructions with TA prefix. > // > > Where are these prefix names coming from? I can't find any
2013 Sep 28
4
PATCH: modify/add intrinsics code
The patch does the following: 1. splits lpc_x86intrin.c to lpc_intrin_sse.c and lpc_intrin_sse2.c 2. adds FLAC__lpc_compute_residual_from_qlp_coefficients_intrin_sse2() function to lpc_intrin_sse2.c 3. adds lpc_intrin_sse41.c with two ..._wide_intrin_sse41() functions (useful for 24-bit en-/decoding) 4. adds precompute_partition_info_sums_intrin_sse2() / ...ssse3() and disables
2010 May 25
2
Samba3x daily logged errors with Win7 clients
In the course of upgrading from CentOS 5.4 to CentOS 5.5 I changed from using the samba (v3.0.x) packages to the samba3x (v3.3.8) packages, mostly because the newer version was said to better support Win7. The Samba server services Linux, WinXP, and Win7 clients. Now I get many, many errors logged to the Samba logs shortly after 3:00 AM, but only from the Win7 clients. I get roughly 430
2014 May 13
1
Performance tests of the current version (git-b1b6caf)
Current sources (git-b1b6caf) were compiled with GCC 4.8.2 and GCC 4.9.0 with various -msseN options (the default is -msse2). Then I took two WAV files (one is 16-bit and the other is 24-bit) and compressed them using best compression mode. The results are in the table below. (please remember that the resulting value is an encoding time, not encoding speed) CPU: Intel Core i7 950 (up to SSE4.2)
2015 Feb 04
2
CPU model and missing AES-NI extension
Hi, today I tried to configure a guest using Virt-Manager and used the "copy host cpu configuration" option which resultet in a "Sandy Bridge" model. What I noticed is that for example the "aes" extension is not available in the guest even though it is available on the host cpu. This is what the host cpu looks like: model name : Intel(R) Xeon(R) CPU E5-2650 v3 @
2012 Sep 05
0
[LLVMdev] branch on vector compare?
Am 05.09.2012 00:24, schrieb Stephen: > Roland Scheidegger <sroland <at> vmware.com> writes: >> This looks quite similar to something I filed a bug on (12312). Michael >> Liao submitted fixes for this, so I think >> if you change it to >> %16 = fcmp ogt <4 x float> %15, %cr >> %17 = sext <4 x i1> %16 to <4 x i32> >> %18 =
2010 Sep 08
4
[LLVMdev] MMX vs SSE
I'm working on changing the MMX implementation to use intrinsics in all cases, which should stop various optimization passes from creating MMX instructions that screw up the x87 stack. Right now the MMX instructions are split between X86InstrMMX.td and X86InstrSSE.td, presumably on the historical grounds that some of them weren't introduced until SSE or SSSE3, and require