Displaying 20 results from an estimated 4000 matches similar to: "Why is sha256-generic preferred over sha256-ssse3?"
2020 May 18
3
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
What do you base this on?
Per https://gcc.gnu.org/onlinedocs/gcc/x86-Options.html :
"For the x86-32 compiler, you must use -march=cpu-type, -msse or
-msse2 switches to enable SSE extensions and make this option
effective. For the x86-64 compiler, these extensions are enabled by
default."
That reads to me like we're fine for SSE2. As stated in my comments,
SSSE3 support must be
2020 May 18
2
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
I don't disagree that MD5 could (or even should) be replaced so it is
no longer the bottleneck in several real-world cases (including mine).
However this patch is not for MD5 performance, rather for the rolling
checksum rsync uses to match blocks on existing files on both ends to
reduce transfer size.
On Mon, May 18, 2020 at 5:44 PM Filipe Maia via rsync
<rsync at lists.samba.org>
2020 May 18
6
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
This drop-in patch increases the performance of the get_checksum1()
function on x86-64.
On the target slow CPU performance of the function increased by nearly
50% in the x86-64 default SSE2 mode, and by nearly 100% if the
compiler was told to enable SSSE3 support. The increase was over 200%
on the fastest CPU tested in SSSE3 mode.
Transfer time improvement with large files existing on both ends
2020 May 19
5
[PATCHv2] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
I've read up some more on the subject, and it seems the proper way to
do this with GCC is g++ and target attributes. I've refactored the
patch that way, and it indeed uses SSSE3 automatically on supporting
CPUs, regardless of the build host, so this should be ideal both for
home builders and distros.
Getting the code to build right in c++ mode (checksum_sse2.cpp only)
was a bit of an
2020 May 18
0
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
I think this is a great patch but, in my view, an even better way to tackle
the fundamental problem (the performance limitations) is to use a much
faster checksum like xxhash, as has been suggested before:
https://lists.samba.org/archive/rsync/2019-October/031975.html
Cheers,
Filipe
On Mon, 18 May 2020 at 17:08, Jorrit Jongma via rsync <rsync at lists.samba.org>
wrote:
> This drop-in
2020 May 20
0
[PATCHv2] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
would it perhaps make sense to have a "--disable-sse2/3" commandline
switch in rsync, too - at least for some timeframe until this is
considered "rock solid" ?
i dislike having automatic cpu feature switching code in a tool which
needs to be reliable for me, this new optimization may have issues - and
without such switch it can't be easily workarounded without replacing
2008 Dec 11
6
Any way to reduce CPU use of OpenSSH?
On my CentOS v5.2 server (dual Pentium4) the OpenSSH daemon stands out
as being the most CPU-intensive of the applications running, It's used
176 minutes of CPU time in the last 2 days alone.
Is there any way to lower the CPU utilization without compromising
security? (I.e. without using a less processor-intensive
encrypt/decrypt algorithm?)
I'm getting the CPU use figures from top,
2009 Apr 01
6
Where's v5.3 source RPMs?
It seems that the mirrors are now all sync'd with the binary RPMs, but
where are the source packages?
The source RPMs are available for the few packages updated since the
upstream 5.3 release, but the SRPMS for the release itself are
missing.
Example: http://mirror.centos.org/centos-5/5.3/os/ has directories
containing the x86 and x86_64 binaries, but should also have a SRPMS
directory.
2016 May 15
3
How to disable audio in CentOS7?
How can I completely disable audio drivers and services in my CentOS7
system?
This system is a server that will never run any audio applications. The
problem is, I can't disable the audio device in my BIOS, so the system
finds it in the PCI device list and configures it on each boot.
Yes, I know I can blacklist specific device driver modules, but I've got
a minimum of 11
2020 May 18
0
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
On 2020-05-18 17:55:58 [+0200], Jorrit Jongma via rsync wrote:
> I don't disagree that MD5 could (or even should) be replaced so it is
> no longer the bottleneck in several real-world cases (including mine).
>
> However this patch is not for MD5 performance, rather for the rolling
> checksum rsync uses to match blocks on existing files on both ends to
> reduce transfer size.
2020 May 18
0
[PATCH] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
On 2020-05-18 21:55:13 [+0200], Jorrit Jongma wrote:
> What do you base this on?
So my memory was wrong. SSE2 is supported by all x86-64bit CPUs. Sorry
for that.
> would imply that SSSE3 is enabled out of the box on builds on machines
> that support it, this is not the case (it certainly isn't on my Ubuntu
> box). It would be preferred to detect this at runtime but getting that
2020 May 21
0
[PATCHv2] SSE2/SSSE3 optimized version of get_checksum1() for x86-64
On Tue, May 19, 2020 at 7:29 AM Jorrit Jongma via rsync <
rsync at lists.samba.org> wrote:
> I've read up some more on the subject, and it seems the proper way to do
> this with GCC is g++ and target attributes. I've refactored the patch that
> way, and it indeed uses SSSE3 automatically on supporting CPUs, regardless
> of the build host, so this should be ideal both for
2009 Mar 23
2
[LLVMdev] X86InstrFormats.td Question
I'm looking at the instruction formats and I can't grok the comments. For
example:
// SSSE3 Instruction Templates:
//
// SS38I - SSSE3 instructions with T8 prefix.
// SS3AI - SSSE3 instructions with TA prefix.
//
Where are these prefix names coming from? I can't find any mention of them in
the Intel literature.
Also, there's this curious table:
// Prefix byte classes
2009 Mar 23
0
[LLVMdev] X86InstrFormats.td Question
On Mar 23, 2009, at 12:57 PM, David A. Greene wrote:
> I'm looking at the instruction formats and I can't grok the
> comments. For
> example:
>
> // SSSE3 Instruction Templates:
> //
> // SS38I - SSSE3 instructions with T8 prefix.
> // SS3AI - SSSE3 instructions with TA prefix.
> //
>
> Where are these prefix names coming from? I can't find any
2013 Sep 28
4
PATCH: modify/add intrinsics code
The patch does the following:
1. splits lpc_x86intrin.c to lpc_intrin_sse.c and lpc_intrin_sse2.c
2. adds FLAC__lpc_compute_residual_from_qlp_coefficients_intrin_sse2()
function to lpc_intrin_sse2.c
3. adds lpc_intrin_sse41.c with two ..._wide_intrin_sse41() functions
(useful for 24-bit en-/decoding)
4. adds precompute_partition_info_sums_intrin_sse2() / ...ssse3() and
disables
2010 May 25
2
Samba3x daily logged errors with Win7 clients
In the course of upgrading from CentOS 5.4 to CentOS 5.5 I changed from
using the samba (v3.0.x) packages to the samba3x (v3.3.8) packages,
mostly because the newer version was said to better support Win7. The
Samba server services Linux, WinXP, and Win7 clients.
Now I get many, many errors logged to the Samba logs shortly after 3:00
AM, but only from the Win7 clients. I get roughly 430
2014 May 13
1
Performance tests of the current version (git-b1b6caf)
Current sources (git-b1b6caf) were compiled with GCC 4.8.2 and GCC 4.9.0
with various -msseN options (the default is -msse2). Then I took two WAV
files (one is 16-bit and the other is 24-bit) and compressed them using
best compression mode. The results are in the table below.
(please remember that the resulting value is an encoding time, not encoding speed)
CPU: Intel Core i7 950 (up to SSE4.2)
2015 Feb 04
2
CPU model and missing AES-NI extension
Hi,
today I tried to configure a guest using Virt-Manager and used the "copy
host cpu configuration" option which resultet in a "Sandy Bridge" model.
What I noticed is that for example the "aes" extension is not available
in the guest even though it is available on the host cpu.
This is what the host cpu looks like:
model name : Intel(R) Xeon(R) CPU E5-2650 v3 @
2012 Sep 05
0
[LLVMdev] branch on vector compare?
Am 05.09.2012 00:24, schrieb Stephen:
> Roland Scheidegger <sroland <at> vmware.com> writes:
>> This looks quite similar to something I filed a bug on (12312). Michael
>> Liao submitted fixes for this, so I think
>> if you change it to
>> %16 = fcmp ogt <4 x float> %15, %cr
>> %17 = sext <4 x i1> %16 to <4 x i32>
>> %18 =
2010 Sep 08
4
[LLVMdev] MMX vs SSE
I'm working on changing the MMX implementation to use intrinsics in
all cases, which should stop various optimization passes from creating
MMX instructions that screw up the x87 stack. Right now the MMX
instructions are split between X86InstrMMX.td and X86InstrSSE.td,
presumably on the historical grounds that some of them weren't
introduced until SSE or SSSE3, and require