thr3ads.net - similar to: "Restrict global constructors to base ISA"

Displaying 20 results from an estimated 700 matches similar to: "Restrict global constructors to base ISA"

2016 Oct 11

Landing Pad bug?

HI, When compiling the open-source software cryptopp (https://www.cryptopp.com/#download <https://www.cryptopp.com/#download>) version 5.6.4 I found a strange issue with the IR generated. The issue only appears when compiling with -O2 optimisation in the integer.cpp file (the function is _ZN8CryptoPPrsERNSt3__113basic_istreamIcNS0_11char_traitsIcEEEERNS_7IntegerE ->

Clang 5, UBsan, runtime error: addition of unsigned offset to X overflowed to Y

2017 Dec 16

Clang 5, UBsan, runtime error: addition of unsigned offset to X overflowed to Y

We have code that processes a buffer in the forward or backwards direction. It looks similar to the following (https://github.com/weidai11/cryptopp/blob/master/adv-simd.h#L1138): uint8_t * ptr = ... size_t len = ... size_t inc = 16; if (flags & REVERSE_DIRECTION) { ptr += len - inc; inc = 0-inc; } while (len > 16) { // process blocks ptr += inc; len -= 16; } Clang

Determine reason for failure at -O1

2018 Jun 30

Determine reason for failure at -O1

Hi Everyone, We caught a report for a failed self test when using Clang 5.0 and 6.0 with -DDEBUG and -O1 (i.e., a "debug build"). The code is question is located at https://github.com/weidai11/cryptopp/blob/master/cham-simd.cpp . It is the SSSE3 code path for CHAM64. Other optimizations levels are OK for Clang. GCC, ICC and MSVC are OK. The code is valgrind, Sanitizer, Coverity and

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 05

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

Hi Chandler, While doing the performance measurement on a Ivy Bridge, I ran into compile time errors. I saw a bunch of “cannot select" in the LLVM test suite with -march=core-avx-i. E.g., SingleSource/UnitTests/Vector/SSE/sse.isamax.c is failing at O3 -march=core-avx-i with: fatal error: error in backend: Cannot select: 0x7f91b99a6420: v4i32 = bitcast 0x7f91b99b0e10 [ORD=3] [ID=27]

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 06

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

I've run the SingleSource test suite for core-avx-i and have no failures here so a preprocessed file + commandline would be very useful if this reproduces for you still. On Sat, Sep 6, 2014 at 4:07 PM, Chandler Carruth <chandlerc at gmail.com> wrote: > I'm having trouble reproducing this. I'm trying to get LNT to actually > run, but manually compiling the given source

Problem building powerdns from EPEL

2013 Apr 09

Problem building powerdns from EPEL

Hi, I just tried to build using http://dl.fedoraproject.org/pub/epel/6/SRPMS/pdns-3.1-2.el6.src.rpm on CentOS 6.4 final (kernel: 2.6.32-358.2.1.el6.x86_64), but it failed when looking for ldap libs: Note: I did not change anything in the original spec file. ... + ./configure --build=x86_64-redhat-linux-gnu --host=x86_64-redhat-linux-gnu --target=x86_64-redhat-linux-gnu

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 08

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

> On Sep 7, 2014, at 8:49 PM, Quentin Colombet <qcolombet at apple.com> wrote: > > Sure, > > Here is the command line: > clang -cc1 -triple x86_64-apple-macosx -S -disable-free -disable-llvm-verifier -main-file-name tmp.i -mrelocation-model pic -pic-level 2 -mdisable-fp-elim -masm-verbose -munwind-tables -target-cpu core-avx-i -O3 -ferror-limit 19 -fmessage-length 114

[LLVMdev] inefficient code generation for 128-bit->256-bit typecast intrinsics

2013 Apr 09

[LLVMdev] inefficient code generation for 128-bit->256-bit typecast intrinsics

Hello, LLVM generates two additional instructions for 128->256 bit typecasts (e.g. _mm256_castsi128_si256()) to clear out the upper 128 bits of YMM register corresponding to source XMM register. vxorps xmm2,xmm2,xmm2 vinsertf128 ymm0,ymm2,xmm0,0x0 Most of the industry-standard C/C++ compilers (GCC, Intel's compiler, Visual Studio compiler) don't generate any extra moves

Legacy MACs and Ciphers: Why?

2012 Apr 15

Legacy MACs and Ciphers: Why?

Why are legacy MACs (like md5-96), and legacy Ciphers (anything in cbc-mode, arcfour*(?)) enabled by default? My proposal would be to change the defaults for ssh_config and sshd_config to contain: MACs hmac-sha2-256,hmac-sha2-512,hmac-sha1 Ciphers aes128-ctr,aes192-ctr,aes256-ctr ...removing md5, truncated versions of sha1, umac64 (for which I can find barely any review), any cipher in cbc

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 09

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

> On Sep 9, 2014, at 1:47 PM, Sean Silva <chisophugis at gmail.com> wrote: > > > > On Tue, Sep 9, 2014 at 12:53 PM, Quentin Colombet <qcolombet at apple.com <mailto:qcolombet at apple.com>> wrote: > Hi Chandler, > > I had observed some improvements and regressions with the new lowering. > > Here are the numbers for an Ivy Bridge machine fixed at

[LLVMdev] AVX Status?

2011 Jun 07

[LLVMdev] AVX Status?

Ralf Karrenberg <Chareos at gmx.de> writes: > This sounds great! > > For my case, I only require some basic support, so I am optimistic > that your next few patches will provide everything I need. If my evil plan works out, within the next 10 or so patches we should be in a place where pushing everything up goes pretty quickly. It's about 8 TableGen patches and then a

[LLVMdev] Optimization puzzle...

2015 Mar 25

[LLVMdev] Optimization puzzle...

Hi everyone, I am wondering what¹s stopping the LLVM optimizer (opt -O3) from eliminating the apparently useless « icmp sgt » instruction in the following piece of LLVM IR. > ; ModuleID = 'lambda-opt.bc' > target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" > target triple = "x86_64-apple-macosx10.10.0" > > ; Function

[LLVMdev] AVX Status?

2011 Jun 03

[LLVMdev] AVX Status?

Bruno Cardoso Lopes <bruno.cardoso at gmail.com> writes: > Hi Ralf > > On Wednesday, June 1, 2011, Ralf Karrenberg <Chareos at gmx.de> wrote: >> Hi, >> >> The last time the AVX backend was mentioned on this list seems to be >> from November 2010, so I would like to ask about the current status. Is >> anybody (e.g. at Cray?) still actively working

Failed PPC64 compile when using Power7 loads and stores?

2019 Oct 24

Failed PPC64 compile when using Power7 loads and stores?

Hi Everyone, I'm having trouble figuring out a compile failure on ppc64le. The failure is at https://travis-ci.org/noloader/cryptopp-autotools/jobs/602187190 . The message is: /bin/bash ./libtool --tag=CXX --mode=compile clang++ -DHAVE_CONFIG_H -I. -DCRYPTOPP_DISABLE_POWER8 -pipe -mcpu=power7 -mvsx -maltivec -g -O2 -MT libppc_power7_la-ppc_power7.lo -MD -MP -MF

[LLVMdev] AVX Status?

2011 Jun 03

[LLVMdev] AVX Status?

Thanks Syoyo and Bruno for your replies. As suggested, I filed a bug under http://llvm.org/bugs/show_bug.cgi?id=10073 . I am not familiar with .td files and the LLVM backend infrastructure yet, but I might give it a try and solve it myself if I find the time. Best, Ralf Am 02.06.2011 23:55, schrieb Bruno Cardoso Lopes: > Hi Ralf > > On Wednesday, June 1, 2011, Ralf

[LLVMdev] AVX Status?

2011 Jun 02

[LLVMdev] AVX Status?

Hi Ralf On Wednesday, June 1, 2011, Ralf Karrenberg <Chareos at gmx.de> wrote: > Hi, > > The last time the AVX backend was mentioned on this list seems to be > from November 2010, so I would like to ask about the current status. Is > anybody (e.g. at Cray?) still actively working on it? I don't think so! > I have tried both LLVM 2.9 final and the latest trunk, and it

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 09

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

Hi Chandler, Thanks for fixing the problem with the insertps mask. Generally the new shuffle lowering looks promising, however there are some cases where the codegen is now worse causing runtime performance regressions in some of our internal codebase. You have already mentioned how the new shuffle lowering is missing some features; for example, you explicitly said that we currently lack of

[LLVMdev] AVX Status?

2011 Jun 01

[LLVMdev] AVX Status?

Hi, The last time the AVX backend was mentioned on this list seems to be from November 2010, so I would like to ask about the current status. Is anybody (e.g. at Cray?) still actively working on it? I have tried both LLVM 2.9 final and the latest trunk, and it seems like some trivial stuff is already working and produces nice code for code using <8 x float>. Unfortunately, the backend

[LLVMdev] Poor register allocation (constants causing spilling)

2015 Jul 14

[LLVMdev] Poor register allocation (constants causing spilling)

Hi, While investigating a performance issue with an internal codebase I came across what looks to be poor register allocation. I have constructed a small(ish) reproducible which demonstrates the issue (see test.ll attached). I have spent some time going through the register allocator to understand what is happening. I have also experimented with some small changes to try and improve the

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

2014 Sep 19

[LLVMdev] Please benchmark new x86 vector shuffle lowering, planning to make it the default very soon!

Hi Chandler, I have tested the new shuffle lowering on a AMD Jaguar cpu (which is AVX but not AVX2). On this particular target, there is a delay when output data from an execution unit is used as input to another execution unit of a different cluster. For example, There are 6 executions units which are divided into 3 execution clusters of Float(FPM,FPA), Vector Integer (MMXA,MMXB,IMM), and Store

similar to: Restrict global constructors to base ISA