thr3ads.net - similar to: "Vectorization in LLVM x86 backend"

Displaying 20 results from an estimated 1000 matches similar to: "Vectorization in LLVM x86 backend"

2017 Aug 21

Vectorization in LLVM x86 backend

I isolated the LLVM IR and the X86 instructions emitted for the function and are attached herewith and it is clearly emitting vector instructions. I am having a hard time figuring out where the vector instructions are formulated. For sure SLP and Loop vectorizer is not doing anything. On Mon, Aug 21, 2017 at 11:56 AM, Craig Topper <craig.topper at gmail.com> wrote: > The X86 backend

Changing Alignment of global variables in LLVM

2017 Oct 03

Changing Alignment of global variables in LLVM

If I know for sure I am accessing 32 byte chunks at a time, how can I go about changing the alignment of @u? Should I use DataLayout's reset method? I couldn't find a method to change alignment of one global variable. Thanks On Tue, Oct 3, 2017 at 6:34 PM, Matthias Braun <mbraun at apple.com> wrote: > The effective alignment is part of the load and store operations. Updating

Changing Alignment of global variables in LLVM

2017 Oct 03

Changing Alignment of global variables in LLVM

What is the best way to change the alignment of global variables and allocated structures in LLVM during one of its optimization passes? For example, I want to change, @u = internal unnamed_addr global [5 x [65 x [65 x [65 x double]]]] zeroinitializer, align 16 to align to 32 bytes. How can this be accomplished so that all other references in the code accessing this structure are also

Loop Unrolling Fail in Simple Vectorized loop

2016 Oct 13

Loop Unrolling Fail in Simple Vectorized loop

If count > MAX_UINT-4 your loop loops indefinitely with an increment of 4, I think. On Thu, Oct 13, 2016 at 4:42 PM, Charith Mendis via llvm-dev < llvm-dev at lists.llvm.org> wrote: > So, I tried unrolling the following simple loop. > > int unroll(unsigned * a, unsigned * b, unsigned *c, unsigned count){ > > for(unsigned i=0; i<count; i++){ > > a[i] =

Loop Unrolling Fail in Simple Vectorized loop

2016 Oct 13

Loop Unrolling Fail in Simple Vectorized loop

Thanks for the explanation. But I am a little confused with the following fact. Can't LLVM keep vectorizable_elements as a symbolic value and convert the loop to say; for(unsigned i = 0; i < vectorizable_elements ; i += 2){ //main loop } for(unsigned i=0 ; i < vectorizable_elements % 2; i++){ //fix up } Why does it have to reason about the range of vectorizable_elements? Even

Getting the symbolic expression for an address calculation

2016 Oct 04

Getting the symbolic expression for an address calculation

How do you generate a SCEVAddRecExpr from a SCEV? It tried dyn_casting and it seems like that the SCEV returned by getSCEV is not a SCEVAddRecExpr. Thanks On Fri, Sep 30, 2016 at 4:16 PM, Friedman, Eli <efriedma at codeaurora.org> wrote: > On 9/30/2016 12:16 PM, Charith Mendis via llvm-dev wrote: > >> >> Hi all, >> >> What is the best way to get the symbolic

Getting the symbolic expression for an address calculation

2016 Sep 30

Getting the symbolic expression for an address calculation

Hi all, What is the best way to get the symbolic expression for an address calculation in llvm specially when memory addresses are calculated within a loop. Use case: I want to know what loop induction variables are used for a particular address calculation and in what symbolic context. Thereby, I want to identify which stores and loads will be contiguous in memory if I unroll each of the

Vector Shuffle chain lowering to X86 instructions simplification inconsistencies

2016 Oct 28

Vector Shuffle chain lowering to X86 instructions simplification inconsistencies

Hi all, Attached herewith is a fairly simple LLVM file (shuffle.ll) with lots of vector shuffles. When I use llc with -O3 -mcpu=core-avx2 the first shuffle sequence containing types of 128 wide gets reduced a single shuffle, where as the second shuffle sequence containing types of 256 wide doesn't get reduced to a single shuffle instruction in the resulting X86 code (Shuffle.s attached).

Loop Unrolling Fail in Simple Vectorized loop

2016 Oct 12

Loop Unrolling Fail in Simple Vectorized loop

Hi all, Attached herewith is a simple vectorized function with loops performing a simple shuffle. I want all loops (inner and outer) to be unrolled by 2 and as such used -unroll-count=2 The inner loops(with k as the induction variable and having constant trip counts) unroll fully, but the outer loop with (j) fails to unroll. The llvm code is also attached with inner loops fully unrolled. To

llc error

2016 Sep 03

llc error

I updated to the latest revision and now llvm does not build and quits cmake with CMake Error at cmake/modules/LLVMProcessSources.cmake:83 (message): Found unknown source file ../llvm-revec/lib/CodeGen/MachineFunctionAnalysis.cpp Please update ../llvm-revec/lib/CodeGen/CMakeLists.txt Thanks On Sat, Sep 3, 2016 at 2:09 AM, Craig Topper <craig.topper at gmail.com> wrote: >

llc error

2016 Sep 03

llc error

Hi all, The attached LLVM assembly file fails to generate x86 code when compiled using llc. compilation command - ../llvm-build/bin/llc -filetype=asm -march=x86-64 -mcpu=core-avx2 ex4.ll The error message is, LLVM ERROR: Cannot select: t95: v8f32 = X86ISD::SUBV_BROADCAST t17 t17: v4f32,ch = load<LD16[%scevgep](tbaa=<0x4dbcd98>)> t0, t16, undef:i64 t16: i64 = add t2,

LLVM flags for Vectorization

2016 Aug 20

LLVM flags for Vectorization

Hi, I have been analyzing the LLVM vectorizer by running some benchmarks. For vectorization, I have used the following flags: -O3 -ffast-math -mavx2 Am I missing any other flags which will improve vectorizer performance? Thanks, Santanu Das IIT Hyd -------------- next part -------------- An HTML attachment was scrubbed... URL:

Bug with auto-vectorization of logf

2016 Oct 27

Bug with auto-vectorization of logf

Hi, I intended to file this bug on Bugzilla, but I've received no response from llvm-admin in the 10 days since asking for a Bugzilla account. I've written 2 test functions in C that take in a float array x of size n and output float array f(x), where f is either fabsf or logf. The LLVM 3.9 auto-vectorization docs claim that both functions will be vectorized:

rsync prevent destination only new folders but need new files

2014 Nov 13

rsync prevent destination only new folders but need new files

Good day to all! I'm doing git file replication task and need to sync source destination, using below sync command. rsync -atnvv --existing --exclude '.git' --progress source/ destination/; Here I, 1) need only sync directories that *exist on destination* ,no new folders should copy from source - [used --existing did the job but it not copying new files also.] 2) need to copy

Question on fast-math optimizations

2018 Nov 20

Question on fast-math optimizations

Dear LLVM developers, I have a question on the fast-math floating-point optimizations applied by LLVM: Judging by the documentation at https://llvm.org/docs/LangRef.html#fast-math-flags I understood that rewriting with associativity and using reciprocal computations are possible optimizations. As the folklore description of fast-math is that it "applies real-valued identities", I

Bug with auto-vectorization of logf

2016 Oct 27

Bug with auto-vectorization of logf

+Tanya for the account issue. > On Oct 27, 2016, at 11:36 AM, Eric Martin via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi, > I intended to file this bug on Bugzilla, but I've received no response from llvm-admin in the 10 days since asking for a Bugzilla account. > > I've written 2 test functions in C that take in a float array x of size n and output

Bug with auto-vectorization of logf

2016 Oct 28

Bug with auto-vectorization of logf

Eric, I apologize for any delay or confusion. From my records/list archives, I saw that Anton had created an account for you on Oct 17th and responded to your email to llvm-admin. I am not sure what happened after that point as I thought the account was done. I just confirmed there is an account for you in bugzilla, so you should be good to go if you reset your password. -Tanya > On Oct 27,

Question on fast-math optimizations

2018 Nov 22

Question on fast-math optimizations

On 11/21/18 12:41 PM, Nicolai Hähnle wrote: > On 20.11.18 16:38, Stephen Canon via llvm-dev wrote: >> Distribution doesn’t seem to be used by many transforms at present. >> My vague recollection is that the fast math flags didn’t do a great >> job of characterizing when it would be allowed, and using it >> aggressively broke a lot of code in practice (code which was

wbinfo isn't working on domain member

2019 Apr 05

wbinfo isn't working on domain member

Hi Rowland, I made the change you suggested to auto refresh kerberos. It didn't seem to fix the issue unfortunately, even after a machine restart. Following your line of reasoning that it is a Kerberos issue, I then tried to grab a new kerberos ticket on the server in question which appears to fail though. Perhaps this gives some further insight? pi at fs1:~ $ kinit administrator at

inconsistent DNS information, windows domain member issues..

2016 Jun 05

inconsistent DNS information, windows domain member issues..

I joined a Windows 10 Pro system to my (still experimental) domain. The windows system actually hosts DC2 as a VM, and another Windows (Server 2008 R2) at another location hosts DC1 also as a VM. The two locations are connected via a VPN, both systems run only when needed. The windows system does not directly use DC2 for DNS but instead talks to a DNS resolver that delegates the samba Domain to

similar to: Vectorization in LLVM x86 backend