similar to: Vectorization in LLVM x86 backend

Displaying 20 results from an estimated 1000 matches similar to: "Vectorization in LLVM x86 backend"

2017 Aug 21
2
Vectorization in LLVM x86 backend
I isolated the LLVM IR and the X86 instructions emitted for the function and are attached herewith and it is clearly emitting vector instructions. I am having a hard time figuring out where the vector instructions are formulated. For sure SLP and Loop vectorizer is not doing anything. On Mon, Aug 21, 2017 at 11:56 AM, Craig Topper <craig.topper at gmail.com> wrote: > The X86 backend
2017 Oct 03
2
Changing Alignment of global variables in LLVM
If I know for sure I am accessing 32 byte chunks at a time, how can I go about changing the alignment of @u? Should I use DataLayout's reset method? I couldn't find a method to change alignment of one global variable. Thanks On Tue, Oct 3, 2017 at 6:34 PM, Matthias Braun <mbraun at apple.com> wrote: > The effective alignment is part of the load and store operations. Updating
2017 Oct 03
2
Changing Alignment of global variables in LLVM
What is the best way to change the alignment of global variables and allocated structures in LLVM during one of its optimization passes? For example, I want to change, @u = internal unnamed_addr global [5 x [65 x [65 x [65 x double]]]] zeroinitializer, align 16 to align to 32 bytes. How can this be accomplished so that all other references in the code accessing this structure are also
2016 Oct 13
2
Loop Unrolling Fail in Simple Vectorized loop
If count > MAX_UINT-4 your loop loops indefinitely with an increment of 4, I think. On Thu, Oct 13, 2016 at 4:42 PM, Charith Mendis via llvm-dev < llvm-dev at lists.llvm.org> wrote: > So, I tried unrolling the following simple loop. > > int unroll(unsigned * a, unsigned * b, unsigned *c, unsigned count){ > > for(unsigned i=0; i<count; i++){ > > a[i] =
2016 Oct 13
2
Loop Unrolling Fail in Simple Vectorized loop
Thanks for the explanation. But I am a little confused with the following fact. Can't LLVM keep vectorizable_elements as a symbolic value and convert the loop to say; for(unsigned i = 0; i < vectorizable_elements ; i += 2){ //main loop } for(unsigned i=0 ; i < vectorizable_elements % 2; i++){ //fix up } Why does it have to reason about the range of vectorizable_elements? Even
2016 Oct 04
2
Getting the symbolic expression for an address calculation
How do you generate a SCEVAddRecExpr from a SCEV? It tried dyn_casting and it seems like that the SCEV returned by getSCEV is not a SCEVAddRecExpr. Thanks On Fri, Sep 30, 2016 at 4:16 PM, Friedman, Eli <efriedma at codeaurora.org> wrote: > On 9/30/2016 12:16 PM, Charith Mendis via llvm-dev wrote: > >> >> Hi all, >> >> What is the best way to get the symbolic
2016 Sep 30
2
Getting the symbolic expression for an address calculation
Hi all, What is the best way to get the symbolic expression for an address calculation in llvm specially when memory addresses are calculated within a loop. Use case: I want to know what loop induction variables are used for a particular address calculation and in what symbolic context. Thereby, I want to identify which stores and loads will be contiguous in memory if I unroll each of the
2016 Oct 28
1
Vector Shuffle chain lowering to X86 instructions simplification inconsistencies
Hi all, Attached herewith is a fairly simple LLVM file (shuffle.ll) with lots of vector shuffles. When I use llc with -O3 -mcpu=core-avx2 the first shuffle sequence containing types of 128 wide gets reduced a single shuffle, where as the second shuffle sequence containing types of 256 wide doesn't get reduced to a single shuffle instruction in the resulting X86 code (Shuffle.s attached).
2016 Oct 12
2
Loop Unrolling Fail in Simple Vectorized loop
Hi all, Attached herewith is a simple vectorized function with loops performing a simple shuffle. I want all loops (inner and outer) to be unrolled by 2 and as such used -unroll-count=2 The inner loops(with k as the induction variable and having constant trip counts) unroll fully, but the outer loop with (j) fails to unroll. The llvm code is also attached with inner loops fully unrolled. To
2016 Sep 03
2
llc error
I updated to the latest revision and now llvm does not build and quits cmake with CMake Error at cmake/modules/LLVMProcessSources.cmake:83 (message): Found unknown source file ../llvm-revec/lib/CodeGen/MachineFunctionAnalysis.cpp Please update ../llvm-revec/lib/CodeGen/CMakeLists.txt Thanks On Sat, Sep 3, 2016 at 2:09 AM, Craig Topper <craig.topper at gmail.com> wrote: >
2016 Sep 03
4
llc error
Hi all, The attached LLVM assembly file fails to generate x86 code when compiled using llc. compilation command - ../llvm-build/bin/llc -filetype=asm -march=x86-64 -mcpu=core-avx2 ex4.ll The error message is, LLVM ERROR: Cannot select: t95: v8f32 = X86ISD::SUBV_BROADCAST t17 t17: v4f32,ch = load<LD16[%scevgep](tbaa=<0x4dbcd98>)> t0, t16, undef:i64 t16: i64 = add t2,
2016 Aug 20
2
LLVM flags for Vectorization
Hi, I have been analyzing the LLVM vectorizer by running some benchmarks. For vectorization, I have used the following flags: -O3 -ffast-math -mavx2 Am I missing any other flags which will improve vectorizer performance? Thanks, Santanu Das IIT Hyd -------------- next part -------------- An HTML attachment was scrubbed... URL:
2016 Oct 27
2
Bug with auto-vectorization of logf
Hi, I intended to file this bug on Bugzilla, but I've received no response from llvm-admin in the 10 days since asking for a Bugzilla account. I've written 2 test functions in C that take in a float array x of size n and output float array f(x), where f is either fabsf or logf. The LLVM 3.9 auto-vectorization docs claim that both functions will be vectorized:
2014 Nov 13
1
rsync prevent destination only new folders but need new files
Good day to all! I'm doing git file replication task and need to sync source destination, using below sync command. rsync -atnvv --existing --exclude '.git' --progress source/ destination/; Here I, 1) need only sync directories that *exist on destination* ,no new folders should copy from source - [used --existing did the job but it not copying new files also.] 2) need to copy
2018 Nov 20
2
Question on fast-math optimizations
Dear LLVM developers, I have a question on the fast-math floating-point optimizations applied by LLVM: Judging by the documentation at https://llvm.org/docs/LangRef.html#fast-math-flags I understood that rewriting with associativity and using reciprocal computations are possible optimizations. As the folklore description of fast-math is that it "applies real-valued identities", I
2016 Oct 27
0
Bug with auto-vectorization of logf
+Tanya for the account issue. > On Oct 27, 2016, at 11:36 AM, Eric Martin via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Hi, > I intended to file this bug on Bugzilla, but I've received no response from llvm-admin in the 10 days since asking for a Bugzilla account. > > I've written 2 test functions in C that take in a float array x of size n and output
2016 Oct 28
1
Bug with auto-vectorization of logf
Eric, I apologize for any delay or confusion. From my records/list archives, I saw that Anton had created an account for you on Oct 17th and responded to your email to llvm-admin. I am not sure what happened after that point as I thought the account was done. I just confirmed there is an account for you in bugzilla, so you should be good to go if you reset your password. -Tanya > On Oct 27,
2018 Nov 22
2
Question on fast-math optimizations
On 11/21/18 12:41 PM, Nicolai Hähnle wrote: > On 20.11.18 16:38, Stephen Canon via llvm-dev wrote: >> Distribution doesn’t seem to be used by many transforms at present. >> My vague recollection is that the fast math flags didn’t do a great >> job of characterizing when it would be allowed, and using it >> aggressively broke a lot of code in practice (code which was
2019 Apr 05
2
wbinfo isn't working on domain member
Hi Rowland, I made the change you suggested to auto refresh kerberos. It didn't seem to fix the issue unfortunately, even after a machine restart. Following your line of reasoning that it is a Kerberos issue, I then tried to grab a new kerberos ticket on the server in question which appears to fail though. Perhaps this gives some further insight? pi at fs1:~ $ kinit administrator at
2016 Jun 05
2
inconsistent DNS information, windows domain member issues..
I joined a Windows 10 Pro system to my (still experimental) domain. The windows system actually hosts DC2 as a VM, and another Windows (Server 2008 R2) at another location hosts DC1 also as a VM. The two locations are connected via a VPN, both systems run only when needed. The windows system does not directly use DC2 for DNS but instead talks to a DNS resolver that delegates the samba Domain to