thr3ads.net - similar to: "[LLVMdev] (no subject)"

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] (no subject)"

[LLVMdev] How to locate the start if an address mode in an X86 MachineInstr?

2012 Sep 20

[LLVMdev] How to locate the start if an address mode in an X86 MachineInstr?

My team interested in doing some post-RA optimizations on X86 instructions, which would require identifying memory reference instructions. In the X86 back end instructions, memory addresses consist of a set of five operands. The offset to the start of the five operands depends on the format of the instruction. For instance, the instructions ADC32rm, ADD32rm, AND32rm, ANDN32rm, CMOVA32rm,

[LLVMdev] Question about CriticalAntiDepBreaker.cpp

2012 Apr 09

[LLVMdev] Question about CriticalAntiDepBreaker.cpp

In the course of implementing the instruction scheduler for the Intel Atom in LLVM, I have run across a problem with the critical anti-dependence breaker, whereby CriticalAntiDepBreak.cpp code changes some XMM0 references to be XMM9 references. This would be all well and good, were it not for the fact that the result of the expression needs to be in XMM0 because it is being returned as the

[LLVMdev] Question about ExpandPostRAPseudos.cpp

2012 Jul 26

[LLVMdev] Question about ExpandPostRAPseudos.cpp

When trying to run test/CodeGen/X86/liveness-local-regalloc.ll with the command line options "-optimize-regalloc=0 -verify-machineinstrs -mcpu-atom", the test fails right after the Post-RA pseudo instruction pass with the messages *** Bad machine code: Using an undefined physical register *** - function: autogen_SD24657 - basic block: BB 0x2662d60 (BB#0) - instruction:

[LLVMdev] State of Loop Unrolling and Vectorization in LLVM

2013 Apr 15

[LLVMdev] State of Loop Unrolling and Vectorization in LLVM

Hi , I have a test case (and a micro benchmark made out of the test case) to check if loop unrolling and loop vectorization is efficiently done on LLVM. Here is the test case (credits: Tyler Nowicki) {code} extern float * array; extern int array_size; float g() { int i; float total = 0; for(i = 0; i < array_size; i++) { total += array[i]; } return total; } {code} When

[LLVMdev] Improving the usability of LNT

2013 May 02

[LLVMdev] Improving the usability of LNT

Wow, that sounds great! Thanks for working on this, and yes, please, send the patches! --renato On 30 April 2013 16:23, Murali, Sriram <sriram.murali at intel.com> wrote: > Hi Daniel,**** > > I made some changes to the LNT perf reporting tool to make it more user > friendly by adding some features:**** > > **1. **Make the sidebar and the navigation bar stationary,

[LLVMdev] Trip count and Loop Vectorizer

2013 Sep 27

[LLVMdev] Trip count and Loop Vectorizer

Hi, I am trying to get a small loop to *not vectorize* for cases where it doesn't make sense. For instance, this loop: void foo(int a[4][8], int n) { int b[4][8]; for(int i = 0; i < 4; i++) { for(int j = 0; j < n; j++) { a[i][j] = b[i][j]; } } } * Has maximum of 8ints copy. LLVM tries to use Memcpy for the inner loop. It is not helpful to perform

[LLVMdev] Improving the usability of LNT

2013 Apr 30

[LLVMdev] Improving the usability of LNT

Hi Daniel, I made some changes to the LNT perf reporting tool to make it more user friendly by adding some features: 1. Make the sidebar and the navigation bar stationary, so that it is easy to navigate the site 2. Have the pop-down menu for the items in the navigation bar, activate upon hovering the mouse, rather than clicking the item 3. Add a nav-link in the sidebar for the

[LLVMdev] Trip count and Loop Vectorizer

2013 Sep 27

[LLVMdev] Trip count and Loop Vectorizer

Hi Sriram, Thanks for performing this analysis. The problem here, both for memcpy and the vectorizer, is that we can’t predict the size of “n”, even though the only use of ’n’ is for the loop bound for the alloca [4 x [8 x i32]]. If you change the unroll condition to TC >= 0 then you will disable loop unrolling for all loops because getSmallConstantTripCount returns an unsigned number. You

[LLVMdev] TargetSpec

2013 Feb 13

[LLVMdev] TargetSpec

This is about the target specification proposal described in http://nondot.org/sabre/LLVMNotes/TargetSpec.txt At the end of the year I spent a while on this, partly as a foot-wetting exercise for parts of LLVM I wouldn't otherwise look at. I did a partial implementation; enough to understand most of the issues (I hope) and get a clear idea of what would need to be done to phase it in. I

[LLVMdev] Trip count and Loop Vectorizer

2013 Sep 27

[LLVMdev] Trip count and Loop Vectorizer

Hi Nadav, Thanks for the response. I forgot to mention that there is an upper limit of 16 for the Trip Count check, TinyTripCountVectorThreshold = 16; if (TC > 0u && TC < TinyTripCountVectorThreshold). So right now, any loop with Trip Count as 0, or with value >=16, LV with unroll. With the change to the lower bound, it will also include the loop with 0 trip count. SCEV returns 0

[LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests

2012 Dec 04

[LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests

Nuno, Inspired by this email thread, I spent a bit of time today looking through the implementation of BoundsChecking::instrument(..). Based on my reading of prior work, it should be possible to do these checks in two comparisons, or possibly even one if the right assumptions could be made. Could you provide a bit of background of the expected domains of Size and Offset? In particular,

[LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests

2012 Nov 26

[LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests

I am investigating changing BoundsChecking to use address-based rather than size- & offset-based tests. To explain, here is a short code sample cribbed from one of the tests: %mem = tail call i8* @calloc(i64 1, i64 %elements) %memobj = bitcast i8* %mem to i64* %ptr = getelementptr inbounds i64* %memobj, i64 %index %4 = load i64* %ptr, align 8 Currently, the IR for bounds checking

[LLVMdev] Trip count and Loop Vectorizer

2013 Sep 27

[LLVMdev] Trip count and Loop Vectorizer

Sriram, The problem is that you want to unroll/vectorize many loops with non-constant loop count - it is a trade-off of which case you estimate as more likely. int foo(int *ptr, int n) { for ( .. i <n) ptr[i] = ... } The question is: is it more likely to have “n” such that unrolling is beneficial or not. Now, you could probably write an analysis that bounds the loop count (for the

[LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests

2012 Nov 26

[LLVMdev] RFC: change BoundsChecking.cpp to use address-based tests

Hi Kevin, Thanks for your interest and for your deep analysis. Unfortunately, your approach doesn't catch all bugs and is vulnerable to an attack. Consider the following case: ...................... | ----- obj --- | | end ^ ptr ^ ^ end-of-memory The scenario is as follows: - an object is allocated in the last page of the address space - obj is byte

[LLVMdev] Public SmallVectorImpl constructor?

2012 Jan 20

[LLVMdev] Public SmallVectorImpl constructor?

I've had the same thought but never got around to trying to implement it. Does everything compile for you if it's protected? If so, then a patch would probably be happily accepted ------------------------------ From: Vane, Edwin Sent: 1/20/2012 7:13 AM To: llvmdev at cs.uiuc.edu Subject: [LLVMdev] Public SmallVectorImpl constructor? Hi all, Just finished debugging a memory

[LLVMdev] Calling with register indirect reference instead of memory indirect reference.

2013 Feb 28

[LLVMdev] Calling with register indirect reference instead of memory indirect reference.

Hi, I am working on a small optimization feature to replace the calls with indirect reference using a memory with an indirect reference using register. The purpose of this feature is to improve the performance of calls to functions referred to by function pointers. The motivation behind this work is that gcc does this optimization. Here is a small test case, that will generate an indirect call

[LLVMdev] Problem with PostRASchedulerList.cpp - advice wanted

2012 Oct 17

[LLVMdev] Problem with PostRASchedulerList.cpp - advice wanted

When you compile the attached file using llc -march=x86 -mcpu=atom sched-bug.ll -o - The Post-RA scheduler changes the sequence movl %ecx, (%esp) bsfl (%esp),%eax # this came from inline assembly code to read bsfl (%esp),%eax # this came from inline assembly code movl %ecx, (%esp) This is an incorrect schedule, because it seems the scheduler is not aware that the memory

[LLVMdev] Inlining sqrt library function in X86

2013 May 17

[LLVMdev] Inlining sqrt library function in X86

Using the following example program #include <math.h> double f(double d){ return sqrt(d); } and compiling it with "clang -O3 ...", I was trying to determine what it would take to get the X86 code generator to replace the call to sqrt with a sqrtsd instruction inline. It turns out that it could do exactly that, were it not for the fact that in the function

[LLVMdev] How to vectorize a vector type cast?

2012 Feb 28

[LLVMdev] How to vectorize a vector type cast?

Since Clang does not seem to allow type casts, such as uchar4 to float4, between vector types, it seems it is necessary to write them as element by element conversions, such as typedef float float4 __attribute__((ext_vector_type(4))); typedef unsigned char uchar4 __attribute__((ext_vector_type(4))); float4 to_float4(uchar4 in) { float4 out = {in.x, in.y, in.z, in.w}; return out; } Running

[LLVMdev] Public SmallVectorImpl constructor?

2012 Jan 20

[LLVMdev] Public SmallVectorImpl constructor?

Hi all, Just finished debugging a memory clobbering bug resulting from using SmallVectorImpl directly without realizing this is a bad idea (aside: I was using it directly because llvm::sys::path::append()'s first argument is a SmallVectorImpl<char>). A note in the docs about not using SmallVectorImpl directly would be nice but could we go further and make SmallVectorImpl's

similar to: [LLVMdev] (no subject)