similar to: [LLVMdev] Incorrect loop optimization when building the Linux kernel

Displaying 20 results from an estimated 900 matches similar to: "[LLVMdev] Incorrect loop optimization when building the Linux kernel"

2014 Dec 08
4
[LLVMdev] Incorrect loop optimization when building the Linux kernel
> It's difficult to say without a full example, but I'm very suspicious > of those global declarations. I think the compiler would be entirely > justified in assuming you could *never* get from __start_builtin_fw to > __end_builtin_fw, let alone on the first iteration: they're distinct > array objects and by definition (within C99) can't overlap. I think this should
2014 Dec 08
3
[LLVMdev] Incorrect loop optimization when building the Linux kernel
On Sun, Dec 7, 2014 at 11:16 PM, Nick Lewycky <nicholas at mxc.ca> wrote: > Chengyu Song wrote: > >> It's difficult to say without a full example, but I'm very suspicious >>> of those global declarations. I think the compiler would be entirely >>> justified in assuming you could *never* get from __start_builtin_fw to >>> __end_builtin_fw, let
2017 Oct 12
2
[PATCH v1 15/27] compiler: Option to default to hidden symbols
On Wed, Oct 11, 2017 at 01:30:15PM -0700, Thomas Garnier wrote: > Provide an option to default visibility to hidden except for key > symbols. This option is disabled by default and will be used by x86_64 > PIE support to remove errors between compilation units. > > The default visibility is also enabled for external symbols that are > compared as they maybe equals (start/end of
2017 Oct 12
2
[PATCH v1 15/27] compiler: Option to default to hidden symbols
On Wed, Oct 11, 2017 at 01:30:15PM -0700, Thomas Garnier wrote: > Provide an option to default visibility to hidden except for key > symbols. This option is disabled by default and will be used by x86_64 > PIE support to remove errors between compilation units. > > The default visibility is also enabled for external symbols that are > compared as they maybe equals (start/end of
2017 Oct 11
0
[PATCH v1 15/27] compiler: Option to default to hidden symbols
Provide an option to default visibility to hidden except for key symbols. This option is disabled by default and will be used by x86_64 PIE support to remove errors between compilation units. The default visibility is also enabled for external symbols that are compared as they maybe equals (start/end of sections). In this case, older versions of GCC will remove the comparison if the symbols are
2014 Dec 08
2
[LLVMdev] Incorrect loop optimization when building the Linux kernel
Nick, consider: char a[0]; char b[0]; at run-time: (gdb) p &a $1 = (char (*)[]) 0x60103c (gdb) p &b $2 = (char (*)[]) 0x60103c Even if this is not safe at the C or C++ level, this comparison could return equal or not equal depending on what the linker chooses to do. Do we have a bug in the constant folder? On Sun, Dec 7, 2014 at 11:16 PM, Nick Lewycky <nicholas at mxc.ca>
2017 Oct 18
0
[PATCH v1 15/27] compiler: Option to default to hidden symbols
On Thu, Oct 12, 2017 at 1:02 PM, Luis R. Rodriguez <mcgrof at kernel.org> wrote: > On Wed, Oct 11, 2017 at 01:30:15PM -0700, Thomas Garnier wrote: >> Provide an option to default visibility to hidden except for key >> symbols. This option is disabled by default and will be used by x86_64 >> PIE support to remove errors between compilation units. >> >> The
2017 Nov 29
4
CodeExtractor buggy?
Hi All, I’m currently working on a simple task which needs to transform loops into tail-recursive functions. I found the CodeExtractor class a handy helper to use, but later encountered a problem. Consider the following CU struct S { int a, b; }; int foo(struct S *s, unsigned n) {   struct S *next = s;   unsigned i;   for (i = 0; i < n; ++i) {     if (!s[i].a)      
2015 Jul 16
2
[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable
----- Original Message ----- > From: "Chandler Carruth" <chandlerc at google.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Hyojin Sung" <hsung at us.ibm.com>, llvmdev at cs.uiuc.edu > Sent: Thursday, July 16, 2015 1:06:03 AM > Subject: Re: [LLVMdev] Improving loop vectorizer support for loops > with a volatile iteration
2015 Jul 16
2
[LLVMdev] Improving loop vectorizer support for loops with a volatile iteration variable
----- Original Message ----- > From: "Chandler Carruth" <chandlerc at google.com> > To: "Hyojin Sung" <hsung at us.ibm.com>, llvmdev at cs.uiuc.edu > Sent: Wednesday, July 15, 2015 7:34:54 PM > Subject: Re: [LLVMdev] Improving loop vectorizer support for loops > with a volatile iteration variable > On Wed, Jul 15, 2015 at 12:55 PM Hyojin Sung
2005 Feb 22
0
[LLVMdev] Area for improvement
When I increased COLS to the point where the loop could no longer be unrolled, the selection dag code generator generated effectively the same code as the default X86 code generator. Lots of redundant imul/movl/addl sequences. It can't clean it up either. Only unrolling all nested loops permits it to be optimized away, regardless of code generator. Jeff Cohen wrote: > I noticed
2019 Aug 26
2
SCEV related question
Here is original C code: void topup(int a[], unsigned long i) { for (; i < 16; i++) { a[i] = 1; } } Here is the IR before the pass where I expect SCEV to return trip-count value ; Function Attrs: nofree norecurse nounwind uwtable writeonly define dso_local void @topup(i32* nocapture %a, i64 %i) local_unnamed_addr #0 { entry: %cmp3 = icmp ult i64 %i, 16 br i1
2009 Jan 06
2
[LLVMdev] LLVM Optmizer
The following C code : #include <stdio.h> #include <stdlib.h> int TESTE2( int parami , int paraml ,double paramd ) { int varx=0,vary; int nI =0; //varx= parami; if( parami > 0 ) { varx = parami; vary = varx + 1; } else { varx = vary + 1; vary = paraml; } varx = varx + parami + paraml; for( nI = 1 ; nI <= paraml; nI++) { varx =
2015 Sep 03
2
[RFC] New pass: LoopExitValues
On Wed, Sep 2, 2015 at 5:36 AM, James Molloy <james at jamesmolloy.co.uk> wrote: > Hi, > > Coremark really isn't a good enough test - have you run the LLVM test suite > with this patch, and what were the performance differences? For the test suite single source benches, the 235 tests improved performance, 2 regressed and 705 were unchanged. That seems very optimistic.
2016 May 24
1
BitcodeReader non explicit error
Hi, I'm working on OpenCL and I'm using clang as compiler (based on clang 3.7.0). I have a issue, I'm generating a bitcode file (that I can print before before the generation). But when I'm trying to read it again with clang, I have this issue: "error: Invalid record" How can I managed to know where it comes from? Thank you, Romaric Here is what is print before the
2014 Dec 26
3
[LLVMdev] Correct usage of `llvm.assume` for loop vectorization alignment?
Using LLVM ToT and Hal's helpful slide deck [1], I've been trying to use `llvm.assume` to communicate pointer alignment guarantees to vector load and store instructions. For example, in [2] %5 and %9 are guaranteed to be 32-byte aligned. However, if I run this IR through `opt -O3 -datalayout -S`, the vectorized loads and stores are still 1-byte aligned [3]. What's going wrong? Do I
2005 Feb 22
5
[LLVMdev] Area for improvement
I noticed that fourinarow is one of the programs in which LLVM is much slower than GCC, so I decided to take a look and see why that is so. The program has many loops that look like this: #define ROWS 6 #define COLS 7 void init_board(char b[COLS][ROWS+1]) { int i,j; for (i=0;i<COLS;i++) for (j=0;j<ROWS;j++) b[i][j]='.';
2015 Apr 25
3
[LLVMdev] alias analysis on llvm internal globals
Hi I have this program in which fooBuf can only take on NULL or the address of local_fooBuf, and fooBuf and local_fooBuf have scope of the foo function. Therefore there is no way for the fooPtr argument to alias with fooBuf. However, LLVM basicaa and globalsmodref-aa say the 2 pointers may alias. I am thinking whether i should implement a limited form of point-to alias on the fooBuf pointer in
2016 Mar 22
3
Instrumented BB in PGO
Hello, I have a question regarding PGO instrumented BBs (I use IR-level instrumentation). It seems that instrumented BBs do not match between the two compilations for profile-gen and profile-use for some cases. Here is an example from SPECcpu 2006 lbm (a simple case consisting of just two modules). In the first compilation, we have 5 instrumentation points for the main function as follows: $
2016 May 05
2
No remapping of clone instruction in CloneBasicBlock
Hi, Found CloneBasicBlock utility only does the cloning without any remapping. Consider below example: Input block: sw.epilog: ; preds = %sw.bb20, %sw.bb15, %sw.bb10, %sw.bb6, %sw.bb2, %sw.bb, %while.body, %if.end29 %no_final.1 = phi i32 [ %no_final.055, %while.body ], [ 1, %if.end29 ], [ %no_final.055, %sw.bb20 ], [ %no_final.055, %sw.bb15 ], [