similar to: [LLVMdev] Loads not hoisted out of the loop

Displaying 20 results from an estimated 1000 matches similar to: "[LLVMdev] Loads not hoisted out of the loop"

2013 Jan 18
0
[LLVMdev] Loads not hoisted out of the loop
On Jan 17, 2013, at 9:27 PM, Dimitri Tcaciuc <dtcaciuc at gmail.com> wrote: > Hey everyone, > > I'm looking at the following two C functions: > > http://pastebin.com/mYWCj6d8 > > Second one is about 50% slower than the first one because in the second case d->data[i * 3 + {X, Y, Z}] loads are not moved out of the second loop and are computed every time j
2013 Jan 18
3
[LLVMdev] Loads not hoisted out of the loop
On 1/17/2013 11:33 PM, Owen Anderson wrote: > > Almost certainly, the compiler can't tell that the load from the struct > (to get the array pointer) doesn't alias with the stores to the array > itself. This is exactly the kind of thing that type-based alias > analysis (-fstrict-aliasing) can help with. Use with caution, as it > can break some type-unsafe idioms. The
2013 Jan 18
0
[LLVMdev] Loads not hoisted out of the loop
On Fri, Jan 18, 2013 at 6:36 AM, Krzysztof Parzyszek < kparzysz at codeaurora.org> wrote: > Since both "d->data" and "out->data" are both of type double, the > type-based alias analysis won't help. Right, I see. Is there any other way to solve this? As the last resort, I was considering silently transforming each Array argument into separate data and
2013 Jan 18
2
[LLVMdev] Loads not hoisted out of the loop
On 1/18/2013 11:11 AM, Dimitri Tcaciuc wrote: > > Right, I see. Is there any other way to solve this? As the last resort, > I was considering silently transforming each Array argument into > separate data and metadata arguments, but that would certainly add other > complications I'd rather avoid. Depends on what information you have available. If all you have is the LLVM IR,
2013 Jan 18
0
[LLVMdev] Loads not hoisted out of the loop
On Fri, Jan 18, 2013 at 10:00 AM, Krzysztof Parzyszek < kparzysz at codeaurora.org> wrote: > f90_array_type = type { i32 size, double* data }; I am not certain how fortran implements multi-dimensional arrays, but in my case I'm doing something like type { i32 nd, i32* dims, double* data }; Perhaps we could add !tbaa.pointer? !1 = metadata !{ metadata !"int",
2013 Jan 18
2
[LLVMdev] Loads not hoisted out of the loop
On 1/18/2013 11:34 AM, Hal Finkel wrote: > > I agree. FWIW, I'm currently working on making the LLVM-based Fortran compiler non-hypothetical, and so for several reasons, I'd like to have a solution to this. If we can't think of anything better, we could always fall back to the N^2 metadata solution (explicit mark as no-alias all pairs that don't alias), but I'd rather we
2013 Jan 27
4
[LLVMdev] SIMD trigonometry/logarithms?
I'm wondering if it makes sense to instead supply a bc math library. I would think it would be easier to maintain and debug, and should still give you all of the benefits. You could just link with it early in the optimization pipeline to ensure inlining. This may also make it easier to maintain SIMD functions for multiple backends. On Sun, Jan 27, 2013 at 8:49 AM, Hal Finkel <hfinkel
2013 Jan 18
0
[LLVMdev] Loads not hoisted out of the loop
----- Original Message ----- > From: "Krzysztof Parzyszek" <kparzysz at codeaurora.org> > To: "Dimitri Tcaciuc" <dtcaciuc at gmail.com> > Cc: llvmdev at cs.uiuc.edu > Sent: Friday, January 18, 2013 11:22:26 AM > Subject: Re: [LLVMdev] Loads not hoisted out of the loop > > On 1/18/2013 11:11 AM, Dimitri Tcaciuc wrote: > > > > Right,
2013 Jan 27
5
[LLVMdev] SIMD trigonometry/logarithms?
Hi everyone, I was looking at loop vectorizer code and wondered if there was any current or planned effort to introduce SIMD implementations of sin/cos/exp/log intrinsics (in particular for x86-64 backend)? Cheers, Dimitri. -------------- next part -------------- An HTML attachment was scrubbed... URL:
2013 Jan 27
0
[LLVMdev] SIMD trigonometry/logarithms?
----- Original Message ----- > From: "Dimitri Tcaciuc" <dtcaciuc at gmail.com> > To: llvmdev at cs.uiuc.edu > Sent: Sunday, January 27, 2013 3:42:42 AM > Subject: [LLVMdev] SIMD trigonometry/logarithms? > > > > Hi everyone, > > > I was looking at loop vectorizer code and wondered if there was any > current or planned effort to introduce SIMD
2013 Jan 27
3
[LLVMdev] SIMD trigonometry/logarithms?
----- Original Message ----- > From: "Dmitry Mikushin" <dmitry at kernelgen.org> > To: "Justin Holewinski" <justin.holewinski at gmail.com> > Cc: "Hal Finkel" <hfinkel at anl.gov>, "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Sunday, January 27, 2013 10:19:42 AM > Subject: Re: [LLVMdev] SIMD
2013 Jan 27
0
[LLVMdev] SIMD trigonometry/logarithms?
Hi Justin, I think having .bc math libraries for different backends makes perfect sense! For example, in case of NVPTX backend we have the following problem: many math functions that are only available as CUDA C++ headers could not be easily used in, for instance, GPU program written in Fortran. On our end we are currently doing exactly what you proposed: generating math.bc module and then link
2013 Jan 17
1
[LLVMdev] Regarding codegenprepare transformations
Hello everyone, For the context of question, I have a small loop written in a custom front-end which can be fairly accurately expressed with the following C program: struct Array { double * data; long n; }; #define X 0 #define Y 1 #define Z 2 void f(struct Array * restrict d, struct Array * restrict out, const long n) { for (long i = 0;
2013 Jan 28
1
[LLVMdev] SIMD trigonometry/logarithms?
First let me say that I really like the notion of being able to plug in .bc libraries into the compiler and I think that there are many potential uses (i.e. vector saturation operations and the like). But even so it is important to realize the limitations of this approach. Generally implementations of transcendental functions require platform specific optimizations to get the best performance and
2013 Feb 14
0
[LLVMdev] SIMD trigonometry/logarithms?
Hi all. In fact, this is how we have implemented it in our compiler (intel's OpenCL). We have created a .bc file for every architecture. Each file contains all the SIMD versions for the functions to be vectorized. To cope with the massive amount of code to be produced, we implemented a dedicated tblgen BE for that purpose. We are willing to share that code with the llvm community, in case this
2013 Feb 14
1
[LLVMdev] SIMD trigonometry/logarithms?
----- Original Message ----- > From: "Elior Malul" <elior.malul at intel.com> > To: "Michael Gottesman" <mgottesman at apple.com>, "Hal Finkel" <hfinkel at anl.gov> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Thursday, February 14, 2013 8:33:42 AM > Subject: RE: [LLVMdev] SIMD
2010 Oct 29
3
[LLVMdev] strict aliasing and LLVM
On Fri, Oct 29, 2010 at 12:26 AM, Nick Lewycky <nicholas at mxc.ca> wrote: > Xinliang David Li wrote: > >> As simple as >> >> void foo (int n, double *p, int *q) >> { >> for (int i = 0; i < n; i++) >> *p += *q; >> } >> >> clang -O2 -fstrict-aliasing -emit-llvm -o foo.bc -c foo.c >> llc -enable-tbaa -O2
2010 Oct 30
0
[LLVMdev] strict aliasing and LLVM
Xinliang David Li wrote: > > > On Fri, Oct 29, 2010 at 12:26 AM, Nick Lewycky <nicholas at mxc.ca > <mailto:nicholas at mxc.ca>> wrote: > > Xinliang David Li wrote: > > As simple as > > void foo (int n, double *p, int *q) > { > for (int i = 0; i < n; i++) > *p += *q; > } > >
2010 Oct 29
5
[LLVMdev] strict aliasing and LLVM
As simple as void foo (int n, double *p, int *q) { for (int i = 0; i < n; i++) *p += *q; } clang -O2 -fstrict-aliasing -emit-llvm -o foo.bc -c foo.c llc -enable-tbaa -O2 -filetype=asm -o foo.s foo.bc Memory accesses remain in the loop. The following works fine: void foo(int n, double *restrict p, int * restrict *q) { ... } By the way, Is there a performance category in the llvm
2010 Oct 29
0
[LLVMdev] strict aliasing and LLVM
Xinliang David Li wrote: > As simple as > > void foo (int n, double *p, int *q) > { > for (int i = 0; i < n; i++) > *p += *q; > } > > clang -O2 -fstrict-aliasing -emit-llvm -o foo.bc -c foo.c > llc -enable-tbaa -O2 -filetype=asm -o foo.s foo.bc There's a couple things interacting here: * clang -fstrict-aliasing -O2 does generate the TBAA info, but