thr3ads.net - similar to: "[LLVMdev] LLVM vector code generation for standard functions"

Displaying 20 results from an estimated 30000 matches similar to: "[LLVMdev] LLVM vector code generation for standard functions"

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

Le 5 juil. 2013 à 17:23, Arnold Schwaighofer <aschwaighofer at apple.com> a écrit : > > On Jul 5, 2013, at 9:50 AM, Stéphane Letz <letz at grame.fr> wrote: > >> >> Le 5 juil. 2013 à 04:11, Tobias Grosser <tobias at grosser.es> a écrit : >> >>> On 07/04/2013 01:39 PM, Stéphane Letz wrote: >>>> Hi, >>>>

[LLVMdev] Vectorized LLVM IR

2010 May 29

[LLVMdev] Vectorized LLVM IR

On Sat, May 29, 2010 at 1:23 AM, Stéphane Letz <letz at grame.fr> wrote: >> >> <32 x float> takes up 8 SSE registers; you're likely running into >> issues with register pressure. Does it work better if you use >> something smaller like <4 x float>? >> >> Besides that, I don't see any obvious issues. >> >> -Eli > >

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

On Jul 5, 2013, at 9:50 AM, Stéphane Letz <letz at grame.fr> wrote: > > Le 5 juil. 2013 à 04:11, Tobias Grosser <tobias at grosser.es> a écrit : > >> On 07/04/2013 01:39 PM, Stéphane Letz wrote: >>> Hi, >>> >>> Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

Le 5 juil. 2013 à 17:48, Arnold Schwaighofer <aschwaighofer at apple.com> a écrit : > > On Jul 5, 2013, at 10:43 AM, Stéphane Letz <letz at grame.fr> wrote >> >> 1) "entry" block is the first block of the function right? > > Yes. OK > >> >> 2) do you mean *all* "alloca" in a function always have to be in the fist entry

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

On Jul 5, 2013, at 10:43 AM, Stéphane Letz <letz at grame.fr> wrote > > 1) "entry" block is the first block of the function right? Yes. > > 2) do you mean *all* "alloca" in a function always have to be in the fist entry block? If you want them converted into ssa variables early on, yes.

[LLVMdev] Vectorized LLVM IR

2010 May 29

[LLVMdev] Vectorized LLVM IR

> > <32 x float> takes up 8 SSE registers; you're likely running into > issues with register pressure. Does it work better if you use > something smaller like <4 x float>? > > Besides that, I don't see any obvious issues. > > -Eli You are right yes. The code works faster with <4 x float> types, with still works a bit slower than the scalar

[LLVMdev] General strategy to optimize LLVM IR

2013 Jul 16

[LLVMdev] General strategy to optimize LLVM IR

On Tue, Jul 16, 2013 at 8:16 AM, Stéphane Letz <letz at grame.fr> wrote: > Hi, > > Our DSL emit sub-optimal LLVM IR that we optimize later on (LLVM IR ==> LLVM IR) before dynamically compiling it with the JIT. We would like to simply follow what clang/clang++ does when compiling with -O1/-O2/-O3 options. Our strategy up to now what to look at the opt.cpp code and take part of it

[LLVMdev] Vectorized LLVM IR

2010 May 28

[LLVMdev] Vectorized LLVM IR

Hi Stéphane, The SSE support is the LLVM backend is fine. What is the code that's generated? Do you have some short examples of where LLVM doesn't do as well as the equivalent scalar code? -bw On May 28, 2010, at 12:13 PM, Stéphane Letz wrote: > Hi, > > We are experimenting directly generating vectorized LLVM IR (using <8 x float> kind of types), then compiling the code

[LLVMdev] Vectorized LLVM IR

2010 May 29

[LLVMdev] Vectorized LLVM IR

On Sat, May 29, 2010 at 12:42 AM, Stéphane Letz <letz at grame.fr> wrote: > > Le 29 mai 2010 à 01:08, Bill Wendling a écrit : > >> Hi Stéphane, >> >> The SSE support is the LLVM backend is fine. What is the code that's generated? Do you have some short examples of where LLVM doesn't do as well as the equivalent scalar code? >> >> -bw >>

[LLVMdev] Generating Floating point constants

2010 Jun 03

[LLVMdev] Generating Floating point constants

> ------------------------------ > > Message: 4 > Date: Wed, 2 Jun 2010 11:07:39 -0700 > From: Dale Johannesen <dalej at apple.com> > Subject: Re: [LLVMdev] Generating Floating point constants > To: St?phane Letz <letz at free.fr> > Cc: llvmdev at cs.uiuc.edu > Message-ID: <AEC895CC-E887-4329-8743-FA606BD401F6 at apple.com> > Content-Type:

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

2013 Jul 05

[LLVMdev] Enabling vectorization with LLVM 3.3 for a DSL emitting LLVM IR

Le 5 juil. 2013 à 04:11, Tobias Grosser <tobias at grosser.es> a écrit : > On 07/04/2013 01:39 PM, Stéphane Letz wrote: >> Hi, >> >> Our DSL can generate C or directly generate LLVM IR. With LLVM 3.3, we can vectorize the C produced code using clang with -O3, or clang with -O1 then opt -O3 -vectorize-loops. But the same program generating LLVM IR version cannot be

[LLVMdev] LLVM 3.3 JIT code speed

2013 Jul 18

[LLVMdev] LLVM 3.3 JIT code speed

Le 18 juil. 2013 à 19:07, Eli Friedman <eli.friedman at gmail.com> a écrit : > On Thu, Jul 18, 2013 at 9:07 AM, Stéphane Letz <letz at grame.fr> wrote: >> Hi, >> >> Our DSL LLVM IR emitted code (optimized with -O3 kind of IR ==> IR passes) runs slower when executed with the LLVM 3.3 JIT, compared to what we had with LLVM 3.1. What could be the reason? >>

[LLVMdev] Vectorized LLVM IR

2010 May 29

[LLVMdev] Vectorized LLVM IR

On Sat, May 29, 2010 at 1:18 AM, Eli Friedman <eli.friedman at gmail.com> wrote: > On Sat, May 29, 2010 at 12:42 AM, Stéphane Letz <letz at grame.fr> wrote: >> >> Le 29 mai 2010 à 01:08, Bill Wendling a écrit : >> >>> Hi Stéphane, >>> >>> The SSE support is the LLVM backend is fine. What is the code that's generated? Do you have some

[LLVMdev] LLVM 3.3 JIT code speed

2013 Jul 18

[LLVMdev] LLVM 3.3 JIT code speed

I understand you to mean that you have isolated the actual execution time as your point of comparison, as opposed to including runtime loading and so on. Is this correct? One thing that changed between 3.1 and 3.3 is that MCJIT no longer compiles the module during the engine creation process but instead waits until either a function pointer is requested or finalizeObject is called. I would

[LLVMdev] LLVM 3.3 JIT code speed

2013 Jul 18

[LLVMdev] LLVM 3.3 JIT code speed

On Thu, Jul 18, 2013 at 9:07 AM, Stéphane Letz <letz at grame.fr> wrote: > Hi, > > Our DSL LLVM IR emitted code (optimized with -O3 kind of IR ==> IR passes) runs slower when executed with the LLVM 3.3 JIT, compared to what we had with LLVM 3.1. What could be the reason? > > I tried to play with TargetOptions without any success… > > Here is the kind of code we use to

[LLVMdev] Vectorized LLVM IR

2010 May 28

[LLVMdev] Vectorized LLVM IR

Hi, We are experimenting directly generating vectorized LLVM IR (using <8 x float> kind of types), then compiling the code to SSE on a 64 bits machine. Right now the equivalent code in scalar mode sill outperform the SSE one. What is the quality of the SSE support in X86 LLVL backend? Are they any specific things to be aware of to improve the speed? Thanks Stéphane Letz

[LLVMdev] LLVM 3.3 JIT code speed

2013 Jul 19

[LLVMdev] LLVM 3.3 JIT code speed

Le 18 juil. 2013 à 23:51, Stéphane Letz <letz at grame.fr> a écrit : > > Le 18 juil. 2013 à 21:05, "Kaylor, Andrew" <andrew.kaylor at intel.com> a écrit : > >> I understand you to mean that you have isolated the actual execution time as your point of comparison, as opposed to including runtime loading and so on. Is this correct? > > We are testing

[LLVMdev] Vectorized LLVM IR

2010 May 29

[LLVMdev] Vectorized LLVM IR

Le 29 mai 2010 à 01:08, Bill Wendling a écrit : > Hi Stéphane, > > The SSE support is the LLVM backend is fine. What is the code that's generated? Do you have some short examples of where LLVM doesn't do as well as the equivalent scalar code? > > -bw > > On May 28, 2010, at 12:13 PM, Stéphane Letz wrote: We are actually testing LLVM for the Faust language

[LLVMdev] Generating Floating point constants

2010 Jun 02

[LLVMdev] Generating Floating point constants

Le 2 juin 2010 à 19:48, Dale Johannesen a écrit : > > On Jun 2, 2010, at 3:28 AMPDT, Stéphane Letz wrote: > >> >> Le 2 juin 2010 à 12:21, Eli Friedman a écrit : >> >>> On Wed, Jun 2, 2010 at 2:59 AM, Stéphane Letz <letz at free.fr> wrote: >>>> Hi, >>>> >>>> We need to generate "Floating point constants"

[LLVMdev] LLVM 3.3 JIT code speed

2013 Jul 18

[LLVMdev] LLVM 3.3 JIT code speed

Hi, Our DSL LLVM IR emitted code (optimized with -O3 kind of IR ==> IR passes) runs slower when executed with the LLVM 3.3 JIT, compared to what we had with LLVM 3.1. What could be the reason? I tried to play with TargetOptions without any success… Here is the kind of code we use to allocate the JIT: EngineBuilder builder(fResult->fModule);

similar to: [LLVMdev] LLVM vector code generation for standard functions