thr3ads.net - llvm dev - [LLVMdev] LLVM 3.3 JIT code speed [Jul 2013]

If this information is useful, please help other people find it:
Share via:

Stéphane Letz

2013-Jul-18 18:20 UTC

[LLVMdev] LLVM 3.3 JIT code speed

Le 18 juil. 2013 à 19:07, Eli Friedman <eli.friedman at gmail.com> a écrit
:
> On Thu, Jul 18, 2013 at 9:07 AM, Stéphane Letz <letz at grame.fr>
wrote:
>> Hi,
>> 
>> Our DSL LLVM IR emitted code (optimized with -O3 kind of IR ==> IR
passes) runs slower when executed with the LLVM 3.3 JIT, compared to what we had
with LLVM 3.1. What could be the reason?
>> 
>> I tried to play with TargetOptions without any success…
>> 
>> Here is the kind of code we use to allocate the JIT:
>> 
>>    EngineBuilder builder(fResult->fModule);
>>    builder.setOptLevel(CodeGenOpt::Aggressive);
>>    builder.setEngineKind(EngineKind::JIT);
>>    builder.setUseMCJIT(true);
>>    builder.setCodeModel(CodeModel::JITDefault);
>>    builder.setMCPU(llvm::sys::getHostCPUName());
>> 
>>    TargetOptions targetOptions;
>>    targetOptions.NoFramePointerElim = true;
>>    targetOptions.LessPreciseFPMADOption = true;
>>    targetOptions.UnsafeFPMath = true;
>>    targetOptions.NoInfsFPMath = true;
>>    targetOptions.NoNaNsFPMath = true;
>>    targetOptions.GuaranteedTailCallOpt = true;
>> 
>>   builder.setTargetOptions(targetOptions);
>> 
>>    TargetMachine* tm = builder.selectTarget();
>> 
>>    fJIT = builder.create(tm);
>>    if (!fJIT) {
>>        return false;
>>    }
>>    ….
>> 
>> Any idea?
> 
> It's hard to say much without seeing the specific IR and the code
> generated from that IR.
> 
> -Eli
Our language can do either:

1) DSL  ==> C/C++  ===> clang/gcc ===> exec  code

or

1) DSL  ==> LLVM IR  ===> (optimisation passes) ==>  LLVM  IR  ==>
LLVM JIT ==> exex code

1) and 2) where running at same speed with LLVM 3.1, but 2) is now slower with
LLVM 3.3

I compared the LLVM IR that is generated by the 2) chain *after* the
optimization passes, with the one that is generated with 1) and clang -emit-llvm
-03 with the pure C input. The two are the same. So my conclusion what that the
way we are activating the JIT is no more correct in 3.3, or we are missing new
steps that have to be done in JIT?

Stéphane Letz

Kaylor, Andrew

2013-Jul-18 19:05 UTC

head link

[LLVMdev] LLVM 3.3 JIT code speed

I understand you to mean that you have isolated the actual execution time as
your point of comparison, as opposed to including runtime loading and so on.  Is
this correct?

One thing that changed between 3.1 and 3.3 is that MCJIT no longer compiles the
module during the engine creation process but instead waits until either a
function pointer is requested or finalizeObject is called.  I would guess that
you have taken that into account in your measurement technique, but it seemed
worth mentioning.

What architecture/OS are you testing?

With LLVM 3.3 you can register a JIT event listener (using
ExecutionEngine::RegisterJITEventListener) that MCJIT will call with a copy of
the actual object image that gets generated.  You could then write that image to
a file as a basis for comparing the generated code.  You can find a reference
implementation of the interface in
lib/ExecutionEngine/IntelJITEvents/IntelJITEventListener.cpp.

-Andy

-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Stéphane Letz
Sent: Thursday, July 18, 2013 11:20 AM
To: Eli Friedman
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] LLVM 3.3 JIT code speed


Le 18 juil. 2013 à 19:07, Eli Friedman <eli.friedman at gmail.com> a écrit
:
> On Thu, Jul 18, 2013 at 9:07 AM, Stéphane Letz <letz at grame.fr>
wrote:
>> Hi,
>> 
>> Our DSL LLVM IR emitted code (optimized with -O3 kind of IR ==> IR
passes) runs slower when executed with the LLVM 3.3 JIT, compared to what we had
with LLVM 3.1. What could be the reason?
>> 
>> I tried to play with TargetOptions without any success.
>> 
>> Here is the kind of code we use to allocate the JIT:
>> 
>>    EngineBuilder builder(fResult->fModule);
>>    builder.setOptLevel(CodeGenOpt::Aggressive);
>>    builder.setEngineKind(EngineKind::JIT);
>>    builder.setUseMCJIT(true);
>>    builder.setCodeModel(CodeModel::JITDefault);
>>    builder.setMCPU(llvm::sys::getHostCPUName());
>> 
>>    TargetOptions targetOptions;
>>    targetOptions.NoFramePointerElim = true;
>>    targetOptions.LessPreciseFPMADOption = true;
>>    targetOptions.UnsafeFPMath = true;
>>    targetOptions.NoInfsFPMath = true;
>>    targetOptions.NoNaNsFPMath = true;
>>    targetOptions.GuaranteedTailCallOpt = true;
>> 
>>   builder.setTargetOptions(targetOptions);
>> 
>>    TargetMachine* tm = builder.selectTarget();
>> 
>>    fJIT = builder.create(tm);
>>    if (!fJIT) {
>>        return false;
>>    }
>>    ..
>> 
>> Any idea?
> 
> It's hard to say much without seeing the specific IR and the code 
> generated from that IR.
> 
> -Eli
Our language can do either:

1) DSL  ==> C/C++  ===> clang/gcc ===> exec  code

or

1) DSL  ==> LLVM IR  ===> (optimisation passes) ==>  LLVM  IR  ==>
LLVM JIT ==> exex code

1) and 2) where running at same speed with LLVM 3.1, but 2) is now slower with
LLVM 3.3

I compared the LLVM IR that is generated by the 2) chain *after* the
optimization passes, with the one that is generated with 1) and clang -emit-llvm
-03 with the pure C input. The two are the same. So my conclusion what that the
way we are activating the JIT is no more correct in 3.3, or we are missing new
steps that have to be done in JIT?

Stéphane Letz


_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Stéphane Letz

2013-Jul-18 21:51 UTC

head link

[LLVMdev] LLVM 3.3 JIT code speed

Le 18 juil. 2013 à 21:05, "Kaylor, Andrew" <andrew.kaylor at
intel.com> a écrit :
> I understand you to mean that you have isolated the actual execution time
as your point of comparison, as opposed to including runtime loading and so on. 
Is this correct?
We are testing actual execution time yes :  time used in a given JIT compiled
function.> 
> 
> One thing that changed between 3.1 and 3.3 is that MCJIT no longer compiles
the module during the engine creation process but instead waits until either a
function pointer is requested or finalizeObject is called.  I would guess that
you have taken that into account in your measurement technique, but it seemed
worth mentioning.
OK, so I guess our testing is then correct since we are testing actual execution
time of the function pointer.> 
> 
> What architecture/OS are you testing?
64 bits OSX (10.8.4)> 
> With LLVM 3.3 you can register a JIT event listener (using
ExecutionEngine::RegisterJITEventListener) that MCJIT will call with a copy of
the actual object image that gets generated.  You could then write that image to
a file as a basis for comparing the generated code.  You can find a reference
implementation of the interface in
lib/ExecutionEngine/IntelJITEvents/IntelJITEventListener.cpp.
Thanks I'll have a look.> 
> -Andy
> 
Stéphane

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Jul 2013 - [LLVMdev] LLVM 3.3 JIT code speed

[LLVMdev] LLVM 3.3 JIT code speed

[LLVMdev] LLVM 3.3 JIT code speed

[LLVMdev] LLVM 3.3 JIT code speed

Possibly Parallel Threads