I understand you to mean that you have isolated the actual execution time as
your point of comparison, as opposed to including runtime loading and so on.  Is
this correct?
One thing that changed between 3.1 and 3.3 is that MCJIT no longer compiles the
module during the engine creation process but instead waits until either a
function pointer is requested or finalizeObject is called.  I would guess that
you have taken that into account in your measurement technique, but it seemed
worth mentioning.
What architecture/OS are you testing?
With LLVM 3.3 you can register a JIT event listener (using
ExecutionEngine::RegisterJITEventListener) that MCJIT will call with a copy of
the actual object image that gets generated.  You could then write that image to
a file as a basis for comparing the generated code.  You can find a reference
implementation of the interface in
lib/ExecutionEngine/IntelJITEvents/IntelJITEventListener.cpp.
-Andy
-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Stéphane Letz
Sent: Thursday, July 18, 2013 11:20 AM
To: Eli Friedman
Cc: llvmdev at cs.uiuc.edu
Subject: Re: [LLVMdev] LLVM 3.3 JIT code speed
Le 18 juil. 2013 à 19:07, Eli Friedman <eli.friedman at gmail.com> a écrit
:
> On Thu, Jul 18, 2013 at 9:07 AM, Stéphane Letz <letz at grame.fr>
wrote:
>> Hi,
>> 
>> Our DSL LLVM IR emitted code (optimized with -O3 kind of IR ==> IR
passes) runs slower when executed with the LLVM 3.3 JIT, compared to what we had
with LLVM 3.1. What could be the reason?
>> 
>> I tried to play with TargetOptions without any success.
>> 
>> Here is the kind of code we use to allocate the JIT:
>> 
>>    EngineBuilder builder(fResult->fModule);
>>    builder.setOptLevel(CodeGenOpt::Aggressive);
>>    builder.setEngineKind(EngineKind::JIT);
>>    builder.setUseMCJIT(true);
>>    builder.setCodeModel(CodeModel::JITDefault);
>>    builder.setMCPU(llvm::sys::getHostCPUName());
>> 
>>    TargetOptions targetOptions;
>>    targetOptions.NoFramePointerElim = true;
>>    targetOptions.LessPreciseFPMADOption = true;
>>    targetOptions.UnsafeFPMath = true;
>>    targetOptions.NoInfsFPMath = true;
>>    targetOptions.NoNaNsFPMath = true;
>>    targetOptions.GuaranteedTailCallOpt = true;
>> 
>>   builder.setTargetOptions(targetOptions);
>> 
>>    TargetMachine* tm = builder.selectTarget();
>> 
>>    fJIT = builder.create(tm);
>>    if (!fJIT) {
>>        return false;
>>    }
>>    ..
>> 
>> Any idea?
> 
> It's hard to say much without seeing the specific IR and the code 
> generated from that IR.
> 
> -Eli
Our language can do either:
1) DSL  ==> C/C++  ===> clang/gcc ===> exec  code
or
1) DSL  ==> LLVM IR  ===> (optimisation passes) ==>  LLVM  IR  ==>
LLVM JIT ==> exex code
1) and 2) where running at same speed with LLVM 3.1, but 2) is now slower with
LLVM 3.3
I compared the LLVM IR that is generated by the 2) chain *after* the
optimization passes, with the one that is generated with 1) and clang -emit-llvm
-03 with the pure C input. The two are the same. So my conclusion what that the
way we are activating the JIT is no more correct in 3.3, or we are missing new
steps that have to be done in JIT?
Stéphane Letz
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev