thr3ads.net - llvm dev - [llvm-dev] Performance of JIT execution [Sep 2020]

If this information is useful, please help other people find it:
Share via:

Haoran Xu via llvm-dev

2020-Sep-04 02:00 UTC

[llvm-dev] Performance of JIT execution

Hello,

I recently noticed a performance issue of JIT execution vs native code of
the following simple logic which computes the Fibonacci sequence:

uint64_t fib(int n) {
if (n <= 2) {
return 1;
} else {
return fib(n-1) + fib(n-2);
}
}

When compiled natively using clang++ with -O3, it took 0.17s to compute
fib(40). However, when executing using LLJIT, fed with the IR output of
"clang++ -emit-llvm -O3", it took 0.26s.

I don't know much about the internals of LLJIT, but my guess is since the
IR is the same, maybe LLJIT used a cheaper but lower quality instruction
selection pass, resulting in the slower runtime? Could someone working on
LLJIT clarify the difference in lowering passes between LLJIT and clang++?
And if I were to change this behavior, which APIs should I look at to begin
with?

Thanks for your time!

Best regards,
Haoran
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200903/72bd5fb0/attachment.html>

Lang Hames via llvm-dev

2020-Sep-07 06:03 UTC

head link

[llvm-dev] Performance of JIT execution

Hi Haoran,

LLJIT uses CodeGenOpt::Default by default, whereas I suspect -O3 uses
CodeGenOpt::Aggressive. You can configure this by setting/modifying the
JITTargetMachineBuilder member of your LLJITBuilder before calling create.

You can also try attaching a DumpObjects instance to your JIT to dump the
JIT'd objects to disk: sometimes comparing objects can offer useful
insights. You can find an example of this in
llvm/examples/OrcV2Examples/LLJITDumpObjects.

Regards,
Lang.

On Thu, Sep 3, 2020 at 7:01 PM Haoran Xu via llvm-dev <
llvm-dev at lists.llvm.org> wrote:
> Hello,
>
> I recently noticed a performance issue of JIT execution vs native code of
> the following simple logic which computes the Fibonacci sequence:
>
> uint64_t fib(int n) {
> if (n <= 2) {
> return 1;
> } else {
> return fib(n-1) + fib(n-2);
> }
> }
>
> When compiled natively using clang++ with -O3, it took 0.17s to compute
> fib(40). However, when executing using LLJIT, fed with the IR output of
> "clang++ -emit-llvm -O3", it took 0.26s.
>
> I don't know much about the internals of LLJIT, but my guess is since
the
> IR is the same, maybe LLJIT used a cheaper but lower quality instruction
> selection pass, resulting in the slower runtime? Could someone working on
> LLJIT clarify the difference in lowering passes between LLJIT and clang++?
> And if I were to change this behavior, which APIs should I look at to begin
> with?
>
> Thanks for your time!
>
> Best regards,
> Haoran
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200906/8138fe5b/attachment.html>

Haoran Xu via llvm-dev

2020-Sep-08 04:59 UTC

head link

[llvm-dev] Performance of JIT execution

Thanks for the clarification Lang! I didn't know about CodeGenOpt before.
I'll give it a try to see if it fixes the issue.

Thanks again!
Haoran

Lang Hames <lhames at gmail.com> 于2020年9月6日周日 下午11:03写道：
> Hi Haoran,
>
> LLJIT uses CodeGenOpt::Default by default, whereas I suspect -O3 uses
> CodeGenOpt::Aggressive. You can configure this by setting/modifying the
> JITTargetMachineBuilder member of your LLJITBuilder before calling create.
>
> You can also try attaching a DumpObjects instance to your JIT to dump the
> JIT'd objects to disk: sometimes comparing objects can offer useful
> insights. You can find an example of this in
> llvm/examples/OrcV2Examples/LLJITDumpObjects.
>
> Regards,
> Lang.
>
> On Thu, Sep 3, 2020 at 7:01 PM Haoran Xu via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
>
>> Hello,
>>
>> I recently noticed a performance issue of JIT execution vs native code
of
>> the following simple logic which computes the Fibonacci sequence:
>>
>> uint64_t fib(int n) {
>> if (n <= 2) {
>> return 1;
>> } else {
>> return fib(n-1) + fib(n-2);
>> }
>> }
>>
>> When compiled natively using clang++ with -O3, it took 0.17s to compute
>> fib(40). However, when executing using LLJIT, fed with the IR output of
>> "clang++ -emit-llvm -O3", it took 0.26s.
>>
>> I don't know much about the internals of LLJIT, but my guess is
since the
>> IR is the same, maybe LLJIT used a cheaper but lower quality
instruction
>> selection pass, resulting in the slower runtime? Could someone working
on
>> LLJIT clarify the difference in lowering passes between LLJIT and
clang++?
>> And if I were to change this behavior, which APIs should I look at to
begin
>> with?
>>
>> Thanks for your time!
>>
>> Best regards,
>> Haoran
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org
>> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20200907/e4631dab/attachment-0001.html>

llvm dev - Sep 2020 - Performance of JIT execution

[llvm-dev] Performance of JIT execution

[llvm-dev] Performance of JIT execution

[llvm-dev] Performance of JIT execution