thr3ads.net - llvm dev - [LLVMdev] Using LLVM to serialize object state -- and performance [Nov 2012]

If this information is useful, please help other people find it:
Share via:

Paul J. Lucas

2012-Nov-07 15:12 UTC

[LLVMdev] Using LLVM to serialize object state -- and performance

On Nov 6, 2012, at 11:49 AM, "Kaylor, Andrew" <andrew.kaylor at
intel.com> wrote:
> I think you may have gone beyond what I understand in how the legacy JIT
code works.  It looks like the call to addGlobalMapping should short-circuit the
named function look up that I described ...
Well, I first look for the function by name and, if I didn't find it, then I
call addGlobalMapping().  But that's not where the time is going.  Here:

	https://dl.dropbox.com/u/46791180/callgraph.pdf

is a call graph generated by kcachegrind.  I still don't understand all the
numbers (and this PDF seems not to include commas where it should), but if you
look at the left fork, the bottom two ovals, "Schedule..." is called
16K times and "setHeightToAtLeas..." is called 37K times.  On the
right fork, RAGreed... is called 35K times.

Those are far too many calls to *anything* for a simple sequence of
"call" LLVM instructions.  Something seems horribly wrong.

- Paul

Kaylor, Andrew

2012-Nov-12 21:52 UTC

head link

[LLVMdev] Using LLVM to serialize object state -- and performance

Hi Paul,

This is definitely outside the area where I know the particulars of what's
going on.  However, one idea that might be worth trying is setting the JIT
optimization level to 'CodeGenOpt::None'.  This should trigger the use
of the FastISel instruction selector.  Normally, you wouldn't want that for
anything other than generating debug code, but since your routines are just
making calls, it might work for you.

-Andy

-----Original Message-----
From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On
Behalf Of Paul J. Lucas
Sent: Wednesday, November 07, 2012 7:13 AM
To: llvmdev at cs.uiuc.edu List
Subject: Re: [LLVMdev] Using LLVM to serialize object state -- and performance

On Nov 6, 2012, at 11:49 AM, "Kaylor, Andrew" <andrew.kaylor at
intel.com> wrote:
> I think you may have gone beyond what I understand in how the legacy JIT
code works.  It looks like the call to addGlobalMapping should short-circuit the
named function look up that I described ...
Well, I first look for the function by name and, if I didn't find it, then I
call addGlobalMapping().  But that's not where the time is going.  Here:

	https://dl.dropbox.com/u/46791180/callgraph.pdf

is a call graph generated by kcachegrind.  I still don't understand all the
numbers (and this PDF seems not to include commas where it should), but if you
look at the left fork, the bottom two ovals, "Schedule..." is called
16K times and "setHeightToAtLeas..." is called 37K times.  On the
right fork, RAGreed... is called 35K times.

Those are far too many calls to *anything* for a simple sequence of
"call" LLVM instructions.  Something seems horribly wrong.

- Paul


_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Paul J. Lucas

2012-Nov-13 19:27 UTC

head link

[LLVMdev] Using LLVM to serialize object state -- and performance

Switching to CodeGenOpt::None reduced the execution time from 5.74s to 0.84s. 
By just tweaking things randomly, changing to CodeModel::Small reduced it
further to 0.22s.

We have some old, ugly, pure C++ code that we're trying to replace (both
because it's ugly and because it's slow).  It's execution time is
about 0.089s, so that's the time to beat.

Hence, I'd like to reduce the 0.22s time even further to below 0.089s.  Any
ideas?

- Paul

On Nov 12, 2012, at 1:52 PM, "Kaylor, Andrew" <andrew.kaylor at
intel.com> wrote:
> Hi Paul,
> 
> This is definitely outside the area where I know the particulars of
what's going on.  However, one idea that might be worth trying is setting
the JIT optimization level to 'CodeGenOpt::None'.  This should trigger
the use of the FastISel instruction selector.  Normally, you wouldn't want
that for anything other than generating debug code, but since your routines are
just making calls, it might work for you.
> 
> -Andy
> 
> -----Original Message-----
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at
cs.uiuc.edu] On Behalf Of Paul J. Lucas
> Sent: Wednesday, November 07, 2012 7:13 AM
> To: llvmdev at cs.uiuc.edu List
> Subject: Re: [LLVMdev] Using LLVM to serialize object state -- and
performance
> 
> On Nov 6, 2012, at 11:49 AM, "Kaylor, Andrew" <andrew.kaylor
at intel.com> wrote:
> 
>> I think you may have gone beyond what I understand in how the legacy
JIT code works.  It looks like the call to addGlobalMapping should short-circuit
the named function look up that I described ...
> 
> Well, I first look for the function by name and, if I didn't find it,
then I call addGlobalMapping().  But that's not where the time is going. 
Here:
> 
> 	https://dl.dropbox.com/u/46791180/callgraph.pdf
> 
> is a call graph generated by kcachegrind.  I still don't understand all
the numbers (and this PDF seems not to include commas where it should), but if
you look at the left fork, the bottom two ovals, "Schedule..." is
called 16K times and "setHeightToAtLeas..." is called 37K times.  On
the right fork, RAGreed... is called 35K times.
> 
> Those are far too many calls to *anything* for a simple sequence of
"call" LLVM instructions.  Something seems horribly wrong.
> 
> - Paul
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Possibly Parallel Threads

Search for more maybe matching threads

llvm dev - Nov 2012 - [LLVMdev] Using LLVM to serialize object state -- and performance

[LLVMdev] Using LLVM to serialize object state -- and performance

[LLVMdev] Using LLVM to serialize object state -- and performance

[LLVMdev] Using LLVM to serialize object state -- and performance

Possibly Parallel Threads