thr3ads.net - llvm dev - [LLVMdev] asmwriting times (was Re: LLVMContext: Suggestions for API Changes) [Aug 2009]

If this information is useful, please help other people find it:
Share via:

Dan Gohman

2009-Aug-24 20:32 UTC

[LLVMdev] asmwriting times (was Re: LLVMContext: Suggestions for API Changes)

Before we get too far into this, I'd like to point out that there's a  
ready solution
for the problem of the AsmPrinter being slow: Bitcode.  If you want IR
reading and writing to be fast, you should consider bitcode files  
rather than
assembly (text) files anyway.  Bitcode is smaller and faster.  And the  
API is
similar, so it's usually easy to change from assembly to bitcode.

That said, I've done testing of the AsmPrinter performance myself and  
seen
only moderate slowdowns due to the formatting changes; nothing of the
magnitude you're describing.  I'm hoping to try out Pure to see if I can
reproduce what you're seeing.

As a first step, would it be possible for you to use strace - 
etrace=write to
determine if buffering is somehow not happening?

One other question the occurs to me: is Pure dumping the whole Module
at once, or is it manually writing out the IR in pieces?

Dan

On Aug 24, 2009, at 12:36 PM, Albert Graef wrote:
> Albert Graef wrote:
>> One thing I noticed is that writing LLVM assembler code (print()
>> methods) seems to be horribly slow now (some 4-5 times slower than in
>> LLVM 2.5). This is a real bummer for me, since Pure's batch
compiler
>> uses those methods to produce output code which then gets fed into  
>> llvmc.
>
> Let me follow up with some concrete figures. Unfortunately, I don't  
> have
> a minimal C++ example, but the effect is easy to reproduce with Pure
> 0.31 from http://pure-lang.googlecode.com/ and the attached little  
> Pure
> script. (I'm quite sure that this is not a bug in the Pure  
> interpreter,
> as exactly the same code runs more than a magnitude faster with LLVM  
> 2.5
> than with LLVM 2.6/2.7svn.)
>
> The given figures are user CPU times in seconds, as given by  
> time(1), so
> they are not really that accurate, but the effect is so prominent that
> this really doesn't matter. (See below for the details on how I  
> obtained
> these figures.)
>
>                LLVM 2.5        LLVM 2.6        LLVM 2.7(svn)
>
> execute         1.752s          2.272s          2.256s
> compile         2.316s          24.458s         24.834s
>
> codegen         0.564s          22.186s         22.578s
> (compile ./. execute)
>
> To measure the asmwriting times, I first ran the script without
> generating LLVM assembler output code ("execute", pure -x
hello.pure)
> and then again with LLVM assembler output enabled ("compile",
pure -o
> hello.ll -c -x hello.pure). The difference between the two figures
> ("codegen") gives a rough estimate of the net asmwriting times.  
> (That's
> really all that pure -c does; at the end of script execution it takes
> the IR that's already in memory and just spits it out by iterating  
> over
> the global variables and the functions of the IR module and using the
> appropriate print() methods.)
>
> The resulting LLVM assembler file hello.ll was some 5.3 MB for LLVM
> 2.6/2.7svn (4.4 MB for LLVM 2.5; the assembler programs are exactly  
> the
> same, though, the size differences are apparently due to formatting
> changes in LLVM 2.6/2.7svn). Note that the code size is quite large
> because the function definitions compiled from Pure's prelude are all
> included.
>
> The tests were performed on an AMD Phenom 9950 4x2.6GHz with 4GB RAM
> running Linux x86-64 2.6.27. The following configure options were used
> to compile LLVM (all versions) and Pure 0.31, respectively:
>
> LLVM: --enable-optimized --disable-assertions --disable-expensive- 
> checks
> --enable-targets=host-only --enable-pic
>
> Pure: --enable-release
>
> So the effect is actually much *more* prominent than I first made it  
> out
> to be. This is just one data point, of course, but I get an easily
> noticable slowdown with every Pure script I tried. In fact it's so  
> much
> slower that I consider it unusable.
>
> I'm at a loss here. I'd have to debug the LLVM asmwriter code to
see
> where exactly the bottleneck is. I haven't done that yet, but I ruled
> out an issue with the raw_ostream buffer sizes by trying different  
> sizes
> from 256 bytes up to 64K; it doesn't change the results very much.
>
> So my question to fellow frontend developers is: Has anyone else seen
> this? Does anyone know a workaround?
>
> Any help will be greatly appreciated.
>
> TIA,
> Albert
>
> -- 
> Dr. Albert Gr"af
> Dept. of Music-Informatics, University of Mainz, Germany
> Email:  Dr.Graef at t-online.de, ag at muwiinfa.geschichte.uni-mainz.de
> WWW:    http://www.musikinformatik.uni-mainz.de/ag
> using system;
>
> fact n = if n>0 then n*fact (n-1) else 1;
>
> main n = do puts ["Hello, world!", str (map fact (1..n))];
>
> const n = if argc>1 then sscanf (argv!1) "%d" else 10;
> main n;
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Albert Graef

2009-Aug-24 21:27 UTC

head link

[LLVMdev] asmwriting times (was Re: LLVMContext: Suggestions for API Changes)

Dan Gohman wrote:> One other question the occurs to me: is Pure dumping the whole Module
> at once, or is it manually writing out the IR in pieces?
Well, you hit the nail on the head with that one. ;-) In fact, I just
had the same idea. So, instead of selecting and emitting individual
globals and functions on the fly, I rewrote the .ll writer in Pure so
that it just erases unwanted stuff from the module and then emits the
entire module at once. Well, you guessed it, it runs a lot faster now. I
also see the minor slowdowns compared to LLVM 2.5 you mentioned, but
those I can easily live with.

Problem solved, thanks! (And sorry for wasting bandwidth with this.)

Thank you also for the hint about bitcode reading/writing. I'm aware of
this, but I actually prefer the .ll output because it's human-readable,
which is great for debugging purposes. I might add a bitcode writer some
time, but it's not a high priority for me right now.

What I'm really looking forward to is the .o writer, though
(http://wiki.llvm.org/Direct_Object_Code_Emission). That will make
things *much* easier for Pure users, as they won't have to install the
entire LLVM suite any more if all they want to do is batch-compile Pure
programs.

Albert

-- 
Dr. Albert Gr"af
Dept. of Music-Informatics, University of Mainz, Germany
Email:  Dr.Graef at t-online.de, ag at muwiinfa.geschichte.uni-mainz.de
WWW:    http://www.musikinformatik.uni-mainz.de/ag

Ivo

2009-Aug-24 22:05 UTC

head link

[LLVMdev] asmwriting times (was Re: LLVMContext: Suggestions for API Changes)

On Monday 24 August 2009 23:27, Albert Graef wrote:> Thank you also for the hint about bitcode reading/writing. I'm aware of
> this, but I actually prefer the .ll output because it's human-readable,
> which is great for debugging purposes. I might add a bitcode writer some
> time, but it's not a high priority for me right now.
Isn't there a tool to turn bitcode into human readable text? I have been 
reading about the internals of LLVM lately (I'm planning on making a new 
backend) and the whole infrastructure reminds me a lot of the Amsterdam 
Compiler Kit (ACK). They had a program to turn binary EM code into ASCII, 
making it easy to dump the output to |more or less during debugging.

--Ivo

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Aug 2009 - [LLVMdev] asmwriting times (was Re: LLVMContext: Suggestions for API Changes)

[LLVMdev] asmwriting times (was Re: LLVMContext: Suggestions for API Changes)

[LLVMdev] asmwriting times (was Re: LLVMContext: Suggestions for API Changes)

[LLVMdev] asmwriting times (was Re: LLVMContext: Suggestions for API Changes)

Apparently Analagous Threads