thr3ads.net - llvm dev - [LLVMdev] Tool for run-time code generation? [Jul 2010]

If this information is useful, please help other people find it:
Share via:

David Piepgrass

2010-Jul-16 15:47 UTC

[LLVMdev] Tool for run-time code generation?

Using C++ code, I would like to generate code at run-time (the same way .NET
code can use dynamic methods or compiled expressions) in order to obtain very
high performance code (to read binary data records whose formats are only known
at run-time.) I need to target x86 (Win32) and ARM (WinCE).

Can LLVM be used for this purpose, or would something else work better? Are
there any open-source projects that have done this, that I could look to as an
example?

David Piepgrass, E.I.T.
Software Developer
__________________________________________

Mentor Engineering Inc.<http://www.mentoreng.com/>
10, 2175 - 29th Street NE
Calgary, AB, Canada  T1Y 7H8

Ph: (403) 777-3760 ext. 490  Fax: (403) 777-3769

What are the costs of speeding & idling in your fleet?
Watch this short demo to find
out<http://www.mentoreng.com/speed-idle/speed-idle-demo.html>
[cid:image001.jpg at 01CB24CB.E805DB10]

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20100716/acafd791/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.jpg
Type: image/jpeg
Size: 3683 bytes
Desc: image001.jpg
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20100716/acafd791/attachment.jpg>

Óscar Fuentes

2010-Jul-16 16:16 UTC

head link

[LLVMdev] Tool for run-time code generation?

David Piepgrass <dpiepgrass at mentoreng.com> writes:
> Using C++ code, I would like to generate code at run-time (the same
> way .NET code can use dynamic methods or compiled expressions) in
> order to obtain very high performance code (to read binary data
> records whose formats are only known at run-time.)
>
> I need to target x86 (Win32) and ARM (WinCE).
>
> Can LLVM be used for this purpose, or would something else work
> better?
LLVM has a JIT for this purpose. You generate LLVM IR code (a sort of
generic assembler) and it produces optimized native code ready to be
executed.

x86-win32 is fine. I don't think so about arm-wince.
> Are there any open-source projects that have done this, that I
> could look to as an example?
This is a list of some of the projects that uses LLVM:

http://www.llvm.org/Users.html

The LLVM tutorial is worth a reading too:

http://www.llvm.org/docs/tutorial/

David Piepgrass

2010-Jul-16 17:45 UTC

head link

[LLVMdev] Tool for run-time code generation?

> LLVM has a JIT for this purpose. You generate LLVM IR code (a sort of
> generic assembler) and it produces optimized native code ready to be
> executed.
> 
> x86-win32 is fine. I don't think so about arm-wince.
What's wrong with running LLVM on ARM? It's supposed to support ARM as a
target, and since it's written in C it should theoretically compile for ARM.
CMake doesn't support Visual Studio 9 for Smart Devices, so I would probably
have to go to quite a bit of trouble to create a project file. Still, if I did
so, shouldn't it theoretically work?

Vlad

2010-Jul-16 18:36 UTC

head link

[LLVMdev] Tool for run-time code generation?

I happen to be using LLVM for just this reason. I process large volumes of data
records with schemas that are only known at runtime and/or can change
dynamically as various transforms are applied to such records at various stages.

To this end, I auto-generate C99 syntax at run time, parse it using clang, do
some AST transformations, compile using LLVM JIT, and then execute within the
same (C++) process. As a point of comparison, I've done similar things with
Java bytecode and while the JVM approach was [much] easier learning curve- and
documentation-wise, it is hard to complain about the level of control you get
with LLVM. It is like having a WISS (When I Say So) dynamic compiler producing
-03-level native code. In my case, I target x86-64 and had only minor trouble
supporting the same toolchain on Linux and Darwin.

So, the approach is definitely workable, but I must warn about the non-trivial
amount of effort required to figure things out in both clang and LLVM codebases.
For example, how to traverse or mutate ASTs produced by clang is pretty much a
FAQ on the clang list but there is no good documentation addressing this very
common use case.

HTH,
Vlad

On Jul 16, 2010, at 10:47 AM, David Piepgrass wrote:
> Using C++ code, I would like to generate code at run-time (the same way
.NET code can use dynamic methods or compiled expressions) in order to obtain
very high performance code (to read binary data records whose formats are only
known at run-time.) I need to target x86 (Win32) and ARM (WinCE).
>  
> Can LLVM be used for this purpose, or would something else work better? Are
there any open-source projects that have done this, that I could look to as an
example?
>  
> David Piepgrass, E.I.T.
> Software Developer
> __________________________________________
>  
> Mentor Engineering Inc.
> 10, 2175 - 29th Street NE
> Calgary, AB, Canada  T1Y 7H8
>  
> Ph: (403) 777-3760 ext. 490  Fax: (403) 777-3769
>  
> What are the costs of speeding & idling in your fleet? 
> Watch this short demo to find out
> <image001.jpg>
>  
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20100716/25d030f9/attachment.html>

Nick Lewycky

2010-Jul-17 02:30 UTC

head link

[LLVMdev] Tool for run-time code generation?

Vlad wrote:> I happen to be using LLVM for just this reason. I process large volumes
> of data records with schemas that are only known at runtime and/or can
> change dynamically as various transforms are applied to such records at
> various stages.
>
> To this end, I auto-generate C99 syntax at run time, parse it using
> clang, do some AST transformations, compile using LLVM JIT, and then
> execute within the same (C++) process. As a point of comparison, I've
> done similar things with Java bytecode and while the JVM approach was
> [much] easier learning curve- and documentation-wise, it is hard to
> complain about the level of control you get with LLVM. It is like having
> a WISS (When I Say So) dynamic compiler producing -03-level native code.
> In my case, I target x86-64 and had only minor trouble supporting the
> same toolchain on Linux and Darwin.
I strongly recommend that anyone doing this sort of specialization not 
to write a system that generates C code as strings and then parses it, 
unless you happen to be starting with a system that already prints C.

Instead, break the chunks of C you would generate into functions and 
compile those ahead-of-time. At run time you use llvm only (no clang) to 
generate a series of function calls into those functions.

Then you can play tricks with that. Instead of fully compiling those 
functions ahead of time (ie. to .o files), you can compile them into .bc 
and create an llvm Module out of it, either by loading it from a file or 
by using 'llc -march=cpp' to create C++ code using the LLVM API that 
produces said module when run. With your run-time generated code and the 
library code in the same Module, you can run the inliner before the 
other optimizers.

Alternately, if your chunks of C are very small you should may find it 
easy to just produce LLVM IR in memory using the LLVM API directly. See 
the LLVM language at llvm.org/docs/LangRef.html and the IRBuilder at 
http://llvm.org/doxygen/classllvm_1_1IRBuilder.html .

Either of these techniques avoids the need to use clang at run-time, or 
spend time generating large strings just to re-parse them. Since the 
optimizers are all in LLVM proper, you should get the exact same 
assembly out.

Nick
> So, the approach is definitely workable, but I must warn about the
> non-trivial amount of effort required to figure things out in both clang
> and LLVM codebases. For example, how to traverse or mutate ASTs produced
> by clang is pretty much a FAQ on the clang list but there is no good
> documentation addressing this very common use case.
>
> HTH,
> Vlad
>
> On Jul 16, 2010, at 10:47 AM, David Piepgrass wrote:
>
>> Using C++ code, I would like to generate code at run-time (the same
>> way .NET code can use dynamic methods or compiled expressions) in
>> order to obtain very high performance code (to read binary data
>> records whose formats are only known at run-time.) I need to target
>> x86 (Win32) and ARM (WinCE).
>> Can LLVM be used for this purpose, or would something else work
>> better? Are there any open-source projects that have done this, that I
>> could look to as an example?
>> *David Piepgrass, E.I.T. *
>> Software Developer
>> __________________________________________
>> *Mentor Engineering Inc. <http://www.mentoreng.com/> *
>> 10, 2175 - 29th Street NE
>> Calgary, AB, Canada T1Y 7H8
>> Ph: (403) 777-3760 ext. 490 Fax: (403) 777-3769
>> *What are the costs of speeding & idling in your fleet?** **
>> ** Watch this short demo to find out
>> <http://www.mentoreng.com/speed-idle/speed-idle-demo.html> *
>> <image001.jpg>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu <mailto:LLVMdev at cs.uiuc.edu>
>> http://llvm.cs.uiuc.edu <http://llvm.cs.uiuc.edu/>
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Nick Lewycky

2010-Jul-17 16:38 UTC

head link

[LLVMdev] Tool for run-time code generation?

Martin C. Martin wrote:>
>
> On 7/16/2010 10:30 PM, Nick Lewycky wrote:
>> Vlad wrote:
>>
>> Instead, break the chunks of C you would generate into functions and
>> compile those ahead-of-time. At run time you use llvm only (no clang)
to
>> generate a series of function calls into those functions.
>
> Compelling. I hadn't considered that.
>
> In our application, we have a tree of primitive operations, where each
> one calls into its children and returns to its parent. There are various
> types of nodes, and we don't know the topology or types of nodes until
> runtime (i.e. Clang/LLVM invocation time). Each operation is pretty
> simple, so we'd like to inline the children's operations into the
parent
> where a C compiler would do it.
>
> Could your technique be extended to that? e.g. precompile to LLVM IR
> with calls to a non-existent "node_do_foo()" call, and then
replace it
> with the specific "childtype_do_foo()" call when we know the type
of the
> child?
Will you know the prototype of the function call in advance? If so, you 
can do something really simple where you write the C functions with a 
function pointer parameter. Then at run-time, use 
llvm::CloneAndPruneFunctionInto to produce the specialized function by 
giving it a valuemap that maps the Argument* for the fptr to the 
concrete Function* you want it to call.

If you don't know the type of the call you intend to place, the question 
becomes "why not?" What arguments were you planning to pass it, if you
don't know how many arguments it takes in advance? I don't see any 
reason to doubt that it's possible to do, but I would need more details 
before I could suggest an implementation.
>> Since the
>> optimizers are all in LLVM proper, you should get the exact same
>> assembly out.
>
> This is something I've been wondering. Since Clang has different
> information than the LLVM IR, it seems there should be some
> optimizations that would be easy to do in Clang, but
> difficult/impossible to do in LLVM. No? Not even for C++?
Yes. Copy constructor elimination is permitted by C++ even though it may 
change the visible behaviour of the program. I think NRVO is also done 
in the frontend.

The sterling example is type-based alias analysis (which most people 
know of as -fstrict-aliasing in GCC). C++ has rules which state that 
sometimes pointers can't alias depending on their types. The LLVM type 
system is not the C++ type system so we can't just apply those rules, 
and Clang doesn't have the sort of low-level optimizations that would 
benefit from this information. The eventual plan is to make clang tag 
loads and stores with metadata indicating what 'aliasing group' they 
belong to, then provide an alias analysis pass in LLVM which uses that 
information.

Nick

Martin C. Martin

2010-Jul-18 16:42 UTC

head link

[LLVMdev] Tool for run-time code generation?

On 7/17/2010 12:38 PM, Nick Lewycky wrote:> Martin C. Martin wrote:
>>
>>
>> On 7/16/2010 10:30 PM, Nick Lewycky wrote:
>>> Vlad wrote:
>>>
>>> Instead, break the chunks of C you would generate into functions
and
>>> compile those ahead-of-time. At run time you use llvm only (no
clang) to
>>> generate a series of function calls into those functions.
>>
>> Compelling. I hadn't considered that.
>>
>> In our application, we have a tree of primitive operations, where each
>> one calls into its children and returns to its parent. There are
various
>> types of nodes, and we don't know the topology or types of nodes
until
>> runtime (i.e. Clang/LLVM invocation time). Each operation is pretty
>> simple, so we'd like to inline the children's operations into
the parent
>> where a C compiler would do it.
>>
>> Could your technique be extended to that? e.g. precompile to LLVM IR
>> with calls to a non-existent "node_do_foo()" call, and then
replace it
>> with the specific "childtype_do_foo()" call when we know the
type of the
>> child?
>
> Will you know the prototype of the function call in advance? If so, you
> can do something really simple where you write the C functions with a
> function pointer parameter. Then at run-time, use
> llvm::CloneAndPruneFunctionInto to produce the specialized function by
> giving it a valuemap that maps the Argument* for the fptr to the
> concrete Function* you want it to call.
Great!  I wasn't aware of that, so that's really helpful.
> If you don't know the type of the call you intend to place, the
question
> becomes "why not?" What arguments were you planning to pass it,
if you
> don't know how many arguments it takes in advance? I don't see any
> reason to doubt that it's possible to do, but I would need more details
> before I could suggest an implementation.
We're processing large amounts of data, so imagine something like a SQL 
implementation.  In one query I might want to JOIN on an int field.  In 
another query, I'm JOINing on a pair of fields of type unsigned & 
double.  For those two queries, I'd generate:

rightChildType_seekTo(rigtChild, left.getIntColumn3());

vs.

rightChildType_seekTo(rightChild, left.getUIntColumn9(), 
left.getDoubleColumn23());

Is that possible in LLVM?

Best,
Martin

Reasonably Related Threads

Search for more reasonably related threads

llvm dev - Jul 2010 - [LLVMdev] Tool for run-time code generation?

[LLVMdev] Tool for run-time code generation?

[LLVMdev] Tool for run-time code generation?

[LLVMdev] Tool for run-time code generation?

[LLVMdev] Tool for run-time code generation?

[LLVMdev] Tool for run-time code generation?

[LLVMdev] Tool for run-time code generation?

[LLVMdev] Tool for run-time code generation?

Reasonably Related Threads