thr3ads.net - llvm dev - [LLVMdev] Some additions to the C bindings [Oct 2009]

If this information is useful, please help other people find it:
Share via:

Erick Tryzelaar

2009-Oct-08 07:39 UTC

[LLVMdev] Some additions to the C bindings

On Tue, Oct 6, 2009 at 5:47 PM, Kenneth Uildriks <kennethuil at gmail.com>
wrote:> On Tue, Oct 6, 2009 at 2:13 PM, Kenneth Uildriks <kennethuil at
gmail.com> wrote:
>
> LLVMGetAttribute had a bug in it.  Here's the revised version of the
patch
Hi Kenneth!

I wouldn't say that I'm the best reviewer, but I've been doing some
work with the c bindings recently so hopefully I have some idea of
what I'm talking about :) Comments are inlined:

+/** See the llvm::Use class. */
+typedef struct LLVMOpaqueUse *LLVMUseRef;
+
...
+void LLVMReplaceAllUsesWith(LLVMValueRef OldVal, LLVMValueRef NewVal);
...
+/* Operations on Uses */
+LLVMUseRef LLVMGetFirstUse(LLVMValueRef Val);
+LLVMUseRef LLVMGetNextUse(LLVMUseRef U);
+LLVMValueRef LLVMGetUser(LLVMUseRef U);
+LLVMValueRef LLVMGetUsedValue(LLVMUseRef U);

These seem okay to me, but I don't have too much experience with using
the Use classes. The impression I've gotten from the other developers
is that the C bindings is really designed to just get data into llvm,
and any complex manipulations should really be done in C++ passes.
What's your use case for exposing these more complex manipulations?

+/* Operations on Users */
+LLVMValueRef LLVMGetOperand(LLVMValueRef Val, unsigned Index);

So how are you using this, since you aren't exposing any of the other
operand functionality?

+unsigned long long LLVMConstIntGetZExtValue(LLVMValueRef ConstantVal);
+long long LLVMConstIntGetSExtValue(LLVMValueRef ConstantVal);

I'm not sure about these functions. There really isn't any other way
to get to the value of any other constant, so why do you need this?

 /* Operations on composite constants */
@@ -464,6 +479,7 @@
 LLVMValueRef LLVMConstVector(LLVMValueRef *ScalarConstantVals, unsigned Size);

 /* Constant expressions */
+unsigned LLVMGetConstOpcode(LLVMValueRef ConstantVal);

This seems okay with me, but there really should be an LLVMInstruction
enum defined instead of a raw unsigned value. Could you also add a
LLVMConstExpr that wraps ConstantExpr::get?

+int LLVMHasInitializer(LLVMValueRef GlobalVar);

Seems fine to me. I can commit this now.

+LLVMAttribute LLVMGetFunctionAttr(LLVMValueRef Fn);
+LLVMAttribute LLVMGetAttribute(LLVMValueRef Arg);

I've never really done much with attributes. What are you using this for?

Kenneth Uildriks

2009-Oct-08 12:20 UTC

head link

[LLVMdev] Some additions to the C bindings

On Thu, Oct 8, 2009 at 2:39 AM, Erick Tryzelaar
<idadesub at users.sourceforge.net> wrote:> On Tue, Oct 6, 2009 at 5:47 PM, Kenneth Uildriks <kennethuil at
gmail.com> wrote:
>> On Tue, Oct 6, 2009 at 2:13 PM, Kenneth Uildriks <kennethuil at
gmail.com> wrote:
>>
>> LLVMGetAttribute had a bug in it.  Here's the revised version of
the patch
>
> Hi Kenneth!
>
> I wouldn't say that I'm the best reviewer, but I've been doing
some
> work with the c bindings recently so hopefully I have some idea of
> what I'm talking about :) Comments are inlined:
Thanks.  Let me start by talking a bit about my project.

I'm working on a compiler/language that supports run-time code
generation and compile-time code execution.  Besides the obvious
benefits of easier JITting, I also get the benefits of C++ templates
and metaprogramming without all of the headaches.

To make this work, the compiler actually compiles functions down into
function generators, outputting calls to the LLVM C-bindings that
generate a "regular" function.  The programmer can then either leave
them in that form for run-time JITting, or have the compiler JIT and
execute those function generators in order to get "regular" functions.
 Either or both can be exposed as public functions and left in place
by the optimizer.  The function generator gets its own set of
parameters, and multiple functions with variations can be generated at
compile time or runtime.

He can also put compile-time expressions inside the body of functions,
so that when the function generator runs, the compile-time expressions
are evaluated and used for function generation.  Those compile-time
expressions can use global variables and/or the function generator
parameters..

Anyway, this scheme means that extensive LLVM capability needs to be
available to generated code, since it's the generated code that
creates all of the "regular" functions.  Generated code has a much
easier time calling the C bindings than the C++ API.
>
>
> +/** See the llvm::Use class. */
> +typedef struct LLVMOpaqueUse *LLVMUseRef;
> +
> ...
> +void LLVMReplaceAllUsesWith(LLVMValueRef OldVal, LLVMValueRef NewVal);
> ...
> +/* Operations on Uses */
> +LLVMUseRef LLVMGetFirstUse(LLVMValueRef Val);
> +LLVMUseRef LLVMGetNextUse(LLVMUseRef U);
> +LLVMValueRef LLVMGetUser(LLVMUseRef U);
> +LLVMValueRef LLVMGetUsedValue(LLVMUseRef U);
>
>
> These seem okay to me, but I don't have too much experience with using
> the Use classes. The impression I've gotten from the other developers
> is that the C bindings is really designed to just get data into llvm,
> and any complex manipulations should really be done in C++ passes.
> What's your use case for exposing these more complex manipulations?
I'm using it to support renaming functions and still allowing
generated code to look up those functions by name; basically searching
for all global strings containing the function name, and replacing all
uses of them with uses of the new function name.

I would like to do away with that, though, but I haven't quite managed
to get rid of all cases where LLVMGetNamedFunction is called by
generated code.

Also, I've gotten the impression from other developers that the
C-bindings are considered incomplete and that there is a general
desire to expose more functionality, and eventually all LLVM
functionality, through them.
>
>
> +/* Operations on Users */
> +LLVMValueRef LLVMGetOperand(LLVMValueRef Val, unsigned Index);
>
>
> So how are you using this, since you aren't exposing any of the other
> operand functionality?
This supports the "address-of" operator.  Any Value that is a LoadInst
can have its address taken.  I need the pointer operand of the
LoadInst to get the address Value.

I figured GetOperand was a good starting point, and could support most
of the operand use cases out there.
>
>
> +unsigned long long LLVMConstIntGetZExtValue(LLVMValueRef ConstantVal);
> +long long LLVMConstIntGetSExtValue(LLVMValueRef ConstantVal);
>
>
> I'm not sure about these functions. There really isn't any other
way
> to get to the value of any other constant, so why do you need this?
When I've parsed an int literal and put it on my evaluation stack as a
Value, there's a case where I need to get it back as an int.
Specifically, the LLVMBuildExtractValue function requires an int, not
a Constant, to represent the member.  I believe that GEP does as well
when applied to a struct.
>
>
>  /* Operations on composite constants */
> @@ -464,6 +479,7 @@
>  LLVMValueRef LLVMConstVector(LLVMValueRef *ScalarConstantVals, unsigned
Size);
>
>  /* Constant expressions */
> +unsigned LLVMGetConstOpcode(LLVMValueRef ConstantVal);
>
>
> This seems okay with me, but there really should be an LLVMInstruction
> enum defined instead of a raw unsigned value. Could you also add a
> LLVMConstExpr that wraps ConstantExpr::get?
That shouldn't be a problem.
>
>
> +int LLVMHasInitializer(LLVMValueRef GlobalVar);
>
>
> Seems fine to me. I can commit this now.
>
>
> +LLVMAttribute LLVMGetFunctionAttr(LLVMValueRef Fn);
> +LLVMAttribute LLVMGetAttribute(LLVMValueRef Arg);
>
>
> I've never really done much with attributes. What are you using this
for?
>
In order to do away with include files, I'm supporting importing
modules in bitcode form.  To call a function from an imported module,
I need to put an external into the compiled module, and it really
ought to have the same function and argument attributes as the
original.  And I want to be able to do that while JITting at runtime
as well.

Erick Tryzelaar

2009-Oct-09 23:56 UTC

head link

[LLVMdev] Some additions to the C bindings

On Thu, Oct 8, 2009 at 5:20 AM, Kenneth Uildriks <kennethuil at gmail.com>
wrote:>
> Thanks.  Let me start by talking a bit about my project.
>
> I'm working on a compiler/language that supports run-time code
> generation and compile-time code execution.  Besides the obvious
> benefits of easier JITting, I also get the benefits of C++ templates
> and metaprogramming without all of the headaches.
>
> To make this work, the compiler actually compiles functions down into
> function generators, outputting calls to the LLVM C-bindings that
> generate a "regular" function.  The programmer can then either
leave
> them in that form for run-time JITting, or have the compiler JIT and
> execute those function generators in order to get "regular"
functions.
>  Either or both can be exposed as public functions and left in place
> by the optimizer.  The function generator gets its own set of
> parameters, and multiple functions with variations can be generated at
> compile time or runtime.
>
> He can also put compile-time expressions inside the body of functions,
> so that when the function generator runs, the compile-time expressions
> are evaluated and used for function generation.  Those compile-time
> expressions can use global variables and/or the function generator
> parameters..
>
> Anyway, this scheme means that extensive LLVM capability needs to be
> available to generated code, since it's the generated code that
> creates all of the "regular" functions.  Generated code has a
much
> easier time calling the C bindings than the C++ API.

You're already doing something a bit more complicated than me :) This
does seem a bit more advanced than what llvm-c is intended for,
though. Is there a reason why you can't make a C++ library to do all
this advanced stuff, and just expose some C hooks for your generated
code?

> I'm using it to support renaming functions and still allowing
> generated code to look up those functions by name; basically searching
> for all global strings containing the function name, and replacing all
> uses of them with uses of the new function name.
>
> I would like to do away with that, though, but I haven't quite managed
> to get rid of all cases where LLVMGetNamedFunction is called by
> generated code.
>
> Also, I've gotten the impression from other developers that the
> C-bindings are considered incomplete and that there is a general
> desire to expose more functionality, and eventually all LLVM
> functionality, through them.

While it's lacking in some areas, it's intentional that not all of
llvm is exposed through llvm-c. I learned that after my patches to
expose APInt/APFloat were turned down :) Llvm's a large object
oriented project, and maintaining a mapping between the c and c++ api
would be pretty challenging, especially since llvm promises to never
remove anything from llvm-c until 3.0. In order to ease development,
it's really designed to just provide the minimum interface for getting
data into llvm. If you want to do something advanced like modify the
bytecode, you really should be writing against the c++ api.

> This supports the "address-of" operator.  Any Value that is a
LoadInst
> can have its address taken.  I need the pointer operand of the
> LoadInst to get the address Value.
>
> I figured GetOperand was a good starting point, and could support most
> of the operand use cases out there.

I'm not sure if I understand. The load instruction takes an address as
an argument and stores the value into a register, therefore you must
already have the address already. Or am I misinterpreting what you're
saying?

> When I've parsed an int literal and put it on my evaluation stack as a
> Value, there's a case where I need to get it back as an int.
> Specifically, the LLVMBuildExtractValue function requires an int, not
> a Constant, to represent the member.  I believe that GEP does as well
> when applied to a struct.

GEP doesn't need to take a constant to work.

%0 = alloca { i32, i32 }
%1 = alloca i32
store i32 0, %1
%2 = load %1
%3 = getelementptr { i32, i32 }*, i32 0, %2
%4 = load %3

extractvalue should only be used if you're using value arrays or
structs, and you need to statically know the indexes. If you don't,
then you really should be using GEPs and let the optimizations do
their thing.

> In order to do away with include files, I'm supporting importing
> modules in bitcode form.  To call a function from an imported module,
> I need to put an external into the compiled module, and it really
> ought to have the same function and argument attributes as the
> original.  And I want to be able to do that while JITting at runtime
> as well.

If I understand correctly, why aren't the functions already marked
external? If they aren't then an optimizer could theoretically
optimize them away. It may also be more appropriate to pass the
function information through some different channel by the frontend,
rather than directly processing the bytecode. Anyone else have any
experience with doing this?

Reasonably Related Threads

Search for more apparently analagous threads

llvm dev - Oct 2009 - [LLVMdev] Some additions to the C bindings

[LLVMdev] Some additions to the C bindings

[LLVMdev] Some additions to the C bindings

[LLVMdev] Some additions to the C bindings

Reasonably Related Threads