Hi all, I'm authoring a C interface to the LLVM IR type system. Since this is Really Quite Tedious, I would like to solicit opinions before I get too far down any paths that seem offensive. I've attached the header, where I've mapped a portion of Module and most of Type and its subclasses. This is working, and I've built ocaml bindings on top of it.[1] My intent is to extend this work (only) far enough to author a language front-end. The C bindings should help other languages which want to have self-hosting front-ends, and probably a C interface to the JIT would be well-received. My naming conventions are similar to the Carbon interfaces in OS X. (Should I prefer a Unixy flavor instead?) Naming prefix is LLVM, which may be a bit long. (Would LL be better?) Pointers are opaque, obviously. I find myself copying enums, which is mildly scary. I'm using C strings instead of const char*, size_t tuples. This avoids having to write things like "tmp", strlen("tmp") in C, and is well-supported for language bindings. Nevertheless, most languages other than C have binary-safe string types, so I'm certainly willing to have my mind changed if we want to prefer correctness over inconvenience to the C programmer. (Providing overloads is silly, though.) I'm putting the headers in include/llvm-c. I created a new library called Interop to house the C bindings—but it might make more sense to implement the C bindings in each library instead. They're just glue which the linker will trivially DCE, so that approach may have merit. — Gordon [1] $ cat emit_bc.ml open Llvm let emit_bc filename let m = create_module filename in let big_fn_ty = make_pointer_type (make_function_type (void_type ()) [| make_vector_type (float_type ()) 4; make_pointer_type (make_struct_type [| double_type (); x86fp80_type (); fp128_type (); ppc_fp128_type () |] true); make_pointer_type (make_struct_type [| make_integer_type 1; make_integer_type 3; i8_type (); i32_type () |] false); make_pointer_type (make_array_type (make_opaque_type ()) 4) |] false) in (* string_of_lltype is implemented in ocaml, so the info on stdout shows that make_*_type isn't a write-once/read-never interface. *) print_endline ("big_fn_ty = " ^ (string_of_lltype big_fn_ty)); ignore(add_type_name m "big_fn_ty" big_fn_ty); if not (write_bitcode_file m filename) then print_endline ("write failed: " ^ filename); dispose_module m let _ if 2 = Array.length Sys.argv then emit_bc Sys.argv.(1) else print_endline "Usage: emit_bc FILE" $ make emit_bc ocamlc -cc g++ -I ../llvm/Release/lib/ocaml llvm_ml.cma -o emit_bc emit_bc.ml $ ./emit_bc test.bc big_fn_ty = void (< 4 x float >, { double, x86fp80, fp128, ppc_fp128 } *, { i1, i3, i8, i32 }*, [ 4 x opaque ]*)* $ llvm-dis -o - test.bc ; ModuleID = 'test.bc' %big_fn_ty = type void (<4 x float>, <{ double, x86_fp80, fp128, ppc_fp128 }>*, { i1, i3, i8, i32 }*, [4 x opaque]*)*  -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070912/6384e755/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: VMCore.h Type: application/octet-stream Size: 5377 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070912/6384e755/attachment.obj> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070912/6384e755/attachment-0001.html>
On Sep 11, 2007, at 10:01 PM, Gordon Henriksen wrote:> Hi all, > > I'm authoring a C interface to the LLVM IR type system. Since this > is Really Quite Tedious, I would like to solicit opinions before I > get too far down any paths that seem offensive.Sounds good.> I've attached the header, where I've mapped a portion of Module and > most of Type and its subclasses. This is working, and I've built > ocaml bindings on top of it.[1]Oooh, look at the long doubles ;-)> My intent is to extend this work (only) far enough to author a > language front-end. The C bindings should help other languages > which want to have self-hosting front-ends, and probably a C > interface to the JIT would be well-received.Sounds good, it seems like anyone who wants more can extend it on demand :)> My naming conventions are similar to the Carbon interfaces in OS X. > (Should I prefer a Unixy flavor instead?) Naming prefix is LLVM, > which may be a bit long. (Would LL be better?)LLVM seems fine to me, and the naming convention seems ok (using lowercase + underscores makes the name longer). I do find things like this slightly strange: /* Same as Module::addTypeName. */ int AddTypeNameToModule(LLVMModuleRef M, const char *Name, LLVMTypeRef Ty); I'd expect it to be named something like "LLVMModuleAddTypeName" or something, using NamespaceClassMethod uniformly.> Pointers are opaque, obviously. I find myself copying enums, which > is mildly scary.Copying the enums does seems scary. Is there any way around this? Is LLVMTypeKind that useful?> I'm using C strings instead of const char*, size_t tuples. This > avoids having to write things like "tmp", strlen("tmp") in C, and > is well-supported for language bindings. Nevertheless, most > languages other than C have binary-safe string types, so I'm > certainly willing to have my mind changed if we want to prefer > correctness over inconvenience to the C programmer. (Providing > overloads is silly, though.)I think this makes sense. In order to support arbitrary strings, you could have a: void LLVMValueSetName(LLVMValueRef, const char *, unsigned len); ... function that works with arbitrary strings.> I'm putting the headers in include/llvm-c. I created a new library > called Interop to house the C bindings—but it might make more sense > to implement the C bindings in each library instead. They're just > glue which the linker will trivially DCE, so that approach may have > merit.Nice! You'll make a lot of friends with this :), adding the bindings to the libraries in question make sense. -Chris
On 2007-09-12, at 18:34, Chris Lattner wrote:> On Sep 11, 2007, at 10:01 PM, Gordon Henriksen wrote: > >> I've attached the header, where I've mapped a portion of Module >> and most of Type and its subclasses. This is working, and I've >> built ocaml bindings on top of it.[1] > > Oooh, look at the long doubles ;-)Oh, that's what this comment means: PackedStructTyID,///< 10: Packed Structure. This is for bitcode only Maybe we can hide that better? ocaml mappings really (really really) like sequential enums with no dead values. Oh well. I'll remap the values.>> My naming conventions are similar to the Carbon interfaces in OS >> X. (Should I prefer a Unixy flavor instead?) Naming prefix is >> LLVM, which may be a bit long. (Would LL be better?) > > LLVM seems fine to me, and the naming convention seems ok (using > lowercase + underscores makes the name longer). I do find things > like this slightly strange: > > /* Same as Module::addTypeName. */ > int AddTypeNameToModule(LLVMModuleRef M, const char *Name, > LLVMTypeRef Ty); > > I'd expect it to be named something like "LLVMModuleAddTypeName" or > something, using NamespaceClassMethod uniformly.I tried that at first; I do like to do �method completion for C� in XCode by typing something like LLVMModule esc. Unfortunately, the names got bizarre and unreadable. I can go back to that, but it wasn't �doing it� for me.>> Pointers are opaque, obviously. I find myself copying enums, which >> is mildly scary. > > Copying the enums does seems scary. Is LLVMTypeKind that useful?Uhm. Just a little bit important? :) I'll need to do the same thing with instructions kinds, too.> Is there any way around this?Well, we could move the enums into the C interfaces and include the C interfaces from the C++ code. That moves the values to the global namespace, though; neither Type::FooTyID nor llvm::Type::FooTyID would be valid. The types themselves can be typedef'd back where they belong.>> I'm using C strings instead of const char*, size_t tuples. This >> avoids having to write things like "tmp", strlen("tmp") in C, and >> is well-supported for language bindings. Nevertheless, most >> languages other than C have binary-safe string types, so I'm >> certainly willing to have my mind changed if we want to prefer >> correctness over inconvenience to the C programmer. (Providing >> overloads is silly, though.) > > I think this makes sense. In order to support arbitrary strings, > you could have a: > > void LLVMValueSetName(LLVMValueRef, const char *, unsigned len); > > ... function that works with arbitrary strings.That's true for this case, but I'm not sure there's always a backdoor like that available. For some things it doesn't matter, of course; valid filenames can't contain '\0' on anything notable except Mac OS � X, for one. I guess it's a case-by-case decision. While I'm sure someone, somewhere, would appreciate consistency, that person is not me.>> I'm putting the headers in include/llvm-c. I created a new library >> called Interop to house the C bindings�but it might make more >> sense to implement the C bindings in each library instead. They're >> just glue which the linker will trivially DCE, so that approach >> may have merit. > > Nice! You'll make a lot of friends with this :):) Of course, you realize they won't be happy until they don't have to link using g++�> adding the bindings to the libraries in question make sense.I'll do that. � Gordon -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070912/3272f009/attachment.html>
On 2007-09-12, at 18:34, Chris Lattner wrote:> /* Same as Module::addTypeName. */ > int AddTypeNameToModule(LLVMModuleRef M, const char *Name, > LLVMTypeRef Ty); > > I'd expect it to be named something like "LLVMModuleAddTypeName" or > something, using NamespaceClassMethod uniformly.This obviously should've had its prefix, at the very least. That was just an oversight. — Gordon -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070912/2bd3f344/attachment.html>
Hello Gordon, I'm part of the felix dev team, and I've been interested in making a backend for felix in llvm. It's very exciting to hear that you're making an ocaml interface to llvm. Do you have any of the libraries exposed to the public yet? Also, what license do you plan on using for the code? Felix is bsd, like llvm, so if there's any chance that you'll use a bsd-compatible license, we'd be very thankful. -e Gordon Henriksen wrote:> Hi all, > > I'm authoring a C interface to the LLVM IR type system. Since this is > Really Quite Tedious, I would like to solicit opinions before I get > too far down any paths that seem offensive. I've attached the header, > where I've mapped a portion of Module and most of Type and its > subclasses. This is working, and I've built ocaml bindings on top of > it.[1] My intent is to extend this work (only) far enough to author a > language front-end. The C bindings should help other languages which > want to have self-hosting front-ends, and probably a C interface to > the JIT would be well-received. > > My naming conventions are similar to the Carbon interfaces in OS X. > (Should I prefer a Unixy flavor instead?) Naming prefix is LLVM, which > may be a bit long. (Would LL be better?) Pointers are opaque, > obviously. I find myself copying enums, which is mildly scary. > > I'm using C strings instead of const char*, size_t tuples. This avoids > having to write things like "tmp", strlen("tmp") in C, and is > well-supported for language bindings. Nevertheless, most languages > other than C have binary-safe string types, so I'm certainly willing > to have my mind changed if we want to prefer correctness over > inconvenience to the C programmer. (Providing overloads is silly, though.) > > I'm putting the headers in include/llvm-c. I created a new library > called Interop to house the C bindings—but it might make more sense to > implement the C bindings in each library instead. They're just glue > which the linker will trivially DCE, so that approach may have merit. > > — Gordon > > > [1] > *$ cat emit_bc.ml* > open Llvm > > let emit_bc filename > let m = create_module filename in > > let big_fn_ty = make_pointer_type > (make_function_type (void_type ()) > [| make_vector_type (float_type ()) 4; > make_pointer_type > (make_struct_type [| double_type (); > x86fp80_type (); > fp128_type (); > ppc_fp128_type () > |] true); > make_pointer_type > (make_struct_type [| make_integer_type 1; > make_integer_type 3; > i8_type (); > i32_type () |] false); > make_pointer_type > (make_array_type (make_opaque_type ()) 4) |] > false) in > > (* string_of_lltype is implemented in ocaml, so the info on stdout > shows that make_*_type isn't a write-once/read-never interface. *) > print_endline ("big_fn_ty = " ^ (string_of_lltype big_fn_ty)); > > ignore(add_type_name m "big_fn_ty" big_fn_ty); > > if not (write_bitcode_file m filename) > then print_endline ("write failed: " ^ filename); > > dispose_module m > > let _ = > if 2 = /Array./length /Sys./argv > then emit_bc /Sys./argv.(1) > else print_endline "Usage: emit_bc FILE" > > *$ make emit_bc* > ocamlc -cc g++ -I ../llvm/Release/lib/ocaml llvm_ml.cma -o emit_bc > emit_bc.ml > *$ ./emit_bc test.bc* > big_fn_ty = void (< 4 x float >, { double, x86fp80, fp128, ppc_fp128 > }*, { i1, i3, i8, i32 }*, [ 4 x opaque ]*)* > *$ llvm-dis -o - test.bc* > ; ModuleID = 'test.bc' > %big_fn_ty = type void (<4 x float>, <{ double, x86_fp80, > fp128, ppc_fp128 }>*, { i1, i3, i8, i32 }*, [4 x opaque]*)* > > > > ------------------------------------------------------------------------ > > > ------------------------------------------------------------------------ > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Hi Erick, On 2007-09-13, at 23:35, Erick Tryzelaar wrote:> Gordon Henriksen wrote: > >> I'm authoring a C interface to the LLVM IR type system. I've >> attached the header, where I've mapped a portion of Module and >> most of Type and its subclasses. This is working, and I've built >> ocaml bindings on top of it. > > I'm part of the felix dev team, and I've been interested in making > a backend for felix in llvm. It's very exciting to hear that you're > making an ocaml interface to llvm.:)> Do you have any of the libraries exposed to the public yet?I've not published it since I haven't yet wrapped enough of the API to do anything useful. The snippet included every function I had mapped at the time I sent the message! Stay tuned.> Also, what license do you plan on using for the code? Felix is bsd, > like llvm, so if there's any chance that you'll use a bsd- > compatible license, we'd be very thankful.My intent is to contribute this work to the LLVM project, so you won't have any licensing problems. In fact, I've simply integrated the ocaml bindings into LLVM's source tree so that they are built and installed if configure can find ocamlc. — Gordon
On Sep 12, 2007, at 01:01, Gordon Henriksen wrote:> I'm authoring a C interface to the LLVM IR type system. Since this > is Really Quite Tedious, I would like to solicit opinions before I > get too far down any paths that seem offensive. I've attached the > header, where I've mapped a portion of Module and most of Type and > its subclasses. This is working, and I've built ocaml bindings on > top of it.Now with constants and globals variables. Functions and basic blocks next, then on to LLVMBuilder. — Gordon //===-- c-bindings.patch (+730) -------------------------------===// include/llvm/CHelpers.h (+94) include/llvm-c/BitWriter.h (+42) include/llvm-c/Core.h (+221) lib/Bitcode/Writer/BitWriter.cpp (+51) lib/VMCore/Core.cpp (+322) Tedious C bindings for libLLVMCore.a and libLLVMBitWriter.a! - The naming prefix is LLVM. - All types are represented using opaque references. - Functions are not named LLVM{Type}{Method}; the names became unreadable goop. Instead, they are named LLVM{ImperativeSentence}. - Where an attribute only appears once in the class hierarchy (e.g., linkage only applies to values; parameter types only apply to function types), the class is omitted from identifiers for brevity. Tastes like methods. - Strings are C strings or string/length tuples on a case-by-case basis. - APIs which give the caller ownership of an object are not mapped (removeFromParent, certain constructor overloads). This keeps keep memory management as simple as possible. For each library with bindings: llvm-c/<LIB>.h - Declares the bindings. lib/<LIB>/<LIB>.cpp - Implements the bindings. So just link with the library of your choice and use the C header instead of the C++ one. This patch is independent. //===-- ocaml-make.patch (+380 -27) ---------------------------===// configure (+111 -25) Makefile.config.in (+2) bindings/ocaml/Makefile.ocaml (+263) Makefile (+2 -2) autoconf/configure.ac (+2) I add a generic ocaml Makefile which will be used by the ocaml language bindings. configure is schooled how to sniff ocamlc and ocamlopt. This patch is independent. //===-- ocaml-bindings.patch (+936) ---------------------------===// bindings/ocaml/llvm bindings/ocaml/llvm/llvm.ml (+226) bindings/ocaml/llvm/llvm_ocaml.c (+394) bindings/ocaml/llvm/llvm.mli (+168) bindings/ocaml/llvm/Makefile (+24) bindings/ocaml/bitwriter bindings/ocaml/bitwriter/llvm_bitwriter.mli (+18) bindings/ocaml/bitwriter/bitwriter_ocaml.c (+31) bindings/ocaml/bitwriter/llvm_bitwriter.ml (+18) bindings/ocaml/bitwriter/Makefile (+23) bindings/ocaml/Makefile (+13) bindings/README.txt (+3) bindings/Makefile (+18) Adds ocaml language bindings to LLVM. They are built automatically if configure detects the ocamlc compiler and are installed to the ocaml standard library. This patch depends on c-bindings and ocaml-make.  -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070916/86097efa/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: c-bindings.patch Type: application/octet-stream Size: 27943 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070916/86097efa/attachment.obj> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070916/86097efa/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: ocaml-bindings.patch Type: application/octet-stream Size: 39129 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070916/86097efa/attachment-0001.obj> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070916/86097efa/attachment-0002.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: ocaml-make.patch Type: application/octet-stream Size: 21981 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070916/86097efa/attachment-0002.obj> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20070916/86097efa/attachment-0003.html>
Hi Gordon,> > I'm authoring a C interface to the LLVM IR type system.It's great to see a C interface being added. A minor niggle:> +typedef enum { > + LLVMVoidTypeKind = 0, /* type with no size */ > ... > +typedef enum { > + LLVMExternalLinkage = 0,/* Externally visible function */ > ... > +typedef enum { > + LLVMDefaultVisibility = 0, /* The GV is visible */It's defined by the language that the first enumerate's zero so the initialisation is redundant. As someone reading the source, it would make me halt and wonder why it's been done, "What am I missing?". Similar to seeing a `static int foo = 0'. Cheers, Ralph.