Hongbo Zhang
2013-Apr-15 22:32 UTC
[LLVMdev] The most efficient way to compile to LLVM IR?
Hi all, I am trying to compile my toy language to LLVM back end. (I am new to LLVM, so my questions may sound naive) I am looking at some tutorials about LLVM, most are about how to use LLVM IRBuilder, however, I find the API provided by IRBuilder is quite imperative and verbose, and the API changes so fast that most of the tutorials are out of dated. So I am wondering what's the benefit of emitting LLVM IR using IRBuilder compared with designing my own abstract syntax in a high-level programming language(e.g. Haskell or OCaml) and unparsing it to LLVM IR. Is there some benefit of using IRBuilder I ignored here? And I have some follow-up questions: 1. How stable is the IR format? 2. Is the binary representation of IR format (*.bc) stable and the same across different platforms? 3. Is there any previous work of building a declarative`interface instead of using IRBuilder? Thank you in advance!
David Blaikie
2013-Apr-16 01:15 UTC
[LLVMdev] The most efficient way to compile to LLVM IR?
On Mon, Apr 15, 2013 at 3:32 PM, Hongbo Zhang <bobzhang1988 at gmail.com> wrote:> Hi all, > I am trying to compile my toy language to LLVM back end. (I am new to > LLVM, so my questions may sound naive) > I am looking at some tutorials about LLVM, most are about how to use LLVM > IRBuilder, however, I find the API provided by IRBuilder is quite imperative > and verbose, and the API changes so fast that most of the tutorials are out > of dated. > So I am wondering what's the benefit of emitting LLVM IR using IRBuilder > compared with designing my own abstract syntax in a high-level programming > language(e.g. Haskell or OCaml) and unparsing it to LLVM IR.To be clear you're suggesting having your frontend (say, for argument's sake, written in C++) parse your toy language and then emit a (say) Haskell representation of IR? Using some Haskell APIs you'll write that will emit LLVM bitcode? And then running the resulting Haskell program to produce your bitcode that you'll load back in to LLVM to optimize/compile?> Is there some > benefit of using IRBuilder I ignored here?It'll be more efficient to keep the IR in memory rather than to go out to a source file, run that file to produce bitcode, then load that bitcode in to LLVM. Also I'm not sure I see quite how that scheme would be less verbose.> And I have some follow-up > questions: 1. How stable is the IR format?There's in-built autoupgrade so that you can load old IR in newer versions of LLVM (any 3.* series should be compatible, I believe).> 2. Is the binary representation > of IR format (*.bc) stable and the same across different platforms?It's the same format, but the actual bitcode isn't retargetable, as such. (ie: don't expect to be able to produce bitcode that you can compile for different architectures)> 3. Is > there any previous work of building a declarative`interface instead of using > IRBuilder?Not that I know of. The frontends tend to use IRBuilder.> Thank you in advance! > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Caldarale, Charles R
2013-Apr-16 02:18 UTC
[LLVMdev] The most efficient way to compile to LLVM IR?
> From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] > On Behalf Of Hongbo Zhang > Subject: [LLVMdev] The most efficient way to compile to LLVM IR?> I find the API provided by IRBuilder is quite imperative and verbose, > and the API changes so fast that most of the tutorials are out of dated.We've been using IRBuilder since 2.7, are now on 3.2, and upgrading with each release has been rather easy - not trivial, but not at all difficult. Using IRBuilder seems to be the simplest and most straightforward mechanism, and gives you the a great deal of control over what's fed to the optimizers. - Chuck THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers.
Thanks for your reply. On Mon, Apr 15, 2013 at 9:15 PM, David Blaikie <dblaikie at gmail.com> wrote:> On Mon, Apr 15, 2013 at 3:32 PM, Hongbo Zhang <bobzhang1988 at gmail.com> > wrote: > > Hi all, > > I am trying to compile my toy language to LLVM back end. (I am new to > > LLVM, so my questions may sound naive) > > I am looking at some tutorials about LLVM, most are about how to use > LLVM > > IRBuilder, however, I find the API provided by IRBuilder is quite > imperative > > and verbose, and the API changes so fast that most of the tutorials are > out > > of dated. > > So I am wondering what's the benefit of emitting LLVM IR using > IRBuilder > > compared with designing my own abstract syntax in a high-level > programming > > language(e.g. Haskell or OCaml) and unparsing it to LLVM IR. > > To be clear you're suggesting having your frontend (say, for > argument's sake, written in C++) parse your toy language and then emit > a (say) Haskell representation of IR? Using some Haskell APIs you'll > write that will emit LLVM bitcode? And then running the resulting > Haskell program to produce your bitcode that you'll load back in to > LLVM to optimize/compile? > > Yes, that's what I am doing, in OCaml though. Functional languages areexcellent for program transformation and manipulation. Where is the specification for the bitcode format? Thanks> > Is there some > > benefit of using IRBuilder I ignored here? > > It'll be more efficient to keep the IR in memory rather than to go out > to a source file, run that file to produce bitcode, then load that > bitcode in to LLVM. > > Also I'm not sure I see quite how that scheme would be less verbose. > > > And I have some follow-up > > questions: 1. How stable is the IR format? > > There's in-built autoupgrade so that you can load old IR in newer > versions of LLVM (any 3.* series should be compatible, I believe). > > > 2. Is the binary representation > > of IR format (*.bc) stable and the same across different platforms? > > It's the same format, but the actual bitcode isn't retargetable, as > such. (ie: don't expect to be able to produce bitcode that you can > compile for different architectures) > > > 3. Is > > there any previous work of building a declarative`interface instead of > using > > IRBuilder? > > Not that I know of. The frontends tend to use IRBuilder. > > > Thank you in advance! > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-- Regards -- Bob -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130416/a820f720/attachment.html>
There are actually two separate things you might want to use the native C++ API for. The first, as you noted, is generating LLVM IR via IRBuilder. You might also want to have a custom opt/llc-like tool which will make it somewhat easier to integrate LLVM plugins, such as for garbage collection or language-specific passes implementing sanity checks and/or optimizations. Shared library plugins are supported on *nix but not Windows AFAIK. One downside of doing your own abstract syntax, beyond the up-front investment of effort, is that as you use more of LLVM's capabilities (vectors, debug metadata, custom calling conventions, etc), you'll have to do more work to represent and unparse each new feature correctly. On the other hand, having an explicit representation independent of IRBuilder state is useful when debugging any invalid IR your front-end generates. The problem is that invalid IR will trigger asserts in LLVM, and printing out bits and pieces of your partially-built module from gdb isn't much fun. GHC's LLVM backend [1] takes the unparse-abstract-syntax approach, but they also use a relatively small subset of LLVM. For example, they don't make any use of LLVM's GC functionality. The Disciple language [2] has a fork of GHC's LLVM code, but I don't know offhand what the differences are. I don't know of any such projects for OCaml. FWIW my approach has been to generate a representation (from Haskell) which is fairly close to LLVM IR but abstracts some things, like GC metadata. This representation is then parsed (via serialized Protocol Buffers) from C++, which performs the appropriate IRBuilder actions, and can integrate statically-linked plugins. [1] https://github.com/ghc/ghc/tree/master/compiler/llvmGen [2] http://hackage.haskell.org/package/ddc-core-llvm On Mon, Apr 15, 2013 at 6:32 PM, Hongbo Zhang <bobzhang1988 at gmail.com>wrote:> Hi all, > I am trying to compile my toy language to LLVM back end. (I am new to > LLVM, so my questions may sound naive) > I am looking at some tutorials about LLVM, most are about how to use > LLVM IRBuilder, however, I find the API provided by IRBuilder is quite > imperative and verbose, and the API changes so fast that most of the > tutorials are out of dated. > So I am wondering what's the benefit of emitting LLVM IR using > IRBuilder compared with designing my own abstract syntax in a high-level > programming language(e.g. Haskell or OCaml) and unparsing it to LLVM IR. Is > there some benefit of using IRBuilder I ignored here? And I have some > follow-up questions: 1. How stable is the IR format? 2. Is the binary > representation of IR format (*.bc) stable and the same across different > platforms? 3. Is there any previous work of building a > declarative`interface instead of using IRBuilder? > Thank you in advance! > ______________________________**_________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/**mailman/listinfo/llvmdev<http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130417/857edbb3/attachment.html>