thr3ads.net - llvm dev - [LLVMdev] Lost in the documentation [Apr 2008]

If this information is useful, please help other people find it:
Share via:

Hendrik Boom

2008-Apr-29 12:41 UTC

[LLVMdev] Lost in the documentation

On Mon, 28 Apr 2008 17:54:31 -0400, Gordon Henriksen wrote:
> On Apr 28, 2008, at 17:32, Hendrik Boom wrote:
> 
>> In http://llvm.org/docs/FAQ.html, when taking about writing a compiler
>> that uses LLVM (at least I think that's what the FAQ question is
>> asking),
>> the FAQ recommends
>>
>>> #  Call into the LLVM libraries code using your language's FFI
>>> (foreign
>>> function interface).
>>>
>>>    * for: best tracks changes to the LLVM IR, .ll syntax, and .bc
>>>           format
>>>    * for: enables running LLVM optimization passes without a
>>>           emit/parse overhead
>>>    * for: adapts well to a JIT context
>>>    * against: lots of ugly glue code to write
>>
>> Now, which particular libraries would that be
> 
> With the exception of the 'util' and 'tools' directories,
the entire
> LLVM source tree consists of libraries.
Indeed, quite a lot of them.  Most of them appear to be internal.  I'm 
trying to identify the ones that are intended for use by LLVM users.

I have to say I missed the crucial paragraph:

: If you go with the first option, the C bindings in include/llvm-c
: should help a lot, since most languages have strong support for
: interfacing with C. The most common hurdle with calling C from managed
: code is interfacing with the garbage collector. The C interface was
: designed to require very little memory management, and so is
: straightforward in this regard.

Evidently I have to go look in include/llvm-c, since I stronlgly suspect 
you didn't go to the trouble of writng a C wrapper for anything that 
wasn't needed by an LLVM user.  Anything internal you'd have left in
C++.

So the API for a C++ *user* could be described as "those parts of the 
internals API that happen to be used in implementing llvm-c.

What I found in llvm-c was core.h.  Is that what I need to know for 
writing a compiler front-end?  Let's see.  core.h seems to describe 
building the LLVM code.  BitWriter says how to write it to a file, should 
that be desired.  It's not clear what lto.h, Analysis.h. c/
ExecutionEngine.h do or why I'd need them.  Target.h looks useful if I 
have to include machine-dependencies into my code generator.  Some things 
I do may depend on the size of pionters and the like.

Putting this together with the tutorial, http://llvm.org/docs/tutorial/, 
which uses CAML instead of C, I think I may be able to get a clue.

> 
>> where are their API(s) documented?
> 
> http://llvm.org/docs/
> http://llvm.org/doxygen/
> http://llvm.org/docs/tutorial/
> etc etc etc.
> 
> — Gordon
The doxygen page describes the complete internal structure of LLVM.  It 
explicitly says, 

; This documentation describes the internal software that makes up LLVM,
; not the external use of LLVM. There are no instructions here on how to
; use LLVM, only the APIs that make up the software. For usage
; instructions, please see the programmer's guide or reference manual.

I haven't yet found a "programmer's guide".

The only reference manual I've found so far was "LLVM Language
Reference
Manual", linked from the llvm.org/docs page.  It describes a programming 
language with a syntax.  No doubt it is a textual representation of the 
information to be transmitted using the API I'm looking for, but it 
doesn't document the API.  I can probably find what I'm looking for by 
prowling the source code that implements this LLVM language, and seeing 
what it calls, then looking those classes and methods in the doxygen 
stuff.  That's another way, complementary to guessing the realtionship 
between the ocaml tutorial and Core.h.

-- hendrik

Gordon Henriksen

2008-Apr-29 13:46 UTC

head link

[LLVMdev] Lost in the documentation

On 2008-04-29, at 08:41, Hendrik Boom wrote:
> On Mon, 28 Apr 2008 17:54:31 -0400, Gordon Henriksen wrote:
>
>> On Apr 28, 2008, at 17:32, Hendrik Boom wrote:
>>
>>> In http://llvm.org/docs/FAQ.html, when taking about writing a  
>>> compiler
>>> that uses LLVM (at least I think that's what the FAQ question
is
>>> asking),
>>> the FAQ recommends
>>>
>>>> #  Call into the LLVM libraries code using your language's
FFI
>>>> (foreign
>>>> function interface).
>>>>
>>>>   * for: best tracks changes to the LLVM IR, .ll syntax, and
.bc
>>>>          format
>>>>   * for: enables running LLVM optimization passes without a
>>>>          emit/parse overhead
>>>>   * for: adapts well to a JIT context
>>>>   * against: lots of ugly glue code to write
>>>
>>> Now, which particular libraries would that be
>>
>> With the exception of the 'util' and 'tools'
directories, the entire
>> LLVM source tree consists of libraries.
>
> Indeed, quite a lot of them.  Most of them appear to be internal.  
> I'm trying to identify the ones that are intended for use by LLVM  
> users.
include/llvm is all public (modulo some implementation details as  
required by the nature of C++). Private includes are in lib. But  
realize that not all users are front-end compilers. A back-end code  
generator is also a user of the framework; as is an IR optimization or  
analysis. The C++ interfaces support all of these clients equally.

VMCore and BitWriter are the libraries absolutely necessary for any  
static compiler that outputs bitcode. You'll likely want Analysis for  
the verifier; and Target for memory layout information. That's the  
basics.
> I have to say I missed the crucial paragraph:
>
> : If you go with the first option, the C bindings in include/llvm-c
> : should help a lot, since most languages have strong support for
> : interfacing with C. The most common hurdle with calling C from  
> managed
> : code is interfacing with the garbage collector. The C interface was
> : designed to require very little memory management, and so is
> : straightforward in this regard.
>
> Evidently I have to go look in include/llvm-c, since I stronlgly  
> suspect
> you didn't go to the trouble of writng a C wrapper for anything that
> wasn't needed by an LLVM user.  Anything internal you'd have left
in
> C++.
>
> So the API for a C++ *user* could be described as "those parts of the
> internals API that happen to be used in implementing llvm-c.
That's a rather poor definition. Only bindings for such features as  
have been required are authored. Still, if this helps you make sense  
of the framework, then that's fantastic; but remember that it is an  
imperfect rule.

Using the C bindings, it's still very important to understand the  
underlying C++ object model; otherwise, the type rules for the  
bindings will appear to be rather capricious.
> Putting this together with the tutorial, http://llvm.org/docs/tutorial/ 
> ,
> which uses CAML instead of C, I think I may be able to get a clue.
If you're not using ocaml, the C++ tutorial (the first one on that  
page) is probably more pertinent, even if you do intend to use the C  
bindings. Searching the implementation of the bindings (lib/VMCore/ 
Core.cpp, etc.) is helpful for "going backwards" from C++ to C once  
you begin to understand the object model.
>>> where are their API(s) documented?
>>
>> http://llvm.org/docs/
>> http://llvm.org/doxygen/
>> http://llvm.org/docs/tutorial/
>> etc etc etc.
>>
>> — Gordon
>
> The doxygen page describes the complete internal structure of LLVM.   
> It
> explicitly says,
>
> ; This documentation describes the internal software that makes up  
> LLVM,
> ; not the external use of LLVM. There are no instructions here on  
> how to
> ; use LLVM, only the APIs that make up the software. For usage
> ; instructions, please see the programmer's guide or reference manual.
>
> I haven't yet found a "programmer's guide".
http://llvm.org/docs/ProgrammersManual.html

— Gordon

Hendrik Boom

2008-Apr-29 17:59 UTC

head link

[LLVMdev] Lost in the documentation

On Tue, 29 Apr 2008 09:46:35 -0400, Gordon Henriksen wrote:
> On 2008-04-29, at 08:41, Hendrik Boom wrote:
> 
>> On Mon, 28 Apr 2008 17:54:31 -0400, Gordon Henriksen wrote:
>>
>>> On Apr 28, 2008, at 17:32, Hendrik Boom wrote:
>>>
>>>> In http://llvm.org/docs/FAQ.html, when taking about writing a
>>>> compiler
>>>> that uses LLVM (at least I think that's what the FAQ
question is
>>>> asking),
>>>> the FAQ recommends
>>>>
>>>>> #  Call into the LLVM libraries code using your
language's FFI
>>>>> (foreign
>>>>> function interface).
>>>>>
>>>>>   * for: best tracks changes to the LLVM IR, .ll syntax,
and .bc
>>>>>          format
>>>>>   * for: enables running LLVM optimization passes without a
>>>>>          emit/parse overhead
>>>>>   * for: adapts well to a JIT context
>>>>>   * against: lots of ugly glue code to write
>>>>
>>>> Now, which particular libraries would that be
>>>
>>> With the exception of the 'util' and 'tools'
directories, the entire
>>> LLVM source tree consists of libraries.
>>
>> Indeed, quite a lot of them.  Most of them appear to be internal.
I'm
>> trying to identify the ones that are intended for use by LLVM users.
> 
> include/llvm is all public (modulo some implementation details as
> required by the nature of C++). Private includes are in lib. But realize
> that not all users are front-end compilers. A back-end code generator is
> also a user of the framework; as is an IR optimization or analysis. The
> C++ interfaces support all of these clients equally.
> 
> VMCore and BitWriter are the libraries absolutely necessary for any
> static compiler that outputs bitcode. You'll likely want Analysis for
> the verifier; and Target for memory layout information. That's the
> basics.
> 
>> I have to say I missed the crucial paragraph:
>>
>> : If you go with the first option, the C bindings in include/llvm-c :
>> should help a lot, since most languages have strong support for :
>> interfacing with C. The most common hurdle with calling C from managed
>> : code is interfacing with the garbage collector. The C interface was :
>> designed to require very little memory management, and so is :
>> straightforward in this regard.
>>
>> Evidently I have to go look in include/llvm-c, since I stronlgly
>> suspect
>> you didn't go to the trouble of writng a C wrapper for anything
that
>> wasn't needed by an LLVM user.  Anything internal you'd have
left in
>> C++.
>>
>> So the API for a C++ *user* could be described as "those parts of
the
>> internals API that happen to be used in implementing llvm-c.
> 
> That's a rather poor definition. Only bindings for such features as
have
> been required are authored. Still, if this helps you make sense of the
> framework, then that's fantastic; but remember that it is an imperfect
> rule.
> 
> Using the C bindings, it's still very important to understand the
> underlying C++ object model; otherwise, the type rules for the bindings
> will appear to be rather capricious.
> 
>> Putting this together with the tutorial, http://llvm.org/docs/tutorial/
>> ,
>> which uses CAML instead of C, I think I may be able to get a clue.
> 
> If you're not using ocaml, the C++ tutorial (the first one on that
page)
> is probably more pertinent, even if you do intend to use the C bindings.
> Searching the implementation of the bindings (lib/VMCore/ Core.cpp,
> etc.) is helpful for "going backwards" from C++ to C once you
begin to
> understand the object model.
> 
>>>> where are their API(s) documented?
>>>
>>> http://llvm.org/docs/
>>> http://llvm.org/doxygen/
>>> http://llvm.org/docs/tutorial/
>>> etc etc etc.
>>>
>>> — Gordon
>>
>> The doxygen page describes the complete internal structure of LLVM. It
>> explicitly says,
>>
>> ; This documentation describes the internal software that makes up
>> LLVM,
>> ; not the external use of LLVM. There are no instructions here on how
>> to
>> ; use LLVM, only the APIs that make up the software. For usage ;
>> instructions, please see the programmer's guide or reference
manual.
>>
>> I haven't yet found a "programmer's guide".
> 
> http://llvm.org/docs/ProgrammersManual.html
Here's what I have in mind to do with LLVM.  Thanks.  I have a few 
languages to compile;  all of them require garbage collection.  I'll be 
looking at the ocaml experience with some interest.  How far I get into 
implementing them depends on the available time. and the state of my 
enthusiasm.  It has been known to go missing, and it often gets diverted 
to so-called real life.

One of these languages, Algol 68, I was working on about 35 years ago.  
It was not finished mainly because at some point the machinery I was 
developing it on became unavailable.  It correctly ran over half of a 
demanding test suite when the project stopped.  It's now something I'd 
like to finish more for old time's sake than any serious use.  35 years 
ago, this compiler would run in about 900K memory.  That was a dream 
machine back then.  Using an overlay linker, it could be crammed into 
400K.  It was written in Algol W, and could use a new portable code 
generator.  It used garbage collection at compile time, but on today's 
machines I could probably get away with wholesale memory leakage.

To get it working, of course I need something that implements Algol W.  
I've tinkered with translating Algol W to C or something similar.  I 
originally intended to translate the Algol 68 compiler into Algol 68, to 
make it self-supporting, but I never got that far.  I have an Algol W 
parser, and at least one ancient attribute grammar that (too slowly) 
translates it to something else.  Since I'll only be using it to develop 
Algol 68, which runs in 900K, I can probably dispense with garbage 
collection and just use my 4 gigabyte RAM instead.

I also have a self-implementing program-transformation tool. It consists 
of a recursive-descent parser generator, a tree-rewriting system, and an 
unparser.  In principle, it needs garbage collection.  In practise, well, 
I've said it before.  Memories are large these days.

-- hendrik
> 
> — Gordon

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Apr 2008 - [LLVMdev] Lost in the documentation

[LLVMdev] Lost in the documentation

[LLVMdev] Lost in the documentation

[LLVMdev] Lost in the documentation

Possibly Parallel Threads