thr3ads.net - llvm dev - [LLVMdev] "Mapping High-Level Constructs to LLVM IR" [Nov 2013]

If this information is useful, please help other people find it:
Share via:

Mikael Lyngvig

2013-Nov-23 06:18 UTC

[LLVMdev] "Mapping High-Level Constructs to LLVM IR"

Thanks, you have a lot of valid points there.  I have myself long ago
abandoned the path of using C as a backend language due to the very factors
you mention.

However, as I said, the document was put together in 30 minutes.  Not
exactly ready for prime time :-)

I do agree that all of the things you mention should be described,
including Lambdas, closures, and generators, but I must admit up front that
I don't know how to implement half of them.  But I suppose I could do a lot
of research and perhaps occasionally ask you guys for specifics.

We are not going to find much common ground on the issue of "calling
propagated return values for exception handling", I think :-)  See
https://www.lyngvig.org/Teknik/A-Proposal-for-Exception-Handling-in-C for
the details.

I started out with C++ as the example language because a lot of people know
that language - and most certainly the majority of the LLVM user base.
 Obviously, you'd have to add source code from other languages than C++
when C++ does not provide features to illustrate the process.

I now agree that the lowering into C is not such a good idea after all.  So
I'll go straight from source language to LLVM IR, which is not that
difficult after all, and won't be very different for the reader.  In fact,
I think it will be much better than my original approach.

Thanks again for your valid objections.


-- Mikael




2013/11/23 Joshua Cranmer 🐧 <Pidgeot18 at gmail.com>
> On 11/22/2013 9:25 PM, Mikael Lyngvig wrote:
>
>> Hi guys,
>>
>> I have begun writing on a new document, named "Mapping High-Level
>> Constructs to LLVM IR", in which I hope to eventually explain how
to map
>> pretty much every contemporary high-level imperative and/or OOP
language
>> construct to LLVM IR.
>>
>> I write it for two reasons:
>>
>> 1. I need to know this stuff myself to be able to continue on my own
>> language project.
>> 2. I feel that this needs to be documented once and for all, to save
tons
>> of time for everybody out there, especially for the language inventors
who
>> just want to use LLVM as a backend.
>>
>> So my plan is to write this document and continue to revise and enhance
>> it as I understand more and helpful people on the list and elsewhere
>> explain to me how these things are done.
>>
>> Basically, I just want to know if there is any interest in such a
>> document or if I should put it on my own website.  If you know of any
books
>> or articles that already do this, then please let me know about them.
>>
>> I've attached the result of 30 minutes work, just so that you can
see
>> what I mean.  Please don't review the document as it is still in
its very
>> early infancy.
>>
>
> There is a strong bias towards C++ in the document, which isn't a
> particularly strong slice of higher-level constructs. For example,
C++'s
> RTTI constructs serve three distinct purposes: exception handling, dynamic
> casts, and reflection (although C++'s reflection capabilities are
extremely
> weak). You'll need to talk about inheritance in the three cases:
single,
> multiple, and virtual (to use C++'s terminology) (note that Java's
> interfaces can be implemented as virtual inheritance). Boxing is another
> important topic. Lambdas, closures, and generators (yield keyword) are
> becoming increasingly common in modern programming languages, and should
> not be ignored.
>
> Finally, calling propagated return values "exception handling"
does an
> extreme disservice to your readers. LLVM IR explicitly models exception
> handling, and attempting to describe it lowered as return values is not how
> anyone should implement it. If you badly want to describe it in C terms,
> you could at least use C's setjmp/longjmp to describe it; the truth is,
> this is a feature which doesn't exist cleanly in C.
>
> Trying to describe mapping higher-level languages to C and then C to IR is
> a poor idea. C is in some ways an extremely limited language (no native
> exception handling constructs, e.g.). If you want to be a guide to how to
> lower languages to LLVM IR, you need to also explain how to take advantage
> of features in the IR to optimize code better (e.g., TBAA). Cfront-like C++
> compilers are extremely rare-to-nonexistent (in part because it is
> difficult to map some features, most notably exception handling, cleanly
> and efficiently into C); if your guide is describing such an approach, it
> reads like an implicit endorsement. It is possible to describe some aspects
> of the IR in C, but if the goal is to lower to IR, then the description
> should be lowering to IR, not lowering to C.
>
> --
> Joshua Cranmer
> Thunderbird and DXR developer
> Source code archæologist
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131123/ae08da36/attachment.html>

Joshua Cranmer

2013-Nov-23 19:35 UTC

head link

[LLVMdev] "Mapping High-Level Constructs to LLVM IR"

On 11/23/2013 12:18 AM, Mikael Lyngvig wrote:> Thanks, you have a lot of valid points there.  I have myself long ago 
> abandoned the path of using C as a backend language due to the very 
> factors you mention.
>
> However, as I said, the document was put together in 30 minutes.  Not 
> exactly ready for prime time :-)
>
> I do agree that all of the things you mention should be described, 
> including Lambdas, closures, and generators, but I must admit up front 
> that I don't know how to implement half of them.  But I suppose I 
> could do a lot of research and perhaps occasionally ask you guys for 
> specifics.
>
> We are not going to find much common ground on the issue of "calling 
> propagated return values for exception handling", I think :-)  See 
> https://www.lyngvig.org/Teknik/A-Proposal-for-Exception-Handling-in-C 
> for the details.
That appears to be comparing return value propagation to setjmp/longjmp, 
which is not considered a good exception handling model. Most low-cost 
(or no-cost!) exception handling mechanisms are based on an unwind 
approach, where a thrown exception causes the stack trace to be 
inspected to find the appropriate catch block. If exceptions are not 
thrown, there is no execution cost to setting up a try/catch block, 
although this is paid for by a relatively expensive throw 
implementation. Details of such an implementation can be found in the 
Itanium ABI <http://mentorembedded.github.io/cxx-abi/abi-eh.html> 
(which, despite its name, is the ABI used by gcc and clang).

-- 
Joshua Cranmer
News submodule owner
DXR coauthor

Yaron Keren

2013-Nov-23 20:41 UTC

head link

[LLVMdev] "Mapping High-Level Constructs to LLVM IR"

There are overviews of Dwarf EH in the gcc wiki

 http://gcc.gnu.org/wiki/Dwarf2EHNewbiesHowto

and in Mortoray blog

 http://mortoray.com/2013/09/12/the-true-cost-of-zero-cost-exceptions/
 http://mortoray.com/2012/03/08/the-necessity-of-exceptions/
 http://mortoray.com/2012/04/02/everything-wrong-with-exceptions/



2013/11/23 Joshua Cranmer <pidgeot18 at gmail.com>
> On 11/23/2013 12:18 AM, Mikael Lyngvig wrote:
>
>> Thanks, you have a lot of valid points there.  I have myself long ago
>> abandoned the path of using C as a backend language due to the very
factors
>> you mention.
>>
>> However, as I said, the document was put together in 30 minutes.  Not
>> exactly ready for prime time :-)
>>
>> I do agree that all of the things you mention should be described,
>> including Lambdas, closures, and generators, but I must admit up front
that
>> I don't know how to implement half of them.  But I suppose I could
do a lot
>> of research and perhaps occasionally ask you guys for specifics.
>>
>> We are not going to find much common ground on the issue of
"calling
>> propagated return values for exception handling", I think :-)  See
>>
https://www.lyngvig.org/Teknik/A-Proposal-for-Exception-Handling-in-Cfor the
details.
>>
>
> That appears to be comparing return value propagation to setjmp/longjmp,
> which is not considered a good exception handling model. Most low-cost (or
> no-cost!) exception handling mechanisms are based on an unwind approach,
> where a thrown exception causes the stack trace to be inspected to find the
> appropriate catch block. If exceptions are not thrown, there is no
> execution cost to setting up a try/catch block, although this is paid for
> by a relatively expensive throw implementation. Details of such an
> implementation can be found in the Itanium ABI <
> http://mentorembedded.github.io/cxx-abi/abi-eh.html> (which, despite its
> name, is the ABI used by gcc and clang).
>
> --
> Joshua Cranmer
> News submodule owner
> DXR coauthor
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131123/6bc5bb51/attachment.html>

David Tweed

2013-Nov-25 09:04 UTC

head link

[LLVMdev] "Mapping High-Level Constructs to LLVM IR"

Hi, documentation is always good and this is a great idea. It'll be
particularly useful as a place where additional examples of constructs from
non-C-family languages could be added (since most compiler tutorials
inevitably focus on languages that are a lot like C/C++/Obj-C).

I'd imagine you've already thought of this, but it might be something
where
using pseudo-LLVM-IR is of the most pedagogical use. (For example, writing
code that accesses complex structured memory using multiple levels of GEP's
and then loads is quite tricky, but for a lot of the constructs it's only a
detail so you could probably express those bits using some pseudo
instruction (going another step in to fully explicit LLVM-IR if necessary.)

Cheers,
Dave


On Sat, Nov 23, 2013 at 6:18 AM, Mikael Lyngvig <mikael at lyngvig.org>
wrote:
> Thanks, you have a lot of valid points there.  I have myself long ago
> abandoned the path of using C as a backend language due to the very factors
> you mention.
>
> However, as I said, the document was put together in 30 minutes.  Not
> exactly ready for prime time :-)
>
> I do agree that all of the things you mention should be described,
> including Lambdas, closures, and generators, but I must admit up front that
> I don't know how to implement half of them.  But I suppose I could do a
lot
> of research and perhaps occasionally ask you guys for specifics.
>
> We are not going to find much common ground on the issue of "calling
> propagated return values for exception handling", I think :-)  See
> https://www.lyngvig.org/Teknik/A-Proposal-for-Exception-Handling-in-C for
> the details.
>
> I started out with C++ as the example language because a lot of people
> know that language - and most certainly the majority of the LLVM user base.
>  Obviously, you'd have to add source code from other languages than C++
> when C++ does not provide features to illustrate the process.
>
> I now agree that the lowering into C is not such a good idea after all.
>  So I'll go straight from source language to LLVM IR, which is not that
> difficult after all, and won't be very different for the reader.  In
fact,
> I think it will be much better than my original approach.
>
> Thanks again for your valid objections.
>
>
> -- Mikael
>
>
>
>
> 2013/11/23 Joshua Cranmer 🐧 <Pidgeot18 at gmail.com>
>
>> On 11/22/2013 9:25 PM, Mikael Lyngvig wrote:
>>
>>> Hi guys,
>>>
>>> I have begun writing on a new document, named "Mapping
High-Level
>>> Constructs to LLVM IR", in which I hope to eventually explain
how to map
>>> pretty much every contemporary high-level imperative and/or OOP
language
>>> construct to LLVM IR.
>>>
>>> I write it for two reasons:
>>>
>>> 1. I need to know this stuff myself to be able to continue on my
own
>>> language project.
>>> 2. I feel that this needs to be documented once and for all, to
save
>>> tons of time for everybody out there, especially for the language
inventors
>>> who just want to use LLVM as a backend.
>>>
>>> So my plan is to write this document and continue to revise and
enhance
>>> it as I understand more and helpful people on the list and
elsewhere
>>> explain to me how these things are done.
>>>
>>> Basically, I just want to know if there is any interest in such a
>>> document or if I should put it on my own website.  If you know of
any books
>>> or articles that already do this, then please let me know about
them.
>>>
>>> I've attached the result of 30 minutes work, just so that you
can see
>>> what I mean.  Please don't review the document as it is still
in its very
>>> early infancy.
>>>
>>
>> There is a strong bias towards C++ in the document, which isn't a
>> particularly strong slice of higher-level constructs. For example,
C++'s
>> RTTI constructs serve three distinct purposes: exception handling,
dynamic
>> casts, and reflection (although C++'s reflection capabilities are
extremely
>> weak). You'll need to talk about inheritance in the three cases:
single,
>> multiple, and virtual (to use C++'s terminology) (note that
Java's
>> interfaces can be implemented as virtual inheritance). Boxing is
another
>> important topic. Lambdas, closures, and generators (yield keyword) are
>> becoming increasingly common in modern programming languages, and
should
>> not be ignored.
>>
>> Finally, calling propagated return values "exception
handling" does an
>> extreme disservice to your readers. LLVM IR explicitly models exception
>> handling, and attempting to describe it lowered as return values is not
how
>> anyone should implement it. If you badly want to describe it in C
terms,
>> you could at least use C's setjmp/longjmp to describe it; the truth
is,
>> this is a feature which doesn't exist cleanly in C.
>>
>> Trying to describe mapping higher-level languages to C and then C to IR
>> is a poor idea. C is in some ways an extremely limited language (no
native
>> exception handling constructs, e.g.). If you want to be a guide to how
to
>> lower languages to LLVM IR, you need to also explain how to take
advantage
>> of features in the IR to optimize code better (e.g., TBAA). Cfront-like
C++
>> compilers are extremely rare-to-nonexistent (in part because it is
>> difficult to map some features, most notably exception handling,
cleanly
>> and efficiently into C); if your guide is describing such an approach,
it
>> reads like an implicit endorsement. It is possible to describe some
aspects
>> of the IR in C, but if the goal is to lower to IR, then the description
>> should be lowering to IR, not lowering to C.
>>
>> --
>> Joshua Cranmer
>> Thunderbird and DXR developer
>> Source code archæologist
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
>

-- 
cheers, dave tweed__________________________
high-performance computing and machine vision expert: david.tweed at gmail.com
"while having code so boring anyone can maintain it, use Python." --
attempted insult seen on slashdot
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131125/73cf87fe/attachment.html>

Mikael Lyngvig

2013-Nov-25 09:46 UTC

head link

[LLVMdev] "Mapping High-Level Constructs to LLVM IR"

Hi David,

I'm glad you like the idea :-)

I've been busy and so far only lack about three or four things in the C++
area of features: Proper exception handling (which I need to understand
myself first), closures (which I don't think I've ever used), and
generators (which I always wondered how they were implemented).  I think
you are absolutely right - some day, this document will be routinely be
extended with new and wonderful language constructs and this will help
these features to propagate much faster into existing and new languages as
people will quickly come to realize how simple it is to do this or that
feature.  Besides, I think it is an awesome way of teaching people how to
use LLVM IR: They can use their knowledge of familiar structures and
constructs to see how it can be done in LLVM IR.

So far, it hasn't been too bad with GEPs and loads.  The most advanced
example uses like two or three GEPs/loads in a row and I have already
explained at the top of the document that the user should not worry - most
commonly two or three LLVM IR instructions will be coalesced into a single
instruction after optimization.

I'll let your idea simmer a bit and see what comes up.  For now, I have had
the joy of coding up quite a few snippets in LLVM IR.  I am learning to
swim as I am sinking into all the intricacies of LLVM IR.  So far is has
only been pleasant working with LLVM IR, albeit I am a tad tired of typing
types because LLVM IR demands explicit types on most expressions and
arguments.  But I do understand that LLVM IR is meant to be automatically
generated by a compiler, not hand-crafted by a tech writer.

I've attached my current draft.  If you are busy, don't care, or prefer
to
wait for the final result, please don't waste time on looking at it.
 There's still a quite a bit to do.  Comments are more than welcome,
though.  And feel free to suggest new language features and so on: Even if
I cannot document them, I can always ask the list for help and together we
can make up a great document, which I personally suspect will one day be
almost as popular as the Language Reference.


-- Mikael


2013/11/25 David Tweed <david.tweed at gmail.com>
> Hi, documentation is always good and this is a great idea. It'll be
> particularly useful as a place where additional examples of constructs from
> non-C-family languages could be added (since most compiler tutorials
> inevitably focus on languages that are a lot like C/C++/Obj-C).
>
> I'd imagine you've already thought of this, but it might be
something
> where using pseudo-LLVM-IR is of the most pedagogical use. (For example,
> writing code that accesses complex structured memory using multiple levels
> of GEP's and then loads is quite tricky, but for a lot of the
constructs
> it's only a detail so you could probably express those bits using some
> pseudo instruction (going another step in to fully explicit LLVM-IR if
> necessary.)
>
> Cheers,
> Dave
>
>
> On Sat, Nov 23, 2013 at 6:18 AM, Mikael Lyngvig <mikael at
lyngvig.org>wrote:
>
>> Thanks, you have a lot of valid points there.  I have myself long ago
>> abandoned the path of using C as a backend language due to the very
factors
>> you mention.
>>
>> However, as I said, the document was put together in 30 minutes.  Not
>> exactly ready for prime time :-)
>>
>> I do agree that all of the things you mention should be described,
>> including Lambdas, closures, and generators, but I must admit up front
that
>> I don't know how to implement half of them.  But I suppose I could
do a lot
>> of research and perhaps occasionally ask you guys for specifics.
>>
>> We are not going to find much common ground on the issue of
"calling
>> propagated return values for exception handling", I think :-)  See
>>
https://www.lyngvig.org/Teknik/A-Proposal-for-Exception-Handling-in-Cfor the
details.
>>
>> I started out with C++ as the example language because a lot of people
>> know that language - and most certainly the majority of the LLVM user
base.
>>  Obviously, you'd have to add source code from other languages than
C++
>> when C++ does not provide features to illustrate the process.
>>
>> I now agree that the lowering into C is not such a good idea after all.
>>  So I'll go straight from source language to LLVM IR, which is not
that
>> difficult after all, and won't be very different for the reader. 
In fact,
>> I think it will be much better than my original approach.
>>
>> Thanks again for your valid objections.
>>
>>
>> -- Mikael
>>
>>
>>
>>
>> 2013/11/23 Joshua Cranmer 🐧 <Pidgeot18 at gmail.com>
>>
>>> On 11/22/2013 9:25 PM, Mikael Lyngvig wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I have begun writing on a new document, named "Mapping
High-Level
>>>> Constructs to LLVM IR", in which I hope to eventually
explain how to map
>>>> pretty much every contemporary high-level imperative and/or OOP
language
>>>> construct to LLVM IR.
>>>>
>>>> I write it for two reasons:
>>>>
>>>> 1. I need to know this stuff myself to be able to continue on
my own
>>>> language project.
>>>> 2. I feel that this needs to be documented once and for all, to
save
>>>> tons of time for everybody out there, especially for the
language inventors
>>>> who just want to use LLVM as a backend.
>>>>
>>>> So my plan is to write this document and continue to revise and
enhance
>>>> it as I understand more and helpful people on the list and
elsewhere
>>>> explain to me how these things are done.
>>>>
>>>> Basically, I just want to know if there is any interest in such
a
>>>> document or if I should put it on my own website.  If you know
of any books
>>>> or articles that already do this, then please let me know about
them.
>>>>
>>>> I've attached the result of 30 minutes work, just so that
you can see
>>>> what I mean.  Please don't review the document as it is
still in its very
>>>> early infancy.
>>>>
>>>
>>> There is a strong bias towards C++ in the document, which isn't
a
>>> particularly strong slice of higher-level constructs. For example,
C++'s
>>> RTTI constructs serve three distinct purposes: exception handling,
dynamic
>>> casts, and reflection (although C++'s reflection capabilities
are extremely
>>> weak). You'll need to talk about inheritance in the three
cases: single,
>>> multiple, and virtual (to use C++'s terminology) (note that
Java's
>>> interfaces can be implemented as virtual inheritance). Boxing is
another
>>> important topic. Lambdas, closures, and generators (yield keyword)
are
>>> becoming increasingly common in modern programming languages, and
should
>>> not be ignored.
>>>
>>> Finally, calling propagated return values "exception
handling" does an
>>> extreme disservice to your readers. LLVM IR explicitly models
exception
>>> handling, and attempting to describe it lowered as return values is
not how
>>> anyone should implement it. If you badly want to describe it in C
terms,
>>> you could at least use C's setjmp/longjmp to describe it; the
truth is,
>>> this is a feature which doesn't exist cleanly in C.
>>>
>>> Trying to describe mapping higher-level languages to C and then C
to IR
>>> is a poor idea. C is in some ways an extremely limited language (no
native
>>> exception handling constructs, e.g.). If you want to be a guide to
how to
>>> lower languages to LLVM IR, you need to also explain how to take
advantage
>>> of features in the IR to optimize code better (e.g., TBAA).
Cfront-like C++
>>> compilers are extremely rare-to-nonexistent (in part because it is
>>> difficult to map some features, most notably exception handling,
cleanly
>>> and efficiently into C); if your guide is describing such an
approach, it
>>> reads like an implicit endorsement. It is possible to describe some
aspects
>>> of the IR in C, but if the goal is to lower to IR, then the
description
>>> should be lowering to IR, not lowering to C.
>>>
>>> --
>>> Joshua Cranmer
>>> Thunderbird and DXR developer
>>> Source code archæologist
>>>
>>> _______________________________________________
>>> LLVM Developers mailing list
>>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>>
>>
>>
>> _______________________________________________
>> LLVM Developers mailing list
>> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>
>>
>
>
> --
> cheers, dave tweed__________________________
> high-performance computing and machine vision expert:
> david.tweed at gmail.com
> "while having code so boring anyone can maintain it, use Python."
--
> attempted insult seen on slashdot
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131125/bceb55bf/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MappingHighLevelConstructsToLLVMIR.rst
Type: application/octet-stream
Size: 39013 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20131125/bceb55bf/attachment.obj>

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Nov 2013 - [LLVMdev] "Mapping High-Level Constructs to LLVM IR"

[LLVMdev] "Mapping High-Level Constructs to LLVM IR"

[LLVMdev] "Mapping High-Level Constructs to LLVM IR"

[LLVMdev] "Mapping High-Level Constructs to LLVM IR"

[LLVMdev] "Mapping High-Level Constructs to LLVM IR"

[LLVMdev] "Mapping High-Level Constructs to LLVM IR"

Apparently Analagous Threads