thr3ads.net - llvm dev - [LLVMdev] seeking advice [Aug 2015]

If this information is useful, please help other people find it:
Share via:

Arno Bastenhof

2015-Aug-03 15:29 UTC

[LLVMdev] seeking advice

I recently subscribed to the LLVM and Clang developer mailing lists, and
earlier today read a message requesting advice on "Contributing to LLVM
Projects". I'm hoping to eventually get involved as well, but with my
situation being very different to that described in the referenced thread I
got the idea of solliciting advice seperately. If this is not the right
place to ask, though, then please let me know and ignore the remainder of
this message.

Right now, I only 'really' know Java for my current job, save for some
Prolog, Haskell and Coq that I picked up before during my linguistics
studies (specializing in categorial grammar, and having supplemented with
courses from CS and AI on logic and type theory). I thus still have a long
road ahead, already when it comes to just bare theory, let alone
implementation. The following is a detailed study plan that I have come up
with for my spare time that I think will last me for the next 1.5 years or
so (naturally to be supplemented with the LLVM documentation), and which I
would be very happy to obtain any feedback on.

Instead of beginning with C++, I rather started studying C from K&R and
(once finished) the book by Fraser and Hanson on LCC. Afterwards, I already
have a modest reading list prepared of some articles on SSA (see below)
which I hope to combine with a study of the LLVM IR. Based on these
readings, as a matter of exercise I can take what I learnt from LCC to
write a small compiler in C targeting the LLVM IR for, say, a subset of
Prolog. Ideally, I should reach this stage by the end of this year. While
far from 'done' with C, I suppose that at this point I can also proceed
to
pick up some C++ (having in mind at least Accelerated C++ and the Effective
(Modern) C++ books, coupled perhaps with Modern C++ Design for template
metaprogramming), by which time I think I should be prepared to start
following some of the developments around Clang and start digging into its
code. All this, in turn, should keep me busy during the first half of the
next year at least. I hope, however, to also come to thoroughly understand
the LLVM backend, and I would suppose that, besides following literature on
compiler optimizations, going through a book or two on computer
architecture should give me a start to dig deeper into its source (thinking
currently of Computer Systems: A Programmer's Perspective). There will
surely never be any shortage of good books to read (in particular hoping to
also squeeze in Sedgewick's algorithms in C and a book on OS architecture
somewhere), but I hope all this at least should give me a good start from
which to start planning further, and hopefully eventually get involved
somehow. However, given my current state of knowledge and being limited to
my spare time, I'm guessing I'll be looking at at least five years or so
to
reach a lower intermediate level understanding of most of the important
concepts involved.

As for the articles that I had in mind for coming to grips with the basics
of SSA, I came up with the following list, planning to first gradually work
my way up to an understanding of control- and data dependence.
1) Prosser '59: Applications of Boolean matrices to the analysis of flow
diagrams <Introduces the concept of dominators, I believe.>
2) Lowry & Medlock '69: Object code optimization <For further
discussion of
the dominator concept.>
3) Allen '70: Control flow analysis
4) Allen & Cocke '76: A program data flow analysis procedure
5) Ferrante et al. '87: The program dependence graph and its use in
optimization
6) Rosen et al. '88: Global value numbers and redundant computations
7) Alpern et al. '88: Detecting equality of variables in programs
8) Cytron et al. '91: Efficiently computing static single assignment form
and the control dependence graph

Many thanks for taking the time to read through all this. I would be very
happy to receive any feedback on these plans, and again, if this was not
the right place to ask, then my apologies.


Kind regards,

Arno Bastenhof
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150803/1674513d/attachment.html>

Renato Golin

2015-Aug-03 17:07 UTC

head link

[LLVMdev] seeking advice

On 3 August 2015 at 16:29, Arno Bastenhof <arnobastenhof at gmail.com>
wrote:> Based on these
> readings, as a matter of exercise I can take what I learnt from LCC to
write
> a small compiler in C targeting the LLVM IR for, say, a subset of Prolog.
Hi Arno,

This is not a monumental task. There are many tutorials in the LLVM
tree, and producing IR from a simple AST is actually simple and
straight-forward.

So many languages can target LLVM IR, that it's very hard that you'll
find something that cannot be represented in IR. At least something
that other implementations haven't found already, and worked around.

With your broad knowledge in CS, I wouldn't expect you to take more
than a month or two to get a front-end working with the JIT compiler
on a reasonable subset of Prolog. Of course, the deeper you go, the
more time it takes to progress the same 1%, but you can always stop
when productivity drops as fast as enjoyment. :)

But you won't need to understand the back-end at all, modulo some
quirks on your platform of choice. The hint is to write the same logic
in Prolog and C, compile the C with Clang and see what they do. If you
spot hackery, the least resistance path is to replicate the hackery.
Asking on the list is also a good way to understand why the hackery,
and maybe even give you an entry point into changing the middle/back
end to support a proper construct in IR for better codegen.
>From there, I'd recommend you to try different targets. Get someboard, device, phone you have at your disposal (RaspberryPi2 is a
great inexpensive kit), and make it work. You'll take a lot less
effort than you're imagining...

cheers,
--renato

mats petersson

2015-Aug-03 17:57 UTC

head link

[LLVMdev] seeking advice

When I set out on my "Pascal Compiler project", I had never written a
compiler before - I had SOME experience in writing interpreters and other
parser/parser-like projects. That's about 14 months ago. I think I started
around end of February, and by the beginning of march, when I pushed the
first commit to github, I already had small programs working.

I only spend (some of my) evenings and weekends on this project, and there
have been periods where I didn't do any work for several weeks for one
reason or another. I'm now at the a stage where the parser and semantic
analysis is parsing the iso7185pat.pas ("pat = Pascal Acceptance
Test").
It's not yet passing the code-gen phase - the current stumbling block is
trampolining for nested function pointer calls, and I think there are going
to be a few other problems along the way. Given that this file is "trying
to do everything the ISO7185 standard allows", it's not surprising it
finds
a few interesting issues...

I have a few decent size home-grown tests, such as a sudoku solver that
"works" - and because LLVM provides good optimisation, my compiler
beats
the "regular" Free Pascal compiler on all like-for-like comparisons
I've
made so far - in particular, the Sudoku solver is about 2x faster with my
compiler than Free Pascal (ok, I cheat, some 5% of that comes from having
implemented a "popcount" intrinsic that can take a "set" and
give the
number of elements in the set, which standard pascal or Free Pascal doesn't
have).

Of course, like Renato says, for every small step towards "full
functionality" gets harder - and the good thing with a hobby project is the
ability to choose how much time you spend on what features, and how you
implement them.

I thoroughly recommend "try it, see how it goes". I've found it
surprisingly easy...  [Mostly because other people have done the REALLY
hard part of making the LLVM IR to machine code phase! - Thanks guys!]

--
Mats

On 3 August 2015 at 18:07, Renato Golin <renato.golin at linaro.org>
wrote:
> On 3 August 2015 at 16:29, Arno Bastenhof <arnobastenhof at
gmail.com> wrote:
> > Based on these
> > readings, as a matter of exercise I can take what I learnt from LCC to
> write
> > a small compiler in C targeting the LLVM IR for, say, a subset of
Prolog.
>
> Hi Arno,
>
> This is not a monumental task. There are many tutorials in the LLVM
> tree, and producing IR from a simple AST is actually simple and
> straight-forward.
>
> So many languages can target LLVM IR, that it's very hard that
you'll
> find something that cannot be represented in IR. At least something
> that other implementations haven't found already, and worked around.
>
> With your broad knowledge in CS, I wouldn't expect you to take more
> than a month or two to get a front-end working with the JIT compiler
> on a reasonable subset of Prolog. Of course, the deeper you go, the
> more time it takes to progress the same 1%, but you can always stop
> when productivity drops as fast as enjoyment. :)
>
> But you won't need to understand the back-end at all, modulo some
> quirks on your platform of choice. The hint is to write the same logic
> in Prolog and C, compile the C with Clang and see what they do. If you
> spot hackery, the least resistance path is to replicate the hackery.
> Asking on the list is also a good way to understand why the hackery,
> and maybe even give you an entry point into changing the middle/back
> end to support a proper construct in IR for better codegen.
>
> From there, I'd recommend you to try different targets. Get some
> board, device, phone you have at your disposal (RaspberryPi2 is a
> great inexpensive kit), and make it work. You'll take a lot less
> effort than you're imagining...
>
> cheers,
> --renato
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150803/e7b4f2ed/attachment.html>

Mukul Sabharwal

2015-Aug-03 19:24 UTC

head link

[LLVMdev] seeking advice

Good recommendations on the thread already, but let me add by saying
you don't _need_ to learn C or C++ to use LLVM; of course
contributions would require familiarity with C++. You can use a
library that talks to LLVM in your favorite language, and if that's
Java, this seems to be the most up-to-date project with bindings:
https://github.com/robovm/robovm/tree/master/llvm

Let me also critique your reading list if your goal is writing a first
compiler. Understanding SSA, optimizations, etc. is 10x more lucid
(and you appreciate it more) when you see some IR or assembly and the
impact a certain optimization or concept can have on it. Let me be
clear, your reading list is bang on for some topics, but I feel like
you'll miss some "aha" moments if you've not actually written
some
code first-hand, or maybe gone through some simple code generation
steps (even if it's reading code).

In college, I took a course (and did well) on compiler optimizations
(program analysis to be specific) and we went through pretty much all
what is needed (theory) to build an optimizing compiler: cfg
construction, dataflow frameworks, SSA form, fixed points, lattices
and abstract interpretation, etc. but a few years hence without using
any of that it's been somewhat of a re-learning this past year or so.
The formal education there was absolutely critical for the concepts,
but I didn't get something tangible at the end of it.

If optimizations are really what you want to understand or implement,
might I suggest starting with the one of the simplest ones: Constant
folding. It's simple to implement yet there is a feeling of
accomplishment.

Finally, to be honest since learning is an individual experience my
suggestions may not at all be helpful, but if you learn like I do ...
by doing, they just might.

P.S. - I have no idea about your level of cs understanding is, so
please don't be offended if what I've suggested is too naive or simple
for you.

On Mon, Aug 3, 2015 at 10:07 AM, Renato Golin <renato.golin at linaro.org>
wrote:> On 3 August 2015 at 16:29, Arno Bastenhof <arnobastenhof at
gmail.com> wrote:
>> Based on these
>> readings, as a matter of exercise I can take what I learnt from LCC to
write
>> a small compiler in C targeting the LLVM IR for, say, a subset of
Prolog.
>
> Hi Arno,
>
> This is not a monumental task. There are many tutorials in the LLVM
> tree, and producing IR from a simple AST is actually simple and
> straight-forward.
>
> So many languages can target LLVM IR, that it's very hard that
you'll
> find something that cannot be represented in IR. At least something
> that other implementations haven't found already, and worked around.
>
> With your broad knowledge in CS, I wouldn't expect you to take more
> than a month or two to get a front-end working with the JIT compiler
> on a reasonable subset of Prolog. Of course, the deeper you go, the
> more time it takes to progress the same 1%, but you can always stop
> when productivity drops as fast as enjoyment. :)
>
> But you won't need to understand the back-end at all, modulo some
> quirks on your platform of choice. The hint is to write the same logic
> in Prolog and C, compile the C with Clang and see what they do. If you
> spot hackery, the least resistance path is to replicate the hackery.
> Asking on the list is also a good way to understand why the hackery,
> and maybe even give you an entry point into changing the middle/back
> end to support a proper construct in IR for better codegen.
>
> From there, I'd recommend you to try different targets. Get some
> board, device, phone you have at your disposal (RaspberryPi2 is a
> great inexpensive kit), and make it work. You'll take a lot less
> effort than you're imagining...
>
> cheers,
> --renato
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Seemingly Similar Threads

Search for more possibly parallel threads

llvm dev - Aug 2015 - [LLVMdev] seeking advice

[LLVMdev] seeking advice

[LLVMdev] seeking advice

[LLVMdev] seeking advice

[LLVMdev] seeking advice

Seemingly Similar Threads