thr3ads.net - llvm dev - [LLVMdev] Re: LLVM and PyPy [Nov 2003]

If this information is useful, please help other people find it:
Share via:

Chris Lattner

2003-Oct-31 10:42 UTC

[LLVMdev] Re: LLVM and PyPy

On Fri, 31 Oct 2003, Armin Rigo wrote:
> Hello Chris,
>
> We have been investigating your project and the good documentation
> and are very impressed. If we understood your goals correctly
> this seems like a good match for our ongoing and active PyPy project,
> a reimplementation of the Python language in Python.
Cool.  We are all big fans of Python here.  :)
> We'll definitely try using llvm as our low-level backend.
> But actually we contact you now to ask you if you'd be interested
> in some bidirectional collaboration.
I've read up a bit on PyPy, and it looks like LLVM could be a nice way to
get the JIT type interface that you would like.  Also, making use of the
LLVM optimizer can make your staticly generated code nice and fast.  :)
> Maybe a bit more background what we did or are:
>
> - we are an open international group of individuals collaborating on
>   our free time mostly. We are very involved with research, open source
>   communities and especially the Python communities.
>
> - during the course of four one-week meetings (which we call development
>   "sprints") we have done a rather complete interpreter and can
translate
>   parts of it to C or Lisp code already (using a control-flow
representation
>   which is actually very similar to LLVM code, hence our enthusiasm!)
>
> - we very recently submitted a funding proposal to the European Union:
>
>     http://codespeak.net/svn/pypy/trunk/doc/funding/proposal/part_b.pdf
>
>   and you may find these two chapters particularly interesting:
>
>     http://codespeak.net/pypy/index.cgi?doc/funding/B1.0_objectives
>     http://codespeak.net/pypy/index.cgi?doc/funding/B6.0_detailed
It sounds like LLVM could be a good implementation strategy for your
goals.  You're right that developing a JIT from stratch is a lot of work
:)
> On the technical level we are interested to know about and maybe
collaborate
> with efforts to support very-high-level language features in LLVM (e.g.
> walking the stack, for garbage collection), fine-grained runtime code
generation
> (generating code only one basic block at a time),
These are definitely features that we plan to add, but just haven't gotten
to yet.  In particular, Alkis is working on a Java front-end, which will
require similar features.  In the beginning, we will probably just use a
conservative collector, eventually adding support for precise GC.

We already have the capability of doing function-at-a-time code
generation: what is basic-block at a time generation used for?  How do you
do global optimizations like register allocation?
> and possibly also contribute
> a PowerPC back-end, and Python bindings for LLVM.
That would be great!  We've tossed around the idea of creating C bindings
for LLVM, which would make interfacing from other languages easier than
going directly to the C++ API, but we just haven't had a chance to yet.
Maybe you guys would be interested in helping with that project?
> On another level, some of the PyPy core developers are actually also
> involved with the 'codespeak' site which aims at connecting
interesting
> open source projects and provide new collaborative development services.
> The PyPy project is extensively using subversion which is a very
interesting
> (and stable) alternative to cvs.  So if you need any help with setting
> up some publically accessible infrastructure the codespeak guys will
> certainly welcome you.
At this point, we're working like crazy to get important features
implemented in LLVM.  We certainly acknowledge that CVS has severe
deficiencies, but in the near future we'll probably stay with it.
Perhaps after SVN 1.0 comes out... :)
> Feel free to forward this mail to the LLVM mailing list, btw.  We are
> just interested in getting some first contact and enter a productive
> discussion and - who knows - some interesting collaboration!
Done.  In general, that's a good place to discuss all kinds of issues like
this.  Please let us know what your plans are and what the next step is.
Perhaps C bindings would be the most logical starting place?

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

holger krekel

2003-Oct-31 13:36 UTC

head link

[LLVMdev] Re: LLVM and PyPy

Hi Chris,

[Chris Lattner Fri, Oct 31, 2003 at 10:58:45AM -0600]> On Fri, 31 Oct 2003, Armin Rigo & Holger Krekel wrote:
> 
> > Hello Chris,
> >
> > We have been investigating your project and the good documentation
> > and are very impressed. If we understood your goals correctly
> > this seems like a good match for our ongoing and active PyPy project,
> > a reimplementation of the Python language in Python.
> 
> Cool.  We are all big fans of Python here.  :)
That's good because we might want to recode some LLVM functionalities
in Python :-)
> > We'll definitely try using LLVM as our low-level backend.
> > But actually we contact you now to ask you if you'd be interested
> > in some bidirectional collaboration.
> 
> I've read up a bit on PyPy, and it looks like LLVM could be a nice way
to
> get the JIT type interface that you would like.  Also, making use of the
> LLVM optimizer can make your staticly generated code nice and fast.  :)
Yes, but we also would want to dynamically emit and execute LLVM code. 
But a static translation is indeed our first goal :-)
> We've tossed around the idea of creating C bindings
> for LLVM, which would make interfacing from other languages easier than
> going directly to the C++ API, but we just haven't had a chance to yet.
> Maybe you guys would be interested in helping with that project?
Thinking some more about it, we would probably try to translate our PyPy
implementation into LLVM-code and also generate some glue-LLVM-code
which allows us to programmatically drive LLVM from Python.  Is LLVM
able to "drive" itself? I mean can the LLVM-low-level object code
generate LLVM-low-level object code and then execute it? 

This would fit nicely with PyPy because we are running ourselves (in
'abstract interpretation' mode) in order to generate a low-level
representation of ourselves.  This low-level representation is already
close to LLVM's low-level view. So if the LLVM-code gets executed
(beeing a python interpreter) it should be able to just-in-time-compile
new LLVM code and execute it.  With our architecture, for such a JIT we
could reuse a good part of the code we already have for generating our
low-level representation.   It's a rather self-referential thing (also
see our logo: http://codespeak.net/pypy/ :-). 
> > On another level, some of the PyPy core developers are actually also
> > involved with the 'codespeak' site which aims at connecting
interesting
> > open source projects and provide new collaborative development
services.
> > The PyPy project is extensively using subversion which is a very
interesting
> > (and stable) alternative to cvs.  So if you need any help with setting
> > up some publically accessible infrastructure the codespeak guys will
> > certainly welcome you.
> 
> At this point, we're working like crazy to get important features
> implemented in LLVM.  We certainly acknowledge that CVS has severe
> deficiencies, but in the near future we'll probably stay with it.
> Perhaps after SVN 1.0 comes out... :)
then we may want to mirror your cvs repo to subversion :-) 
The reason is that we want to provide consistent versions of all 
the libraries/modules/projects we use.  And subversion makes
this rather easy (if the other project is svn-controled, too), 
e.g. you can say 'i want to follow the HEAD version of LLVM
in this branch' or 'i want to use this stable version of LLVM
for my own stable-release'. Then you can just issue 'svn up' and
you will have the desired versions on your working-copy. 
However, i can understand that you don't want to consider 
subversion right now and will stop advertising now :-)

cheers,

    holger

Chris Lattner

2003-Oct-31 13:47 UTC

head link

[LLVMdev] Re: LLVM and PyPy

> > Cool.  We are all big fans of Python here.  :)
>
> That's good because we might want to recode some LLVM functionalities
> in Python :-)
As long as it makes sense.  Needless duplication of effort is never a good
idea...
> > I've read up a bit on PyPy, and it looks like LLVM could be a nice
way to
> > get the JIT type interface that you would like.  Also, making use of
the
> > LLVM optimizer can make your staticly generated code nice and fast. 
:)
>
> Yes, but we also would want to dynamically emit and execute LLVM code.
> But a static translation is indeed our first goal :-)
Of course.  We can do both.  In fact, we can even emit C code, which will
be useful initially if you're work on PowerPC machines.
> Thinking some more about it, we would probably try to translate our PyPy
> implementation into LLVM-code and also generate some glue-LLVM-code
> which allows us to programmatically drive LLVM from Python.  Is LLVM
> able to "drive" itself? I mean can the LLVM-low-level object code
> generate LLVM-low-level object code and then execute it?
Yes, this should certainly be possible.  Kindof like what the
Jalapeno/Jikes JVM does with Java.  The point about the C bindings is that
they will allow a nice interface between the parts written in python, and
the parts written in C++.  It doesn't make sense for you to rewrite all of
LLVM in python, especially since the interface to build the LLVM is pretty
clean.
> This would fit nicely with PyPy because we are running ourselves (in
> 'abstract interpretation' mode) in order to generate a low-level
> representation of ourselves.  This low-level representation is already
> close to LLVM's low-level view. So if the LLVM-code gets executed
> (beeing a python interpreter) it should be able to just-in-time-compile
> new LLVM code and execute it.  With our architecture, for such a JIT we
Makes a lot of sense.
> > At this point, we're working like crazy to get important features
> > implemented in LLVM.  We certainly acknowledge that CVS has severe
> > deficiencies, but in the near future we'll probably stay with it.
> > Perhaps after SVN 1.0 comes out... :)
>
> then we may want to mirror your cvs repo to subversion :-)
That is obviously no problem.  :)
> The reason is that we want to provide consistent versions of all
> the libraries/modules/projects we use.  And subversion makes
Makes sense.  If it is publically accessible and stable, perhaps we can
add information about it on the LLVM pages for others who would prefer to
work with SVN...

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

Armin Rigo

2003-Oct-31 14:56 UTC

head link

[LLVMdev] Re: LLVM and PyPy

Hello Chris,

On Fri, Oct 31, 2003 at 10:58:45AM -0600, Chris Lattner
wrote:> These are definitely features that we plan to add, but just haven't
gotten
> to yet.  In particular, Alkis is working on a Java front-end, which will
> require similar features.  In the beginning, we will probably just use a
> conservative collector, eventually adding support for precise GC.
Great!
> We already have the capability of doing function-at-a-time code
> generation: what is basic-block at a time generation used for?  How do you
> do global optimizations like register allocation?
It is central to Psyco, the Python just-in-time specializer
(http://psyco.sourceforge.net) whose techniques we plan to integrate with
PyPy.  Unlike other environments like Self, which collects execution profiles
during interpretation and use them to recompile whole functions, Psyco has no
interpretation stage: it directly emits a basic block and run it; the values
found at run-time trigger the compilation of more basic blocks, which are run,
and so on.  So each function's machine code is a dynamic network of basic
blocks which are various specialized versions of a bit of the original
function.  This network is not statically known, in particular because basic
blocks often have a "switch" exit based on some value or type
collected at
run-time.  Every new value encountered at this point triggers the compilation
of a new switch case jumping to a new basic block.

We will also certainly consider Self-style recompilations, as they allow more
agressive optimizations.  (Register allocation in Psyco is done using a simple
round-robin scheme; code generation is very fast.)
> That would be great!  We've tossed around the idea of creating C
bindings
> for LLVM, which would make interfacing from other languages easier than
> going directly to the C++ API, but we just haven't had a chance to yet.
> Maybe you guys would be interested in helping with that project?
Well, as the C++ API is nice and clean it is probably simpler to bind it
directly to Python.  We would probably go for Boost-Python, which makes C++
objects directly accessible to Python.  But nothing is sure about this; maybe
driving LLVM from LLVM code is closer to our needs.  Is there a specific
interface to do that?  Is it possible to extract from LLVM the required code
only, and link it with the final executable?  In my experience, there are a
few limitations of C that require explicit assembly code, like building calls
dynamically (i.e. the caller's equivalent of varargs).


A bientot,

Armin.

Vikram Adve

2003-Oct-31 23:07 UTC

head link

[LLVMdev] Re: LLVM and PyPy

Hi,

I've been following these messages and just thought I would mention
a couple of our near-term goals which may be related to what you all are
interested in:

(1) Alkis is really working on building a "toolkit" for implementing
virtual
    machines on top of LLVM.  This means that different VMs (like JVM, CLI,
    and PyPy) only need to implement their specific runtime requirements,
    and a fast, simple (online or offline) translator to LLVM.
    All the the native code generation and runtime optimization would happen
    in the LLVM framework.  Is this more or less what you have in mind
    for using LLVM as a back end in PyPy?

    Note that in this view, *all* the decisions about whether or when to
    recompile some unit (e.g., hot functions as in Self) would happen in
    the LLVM framework, independent of what language is being compiled.
    Does that make sense for Python (and for PyPy)?

    Supporting a Psyco-style basic-block-at-a-time compilation model 
    (as described by Armin below) on top of this toolkit is not
    something we had considered so far.  It would be interesting to see
    how that could be done. 

(2) One difficult part in building such a toolkit is to abstract the
    interfaces between code generation and the runtime components
    implemented in the language VM (like GC, exception handling, etc.). 
    We have been assuming that these runtime components must be controlled
    by the language VM (e.g., JVM), since their semantics and performance
    constraints are language-specific.  The toolkit would only provide
    some common primitives, to interface with the code generator and 
    to make these more efficient.

(3) Patrick Meredith is going to be working on a CAML (and perhaps later,
    OCAML) front end to LLVM.

Note that these are all at a very early stage of work.

> From: Armin Rigo <arigo at tunes.org>
> Subject: [LLVMdev] Re: LLVM and PyPy
> Sender: llvmdev-admin at cs.uiuc.edu
> Date: Fri, 31 Oct 2003 20:48:40 +0000
> 
> Hello Chris,
> 
> On Fri, Oct 31, 2003 at 10:58:45AM -0600, Chris Lattner wrote:
> > These are definitely features that we plan to add, but just
haven't gotten
> > to yet.  In particular, Alkis is working on a Java front-end, which
will
> > require similar features.  In the beginning, we will probably just use
a
> > conservative collector, eventually adding support for precise GC.
> 
> Great!
> 
> > We already have the capability of doing function-at-a-time code
> > generation: what is basic-block at a time generation used for?  How do
you
> > do global optimizations like register allocation?
> 
> It is central to Psyco, the Python just-in-time specializer
> (http://psyco.sourceforge.net) whose techniques we plan to integrate with
> PyPy.  Unlike other environments like Self, which collects execution
profiles
> during interpretation and use them to recompile whole functions, Psyco has
no
> interpretation stage: it directly emits a basic block and run it; the
values
> found at run-time trigger the compilation of more basic blocks, which are
run,
> and so on.  So each function's machine code is a dynamic network of
basic
> blocks which are various specialized versions of a bit of the original
> function.  This network is not statically known, in particular because
basic
> blocks often have a "switch" exit based on some value or type
collected at
> run-time.  Every new value encountered at this point triggers the
compilation
> of a new switch case jumping to a new basic block.
> 
> We will also certainly consider Self-style recompilations, as they allow
more
> agressive optimizations.  (Register allocation in Psyco is done using a
simple
> round-robin scheme; code generation is very fast.)
> 
> > That would be great!  We've tossed around the idea of creating C
bindings
> > for LLVM, which would make interfacing from other languages easier
than
> > going directly to the C++ API, but we just haven't had a chance to
yet.
> > Maybe you guys would be interested in helping with that project?
> 
> Well, as the C++ API is nice and clean it is probably simpler to bind it
> directly to Python.  We would probably go for Boost-Python, which makes C++
> objects directly accessible to Python.  But nothing is sure about this;
maybe
> driving LLVM from LLVM code is closer to our needs.  Is there a specific
> interface to do that?  Is it possible to extract from LLVM the required
code
> only, and link it with the final executable?  In my experience, there are a
> few limitations of C that require explicit assembly code, like building
calls
> dynamically (i.e. the caller's equivalent of varargs).
> 
> 
> A bientot,
> 
> Armin.


Regards,

--Vikram

---------------------------------------------------------------------
 VIKRAM S. ADVE
 Assistant Professor                        E-MAIL: vadve at cs.uiuc.edu
 Department of Computer Science             PHONE:  (217) 244-2016
 Univ. of Illinois at Urbana-Champaign      FAX:    (217) 244-6869
 1304 W. Springfield Ave.               http://www.cs.uiuc.edu/~vadve
 Urbana IL 61801-2987.                  http://llvm.cs.uiuc.edu/
---------------------------------------------------------------------

Chris Lattner

2003-Nov-02 10:39 UTC

head link

[LLVMdev] Re: LLVM and PyPy

> > We already have the capability of doing function-at-a-time code
> > generation: what is basic-block at a time generation used for?  How do
you
> > do global optimizations like register allocation?
>
> It is central to Psyco, the Python just-in-time specializer
> (http://psyco.sourceforge.net) whose techniques we plan to integrate with
> PyPy.  Unlike other environments like Self, which collects execution
profiles
Ok, makes sense.
> > That would be great!  We've tossed around the idea of creating C
bindings
> > for LLVM, which would make interfacing from other languages easier
than
> Well, as the C++ API is nice and clean it is probably simpler to bind it
> directly to Python.  We would probably go for Boost-Python, which makes C++
> objects directly accessible to Python.  But nothing is sure about this;
maybe
Ok, I didn't know the boost bindings allowed calling C++ code from python.
In retrospect, that makes a lot of sense.  :)
> driving LLVM from LLVM code is closer to our needs.  Is there a specific
> interface to do that?
Sure, what exactly do you mean by driving LLVM code from LLVM?  The main
interface for executing LLVM code is the ExecutionEngine interface:
http://llvm.cs.uiuc.edu/doxygen/classExecutionEngine.html

There are concrete implementations of this interface for the JIT and for
the interpreter.  Note that we will probably need to add some additional
methods to this class to enable all of the functionality that you need
(that's not a problem though :).
> Is it possible to extract from LLVM the required code
> only, and link it with the final executable?  In my experience, there are a
> few limitations of C that require explicit assembly code, like building
calls
> dynamically (i.e. the caller's equivalent of varargs).
What do you mean by the "required code only"?  LLVM itself is very
modular, you only have to link the libraries in that you use.  It's also
very easy to slice and dice LLVM code from programs or functions, etc.
For example, the simple 'extract' tool rips a function out of a module
(this is typically useful only when debugging though)...

-Chris

-- 
http://llvm.cs.uiuc.edu/
http://www.nondot.org/~sabre/Projects/

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Nov 2003 - [LLVMdev] Re: LLVM and PyPy

[LLVMdev] Re: LLVM and PyPy

[LLVMdev] Re: LLVM and PyPy

[LLVMdev] Re: LLVM and PyPy

[LLVMdev] Re: LLVM and PyPy

[LLVMdev] Re: LLVM and PyPy

[LLVMdev] Re: LLVM and PyPy

Apparently Analagous Threads