On Fri, 31 Oct 2003, Armin Rigo wrote:> Hello Chris, > > We have been investigating your project and the good documentation > and are very impressed. If we understood your goals correctly > this seems like a good match for our ongoing and active PyPy project, > a reimplementation of the Python language in Python.Cool. We are all big fans of Python here. :)> We'll definitely try using llvm as our low-level backend. > But actually we contact you now to ask you if you'd be interested > in some bidirectional collaboration.I've read up a bit on PyPy, and it looks like LLVM could be a nice way to get the JIT type interface that you would like. Also, making use of the LLVM optimizer can make your staticly generated code nice and fast. :)> Maybe a bit more background what we did or are: > > - we are an open international group of individuals collaborating on > our free time mostly. We are very involved with research, open source > communities and especially the Python communities. > > - during the course of four one-week meetings (which we call development > "sprints") we have done a rather complete interpreter and can translate > parts of it to C or Lisp code already (using a control-flow representation > which is actually very similar to LLVM code, hence our enthusiasm!) > > - we very recently submitted a funding proposal to the European Union: > > http://codespeak.net/svn/pypy/trunk/doc/funding/proposal/part_b.pdf > > and you may find these two chapters particularly interesting: > > http://codespeak.net/pypy/index.cgi?doc/funding/B1.0_objectives > http://codespeak.net/pypy/index.cgi?doc/funding/B6.0_detailedIt sounds like LLVM could be a good implementation strategy for your goals. You're right that developing a JIT from stratch is a lot of work :)> On the technical level we are interested to know about and maybe collaborate > with efforts to support very-high-level language features in LLVM (e.g. > walking the stack, for garbage collection), fine-grained runtime code generation > (generating code only one basic block at a time),These are definitely features that we plan to add, but just haven't gotten to yet. In particular, Alkis is working on a Java front-end, which will require similar features. In the beginning, we will probably just use a conservative collector, eventually adding support for precise GC. We already have the capability of doing function-at-a-time code generation: what is basic-block at a time generation used for? How do you do global optimizations like register allocation?> and possibly also contribute > a PowerPC back-end, and Python bindings for LLVM.That would be great! We've tossed around the idea of creating C bindings for LLVM, which would make interfacing from other languages easier than going directly to the C++ API, but we just haven't had a chance to yet. Maybe you guys would be interested in helping with that project?> On another level, some of the PyPy core developers are actually also > involved with the 'codespeak' site which aims at connecting interesting > open source projects and provide new collaborative development services. > The PyPy project is extensively using subversion which is a very interesting > (and stable) alternative to cvs. So if you need any help with setting > up some publically accessible infrastructure the codespeak guys will > certainly welcome you.At this point, we're working like crazy to get important features implemented in LLVM. We certainly acknowledge that CVS has severe deficiencies, but in the near future we'll probably stay with it. Perhaps after SVN 1.0 comes out... :)> Feel free to forward this mail to the LLVM mailing list, btw. We are > just interested in getting some first contact and enter a productive > discussion and - who knows - some interesting collaboration!Done. In general, that's a good place to discuss all kinds of issues like this. Please let us know what your plans are and what the next step is. Perhaps C bindings would be the most logical starting place? -Chris -- http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/
Hi Chris, [Chris Lattner Fri, Oct 31, 2003 at 10:58:45AM -0600]> On Fri, 31 Oct 2003, Armin Rigo & Holger Krekel wrote: > > > Hello Chris, > > > > We have been investigating your project and the good documentation > > and are very impressed. If we understood your goals correctly > > this seems like a good match for our ongoing and active PyPy project, > > a reimplementation of the Python language in Python. > > Cool. We are all big fans of Python here. :)That's good because we might want to recode some LLVM functionalities in Python :-)> > We'll definitely try using LLVM as our low-level backend. > > But actually we contact you now to ask you if you'd be interested > > in some bidirectional collaboration. > > I've read up a bit on PyPy, and it looks like LLVM could be a nice way to > get the JIT type interface that you would like. Also, making use of the > LLVM optimizer can make your staticly generated code nice and fast. :)Yes, but we also would want to dynamically emit and execute LLVM code. But a static translation is indeed our first goal :-)> We've tossed around the idea of creating C bindings > for LLVM, which would make interfacing from other languages easier than > going directly to the C++ API, but we just haven't had a chance to yet. > Maybe you guys would be interested in helping with that project?Thinking some more about it, we would probably try to translate our PyPy implementation into LLVM-code and also generate some glue-LLVM-code which allows us to programmatically drive LLVM from Python. Is LLVM able to "drive" itself? I mean can the LLVM-low-level object code generate LLVM-low-level object code and then execute it? This would fit nicely with PyPy because we are running ourselves (in 'abstract interpretation' mode) in order to generate a low-level representation of ourselves. This low-level representation is already close to LLVM's low-level view. So if the LLVM-code gets executed (beeing a python interpreter) it should be able to just-in-time-compile new LLVM code and execute it. With our architecture, for such a JIT we could reuse a good part of the code we already have for generating our low-level representation. It's a rather self-referential thing (also see our logo: http://codespeak.net/pypy/ :-).> > On another level, some of the PyPy core developers are actually also > > involved with the 'codespeak' site which aims at connecting interesting > > open source projects and provide new collaborative development services. > > The PyPy project is extensively using subversion which is a very interesting > > (and stable) alternative to cvs. So if you need any help with setting > > up some publically accessible infrastructure the codespeak guys will > > certainly welcome you. > > At this point, we're working like crazy to get important features > implemented in LLVM. We certainly acknowledge that CVS has severe > deficiencies, but in the near future we'll probably stay with it. > Perhaps after SVN 1.0 comes out... :)then we may want to mirror your cvs repo to subversion :-) The reason is that we want to provide consistent versions of all the libraries/modules/projects we use. And subversion makes this rather easy (if the other project is svn-controled, too), e.g. you can say 'i want to follow the HEAD version of LLVM in this branch' or 'i want to use this stable version of LLVM for my own stable-release'. Then you can just issue 'svn up' and you will have the desired versions on your working-copy. However, i can understand that you don't want to consider subversion right now and will stop advertising now :-) cheers, holger
> > Cool. We are all big fans of Python here. :) > > That's good because we might want to recode some LLVM functionalities > in Python :-)As long as it makes sense. Needless duplication of effort is never a good idea...> > I've read up a bit on PyPy, and it looks like LLVM could be a nice way to > > get the JIT type interface that you would like. Also, making use of the > > LLVM optimizer can make your staticly generated code nice and fast. :) > > Yes, but we also would want to dynamically emit and execute LLVM code. > But a static translation is indeed our first goal :-)Of course. We can do both. In fact, we can even emit C code, which will be useful initially if you're work on PowerPC machines.> Thinking some more about it, we would probably try to translate our PyPy > implementation into LLVM-code and also generate some glue-LLVM-code > which allows us to programmatically drive LLVM from Python. Is LLVM > able to "drive" itself? I mean can the LLVM-low-level object code > generate LLVM-low-level object code and then execute it?Yes, this should certainly be possible. Kindof like what the Jalapeno/Jikes JVM does with Java. The point about the C bindings is that they will allow a nice interface between the parts written in python, and the parts written in C++. It doesn't make sense for you to rewrite all of LLVM in python, especially since the interface to build the LLVM is pretty clean.> This would fit nicely with PyPy because we are running ourselves (in > 'abstract interpretation' mode) in order to generate a low-level > representation of ourselves. This low-level representation is already > close to LLVM's low-level view. So if the LLVM-code gets executed > (beeing a python interpreter) it should be able to just-in-time-compile > new LLVM code and execute it. With our architecture, for such a JIT weMakes a lot of sense.> > At this point, we're working like crazy to get important features > > implemented in LLVM. We certainly acknowledge that CVS has severe > > deficiencies, but in the near future we'll probably stay with it. > > Perhaps after SVN 1.0 comes out... :) > > then we may want to mirror your cvs repo to subversion :-)That is obviously no problem. :)> The reason is that we want to provide consistent versions of all > the libraries/modules/projects we use. And subversion makesMakes sense. If it is publically accessible and stable, perhaps we can add information about it on the LLVM pages for others who would prefer to work with SVN... -Chris -- http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/
Hello Chris, On Fri, Oct 31, 2003 at 10:58:45AM -0600, Chris Lattner wrote:> These are definitely features that we plan to add, but just haven't gotten > to yet. In particular, Alkis is working on a Java front-end, which will > require similar features. In the beginning, we will probably just use a > conservative collector, eventually adding support for precise GC.Great!> We already have the capability of doing function-at-a-time code > generation: what is basic-block at a time generation used for? How do you > do global optimizations like register allocation?It is central to Psyco, the Python just-in-time specializer (http://psyco.sourceforge.net) whose techniques we plan to integrate with PyPy. Unlike other environments like Self, which collects execution profiles during interpretation and use them to recompile whole functions, Psyco has no interpretation stage: it directly emits a basic block and run it; the values found at run-time trigger the compilation of more basic blocks, which are run, and so on. So each function's machine code is a dynamic network of basic blocks which are various specialized versions of a bit of the original function. This network is not statically known, in particular because basic blocks often have a "switch" exit based on some value or type collected at run-time. Every new value encountered at this point triggers the compilation of a new switch case jumping to a new basic block. We will also certainly consider Self-style recompilations, as they allow more agressive optimizations. (Register allocation in Psyco is done using a simple round-robin scheme; code generation is very fast.)> That would be great! We've tossed around the idea of creating C bindings > for LLVM, which would make interfacing from other languages easier than > going directly to the C++ API, but we just haven't had a chance to yet. > Maybe you guys would be interested in helping with that project?Well, as the C++ API is nice and clean it is probably simpler to bind it directly to Python. We would probably go for Boost-Python, which makes C++ objects directly accessible to Python. But nothing is sure about this; maybe driving LLVM from LLVM code is closer to our needs. Is there a specific interface to do that? Is it possible to extract from LLVM the required code only, and link it with the final executable? In my experience, there are a few limitations of C that require explicit assembly code, like building calls dynamically (i.e. the caller's equivalent of varargs). A bientot, Armin.
Hi, I've been following these messages and just thought I would mention a couple of our near-term goals which may be related to what you all are interested in: (1) Alkis is really working on building a "toolkit" for implementing virtual machines on top of LLVM. This means that different VMs (like JVM, CLI, and PyPy) only need to implement their specific runtime requirements, and a fast, simple (online or offline) translator to LLVM. All the the native code generation and runtime optimization would happen in the LLVM framework. Is this more or less what you have in mind for using LLVM as a back end in PyPy? Note that in this view, *all* the decisions about whether or when to recompile some unit (e.g., hot functions as in Self) would happen in the LLVM framework, independent of what language is being compiled. Does that make sense for Python (and for PyPy)? Supporting a Psyco-style basic-block-at-a-time compilation model (as described by Armin below) on top of this toolkit is not something we had considered so far. It would be interesting to see how that could be done. (2) One difficult part in building such a toolkit is to abstract the interfaces between code generation and the runtime components implemented in the language VM (like GC, exception handling, etc.). We have been assuming that these runtime components must be controlled by the language VM (e.g., JVM), since their semantics and performance constraints are language-specific. The toolkit would only provide some common primitives, to interface with the code generator and to make these more efficient. (3) Patrick Meredith is going to be working on a CAML (and perhaps later, OCAML) front end to LLVM. Note that these are all at a very early stage of work.> From: Armin Rigo <arigo at tunes.org> > Subject: [LLVMdev] Re: LLVM and PyPy > Sender: llvmdev-admin at cs.uiuc.edu > Date: Fri, 31 Oct 2003 20:48:40 +0000 > > Hello Chris, > > On Fri, Oct 31, 2003 at 10:58:45AM -0600, Chris Lattner wrote: > > These are definitely features that we plan to add, but just haven't gotten > > to yet. In particular, Alkis is working on a Java front-end, which will > > require similar features. In the beginning, we will probably just use a > > conservative collector, eventually adding support for precise GC. > > Great! > > > We already have the capability of doing function-at-a-time code > > generation: what is basic-block at a time generation used for? How do you > > do global optimizations like register allocation? > > It is central to Psyco, the Python just-in-time specializer > (http://psyco.sourceforge.net) whose techniques we plan to integrate with > PyPy. Unlike other environments like Self, which collects execution profiles > during interpretation and use them to recompile whole functions, Psyco has no > interpretation stage: it directly emits a basic block and run it; the values > found at run-time trigger the compilation of more basic blocks, which are run, > and so on. So each function's machine code is a dynamic network of basic > blocks which are various specialized versions of a bit of the original > function. This network is not statically known, in particular because basic > blocks often have a "switch" exit based on some value or type collected at > run-time. Every new value encountered at this point triggers the compilation > of a new switch case jumping to a new basic block. > > We will also certainly consider Self-style recompilations, as they allow more > agressive optimizations. (Register allocation in Psyco is done using a simple > round-robin scheme; code generation is very fast.) > > > That would be great! We've tossed around the idea of creating C bindings > > for LLVM, which would make interfacing from other languages easier than > > going directly to the C++ API, but we just haven't had a chance to yet. > > Maybe you guys would be interested in helping with that project? > > Well, as the C++ API is nice and clean it is probably simpler to bind it > directly to Python. We would probably go for Boost-Python, which makes C++ > objects directly accessible to Python. But nothing is sure about this; maybe > driving LLVM from LLVM code is closer to our needs. Is there a specific > interface to do that? Is it possible to extract from LLVM the required code > only, and link it with the final executable? In my experience, there are a > few limitations of C that require explicit assembly code, like building calls > dynamically (i.e. the caller's equivalent of varargs). > > > A bientot, > > Armin.Regards, --Vikram --------------------------------------------------------------------- VIKRAM S. ADVE Assistant Professor E-MAIL: vadve at cs.uiuc.edu Department of Computer Science PHONE: (217) 244-2016 Univ. of Illinois at Urbana-Champaign FAX: (217) 244-6869 1304 W. Springfield Ave. http://www.cs.uiuc.edu/~vadve Urbana IL 61801-2987. http://llvm.cs.uiuc.edu/ ---------------------------------------------------------------------
> > We already have the capability of doing function-at-a-time code > > generation: what is basic-block at a time generation used for? How do you > > do global optimizations like register allocation? > > It is central to Psyco, the Python just-in-time specializer > (http://psyco.sourceforge.net) whose techniques we plan to integrate with > PyPy. Unlike other environments like Self, which collects execution profilesOk, makes sense.> > That would be great! We've tossed around the idea of creating C bindings > > for LLVM, which would make interfacing from other languages easier than> Well, as the C++ API is nice and clean it is probably simpler to bind it > directly to Python. We would probably go for Boost-Python, which makes C++ > objects directly accessible to Python. But nothing is sure about this; maybeOk, I didn't know the boost bindings allowed calling C++ code from python. In retrospect, that makes a lot of sense. :)> driving LLVM from LLVM code is closer to our needs. Is there a specific > interface to do that?Sure, what exactly do you mean by driving LLVM code from LLVM? The main interface for executing LLVM code is the ExecutionEngine interface: http://llvm.cs.uiuc.edu/doxygen/classExecutionEngine.html There are concrete implementations of this interface for the JIT and for the interpreter. Note that we will probably need to add some additional methods to this class to enable all of the functionality that you need (that's not a problem though :).> Is it possible to extract from LLVM the required code > only, and link it with the final executable? In my experience, there are a > few limitations of C that require explicit assembly code, like building calls > dynamically (i.e. the caller's equivalent of varargs).What do you mean by the "required code only"? LLVM itself is very modular, you only have to link the libraries in that you use. It's also very easy to slice and dice LLVM code from programs or functions, etc. For example, the simple 'extract' tool rips a function out of a module (this is typically useful only when debugging though)... -Chris -- http://llvm.cs.uiuc.edu/ http://www.nondot.org/~sabre/Projects/