Vyacheslav Akhmechet
2005-Apr-21 14:28 UTC
[LLVMdev] Using LLVM for a dynamically typed language
Evan, The problem is that I do not know the type of a target function at compile time. If you consider my code example, I don't know the type of 'i' until runtime (in fact, I can't even know a possible range of types 'i' may assume). Thanks, - Slava. On 4/21/05, Evan Jones <ejones at uwaterloo.ca> wrote:> On Thu, 2005-21-04 at 09:31 -0400, Vyacheslav Akhmechet wrote: > > At this point I cannot know the type of 'i' at compile time. At > > runtime 'i' is a structure that contains a type and a function > > pointer. What I can't figure out is how to cast my llvm function > > pointer to an appropriate function type. I cannot know until runtime > > what the type will be. Naturally I can add runtime code that will do > > all appropriate checks (number of parameters, etc.) but how do I do > > the actual cast? > > I'm not sure that I understand. LLVM permits you to cast any pointer to > any other pointer. If you are using LLVM to generate your code, this > basically has 3 steps: > > 1. Get a Type* for the function type you want to cast to. > 2. Use a cast instruction to actually perform the type cast. > > In the code I'm currently working on, I do the following to cast a > function pointer from one type to another. Hopefully the line wrapping > won't totally destroy this. > > std::vector<const Type*> paramTypes; > > // First parameter: a void pointer > paramTypes.push_back( PointerType::get( Type::getPrimitiveType( Type::SByteTyID ) ) ); > > FunctionType* pthreadFunctionType = FunctionType::get( > PointerType::get( Type::getPrimitiveType( Type::SByteTyID ) ), > paramTypes, false ); > assert( pthreadFunctionType != NULL ); > > // Cast the function to the pthread function type > Value* castedFunction = new CastInst( function, > PointerType::get( pthreadFunctionType ), "", &terminator ); > > I hope this helps, > > Evan Jones > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://mail.cs.uiuc.edu/mailman/listinfo/llvmdev >
On Thursday 21 April 2005 18:28, Vyacheslav Akhmechet wrote:> Evan, > > The problem is that I do not know the type of a target function at > compile time. If you consider my code example, I don't know the type > of 'i' until runtime (in fact, I can't even know a possible range of > types 'i' may assume).But... in a dynamic language each variable must have an associated type information? So, why can't you use LLVM struct consisting of - integer type code (or anything else you like) - pointer to void which holds the actual data. What you're doing is rather interesting. There's IronPython which implements Python (a highly dynamic language) on top of .NET and which claims to be 2x faster than regular Python. However, I never seen any implementation details so don't know how dynamic features are dealt with. - Volodya
On Thu, 2005-21-04 at 10:28 -0400, Vyacheslav Akhmechet wrote:> The problem is that I do not know the type of a target function at > compile time. If you consider my code example, I don't know the type > of 'i' until runtime (in fact, I can't even know a possible range of > types 'i' may assume).I don't think I properly understand what the issue is here. For my perspective, if you can figure out the type of 'i' at runtime, you either have to: a) Make all functions the same type. For example, make them all return void, take a vector of parameters as the first argument, and a vector for return values as the second argument. b) Call the appropriate LLVM functions to emit the code for the cast at runtime. I don't see how this is a specific challenge with LLVM. It seems to me that this is a challenge that you will encounter when implementing a dynamic language in *any* low-level language. I would encourage you to prototype something in C, and then figure out how to translate that to use LLVM. There are lots of dynamic languages you can use as examples. Python would be appropriate here, since it has function objects. There are also likely to be some textbooks that discuss these sorts of implementation challenges. As compilers are not my research area, I can't recommend any. Good luck, Evan Jones
On Thu, 2005-21-04 at 18:51 +0400, Vladimir Prus wrote:> However, I never seen any implementation details > so don't know how dynamic features are dealt with.I am not an expert about this, but I have done a bit of reading about IronPython and Jython. As far as I am aware of, these issues are dealt with in the same way that they are handled in CPython. Basically, when they compile Python code they just emit the sequence of function calls that the interpreter would make. Thus, doing something like this: pyobject.foo += 1 Turns into something like this: PythonObject temp = pyobject.get_attr( 'foo' ); Array parameters = [ temp, new PythonInteger( 1 ) ]; PythonObject addMethod = temp.get_attr( '__add__' ); PythonObject result = addMethod.call( parameters ); pyobject.set_attr( 'foo', result ); This ends up executing much faster than having to go around an interpreter loop multiple times. You could actually use the same approach to compile Python to native code. Evan Jones
Vyacheslav Akhmechet
2005-Apr-21 15:34 UTC
[LLVMdev] Using LLVM for a dynamically typed language
> a) Make all functions the same type. For example, make them all return > void, take a vector of parameters as the first argument, and a vector > for return values as the second argument.This is something I was considering. I guess I'll end up going with this option.> I don't see how this is a specific challenge with LLVM. It seems to me > that this is a challenge that you will encounter when implementing a > dynamic language in *any* low-level language.I disagree. If I could push a bunch of arguments on a stack (or specify a list of arguments, etc.) and just use a "call" instruction with a pointer to a memory address I wouldn't run into this problem. This is a specific challenge with LLVM because it is strictly typed.
On Thu, 21 Apr 2005, Evan Jones wrote:> I don't think I properly understand what the issue is here. For my > perspective, if you can figure out the type of 'i' at runtime, you > either have to: > > a) Make all functions the same type. For example, make them all return > void, take a vector of parameters as the first argument, and a vector > for return values as the second argument. > > b) Call the appropriate LLVM functions to emit the code for the cast at > runtime. > > I don't see how this is a specific challenge with LLVM. It seems to meI agree with Evan. To be more specific, for your initial example: if(rand() > 5) i = define(x, y, z) { return x + y + z; } else i = define(x, y) { return x + y; } As you've noticed here, the result of a "define" has to have the same LLVM type. I suggest compiling this to the equivalent of this C code: void anon1(int *Result, void *Args) { RealArgs = (sometype*)Args; *Result = RealArgs->X + RealArgs->Y + RealArgs->Z; } void anon2(int *Result, void *Args) { RealArgs = (sometype2*)Args; *Result = RealArgs->X + RealArgs->Y; } Of course, the result pointer would want to be generic somehow, but I just made it int* to simplify the issue. Despite this, "i" can now just be a function pointer. If you have closures to deal with, you make each function a pair of function pointer and environment pointer, and pass them around together. -Chris -- http://nondot.org/sabre/ http://llvm.cs.uiuc.edu/