thr3ads.net - llvm dev - [LLVMdev] "Bound Methods" in LLVM Bytecode [Oct 2005]

If this information is useful, please help other people find it:
Share via:

Evan Jones

2005-Oct-28 20:34 UTC

[LLVMdev] "Bound Methods" in LLVM Bytecode

Hello,

I have been thinking about efficient implementation of dynamically typed 
languages in my spare time. Specifically, I'm working on a toy 
implementation of a tiny piece of Python using LLVM as a native code 
generating JIT. I've run into a bit of an issue, involving how Python 
deals with method calls. I'm not sure how/if I can implement this in 
LLVM. In Python, the following code:


somefunc = a.method
somefunc()

Roughly translates into:

functionObject = lookup( "method" in object a )
functionObject->functionPointer()


The challenge is that if "method" is actually a method, calling it 
magically adds "a" as the first parameter. If it is NOT a method, then
no messing with the arguments occurs. As far as can tell, this forces an 
implementation to create BoundMethod objects that wrap the actual method 
calls. The question is, how can I implement this efficiently, ideally 
using LLVM?

My idea is to add a NULL pointer as the first parameter to all function 
calls. "Normal" functions would ignore it, but methods would look at
the
first parameter to find the "this" pointer. I could then generate a
tiny
stub for each bound method that would do the following:

1. Replace the first argument with the appropriate "this"
2. Jump to the real function

Is it possible to do something like this in LLVM? Will it work if I just 
create a char array and copy in the appropriate native code for the 
current platform? I would rather let LLVM do the hard work, but if that 
isn't possible, I'm looking for some acceptable hack.

An additional ugly bit is that these objects will be created and 
destroyed frequently, so integration with LLVM's memory system is 
important. The last I checked, LLVM does not keep track of code in 
memory, so this would effectively create a memory leak.

Thanks for any help,

Evan Jones

Chris Lattner

2005-Oct-29 05:04 UTC

head link

[LLVMdev] "Bound Methods" in LLVM Bytecode

On Fri, 28 Oct 2005, Evan Jones wrote:> I have been thinking about efficient implementation of dynamically typed 
> languages in my spare time. Specifically, I'm working on a toy
implementation
> of a tiny piece of Python using LLVM as a native code generating JIT.
I've
Cool!
> run into a bit of an issue, involving how Python deals with method calls.
I'm
> not sure how/if I can implement this in LLVM. In Python, the following
code:
Ok.
> somefunc = a.method
> somefunc()
>
> Roughly translates into:
>
> functionObject = lookup( "method" in object a )
> functionObject->functionPointer()
>
>
> The challenge is that if "method" is actually a method, calling
it magically
> adds "a" as the first parameter. If it is NOT a method, then no
messing with
> the arguments occurs. As far as can tell, this forces an implementation to 
> create BoundMethod objects that wrap the actual method calls. The question 
> is, how can I implement this efficiently, ideally using LLVM?
Okay.  One simple option would be to insert code like this:

if (isamethod(functionObject))
   functionObject->functionPointer(a)
else
   functionObject->functionPointer()
> My idea is to add a NULL pointer as the first parameter to all function 
> calls. "Normal" functions would ignore it, but methods would look
at the
> first parameter to find the "this" pointer. I could then generate
a tiny stub
> for each bound method that would do the following:
>
> 1. Replace the first argument with the appropriate "this"
> 2. Jump to the real function
>
> Is it possible to do something like this in LLVM?
Sure, you can do this.  Another simple option would be to just make every 
"function" take a first pointer argument which they ignore.  This
would
allow the caller to always pass a this pointer without knowing anything 
about the callee.
> Will it work if I just 
> create a char array and copy in the appropriate native code for the current
> platform?
Hrm, sometimes, sometimes not.  Code is not always relocatable like that, 
it sounds dangerous.
> I would rather let LLVM do the hard work, but if that isn't possible, 
> I'm looking for some acceptable hack.
LLVM can do it, it's just a matter of picking the right solution.  To me, 
adding a dummy 'this' argument to functions which is ignored seems like 
the most simple and logical way to do it.
> An additional ugly bit is that these objects will be created and destroyed 
> frequently, so integration with LLVM's memory system is important. The
last I
> checked, LLVM does not keep track of code in memory, so this would 
> effectively create a memory leak.
If possible, I would suggest avoiding creating and destroying lots of 
little stubs.  Even if we teach llvm to recycle this memory (wouldn't be 
that hard), it will still be much less efficient than having a dummy 
argument for functions.

Besides, if the 'address is never taken' of these functions, the
standard
LLVM optimizations will remove dead arguments.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.org/

Evan Jones

2005-Oct-29 14:20 UTC

head link

[LLVMdev] "Bound Methods" in LLVM Bytecode

On Oct 29, 2005, at 1:04, Chris Lattner wrote:>> The question is, how can I implement this efficiently, ideally using 
>> LLVM?
> Okay.  One simple option would be to insert code like this:
>
> if (isamethod(functionObject))
>   functionObject->functionPointer(a)
> else
>   functionObject->functionPointer()
Ah yes, the good old fashioned simple approach. The only change is that 
by the time I get to the function call, I may no longer have reference 
to the object (in the compiler), so I would have to stuff that into the 
bound method object itself.
> Sure, you can do this.  Another simple option would be to just make 
> every "function" take a first pointer argument which they ignore.
> This would allow the caller to always pass a this pointer without 
> knowing anything about the callee.
Ah, of course! This is probably the best way to do it, since it is so 
simple. The "FunctionObject" type would contain not only a function 
pointer, but also a "this" pointer. For normal functions,
"this" would
be NULL. Why didn't I think of that, since I was halfway to that 
solution already? That would change the call implementation to the 
following:

functionObject->functionPointer( functionObject->thisPointer )
>> Will it work if I just create a char array and copy in the 
>> appropriate native code for the current platform?
> Hrm, sometimes, sometimes not.  Code is not always relocatable like 
> that, it sounds dangerous.
Ah, also a good point. A copying garbage collector, for example, would 
definitely make things more complicated.

Thanks for your help! I was definitely thinking the wrong way.

Evan Jones

--
Evan Jones
http://evanjones.ca/

Karl Magdsick

2005-Oct-29 16:48 UTC

head link

[LLVMdev] "Bound Methods" in LLVM Bytecode

On 10/28/05, Evan Jones <ejones at uwaterloo.ca> wrote:
[snip]>  Will it work if I just
> create a char array and copy in the appropriate native code for the
> current platform? I would rather let LLVM do the hard work, but if that
> isn't possible, I'm looking for some acceptable hack.
(1) The memory page/segment must be marked executable by the
OS.  Under POSIX systems, this is typically done by mmap()ing
an anonymous file and then mprotect()ing the memory.  As I remember,
POSIX doesn't guarantee that mprotect will work on memory directly
allocated with malloc or calloc.  I believe some systems allow it, but
it's my understanding that this practice is non-portable.  The Win32 API
has a function similar in name and function to mprotect
("MemProtect"?? "ProtectMem"??), but I'm not a Win32
guy.

Note: prior to OSes setting the x86 NX/DX bit, x86 code
was able to get away with the assumption that all readable pages are
executable.  This doesn't make such code correct.

(2) As already mentioned by others, you need relocatable code for this
to work properly.


-Karl

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - Oct 2005 - [LLVMdev] "Bound Methods" in LLVM Bytecode

[LLVMdev] "Bound Methods" in LLVM Bytecode

[LLVMdev] "Bound Methods" in LLVM Bytecode

[LLVMdev] "Bound Methods" in LLVM Bytecode

[LLVMdev] "Bound Methods" in LLVM Bytecode

Maybe Matching Threads