thr3ads.net - llvm dev - [LLVMdev] PHP Zend LLVM extension (SoC) [Apr 2008]

If this information is useful, please help other people find it:
Share via:

Nuno Lopes

2008-Apr-23 18:44 UTC

[LLVMdev] PHP Zend LLVM extension (SoC)

Thank you both for your answers!
That part of type inference was my second question. PHP uses a structure 
with a union to represent a variable (because a variable can have different 
types, like a long, a double, a stream, etc..), but often a single variable 
will only have one type throughout the program (e.g. iterating through $i in 
a loop). Will LLVM automagically see that we always use the same type for a 
certain variable and discard the whole union and use a single scalar (and 
also discard all the type checking done in the opcode handlers)? We can do 
some type inference on our side if we do a pass on the bytecode, but I would 
like to be sure if that's needed or if LLVM will do it on its own.

Well, about the opcode handlers, that's great news that we don't need to
inline them by hand. Now I only need to fix clang to compile PHP :P

Thanks,
Nuno


----- Original Message ----- 
From: "Gordon Henriksen" <gordonhenriksen at mac.com>
To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu>
Sent: Wednesday, April 23, 2008 12:17 AM
Subject: Re: [LLVMdev] PHP Zend LLVM extension (SoC)


Hi Nuno,

On Apr 22, 2008, at 18:44, Nuno Lopes wrote:
> PHP has a Google Summer of Code project approved to create an LLVM
> extension for the PHP's VM (Zend). 
> (http://code.google.com/soc/2008/php/appinfo.html?csaid=73D5F5E282F9163F
> ). I'll be mentoring that project (and the student is CC'ed).
> Although I've already contributed a few patches to clang, I haven't
> hacked LLVM much, so I would like to gather some advise before
> misleading the student too much :P
This is very exciting!
> So my idea is to use the current PHP parser to produce PHP bytecode
> and then convert the PHP bytecode to LLVM's bitcode. The extra pass
> to create PHP bytecode seems necessary for now, as it makes things
> simpler in the PHP end. The first step would be to convert the PHP
> bytecode to LLVM by just producing function calls to the PHP
> interpreter opcode handlers. This has two advantages: it's a simple
> task and we can put something working fast. The disadvantage is that
> it would only bypass the opcode dispatcher, leaving no much room for
> optimizations.
As far as I know, this is exactly how Apple's OpenGL shader JIT works
in Mac OS X. Unfortunately, LLVM will rarely make dramatic changes to
your memory representation, so this probably won't be as effective as
it is in the OpenGL context. (LLVM will only do aggregate->scalar
memory reorganizations; it probably won't be able to prove this safe
for a dynamic language very often.) Your challenge in generating very-
fast code would likely be one of type inference.
> In the second phase, we would start to inline some simple PHP
> bytecodes, like arithmetic operations and so on, by dumping LLVM
> assembly instead of calling the opcode handler. Eventually we could
> reach a point that no opcode handlers are necessary.
>
> So does this looks like a sane thing? Any helpful advise? Other
> question: After having the LLVM assembly, how should the binary code
> be produced, loaded to memory, and then executed? I assume we can
> link directly to the LLVM code generation and optimization libs. And
> does it support dumping the code directly to the memory so that we
> can run it from there without much magic (and then cache it
> somewhere)?
You can use the facilities of ExecutionEngine to run code in-memory
without ever touching the filesystem. The LLVM tutorial has
information on how to do this.

http://llvm.org/doxygen/classllvm_1_1ExecutionEngine.html
http://llvm.org/docs/tutorial/LangImpl4.html

You'll probably want to provide your opcode handlers as an LLVM IR
module. Your JIT can start up and “seed” the execution environment
with the predefined handlers, then progressively incorporate more
functions into the module as execution progresses.

Hope that helps,
Gordon

Owen Anderson

2008-Apr-24 05:49 UTC

head link

[LLVMdev] PHP Zend LLVM extension (SoC)

On Apr 23, 2008, at 1:44 PM, Nuno Lopes wrote:
> Thank you both for your answers!
> That part of type inference was my second question. PHP uses a  
> structure
> with a union to represent a variable (because a variable can have  
> different
> types, like a long, a double, a stream, etc..), but often a single  
> variable
> will only have one type throughout the program (e.g. iterating  
> through $i in
> a loop). Will LLVM automagically see that we always use the same  
> type for a
> certain variable and discard the whole union and use a single scalar  
> (and
> also discard all the type checking done in the opcode handlers)? We  
> can do
> some type inference on our side if we do a pass on the bytecode, but  
> I would
> like to be sure if that's needed or if LLVM will do it on its own.
>
LLVM likely won't be able to do type inference for you.  That kind of  
high level language will be lost by the time you hit LLVM IR.  Take a  
look at how llvm-gcc or the online demo does codegen for C unions,  
especially for ones of any complexity.  If you want to see a real  
performance win from this, some degree of type inference at a stage  
where high level information has not yet been discarded will be very  
helpful.

--Owen

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 4260 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20080424/bf61a4ab/attachment.bin>

Chris Lattner

2008-Apr-24 05:54 UTC

head link

[LLVMdev] PHP Zend LLVM extension (SoC)

On Apr 23, 2008, at 10:49 PM, Owen Anderson wrote:
>
> On Apr 23, 2008, at 1:44 PM, Nuno Lopes wrote:
>
>> Thank you both for your answers!
>> That part of type inference was my second question. PHP uses a  
>> structure
>> with a union to represent a variable (because a variable can have  
>> different
>> types, like a long, a double, a stream, etc..), but often a single  
>> variable
>> will only have one type throughout the program (e.g. iterating  
>> through $i in
>> a loop). Will LLVM automagically see that we always use the same  
>> type for a
>> certain variable and discard the whole union and use a single  
>> scalar (and
>> also discard all the type checking done in the opcode handlers)? We  
>> can do
>> some type inference on our side if we do a pass on the bytecode,  
>> but I would
>> like to be sure if that's needed or if LLVM will do it on its own.
>>
>
> LLVM likely won't be able to do type inference for you.
I'd put it another way: an existing llvm pass won't do type inference  
for you.  The right way to tackle this is to write an language- 
specific pass on LLVM IR that knows your runtime and can propagate  
types around.

-Chris

Seemingly Similar Threads

Search for more reasonably related threads

llvm dev - Apr 2008 - [LLVMdev] PHP Zend LLVM extension (SoC)

[LLVMdev] PHP Zend LLVM extension (SoC)

[LLVMdev] PHP Zend LLVM extension (SoC)

[LLVMdev] PHP Zend LLVM extension (SoC)

Seemingly Similar Threads