thr3ads.net - llvm dev - [LLVMdev] Folding vector instructions [Dec 2008]

If this information is useful, please help other people find it:
Share via:

Alex

2008-Dec-30 02:29 UTC

[LLVMdev] Folding vector instructions

Hello.

Sorry I am not sure this question should go to llvm or mesa3d-dev mailing
list, so I post it to both.

I am writing a llvm backend for a modern graphics processor which has a ISA
very similar to that of Direct 3D.

I am reading the code in Gallium-3D driver in a mesa3d branch, which
converts the shader programs (TGSI tokens) to LLVM IR.

For the shader instruction also found in LLVM IR, the conversion is trivial:

<code>
llvm::Value * Instructions::mul(llvm::Value *in1, llvm::Value *in2) {
   return m_builder.CreateMul(in1, in2, name("mul")); // m_builder is
a
llvm::IRBuilder
}
</code>

However, the special instrucions cannot directly be mapped to LLVM IR, like
"min", the conversion involves in 'extract' the vector, create
less-than-compare, create 'select' instruction, and create
'insert-element'
instruction.

<code>
llvm::Value * Instructions::min(llvm::Value *in1, llvm::Value *in2)
{
   std::vector<llvm::Value*> vec1 = extractVector(in1); // generate LLVM
extract element
   std::vector<llvm::Value*> vec2 = extractVector(in2);

   Value *xcmp  = m_builder.CreateFCmpOLT(vec1[0], vec2[0],
name("xcmp"));
   Value *selx = m_builder.CreateSelect(xcmp, vec1[0], vec2[0],
                                        name("selx"));

   Value *ycmp  = m_builder.CreateFCmpOLT(vec1[1], vec2[1],
name("ycmp"));
   Value *sely = m_builder.CreateSelect(ycmp, vec1[1], vec2[1],
                                        name("sely"));

   Value *zcmp  = m_builder.CreateFCmpOLT(vec1[2], vec2[2],
name("zcmp"));
   Value *selz = m_builder.CreateSelect(zcmp, vec1[2], vec2[2],
                                        name("selz"));

   Value *wcmp  = m_builder.CreateFCmpOLT(vec1[3], vec2[3],
name("wcmp"));
   Value *selw = m_builder.CreateSelect(wcmp, vec1[3], vec2[3],
                                        name("selw"));
   return vectorFromVals(selx, sely, selz, selw); // generate LLVM
'insert-element'
}
</code>

Eventually all these should be folded to a 'min' instruction in the
codegen,
so I wonder if the conversion only generates a simple 'call' instruction
to
a 'min Function' will make the instruction selection easier (no folding
and
complicated pattern-matching in the instruction selection DAG).

I don't have experience of the new vector instructions in LLVM, and perhaps
that's why it makes me feel it's complicated to fold the swizzle and
writemask.

Thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20081230/3e9eaef5/attachment.html>

Corbin Simpson

2008-Dec-30 14:39 UTC

head link

[LLVMdev] [Mesa3d-dev] Folding vector instructions

Alex wrote:> Hello.
> 
> Sorry I am not sure this question should go to llvm or mesa3d-dev mailing
> list, so I post it to both.
> 
> I am writing a llvm backend for a modern graphics processor which has a ISA
> very similar to that of Direct 3D.
> 
> I am reading the code in Gallium-3D driver in a mesa3d branch, which
> converts the shader programs (TGSI tokens) to LLVM IR.
> 
> For the shader instruction also found in LLVM IR, the conversion is
trivial:
> 
> <code>
> llvm::Value * Instructions::mul(llvm::Value *in1, llvm::Value *in2) {
>    return m_builder.CreateMul(in1, in2, name("mul")); //
m_builder is a
> llvm::IRBuilder
> }
> </code>
> 
> However, the special instrucions cannot directly be mapped to LLVM IR, like
> "min", the conversion involves in 'extract' the vector,
create
> less-than-compare, create 'select' instruction, and create
'insert-element'
> instruction.
> 
> <code>
> llvm::Value * Instructions::min(llvm::Value *in1, llvm::Value *in2)
> {
>    std::vector<llvm::Value*> vec1 = extractVector(in1); // generate
LLVM
> extract element
>    std::vector<llvm::Value*> vec2 = extractVector(in2);
> 
>    Value *xcmp  = m_builder.CreateFCmpOLT(vec1[0], vec2[0],
name("xcmp"));
>    Value *selx = m_builder.CreateSelect(xcmp, vec1[0], vec2[0],
>                                         name("selx"));
> 
>    Value *ycmp  = m_builder.CreateFCmpOLT(vec1[1], vec2[1],
name("ycmp"));
>    Value *sely = m_builder.CreateSelect(ycmp, vec1[1], vec2[1],
>                                         name("sely"));
> 
>    Value *zcmp  = m_builder.CreateFCmpOLT(vec1[2], vec2[2],
name("zcmp"));
>    Value *selz = m_builder.CreateSelect(zcmp, vec1[2], vec2[2],
>                                         name("selz"));
> 
>    Value *wcmp  = m_builder.CreateFCmpOLT(vec1[3], vec2[3],
name("wcmp"));
>    Value *selw = m_builder.CreateSelect(wcmp, vec1[3], vec2[3],
>                                         name("selw"));
>    return vectorFromVals(selx, sely, selz, selw); // generate LLVM
> 'insert-element'
> }
> </code>
> 
> Eventually all these should be folded to a 'min' instruction in the
codegen,
> so I wonder if the conversion only generates a simple 'call'
instruction to
> a 'min Function' will make the instruction selection easier (no
folding and
> complicated pattern-matching in the instruction selection DAG).
> 
> I don't have experience of the new vector instructions in LLVM, and
perhaps
> that's why it makes me feel it's complicated to fold the swizzle
and
> writemask.
> 
> Thanks.
I hope marcheu sees this too.

Um, I was thinking that we should eventually create intrinsic functions
for some of the commands, like LIT, that might not be
single-instruction, but that can be lowered eventually, and for commands
like LG2, that might be single-instruction for shaders, but probably not
for non-shader chipsets.

Unfortunately, I'm still learning LLVM, so I might be completely and
totally off-base here.

Out of curiosity, which chipset are you working on? R600? NV50?
Something else?

~ C.

Chris Lattner

2008-Dec-30 20:30 UTC

head link

[LLVMdev] [Mesa3d-dev] Folding vector instructions

On Dec 30, 2008, at 6:39 AM, Corbin Simpson wrote:>> However, the special instrucions cannot directly be mapped to LLVM  
>> IR, like
>> "min", the conversion involves in 'extract' the
vector, create
>> less-than-compare, create 'select' instruction, and create
'insert-
>> element'
>> instruction.
Using scalar operations obviously works, but will probably produce  
very inefficient code.  One positive thing is that all target-specific  
operations of supported vector ISAs (Altivec and SSE[1-4] currently)  
are exposed either through LLVM IR ops or through target-specific  
builtins/intrinsics.  This means that you can get access to all the  
crazy SSE instructions, but it means that your codegen would have to  
handle this target-specific code generation.

The direction we're going is to expose more and more vector operations  
in LLVM IR.  For example, compares and select are currently being  
worked on, so you can do a comparison of two vectors which returns a  
vector of bools, and use that as the compare value of a select  
instruction (selecting between two vectors).  This would allow  
implementing min and a variety of other operations and is easier for  
the codegen to reassemble into a first-class min operation etc.

I don't know what the status of this is, I think it is partially  
implemented but may not be complete yet.
>> I don't have experience of the new vector instructions in LLVM, and
>> perhaps
>> that's why it makes me feel it's complicated to fold the
swizzle and
>> writemask.
We have really good support for swizzling operations already with the  
shuffle_vector instruction.  I'm not sure about writemask.
>
> Um, I was thinking that we should eventually create intrinsic  
> functions
> for some of the commands, like LIT, that might not be
> single-instruction, but that can be lowered eventually, and for  
> commands
> like LG2, that might be single-instruction for shaders, but probably  
> not
> for non-shader chipsets.
Sure, it would be very reasonable to make these target-specific  
builtins when targeting a GPU, the same way we have target-specific  
builtins for SSE.

-Chris

Possibly Parallel Threads

Search for more reasonably related threads

llvm dev - Dec 2008 - [LLVMdev] Folding vector instructions

[LLVMdev] Folding vector instructions

[LLVMdev] [Mesa3d-dev] Folding vector instructions

[LLVMdev] [Mesa3d-dev] Folding vector instructions

Possibly Parallel Threads