thr3ads.net - llvm dev - [LLVMdev] avoid live range overlap of "vector" registers [May 2005]

If this information is useful, please help other people find it:
Share via:

Tzu-Chien Chiu

2005-May-11 02:25 UTC

[LLVMdev] avoid live range overlap of "vector" registers

On  Tue May 10 2005, Chris Lattner wrote:>On Tue, 10 May 2005, Morten Ofstad wrote:
>> Actually, I think it would be better to define the registers as a
machine
>> value type for packed float x4, and providing some 'extract'
and 'inject'
>> instructions to access individual components... There should also be a 
>> 'shuffle' instruction (corresponding to the SSE PSHUF
instruction) to change
>> the individual components around.
>
>You're right, that would be a better way to go.  To start, I would
suggest
>adding extract/inject intrinsics (not instructions) because it is easier. 
>If you're interested in doing this, there is documentation for this
here:
quote <http://llvm.cs.uiuc.edu/docs/LangRef.html#intrinsics>:
"To do this, extend the default implementation of the
IntrinsicLowering class to handle the intrinsic. Code generators use
this class to lower intrinsics they do not understand to raw LLVM
instructions that they do."

but to which llvm instructions should the extract/inject (or
shuffle/pack) intrinsics be lowered? llvm instruction does not allow
to access the individual scalar value in a packed value.

Chris Lattner

2005-May-11 04:02 UTC

head link

[LLVMdev] avoid live range overlap of "vector" registers

On Wed, 11 May 2005, Tzu-Chien Chiu wrote:
> On  Tue May 10 2005, Chris Lattner wrote:
>> On Tue, 10 May 2005, Morten Ofstad wrote:
>>> Actually, I think it would be better to define the registers as a
machine
>>> value type for packed float x4, and providing some
'extract' and 'inject'
>>> instructions to access individual components... There should also
be a
>>> 'shuffle' instruction (corresponding to the SSE PSHUF
instruction) to change
>>> the individual components around.
>>
>> You're right, that would be a better way to go.  To start, I would
suggest
>> adding extract/inject intrinsics (not instructions) because it is
easier.
>> If you're interested in doing this, there is documentation for this
here:
>
> quote <http://llvm.cs.uiuc.edu/docs/LangRef.html#intrinsics>:
> "To do this, extend the default implementation of the
> IntrinsicLowering class to handle the intrinsic. Code generators use
> this class to lower intrinsics they do not understand to raw LLVM
> instructions that they do."
>
> but to which llvm instructions should the extract/inject (or
> shuffle/pack) intrinsics be lowered? llvm instruction does not allow
> to access the individual scalar value in a packed value.
None, that documentation is out of date and doesn't make a ton of sense 
for your application.  I would suggest that you implement it in the 
context of the SelectionDAG framework that all of the code generators 
either currently use or are moving to.  I updated the documentation here: 
http://llvm.cs.uiuc.edu/ChrisLLVM/docs/ExtendingLLVM.html#intrinsic

This will allow you to do something like this:

%i32v4 = type <4 x uint>

%f32v4 = type <4 x float>

declare %f32v4 %swizzle(%f32v4 %In, %i32v4 %Form)

%G = external global %f32v4

void %test() {
         %A = load %f32v4* %G
         %B = call %f32v4 %swizzle(%f32v4 %A, %i32v4 <uint 1, uint 1, uint 1,
uint 1>)   ;; splat XYZW -> YYYY
         store %f32v4 %B, %f32v4* %G
         ret void
}

... Except using llvm.swizzle instead of 'swizzle'.

Unfortunately the code generator currently does not support packed types, 
so this will require some work.  However, this certainly is the closest 
match for your model.

-Chris

-- 
http://nondot.org/sabre/
http://llvm.cs.uiuc.edu/

Morten Ofstad

2005-May-11 07:40 UTC

head link

[LLVMdev] avoid live range overlap of "vector" registers

Chris Lattner wrote:> None, that documentation is out of date and doesn't make a ton of sense
> for your application.  I would suggest that you implement it in the 
> context of the SelectionDAG framework that all of the code generators 
> either currently use or are moving to.  I updated the documentation 
> here: http://llvm.cs.uiuc.edu/ChrisLLVM/docs/ExtendingLLVM.html#intrinsic
> 
> This will allow you to do something like this:
> 
> %i32v4 = type <4 x uint>
> 
> %f32v4 = type <4 x float>
> 
> declare %f32v4 %swizzle(%f32v4 %In, %i32v4 %Form)
> 
> %G = external global %f32v4
> 
> void %test() {
>         %A = load %f32v4* %G
>         %B = call %f32v4 %swizzle(%f32v4 %A, %i32v4 <uint 1, uint 1, 
> uint 1, uint 1>)   ;; splat XYZW -> YYYY
>         store %f32v4 %B, %f32v4* %G
>         ret void
> }
> 
> ... Except using llvm.swizzle instead of 'swizzle'.
I much prefer the name chosen in the SSE instruction set: 'shuffle'
> Unfortunately the code generator currently does not support packed 
> types, so this will require some work.  However, this certainly is the 
> closest match for your model.
This work needs to be done for SSE code generation, which I think would be of
interest to several people (including me)
-- Our front-end generates code that uses packed datatypes a lot and I'm not
entirely happy with the current situation
using the LowerPacked pass... If SSE code generation was working, we would use
LLVM for a lot more, at the moment we
have a small runtime library with SSE optimized functions for things like
trilinear interpolation, but the LLVM
optimizer can't do very much with these functions since they are just
external calls.

m.

Reasonably Related Threads

Search for more seemingly similar threads

llvm dev - May 2005 - [LLVMdev] avoid live range overlap of "vector" registers

[LLVMdev] avoid live range overlap of "vector" registers

[LLVMdev] avoid live range overlap of "vector" registers

[LLVMdev] avoid live range overlap of "vector" registers

Reasonably Related Threads