thr3ads.net - llvm dev - [LLVMdev] LowerPacked pass [Nov 2004]

If this information is useful, please help other people find it:
Share via:

Morten Ofstad

2004-Nov-17 15:04 UTC

[LLVMdev] LowerPacked pass

Hello,

Our software uses 4 x float vectors a lot, and I pass these to LLVM as 
packed types - but when I do the JIT compile it seems that the 
LowerPacked pass is never run so the code generation fails. I noticed 
that most other passes have a header file with a public createXXXPass() 
function so they can be added to the PassManager, but LowerPacked 
doesn't have this... What should I do?

m.

PS. Chris, thanks for the feedback on the memory cleanup patch - I'm a 
bit busy getting LLVM integrated in our app now, but I will incorporate 
your suggestions and submit a proper patch soon...

Chris Lattner

2004-Nov-17 18:37 UTC

head link

[LLVMdev] LowerPacked pass

On Wed, 17 Nov 2004, Morten Ofstad wrote:
> Our software uses 4 x float vectors a lot, and I pass these to LLVM as
> packed types - but when I do the JIT compile it seems that the
> LowerPacked pass is never run so the code generation fails. I noticed
> that most other passes have a header file with a public createXXXPass()
> function so they can be added to the PassManager, but LowerPacked
> doesn't have this... What should I do?
I just added it.  There was no reason to not expose it, we just never got
to that point.  Note that packed support in LLVM is not complete yet.  In
particular, here are some of the big missing pieces:

1. No code generators can generate vector instructions yet (SSE or
   altivec, for example).  This should be fairly easy to add though.
2. The lowerpacked pass, which currently converts packed ops into their
   scalar counterparts, has a few limitations:
     A. It does not handle packed arguments to functions
     B. It always lowers all of the way to scalar ops, even if the target
        supports SOME packed types.  For example, it would be nice for it
        to eventually lower <16 x float> into 4 <4 x float>'s if
the
        target supports them.
     C. It has never been thoroughly tested, primarily because we don't
        have a producer of packed operations yet.  I believe it should
        work reasonably well though.
3. LLVM is missing support for a bunch of important vector operations.  In
   particular, we need at least 'extract element' and 'build vector
out of
   scalars' operations.  Given these, we can implement packed arguments to
   functions without a problem.  There are problem many others we
   eventually want.

For your work, it might be most expedient to just ignore the lower packed
pass and add SSE support to the X86 backend: that will get you up and
running quickly and get you the performance you are obviously after.  If
backwards compatibility with old hardware is an issue, revisiting the
lower packed pass would make sense.

Let me know what you think.  In the very short term, the hook exposed to
create the lower packed pass can be plunked into the X86TargetMachine and
get intra function packed types working for you.
> PS. Chris, thanks for the feedback on the memory cleanup patch - I'm a
> bit busy getting LLVM integrated in our app now, but I will incorporate
> your suggestions and submit a proper patch soon...
No problem.

-Chris

-- 
http://llvm.org/
http://nondot.org/sabre/

Morten Ofstad

2004-Nov-19 11:11 UTC

head link

[LLVMdev] LowerPacked pass

Chris Lattner wrote:> Note that packed support in LLVM is not complete yet.  In
> particular, here are some of the big missing pieces:
> 
> 1. No code generators can generate vector instructions yet (SSE or
>    altivec, for example).  This should be fairly easy to add though.
> 2. The lowerpacked pass, which currently converts packed ops into their
>    scalar counterparts, has a few limitations:
>      A. It does not handle packed arguments to functions
>      B. It always lowers all of the way to scalar ops, even if the target
>         supports SOME packed types.  For example, it would be nice for it
>         to eventually lower <16 x float> into 4 <4 x
float>'s if the
>         target supports them.
>      C. It has never been thoroughly tested, primarily because we don't
>         have a producer of packed operations yet.  I believe it should
>         work reasonably well though.
It works reasonably well, quite impressive really considering it's not 
been tested ;-) B is not much of a problem for my use, but A is a bit 
annoying even though I mostly pass pointers to packed types anyway. Can 
you elaborate a bit on what is the problem with this? I have calls going 
back into our code by adding mappings to the JIT, but I'm not sure if I 
can get it to call functions with R32x4 (<float x 4>) args without 
making a wrapper that takes a pointer.
> For your work, it might be most expedient to just ignore the lower packed
> pass and add SSE support to the X86 backend: that will get you up and
> running quickly and get you the performance you are obviously after.  If
> backwards compatibility with old hardware is an issue, revisiting the
> lower packed pass would make sense.
Is it easy to add intrinsics to do things like dot product of packed 
types using SSE instructions? That's probably all I need...
> Let me know what you think.  In the very short term, the hook exposed to
> create the lower packed pass can be plunked into the X86TargetMachine and
> get intra function packed types working for you.
The patch you did was missing the actual implementation of 
createLowerPackedPass, so I'm including my own differences -- I guess 
you don't want to apply the changes to X86TargetMachine as I'm the only 
one actually generating packed types, but I include it for completeness..

m.

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: lowerpacked.patch.txt
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20041119/5fab9aff/attachment.txt>

Possibly Parallel Threads

Search for more seemingly similar threads

llvm dev - Nov 2004 - [LLVMdev] LowerPacked pass

[LLVMdev] LowerPacked pass

[LLVMdev] LowerPacked pass

[LLVMdev] LowerPacked pass

Possibly Parallel Threads