thr3ads.net - llvm dev - [LLVMdev] Transforming wide integer computations back to vector computations [Jan 2012]

If this information is useful, please help other people find it:
Share via:

Matt Pharr

2012-Jan-02 17:12 UTC

[LLVMdev] Transforming wide integer computations back to vector computations

It seems that one of the optimization passes (it seems to be SROA) sometimes
transforms computations on vectors of ints to computations on wide integer
types; for example, I'm seeing code like the following after
optimizations(*):

  %0 = bitcast <16 x i8> %float2uint to i128
  %1 = shl i128 %0, 8
  %ins = or i128 %1, 255
  %2 = bitcast i128 %ins to <16 x i8>

The back end I'm trying to get this code to go through (a hacked up version
of the LLVM C backend(**)) doesn't support wide integer types, but is fine
with the original vectors of integers; I'm wondering if there's a
straightforward way to avoid having these computations on wide integer types
generated in the first place or if there's pre-existing code that would
transform this back to use the original vector types.

Thanks,
-matt

(*) It seems that this is happening with vectors of i8 and i16, but not i32 and
i64; in some cases, this is leading to better code for i8/i16 vectors, in that
an unnecessary store/load round-trip being optimized out for the i8/i16 case.  I
can provide a test case/submit a bug if this would be useful.

(**) Additional CBE patches to come from this effort, pending turning
aforementioned hacks into something a little cleaner/nicer.

Duncan Sands

2012-Jan-02 17:32 UTC

head link

[LLVMdev] Transforming wide integer computations back to vector computations

Hi Matt,
> It seems that one of the optimization passes (it seems to be SROA)
sometimes transforms computations on vectors of ints to computations on wide
integer types; for example, I'm seeing code like the following after
optimizations(*):
>
>    %0 = bitcast<16 x i8>  %float2uint to i128
>    %1 = shl i128 %0, 8
>    %ins = or i128 %1, 255
>    %2 = bitcast i128 %ins to<16 x i8>
this would probably be better expressed as a vector shuffle.  What's the
testcase?

Ciao, Duncan.
>
> The back end I'm trying to get this code to go through (a hacked up
version of the LLVM C backend(**)) doesn't support wide integer types, but
is fine with the original vectors of integers; I'm wondering if there's
a straightforward way to avoid having these computations on wide integer types
generated in the first place or if there's pre-existing code that would
transform this back to use the original vector types.
>
> Thanks,
> -matt
>
> (*) It seems that this is happening with vectors of i8 and i16, but not i32
and i64; in some cases, this is leading to better code for i8/i16 vectors, in
that an unnecessary store/load round-trip being optimized out for the i8/i16
case.  I can provide a test case/submit a bug if this would be useful.
>
> (**) Additional CBE patches to come from this effort, pending turning
aforementioned hacks into something a little cleaner/nicer.
>
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Eli Friedman

2012-Jan-02 18:14 UTC

head link

[LLVMdev] Transforming wide integer computations back to vector computations

On Mon, Jan 2, 2012 at 9:12 AM, Matt Pharr <matt.pharr at gmail.com>
wrote:> It seems that one of the optimization passes (it seems to be SROA)
sometimes transforms computations on vectors of ints to computations on wide
integer types; for example, I'm seeing code like the following after
optimizations(*):
>
>  %0 = bitcast <16 x i8> %float2uint to i128
>  %1 = shl i128 %0, 8
>  %ins = or i128 %1, 255
>  %2 = bitcast i128 %ins to <16 x i8>
>
> The back end I'm trying to get this code to go through (a hacked up
version of the LLVM C backend(**)) doesn't support wide integer types, but
is fine with the original vectors of integers; I'm wondering if there's
a straightforward way to avoid having these computations on wide integer types
generated in the first place
The simplest workaround is to skip running scalarrepl, and just use
the mem2reg pass instead.  IIRC, no other pass generates large
integers like that at the moment.  (The idea is generally that the
large integers get eliminated by instcombine, but as you're seeing
that isn't guaranteed.)

-Eli

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - Jan 2012 - [LLVMdev] Transforming wide integer computations back to vector computations

[LLVMdev] Transforming wide integer computations back to vector computations

[LLVMdev] Transforming wide integer computations back to vector computations

[LLVMdev] Transforming wide integer computations back to vector computations

Maybe Matching Threads