Displaying 8 results from an estimated 8 matches for "iszextfree".
2013 Jan 25
0
[LLVMdev] TargetLowering vs. TargetTransform
Hi Renato,
I think that we need to improve ::isTruncateFree, ::isZextFree, etc to include all of the free conversions. Vector and Scalar.
Non-free conversions are marked with setOperationAction so the generic parts of TTI should be able to give a reasonable cost estimation.
The cost tables should contain cases that are not handled by TTI. So, if we have a clever DAGCo...
2013 Jan 25
2
[LLVMdev] TargetLowering vs. TargetTransform
Hi all,
I'm looking for a place where to put the costs of vector (and scalar) cast
operations for ARM, but I noticed the TargetTransform methods call the
TargetLowering ones when unsure.
Now, I'm not sure...
Many casts on ARM are free, and I could build a list of cases where it is
true, but should I put this on the lowering or the transform? My main
motivation is to get the costs right
2015 Sep 30
2
InstCombine wrongful (?) optimization on BinOp with SameOperands
...de *N)
In my backend's architecture truncate is free, but zext is not (and i64
is not a desirable type for xor or any binary operation in general), so
I would expect this optimization to be bypassed but because of the
following statement :
(N0.getOpcode() == ISD::TRUNCATE && (!TLI.isZExtFree(VT, Op0VT) ||
!TLI.isTruncateFree(Op0VT, VT))
it is not (as isZExtFree return false for my architecture while
isTruncateFree returns true). The comment on binop simplification says
that binop over truncs should be optimize only if trunc is not free, so
I do not understand the point of adding !...
2013 Jan 25
2
[LLVMdev] TargetLowering vs. TargetTransform
On 25 January 2013 17:48, Nadav Rotem <nrotem at apple.com> wrote:
> I think that we need to improve ::isTruncateFree, ::isZextFree, etc to
> include all of the free conversions. Vector and Scalar.
>
Hi Nadav,
Yes, and the question is: TargetLowering's isZExtFree or TargetTransform's
isZExtFree?
TargetTransform (TT) only has the free checks on types, while
TargetLowering (TL) has on SDValue and destination type...
2013 Sep 06
2
[LLVMdev] removing unnecessary ZEXT
Hi,
Within a basic block I can remove unnecessary register copies + zero sign extensions of unsigned-8bit-loaded values by implementing isZExtFree() for ISD::LOAD nodes.
...But not between basic blocks.
The first block does a CopyFromReg of the unsigned-8bit-loaded vreg1 into a new vreg2.
The second block then does a unnecessary zext to vreg2.
What I want is the 2nd block to use the original vreg1!
What I am getting is one extra register clo...
2013 Sep 10
0
[LLVMdev] removing unnecessary ZEXT
....uiuc.edu] on behalf of Robert Lytton [robert at xmos.com]
Sent: 06 September 2013 17:18
To: llvmdev at cs.uiuc.edu
Subject: [LLVMdev] removing unnecessary ZEXT
Hi,
Within a basic block I can remove unnecessary register copies + zero sign extensions of unsigned-8bit-loaded values by implementing isZExtFree() for ISD::LOAD nodes.
...But not between basic blocks.
The first block does a CopyFromReg of the unsigned-8bit-loaded vreg1 into a new vreg2.
The second block then does a unnecessary zext to vreg2.
What I want is the 2nd block to use the original vreg1!
What I am getting is one extra register clo...
2013 Sep 11
2
[LLVMdev] removing unnecessary ZEXT
...on [robert at xmos.com]
> Sent: 06 September 2013 17:18
> To: llvmdev at cs.uiuc.edu
> Subject: [LLVMdev] removing unnecessary ZEXT
>
> Hi,
>
> Within a basic block I can remove unnecessary register copies + zero sign extensions of unsigned-8bit-loaded values by implementing isZExtFree() for ISD::LOAD nodes.
> ...But not between basic blocks.
>
> The first block does a CopyFromReg of the unsigned-8bit-loaded vreg1 into a new vreg2.
> The second block then does a unnecessary zext to vreg2.
> What I want is the 2nd block to use the original vreg1!
> What I am get...
2013 Sep 11
0
[LLVMdev] removing unnecessary ZEXT
Hi Andrew,
Thank you for the suggestion.
I've looked at CodeGenPrepare.cpp and MoveExtToFormExtLoad() is never run.
I also notice that the ARM target produces the same additional register usage (copy) and zero extending (of the copy).
(See the usage of r3 &r5 and also r12 & r4 in attached file arm-strcspn.s, my understanding is that 'ldrb' is zero extending.)
Here is a