similar to: How to tell LLVM to treat Commutable library calls as such, for example multiplication?

Displaying 20 results from an estimated 10000 matches similar to: "How to tell LLVM to treat Commutable library calls as such, for example multiplication?"

2019 Jun 05
2
Strange behaviour of post-legalising optimisations(?)
I come across a situation that I am having a hard time to understand. When I compile the following code : char *tst( char *dest, const char *src, unsigned int len ) { for (int i=0 ; i<len ; i++) { dest[i] = src[i]; } return dest; } Clang generates this for the ‘for’ body: for.body: ; preds = %for.cond %arrayidx = getelementptr inbounds i8,
2019 Feb 08
2
Unfolded additions of constants after promotion of @llvm.ctlz.i16 on SystemZ
Hi, SystemZ supports @llvm.ctlz.i64() natively with a single instruction (FLOGR), and lesser bitwidth versions of the intrinsic are promoted to i64. For some reason, this leads to unfolded additions of constants as shown below: This function: define i16 @fun(i16 %arg) {   %1 = tail call i16 @llvm.ctlz.i16(i16 %arg, i1 false)   ret i16 %1 } ,gives this optimized DAG as input to instruction
2017 Jul 29
2
ISelDAGToDAG breaks node ordering
Hi, During instruction selection, I have the following code for certain LOAD instructions: const LoadSDNode *LD = cast<LoadSDNode>(N); SDNode* LDW = CurDAG->getMachineNode(AVR::LDWRdPtr, SDLoc(N), VT, PtrVT, MVT::Other, LD->getBasePtr(), LD->getChain()); // Honestly, I have no idea what this does, but other memory // accessing instructions
2016 Nov 03
2
rotl: undocumented LLVM instruction?
Is there any way to get it to delay this optimization where it goes from this: Initial selection DAG: BB#0 'bclr64:entry' SelectionDAG has 14 nodes: t0: ch = EntryToken t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1 t6: i64 = sub t4, Constant:i64<1> t7: i64 = shl Constant:i64<1>, t6
2016 Nov 03
3
rotl: undocumented LLVM instruction?
Setting the ISD::ROTL to Expand doesn't work? (via SetOperation) You could also do a Custom hook if that's what you're looking for. On Thu, Nov 3, 2016 at 5:12 PM, Phil Tomson <phil.a.tomson at gmail.com> wrote: > ... or perhaps to rephrase: > > In 3.9 it seems to be doing a smaller combine much sooner, whereas in 3.6 > it deferred that till later in the
2016 Nov 02
3
rotl: undocumented LLVM instruction?
We've recently moved our project from LLVM 3.6 to LLVM 3.9. I noticed one of our code generation tests is breaking in 3.9. The test is: ; RUN: llc < %s -march=xstg | FileCheck %s define i64 @bclr64(i64 %a, i64 %b) nounwind readnone { entry: ; CHECK: bclr r1, r0, r1, 64 %sub = sub i64 %b, 1 %shl = shl i64 1, %sub %xor = xor i64 %shl, -1 %and = and i64 %a, %xor ret i64
2019 Jun 10
2
Bug: Library functions for ISD::SRA, ISD::SHL, and ISD::SRL
Hi Eli, Thanks for pointing to the CTLZ_ZERO_UNDEF “LibCall” implementation. I have not it in the version that I am currently using, so it’s nice to know that it’s implemented now. Incidentally, the CTLZ… implementation is IDENTICAL to what I am proposing for the Shifts. This is not just adding support for “out-of-tree-targets”, but giving consistency to the fact that we have perfectly defined
2018 Jan 18
0
[RFC] Half-Precision Support in the Arm Backends
I would like to revive this thread, as I am struggling a lot with the FP16 implementation in the ARM backend. My implementation in https://reviews.llvm.org/D38315 is finished (except one case), but a more robust alternative implementation was suggested. One can indeed argue that my current implementation is a bit fragile, because it involves manually patching up the isel dags for a few cases. The
2019 Jun 10
2
Bug: Library functions for ISD::SRA, ISD::SHL, and ISD::SRL
LLVM appears to support Library functions for ISD::SRA ,ISD::SHL, and ISD::SRL, as they are properly defined in RuntimeLibCalls.def. The library functions defined in RuntimeLibCalls.def (among others) are these: HANDLE_LIBCALL(SRA_I16, "__ashrhi3") HANDLE_LIBCALL(SRA_I32, "__ashrsi3") HANDLE_LIBCALL(SRA_I64, "__ashrdi3") However, setting
2016 Nov 03
2
rotl: undocumented LLVM instruction?
One option may be to prevent the formation of ROTL, if possible, and then generating rol by hand. Marking it as "expand" would likely stop the DAG combiner from creating it. Then you could "preprocess" the selection DAG before the instruction selection and do the pattern matching yourself. -Krzysztof On 11/3/2016 4:24 PM, Phil Tomson via llvm-dev wrote: > I could try
2016 Jun 21
3
LLVM Backend Issues
Hi, I am having issues running a new backend that I created for a new architecture. I suspect these errors may have something to do with how I have the string setup in LLVMTargetMachine() below? Also - It would be great if someone could point me to a document that describes some of these error messages? For example what does t26 ..t4 mean? Thanks in advance for taking your valuable time to help
2018 Jan 18
1
[RFC] Half-Precision Support in the Arm Backends
Hi Sjoerd, For ISel, I think having a separate register class will give you less headache. I wondering if you could get away with not touching the instructions descriptions at all, instead defining external pattens for the FullFP16 case, like so: def VCVTBHS: ASuI<0b11101, 0b11, 0b0010, 0b01, 0, (outs SPR:$Sd), (ins SPR:$Sm), IIC_fpCVTSH, "vcvtb",
2016 Dec 26
2
[SDAG] Recovering pointer types
I am wondering if there is a good/easy way to recover the original type of a pointer parameter in the SDAG. Here's the problem that I am dealing with: define <4 x i32> @test(i32* nocapture readonly %a) local_unnamed_addr #0 { entry: %0 = bitcast i32* %a to <4 x i32>* %1 = load <4 x i32>, <4 x i32>* %0, align 16, !tbaa !2 ret <4 x i32> %1 } The problem is
2017 Sep 21
1
VSelect Instruction Error
Hello, I am getting this error. What instruction is required to be implemented? LLVM ERROR: Cannot select: t22: v32i32 = vselect t724, t11, t16 t724: v32i32,ch = load<LD128[FixedStack1]> t723, FrameIndex:i64<1>, undef:i64 t659: i64 = FrameIndex<1> t10: i64 = undef t11: v32i32,ch = load<LD128[%sunkaddr45](align=4)(tbaa=<0x481f1e8>)> t0, t8, undef:i64
2017 Sep 14
2
Question about 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'
Hi All, I have a question about splitting 'EXTRACT_VECTOR_ELT' with 'v2i1'. I have a llvm IR code snippet as following: llvm IR code snippet: for.body: ; preds = %entry, %for.cond %i.022 = phi i32 [ 0, %entry ], [ %inc, %for.cond ] %0 = icmp ne <2 x i32> %vecinit1, <i32 0, i32 -23> %1 = extractelement <2 x i1>
2016 Dec 26
0
[SDAG] Recovering pointer types
On 26 Dec 2016, at 14:58, Nemanja Ivanovic via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > I am wondering if there is a good/easy way to recover the original type of a pointer parameter in the SDAG. Here's the problem that I am dealing with: > > define <4 x i32> @test(i32* nocapture readonly %a) local_unnamed_addr #0 { > entry: > %0 = bitcast i32* %a to
2019 Jun 11
2
Bug: Library functions for ISD::SRA, ISD::SHL, and ISD::SRL
Hi Eli, First of all, please I would appreciate that you try to not confuse my limited use of English with stupidity or lack or criteria in other subjects. I’m not English native, so please keep that in mind. You have been significantly helpful in the recent past so please keep on. Interestingly, you made a mention of a related but not identical issue. It is true that most (or all) processors
2016 Aug 02
2
Instruction selection problems due to SelectionDAGBuilder
Hello. I'm having problems at instruction selection with my back end with the following basic-block due to a vector add with immediate constant vector (obtained by vectorizing a simple C program doing vector sum map): vector.ph: ; preds = %vector.memcheck50 %.splatinsert = insertelement <8 x i64> undef, i64 %i.07.unr, i32 0
2017 Sep 15
2
Question about 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'
> extends the elements to 8bit and stores them on stack. Store is responsible for zero-extend. This is the policy... - Elena -----Original Message----- From: jingu at codeplay.com [mailto:jingu at codeplay.com] Sent: Friday, September 15, 2017 17:45 To: llvm-dev at lists.llvm.org; Demikhovsky, Elena <elena.demikhovsky at intel.com>; daniel_l_sanders at apple.com Subject: Re: Question
2019 Jun 05
2
Optimizing Compare instruction selection
Hi Eli, Thanks again for your reply. I am unsure about implementing the getCrossCopyRegClass for my target. My target does not support or allow moves to and from the SR. The SR exists because it has implicit involvement in some instructions, but it is opaque to the assembler and to the user as a register. I mean, there are no instructions to directly move or read it, or even access it directly.