thr3ads.net - similar to: "Ensuring chain dependencies with expansion to libcalls"

Displaying 20 results from an estimated 2000 matches similar to: "Ensuring chain dependencies with expansion to libcalls"

Instruction selection problems due to SelectionDAGBuilder

2016 Aug 02

Instruction selection problems due to SelectionDAGBuilder

Hello. I'm having problems at instruction selection with my back end with the following basic-block due to a vector add with immediate constant vector (obtained by vectorizing a simple C program doing vector sum map): vector.ph: ; preds = %vector.memcheck50 %.splatinsert = insertelement <8 x i64> undef, i64 %i.07.unr, i32 0

How to constraint instructions reordering from patterns?

2018 May 04

How to constraint instructions reordering from patterns?

Hi, Is there a kind of scope mechanism in the instruction lowering pattern language in order to control where instructions are inserted or how they are later reordered during the SelectionDiag linearization? I know the glue chain that stick instructions together. But such mechanism in not provided in instruction lowering pattern. I'm facing many situations where some patterns are lowered into

How to constraint instructions reordering from patterns?

2018 May 04

How to constraint instructions reordering from patterns?

The DAG dumping will try to print some of the nodes "inline" (i.e. where they are used) to make the output more readable, so the dump of the DAG may not strictly reflect the node ordering. -Krzysztof On 5/4/2018 8:18 AM, Dominique Torette via llvm-dev wrote: > Here is a last example to illustrate my concern. > > The problem is about the lowering of node t13. > >

How to constraint instructions reordering from patterns?

2018 May 04

How to constraint instructions reordering from patterns?

Here is a last example to illustrate my concern. The problem is about the lowering of node t13. Initial selection DAG: BB#0 '_start:entry' SelectionDAG has 44 nodes: t11: i16 = Constant<0> t0: ch = EntryToken t3: ch = llvm.clp.set.rspa t0, TargetConstant:i16<392>, Constant:i32<64> t5: ch = llvm.clp.set.rspb t3,

How to constraint instructions reordering from patterns?

2018 May 04

How to constraint instructions reordering from patterns?

Krzysztof, Thanks for your interest to my questions. In order to clarify the context, here is the C source file of my test case. The 3 builtins initialize some stack pointers. They have to be executed before any other instruction. extern float fdivfaddfmul_a(float a, float b, float c, float d); volatile static float x1,x2,x3,x4; void _start(void) { float res;

rotl: undocumented LLVM instruction?

2016 Nov 03

rotl: undocumented LLVM instruction?

Is there any way to get it to delay this optimization where it goes from this: Initial selection DAG: BB#0 'bclr64:entry' SelectionDAG has 14 nodes: t0: ch = EntryToken t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 t4: i64,ch = CopyFromReg t0, Register:i64 %vreg1 t6: i64 = sub t4, Constant:i64<1> t7: i64 = shl Constant:i64<1>, t6

rotl: undocumented LLVM instruction?

2016 Nov 03

rotl: undocumented LLVM instruction?

Setting the ISD::ROTL to Expand doesn't work? (via SetOperation) You could also do a Custom hook if that's what you're looking for. On Thu, Nov 3, 2016 at 5:12 PM, Phil Tomson <phil.a.tomson at gmail.com> wrote: > ... or perhaps to rephrase: > > In 3.9 it seems to be doing a smaller combine much sooner, whereas in 3.6 > it deferred that till later in the

rotl: undocumented LLVM instruction?

2016 Nov 02

rotl: undocumented LLVM instruction?

We've recently moved our project from LLVM 3.6 to LLVM 3.9. I noticed one of our code generation tests is breaking in 3.9. The test is: ; RUN: llc < %s -march=xstg | FileCheck %s define i64 @bclr64(i64 %a, i64 %b) nounwind readnone { entry: ; CHECK: bclr r1, r0, r1, 64 %sub = sub i64 %b, 1 %shl = shl i64 1, %sub %xor = xor i64 %shl, -1 %and = and i64 %a, %xor ret i64

rotl: undocumented LLVM instruction?

2016 Nov 03

rotl: undocumented LLVM instruction?

One option may be to prevent the formation of ROTL, if possible, and then generating rol by hand. Marking it as "expand" would likely stop the DAG combiner from creating it. Then you could "preprocess" the selection DAG before the instruction selection and do the pattern matching yourself. -Krzysztof On 11/3/2016 4:24 PM, Phil Tomson via llvm-dev wrote: > I could try

[SelectionDAG] Assertion due to MachineMemOperand flags difference.

2017 Oct 13

[SelectionDAG] Assertion due to MachineMemOperand flags difference.

Hello, I've hit an assertion in SelectionDAG where we try to merge 2 loads that have the same operands but their MMO flags differ. One is dereferenceable and one is not. I'm not sure what the underlying issue here is: 1) MDSDNode with the same operands should have the same flags set on their respective MMO. The fact the flags differ when the opcode,types,operands and address-space are

Question about 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'

2017 Sep 14

Question about 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'

Hi All, I have a question about splitting 'EXTRACT_VECTOR_ELT' with 'v2i1'. I have a llvm IR code snippet as following: llvm IR code snippet: for.body: ; preds = %entry, %for.cond %i.022 = phi i32 [ 0, %entry ], [ %inc, %for.cond ] %0 = icmp ne <2 x i32> %vecinit1, <i32 0, i32 -23> %1 = extractelement <2 x i1>

Question about 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'

2017 Sep 15

Question about 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'

> extends the elements to 8bit and stores them on stack. Store is responsible for zero-extend. This is the policy... - Elena -----Original Message----- From: jingu at codeplay.com [mailto:jingu at codeplay.com] Sent: Friday, September 15, 2017 17:45 To: llvm-dev at lists.llvm.org; Demikhovsky, Elena <elena.demikhovsky at intel.com>; daniel_l_sanders at apple.com Subject: Re: Question

COPYs between register classes

2020 Feb 22

COPYs between register classes

Hi, On SystemZ there are a set of "access registers" that can be copied in and out of 32-bit GPRs with special instructions. These instructions can only perform the copy using low 32-bit parts of the 64-bit GPRs. As reported and discussed at https://bugs.llvm.org/show_bug.cgi?id=44254, this is currently broken due to the fact that the default register class for 32-bit integers is

Question about 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'

2017 Sep 17

Question about 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'

Please open a bugzilla ticket and attach your testcase. It will allow us to debug and fix the problem. Thanks - Elena From: JinGu [mailto:jingu at codeplay.com] Sent: Saturday, September 16, 2017 00:38 To: Demikhovsky, Elena <elena.demikhovsky at intel.com>; daniel_l_sanders at apple.com <daniel_l_sanders at apple.com>; Jon Chesterfield <jonathanchesterfield at

Question about 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'

2017 Sep 18

Question about 'DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT'

> so I think we need to use non-extending load for element size less than 8bit on "DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT" like this roughly. > if (N->getOperand(0).getValueType().getVectorElementType().getSizeInBits() < 8) { > return DAG.getLoad(N->getValueType(0), dl, Store, StackPtr, MachinePointerInfo()); > } else { > return

Strange behaviour of post-legalising optimisations(?)

2019 Jun 05

Strange behaviour of post-legalising optimisations(?)

I come across a situation that I am having a hard time to understand. When I compile the following code : char *tst( char *dest, const char *src, unsigned int len ) { for (int i=0 ; i<len ; i++) { dest[i] = src[i]; } return dest; } Clang generates this for the ‘for’ body: for.body: ; preds = %for.cond %arrayidx = getelementptr inbounds i8,

[PATCH] D70246: [InstCombine] remove identity shuffle simplification for mask with undefs

2019 Dec 09

[PATCH] D70246: [InstCombine] remove identity shuffle simplification for mask with undefs

Sanjay, I'm looking at some missed optimizations caused by D70246. Here's a test case: define <4 x float> @f(i32 %t32, <4 x float>* %t24) { .entry: %t43 = insertelement <3 x i32> undef, i32 %t32, i32 2 %t44 = bitcast <3 x i32> %t43 to <3 x float> %t45 = shufflevector <3 x float> %t44, <3 x float> undef, <4 x i32> <i32 0, i32 undef,

Help with ISEL matching for an SDAG

2016 Jul 29

Help with ISEL matching for an SDAG

I have the following selection DAG: SelectionDAG has 9 nodes: t0: ch = EntryToken t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0 t16: i32,ch = load<LD1[%ptr](tbaa=<0x10023c9f448>), anyext from i8> t0, t2, undef:i64 t15: v16i8 = BUILD_VECTOR t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16, t16 t11: ch,glue = CopyToReg t0, Register:v16i8 %V2, t15

Why am I getting FrameIndex:i64<0> when I have no i64's?

2017 Nov 02

Why am I getting FrameIndex:i64<0> when I have no i64's?

Here's the IR I'm trying to compile for my backend, a 16-bit CPU: ; ModuleID = 'foo.c' source_filename = "foo.c" target datalayout = "E-m:e-p16:16:16-i1:16:16-i8:16:16-i16:16:16-i32:16:16-i64:16:16-S16-n16" target triple = "tms9900" @global_var = common global i16 0, align 2 ; Function Attrs: noinline nounwind optnone define signext i16 @dothis(i16

llc gives Segmentation fault at instruction selection [was Re: Instruction selection gives "LLVM ERROR: Cannot select"]

2016 Feb 04

llc gives Segmentation fault at instruction selection [was Re: Instruction selection gives "LLVM ERROR: Cannot select"]

Hello, Tim, Thank you for your advice. Indeed, the problem with "LLVM ERROR: Cannot select" was a false predicate that should have been true. I solved the problem by simply making the C++ function implementing the TableGen predicate used in my store instruction (very similar to the selectIntAddrMSA predicate from the Mips back end) return true instead of false. But

similar to: Ensuring chain dependencies with expansion to libcalls