similar to: [LLVMdev] 16bit loads being promoted to 32bit?

Displaying 20 results from an estimated 2000 matches similar to: "[LLVMdev] 16bit loads being promoted to 32bit?"

2009 Feb 13
0
[LLVMdev] 16bit loads being promoted to 32bit?
On Thu, Feb 12, 2009 at 4:53 PM, Villmow, Micah <Micah.Villmow at amd.com> wrote: > The > problem that I am having is somewhere along the line the 16bit load is being > promoted to a 32bit load For the given testcase, that's clearly illegal. Either there's a serious bug in LLVM, or you're misinterpreting the meaning of the DAG. Are you sure you aren't seeing a
2015 Mar 03
3
[LLVMdev] ReduceLoadWidth, DAGCombiner and non 8bit loads/extloads question.
I'm curious about this code in ReduceLoadWidth (and in DAGCombiner in general): if (LegalOperations && !TLI.isLoadExtLegal(ExtType, ExtVT)) return SDValue <http://llvm.org/docs/doxygen/html/classllvm_1_1SDValue.html>(); LegalOperations is false for the first pre-legalize pass and true for the post-legalize pass. The first pass is target-independent yes? So that makes sense.
2010 Sep 29
0
[LLVMdev] spilling & xmm register usage
On Sep 29, 2010, at 8:35 AMPDT, Ralf Karrenberg wrote: > Hello everybody, > > I have stumbled upon a test case (the attached module is a slightly > reduced version) that shows extremely reduced performance on linux > compared to windows when executed using LLVM's JIT. > > We narrowed the problem down to the actual code being generated, the > source IR on both systems
2017 Mar 02
5
Structurizing multi-exit regions
Hi, I'm trying to solve a problem from StructurizeCFG not actually handling regions with multiple exits. Sample IR attached. StructurizeCFG doesn't touch this function, exiting early on the isTopLevelRegion check. SIAnnotateControlFlow then gets confused and ends up inserting an if into one of the blocks, and the matching end.cf into one of the return/unreachable blocks. The input to
2013 Nov 08
3
[LLVMdev] Loads moving across barriers
Hi, For a long time we've been having a problem we've been working around in OpenCL where loads are moving across an intrinsic used for a barrier. Attached is the testcase, and the result of opt -S -basicaa -gvn on it. This example is essentially this: void foo(global float2* result, local float2* restrict data0, ...) { int id = get_local_id(0); // ... data0[id] = ...;
2015 Mar 03
2
[LLVMdev] ReduceLoadWidth, DAGCombiner and non 8bit loads/extloads question.
1) It's crashing because LD1 is produced due to LegalOperations=false in pre-legalize pass. Then Legalization does not know how to handle it so it asserts on a default case. I don't know if it's a reasonable expectation or not but we do not have support for it. I have not tried overriding shouldReduceLoadWidth. 2) I see, that makes sense to some degree, I'm curious if you can
2011 Sep 02
4
[LLVMdev] Some questions on SelectionDAG
Hi, all I am studying the ARM backend on SelectionDAG, I have some following questions: 1. Each operator of SDNode in SelectionDAG is required to be defined by SDNode<ISD::XXX,XXX,XXX> in .td file, right? But several operators are not defined in .td file, why? (e.g., ISD::BR_CC, ISD::CopyToReg, ISD::AssertSext) 2. The MVT::glue value is used to ensure two nodes are scheduled
2013 Mar 04
1
[LLVMdev] Custom Lowering of ARM zero-extending loads
Hi, For my research, I need to reshape the current ARM backend to support armv2a. Zero-extend half word load (ldrh) is not supported by armv2a, so I need to make the code generation to not generate ldrh instructions. I want to replace all those instances with a 32-bit load (ldr) and then and the result with 0xffff to mask out the upper bits. These are the modifications that I have made to
2016 May 24
1
BitcodeReader non explicit error
Hi, I'm working on OpenCL and I'm using clang as compiler (based on clang 3.7.0). I have a issue, I'm generating a bitcode file (that I can print before before the generation). But when I'm trying to read it again with clang, I have this issue: "error: Invalid record" How can I managed to know where it comes from? Thank you, Romaric Here is what is print before the
2014 Dec 05
3
[LLVMdev] Question on equivalence of pointer types
Is copy.0 semantically equivalent to copy.1 in the following example? define void @copy.0(i8 addrspace(1)* addrspace(1)* %src, i8 addrspace(1)* addrspace(1)* %dst) { entry: %val = load i8 addrspace(1)* addrspace(1)* %src store i8 addrspace(1)* %val, i8 addrspace(1)* addrspace(1)* %dst ret void } define void @copy.1(i8 addrspace(1)* addrspace(1)* %src, i8 addrspace(1)* addrspace(1)* %dst)
2014 Dec 09
2
[LLVMdev] Question on equivalence of pointer types
> On Dec 8, 2014, at 5:12 PM, Sanjoy Das <sanjoy at playingwithpointers.com> wrote: > > Partially answering my own question, in general these are not > equivalent because LLVM allows for pointers in different address > spaces to have different sizes. However, are they equivalent if > pointers in addrspace(1) have the same size as pointers in > addrspace(0)? > >
2017 Feb 13
2
RFC: Representing unions in TBAA
Hello all, I'm new to the llvm community. I'm learning how things work. I noticed that there has been some interest in improving how unions are handled. Bug 21725 is one example. I figured it might be a interesting place to start. I discussed this with a couple people, and below is a suggestion on how to represent unions. I would like some comments on how this fits in with how
2008 Oct 06
3
[LLVMdev] Address calculation
I am attempting to get indexing code generation working with my backend. However, it seems that the addresses being calculated is being multiplied by the width of the data type. define void @ test_input_index_constant_int(i32 addrspace(11)* %input, i32 addrspace(11)* %result) { entry: %input.addr = alloca i32 addrspace(11)* ; <i32 addrspace(11)**> [#uses=2]
2014 Oct 03
2
[LLVMdev] Weird problems with cos (was Re: [PATCH v3 2/3] R600: Add carry and borrow instructions. Use them to implement UADDO/USUBO)
Hi Tom, Matt, I'm running into strange issues with the cos test (piglit generated_tests/cl/builtin/math/builtin-float-cos-1.0.generated.c) I have been seeing random failures (incorrect results) for some time and tried to investigate. the weird part is that the failures are not 100% reproducible, sometimes the tests pass, or partly pass (it's usually float8 and float16 subtests that
2013 Nov 13
2
[LLVMdev] SCEV getMulExpr() not propagating Wrap flags
Hi folks, I'm trying to analyse this piece of IR: for.body: ; preds = %for.body, %entry %indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ] %0 = shl nsw i64 %indvars.iv, 1 %arrayidx = getelementptr inbounds i32* %b, i64 %0 %1 = load i32* %arrayidx, align 4, !tbaa !1 %add = add nsw i32 %1, %I %arrayidx3 = getelementptr
2017 Jan 03
2
Optimisation passes introducing address space casts
OK, I’ve hit one more existing regression test that I’m weary of: define void @test2_addrspacecast() { %A = alloca %T %B = alloca %T %a = addrspacecast %T* %A to i8 addrspace(1)* %b = addrspacecast %T* %B to i8 addrspace(1)* call void @llvm.memcpy.p1i8.p0i8.i64(i8 addrspace(1)* %a, i8* bitcast (%T* @G to i8*), i64 124, i32 4, i1 false) call void
2016 Mar 28
0
RFC: atomic operations on SI+
On Fri, Mar 25, 2016 at 02:22:11PM -0400, Jan Vesely wrote: > Hi Tom, Matt, > > I'm working on a project that needs few coherent atomic operations (HSA > mode: load, store, compare-and-swap) for std::atomic_uint in HCC. > > the attached patch implements atomic compare and swap for SI+ > (untested). I tried to stay within what was available, but there are > few issues
2018 Sep 25
2
byval argument causes llvm to crash after inlining
Hello, With the following reduced test case, cmd "opt -always-inline t.ll" crashes after inlining. Notice that byval argument %a will be remapped to %1 below, and consequently produces an illegal store. %1 = alloca i32, align 4 store i32 * %1, i32 addrspace(1)** %a.addr, align 8 Looks like Inliner assumes that byval arguments are from address space 0. Or this is just a bug in inliner?
2018 Sep 25
2
byval argument causes llvm to crash after inlining
It is problematic when byval argument is not from address space 0. When the default alloca address space is 0, should we consider this IR illegal? define internal i32 @bar(i32 addrspace(1)* byval %a) alwaysinline From: Reid Kleckner [mailto:rnk at google.com] Sent: Tuesday, September 25, 2018 2:38 PM To: Pan, Wei <wei.pan at intel.com> Cc: llvm-dev <llvm-dev at lists.llvm.org>
2019 Jun 21
2
Using store with operands in non-zero address space
Hello, LLVM devs. I have the following IR: %x = alloca i32, align 4 %p = alloca i32*, align 8 store i32* %x, i32** %p, align 8 Now I change module's data layout and run InferAddressSpacePass. This turns that piece of code into %x = alloca i32, align 4, addrspace(1) %p = alloca i32*, align 8, addrspace(1) store i32 addrspace(1)* %x, i32* addrspace(1)* %p, align 8 But Verifier