similar to: RFC: SROA for method argument

Displaying 20 results from an estimated 5000 matches similar to: "RFC: SROA for method argument"

2018 Mar 08
1
[RFC] jump threading on std::pair<int, bool>
Hi, While comparing the code by LLVM and GCC for some major libraries, I found that LLVM fails to apply jump threading with a method whose return type is std::pair<int, bool> (actually, any pair of 32-bit values like std::pair<bool, int> and std::pair<int, int>). For example, jump threading does not work for the if statement in func. std::pair<int, bool> callee(int v) {
2015 Mar 25
2
[LLVMdev] Optimization puzzle...
Hi everyone, I am wondering what¹s stopping the LLVM optimizer (opt -O3) from eliminating the apparently useless « icmp sgt » instruction in the following piece of LLVM IR. > ; ModuleID = 'lambda-opt.bc' > target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128" > target triple = "x86_64-apple-macosx10.10.0" > > ; Function
2015 Mar 25
3
[LLVMdev] Optimization puzzle...
Here's a version that doesn't try to do block deletion on it's own. If you use -adce then -simplifycfg, you get what you want. It passes all tests except one, which is that we delete an invoke of a pure function, IE Transforms/ADCE/dce_pure_invoke.ll - I'm not sure why that's bad. The reason we delete it is because it returns false to I.mayHaveSideEffects(), and in particular,
2016 Dec 12
0
RFC: Adding argument allocas
On Fri, Dec 9, 2016 at 4:04 PM, James Y Knight <jyknight at google.com> wrote: > IMO, the LLVM function definitions should be a straightforward > transformation from the C function signatures, and clang should stop > mangling the function signatures with its own intimate knowledge of the > calling convention rules. > > Instead, clang could emit (still ABI-specific!)
2017 Dec 06
2
[AMDGPU] Strange results with different address spaces
> On Dec 6, 2017, at 02:28, Haidl, Michael <michael.haidl at uni-muenster.de> wrote: > > The IR goes through a backend agnostic preparation phase that brings it into SSA from and changes the AS from 0 to 1. This sounds possibly problematic to me. The IR should be created with the correct address space to begin with. Changing this in the middle sounds suspect. > After this
2013 Aug 12
2
[LLVMdev] [RFC] Poor code generation for paired load
Hi Eli, Thanks for the feedbacks. On Aug 9, 2013, at 8:00 PM, Eli Friedman <eli.friedman at gmail.com> wrote: > On Fri, Aug 9, 2013 at 4:58 PM, Quentin Colombet <qcolombet at apple.com> wrote: >> Hi, >> >> I am investigating a poor code generation on x86-64 involving a 64-bits >> structure with two 32-bits fields (in the attached examples float, but
2013 Aug 09
2
[LLVMdev] [RFC] Poor code generation for paired load
Hi, I am investigating a poor code generation on x86-64 involving a 64-bits structure with two 32-bits fields (in the attached examples float, but similar behavior is exposed with i32, and we can probably generalize that to smaller types too). The root cause of the problem is in SROA, although I am not sure we should fix something there. That is why I need your advices. ** Problem ** 64-bits
2016 Dec 10
3
RFC: Adding argument allocas
On Fri, Dec 9, 2016 at 1:30 PM, Friedman, Eli via llvm-dev < llvm-dev at lists.llvm.org> wrote: > On 12/9/2016 8:45 AM, Reid Kleckner wrote: > > On Thu, Dec 8, 2016 at 5:37 PM, Mehdi Amini <mehdi.amini at apple.com> wrote: > >> So IIUC basically the *only* reason for this IR change is that we don’t >> want to pattern match in debug build? >> I don't
2013 Aug 10
0
[LLVMdev] [RFC] Poor code generation for paired load
On Fri, Aug 9, 2013 at 4:58 PM, Quentin Colombet <qcolombet at apple.com> wrote: > Hi, > > I am investigating a poor code generation on x86-64 involving a 64-bits > structure with two 32-bits fields (in the attached examples float, but > similar behavior is exposed with i32, and we can probably generalize that to > smaller types too). > The root cause of the problem is
2013 Aug 12
0
[LLVMdev] [RFC] Poor code generation for paired load
On Mon, Aug 12, 2013 at 9:59 AM, Quentin Colombet <qcolombet at apple.com> wrote: > Hi Eli, > > Thanks for the feedbacks. > > On Aug 9, 2013, at 8:00 PM, Eli Friedman <eli.friedman at gmail.com> wrote: > > On Fri, Aug 9, 2013 at 4:58 PM, Quentin Colombet <qcolombet at apple.com> > wrote: > > Hi, > > I am investigating a poor code generation on
2017 Dec 05
2
[AMDGPU] Strange results with different address spaces
> On Dec 5, 2017, at 13:53, Matt Arsenault <arsenm2 at gmail.com> wrote: > > > >> On Dec 5, 2017, at 02:51, Haidl, Michael via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Hi dev list, >> >> I am currently exploring the integration of AMDGPU/ROCm into the PACXX project and observing some
2015 Nov 10
2
SROA and volatile memcpy/memset
Hi, I have a customer testcase where SROA splits a volatile memcpy and we end up generating bad code[1]. While this looks like a bug, simply preventing SROA from splitting volatile memory intrinsics causes basictest.ll for SROA to fail. Not only that, but it also seems like handling of volatile memory transfers was done with some intent. What are the design decisions in SROA regarding
2015 Nov 10
4
SROA and volatile memcpy/memset
On 11/10/2015 1:07 PM, Joerg Sonnenberger via llvm-dev wrote: > On Tue, Nov 10, 2015 at 10:41:06AM -0600, Krzysztof Parzyszek via llvm-dev wrote: >> I have a customer testcase where SROA splits a volatile memcpy and we end up >> generating bad code[1]. While this looks like a bug, simply preventing SROA >> from splitting volatile memory intrinsics causes basictest.ll for SROA
2015 Nov 11
2
SROA and volatile memcpy/memset
On 11/11/2015 8:53 AM, Hal Finkel wrote: > > SROA seems to be doing a number of things here. What about if we prevented SROA from generating multiple slices splitting volatile accesses? There might be a significant difference between that and something like this test (test/Transforms/SROA/basictest.ll): > > define i32 @test6() { > ; CHECK-LABEL: @test6( > ; CHECK: alloca i32 >
2013 Nov 24
1
[LLVMdev] wrong code generation for memcpy function in SROA optimization pass
SROA optimization pass did some optimizations and transforms for memcpy function,such as ld/st operations.When someone has written down code like size>sizeof(dest) in memcpy(*dest,*src,size), there was much likely a wrong code generation.for example,considered as such testcase: int main() { char ch; short sh = 0x1234; memcpy(&ch,&sh,2); printf("ch=0x%02x\n",ch); } At
2015 Nov 11
4
SROA and volatile memcpy/memset
On 11/11/2015 9:28 AM, Chandler Carruth wrote: > So, here is the model that LLVM is using: a volatile memcpy is lowered > to a loop of loads and stores of indeterminate width. As such, splitting > a memcpy is always valid. > > If we want a very specific load and store width for volatile accesses, I > think that the frontend should generate concrete loads and stores of a > type
2012 Sep 22
2
[LLVMdev] Heads up! New SROA implementation is going on-by-default today!
After a lot of testing and help from Duncan, Benjamin, Joerg and others, I think the new SROA is ready for some broader testing. I've fixed all the crashers and miscompiles that Duncan and Joerg have been able to find (although I'm sure there are a few left I'll tackle when there are reports), and the LNT numbers look *really* good. Here is the latest LNT run we got by flipping it on
2018 May 30
4
[SROA][DebugInfo][GSoC] Testing SROA on amalgamated sqlite source
Introduction ============ `SROA' is an early stage pass running at the very beginning of the pipeline in `-O{1,2,3}'. Greg Bedwell's report from his DExTer tool shows SROA on function as one of the major culprits of Debug Info loss. With debugify-each partially done I tried testing this on the amalgamated sqlite source. The steps are as follows: ,---- | # generate
2015 Nov 11
2
SROA and volatile memcpy/memset
On 11/11/2015 9:36 AM, Hal Finkel wrote: > ----- Original Message ----- >> From: "Krzysztof Parzyszek" <kparzysz at codeaurora.org> >> >> Yeah, the remark about devices I made in my post was a result of a >> "last-minute" thought to add some rationale. It doesn't actually >> apply >> to SROA, since there are no devices that are
2012 Nov 14
4
[LLVMdev] About a problem in SROA
Hi, For the following case, $ cat bad1.ll target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:64:128-a0:0:64-n32-S64" define internal void @test(i32 %v) { entry: %tmp = alloca i32, align 4 store i32 %v, i32* %tmp, align 4 %0 = bitcast i32* %tmp to <2 x i8>* %1 = load <2 x i8>* %0, align 4 ret void } I