thr3ads.net - search: "sroa"

RFC: A change in InstCombine canonical form

2016 Mar 16

3

RFC: A change in InstCombine canonical form

...ug https://llvm.org/bugs/show_bug.cgi?id=26445) IR contains code for loading a float from float * and storing it to a float * address. After canonicalization of load in InstCombine [1], new bitcasts are added to the IR (see bottom of the email for code samples). This prevents select speculation in SROA to work. Also after SROA we have bitcasts from int32 to float. (Whereas originally after instCombine, bitcasts are only done on pointer types). === PROPOSED SOLUTION=== [1] implies that we need load canonicalization when we load a value only to store it again. The reason is to avoid generating sl...

[LLVMdev] Weird volatile propagation ?

2013 Jan 18

2

[LLVMdev] Weird volatile propagation ?

...-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %struct.R = type { i16, i16 } @addr = constant %struct.R* inttoptr (i64 416 to %struct.R*), align 8 define void @test(i16 zeroext %a) nounwind uwtable { %r.sroa.0 = alloca i16, align 2 %r.sroa.1 = alloca i16, align 2 store i16 %a, i16* %r.sroa.0, align 2 store i16 1, i16* %r.sroa.1, align 2 %r.sroa.0.0.load3 = load volatile i16* %r.sroa.0, align 2 store volatile i16 %r.sroa.0.0.load3, i16* inttoptr (i64 416 to i16*), align 32 %r.sroa.1.0.load2...

[LLVMdev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

2013 Jan 20

0

[LLVMdev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

...-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %struct.R = type { i16, i16 } @addr = constant %struct.R* inttoptr (i64 416 to %struct.R*), align 8 define void @test(i16 zeroext %a) nounwind uwtable { %r.sroa.0 = alloca i16, align 2 %r.sroa.1 = alloca i16, align 2 store i16 %a, i16* %r.sroa.0, align 2 store i16 1, i16* %r.sroa.1, align 2 %r.sroa.0.0.load3 = load volatile i16* %r.sroa.0, align 2 store volatile i16 %r.sroa.0.0.load3, i16* inttoptr (i64 416 to i16*), align 32 %r.sroa.1.0.load2...

RFC: SROA for method argument

2017 May 09

3

RFC: SROA for method argument

Hi, I am working to improve SROA to generate better code when a method has a struct in its arguments. I would appreciate it if I could have any suggestions or comments on how I can best proceed with this optimization. * Problem * I observed that LLVM often generates redundant instructions around glibc’s istreambuf_iterator. The p...

RFC: A change in InstCombine canonical form

2016 Mar 16

2

RFC: A change in InstCombine canonical form

.../show_bug.cgi?id=26445) > > IR contains code for loading a float from float * and storing it to a > float * address. After canonicalization of load in InstCombine [1], new > bitcasts are added to the IR (see bottom of the email for code samples). > This prevents select speculation in SROA to work. Also after SROA we have > bitcasts from int32 to float. (Whereas originally after instCombine, > bitcasts are only done on pointer types). > > === PROPOSED SOLUTION=== > > [1] implies that we need load canonicalization when we load a value only > to store it again. The...

[SROA][DebugInfo][GSoC] Testing SROA on amalgamated sqlite source

2018 May 30

0

[SROA][DebugInfo][GSoC] Testing SROA on amalgamated sqlite source

> On May 30, 2018, at 8:37 AM, Anast Gramm <anastasis.gramm2 at gmail.com> wrote: > > Introduction > ============ > > `SROA' is an early stage pass running at the very beginning of the > pipeline in `-O{1,2,3}'. Greg Bedwell's report from his DExTer tool > shows SROA on function as one of the major culprits of Debug Info > loss. > > With debugify-each partially done I tried testing this o...

SROA and volatile memcpy/memset

2015 Nov 10

4

SROA and volatile memcpy/memset

On 11/10/2015 1:07 PM, Joerg Sonnenberger via llvm-dev wrote: > On Tue, Nov 10, 2015 at 10:41:06AM -0600, Krzysztof Parzyszek via llvm-dev wrote: >> I have a customer testcase where SROA splits a volatile memcpy and we end up >> generating bad code[1]. While this looks like a bug, simply preventing SROA >> from splitting volatile memory intrinsics causes basictest.ll for SROA to >> fail. Not only that, but it also seems like handling of volatile memory >>...

[LLVMdev] [cfe-dev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

2013 Jan 20

2

[LLVMdev] [cfe-dev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

...orry I hadn't seen this, this seems like an easy and simple deficiency in the IR intrinsic for memcpy. See below. On Sun, Jan 20, 2013 at 1:42 PM, Arnaud de Grandmaison < arnaud.allarddegrandmaison at parrot.com> wrote: > define void @test(i16 zeroext %a) nounwind uwtable { > %r.sroa.0 = alloca i16, align 2 > %r.sroa.1 = alloca i16, align 2 > store i16 %a, i16* %r.sroa.0, align 2 > store i16 1, i16* %r.sroa.1, align 2 > %r.sroa.0.0.load3 = load volatile i16* %r.sroa.0, align 2 > store volatile i16 %r.sroa.0.0.load3, i16* inttoptr (i64 416 to i16*), >...

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

2013 Jan 28

4

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

...-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %struct.R = type { i16, i16 } @addr = constant %struct.R* inttoptr (i64 416 to %struct.R*), align 8 define void @test(i16 zeroext %a) nounwind uwtable { %r.sroa.0 = alloca i16, align 2 %r.sroa.1 = alloca i16, align 2 store i16 %a, i16* %r.sroa.0, align 2 store i16 1, i16* %r.sroa.1, align 2 %r.sroa.0.0.load3 = load volatile i16* %r.sroa.0, align 2 store volatile i16 %r.sroa.0.0.load3, i16* inttoptr (i64 416 to i16*), align 32 %r.sroa.1.0.load2...

[LLVMdev] wrong code generation for memcpy function in SROA optimization pass

2013 Nov 24

1

[LLVMdev] wrong code generation for memcpy function in SROA optimization pass

SROA optimization pass did some optimizations and transforms for memcpy function,such as ld/st operations.When someone has written down code like size>sizeof(dest) in memcpy(*dest,*src,size), there was much likely a wrong code generation.for example,considered as such testcase: int main() { char ch...

RFC: A change in InstCombine canonical form

2016 Mar 16

3

RFC: A change in InstCombine canonical form

...;>> IR contains code for loading a float from float * and storing it to a >>> float * address. After canonicalization of load in InstCombine [1], new >>> bitcasts are added to the IR (see bottom of the email for code samples). >>> This prevents select speculation in SROA to work. Also after SROA we have >>> bitcasts from int32 to float. (Whereas originally after instCombine, >>> bitcasts are only done on pointer types). >>> >>> === PROPOSED SOLUTION=== >>> >>> [1] implies that we need load canonicalization when...

SROA and volatile memcpy/memset

2015 Nov 10

2

SROA and volatile memcpy/memset

Hi, I have a customer testcase where SROA splits a volatile memcpy and we end up generating bad code[1]. While this looks like a bug, simply preventing SROA from splitting volatile memory intrinsics causes basictest.ll for SROA to fail. Not only that, but it also seems like handling of volatile memory transfers was done with some int...

RFC: A change in InstCombine canonical form

2016 Mar 22

0

RFC: A change in InstCombine canonical form

...contains code for loading a float from float * and storing it to a >>>> float * address. After canonicalization of load in InstCombine [1], new >>>> bitcasts are added to the IR (see bottom of the email for code samples). >>>> This prevents select speculation in SROA to work. Also after SROA we have >>>> bitcasts from int32 to float. (Whereas originally after instCombine, >>>> bitcasts are only done on pointer types). >>>> >>>> === PROPOSED SOLUTION=== >>>> >>>> [1] implies that we need loa...

[LLVMdev] About a problem in SROA

2012 Nov 14

4

[LLVMdev] About a problem in SROA

...4:128-a0:0:64-n32-S64" define internal void @test(i32 %v) { entry: %tmp = alloca i32, align 4 store i32 %v, i32* %tmp, align 4 %0 = bitcast i32* %tmp to <2 x i8>* %1 = load <2 x i8>* %0, align 4 ret void } I have the following failure message for command line "opt -sroa bad.ll -o bad.bc". $ opt -sroa bad.ll -o bad.bc opt: llvm/lib/Transforms/Scalar/SROA.cpp:2063: llvm::Value* convertValue(const llvm::DataLayout&, llvm::IRBuilder<>&, llvm::Value*, llvm::Type*): Assertion `canConvertValue(DL, V->getType(), Ty) && "Value not conver...

[RFC] jump threading on std::pair<int, bool>

2018 Mar 08

1

[RFC] jump threading on std::pair<int, bool>

...or the if statement in func. std::pair<int, bool> callee(int v) { int a = dummy(v); if (a) return std::make_pair(dummy(v), true); else return std::make_pair(v, v < 0); } int func(int v) { std::pair<int, bool> rc = callee(v); if (rc.second) { // do something } ... SROA executed before the method inlining replaces std::pair by i64 without splitting in both `callee` and `func` since at this point no access to the individual fields is seen to SROA. After inlining, jump threading fails to identify that the incoming value is a constant due to additional instructions (...

[LLVMdev] scalarrepl tuning

2010 Feb 03

1

[LLVMdev] scalarrepl tuning

In svn r95224 I modified the scalar replacement (SROA) pass to use different criteria to decide when it is likely to be profitable to split up an aggregate into its separate elements. The commit message has a pretty decent explanation, but I wanted to give some further detail here. I am hoping that the llvm-dev list allows messages with attachments....

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

2013 Jan 29

0

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

...64-s0:64:64-f80:128:128-n8:16:32:64-S128" > target triple = "x86_64-unknown-linux-gnu" > > %struct.R = type { i16, i16 } > > @addr = constant %struct.R* inttoptr (i64 416 to %struct.R*), align 8 > > define void @test(i16 zeroext %a) nounwind uwtable { > %r.sroa.0 = alloca i16, align 2 > %r.sroa.1 = alloca i16, align 2 > store i16 %a, i16* %r.sroa.0, align 2 > store i16 1, i16* %r.sroa.1, align 2 > %r.sroa.0.0.load3 = load volatile i16* %r.sroa.0, align 2 > store volatile i16 %r.sroa.0.0.load3, i16* inttoptr (i64 416 to i16*), align 32...

RFC: A change in InstCombine canonical form

2016 Mar 22

2

RFC: A change in InstCombine canonical form

...gs/show_bug.cgi?id=26445>) >> >> IR contains code for loading a float from float * and storing it to a float * address. After canonicalization of load in InstCombine [1], new bitcasts are added to the IR (see bottom of the email for code samples). This prevents select speculation in SROA to work. Also after SROA we have bitcasts from int32 to float. (Whereas originally after instCombine, bitcasts are only done on pointer types). >> >> === PROPOSED SOLUTION=== >> >> [1] implies that we need load canonicalization when we load a value only to store it again....

[LLVMdev] [RFC] Poor code generation for paired load

2013 Aug 09

2

[LLVMdev] [RFC] Poor code generation for paired load

Hi, I am investigating a poor code generation on x86-64 involving a 64-bits structure with two 32-bits fields (in the attached examples float, but similar behavior is exposed with i32, and we can probably generalize that to smaller types too). The root cause of the problem is in SROA, although I am not sure we should fix something there. That is why I need your advices. ** Problem ** 64-bits structures are usually loaded as one chunk of bits and fields are extracted from this chunk. Although this may be generally better than loading each field on its own, this can lead to po...

Optimization issues (Alias Analysis?)

2016 Jul 04

2

Optimization issues (Alias Analysis?)

...oid @Test(%struct.regs* noalias nocapture sret, i32, i32, i32) local_unnamed_addr #0 { %5 = add i32 %3, -4 %6 = inttoptr i32 %5 to i32* store i32 %2, i32* %6, align 4 %7 = add i32 %3, -8 %8 = inttoptr i32 %7 to i32* store i32 %1, i32* %8, align 4 %9 = load i32, i32* %6, align 4 %.sroa.0.0..sroa_idx = getelementptr inbounds %struct.regs, %struct.regs* %0, i32 0, i32 0 store i32 %9, i32* %.sroa.0.0..sroa_idx, align 4 %.sroa.4.0..sroa_idx4 = getelementptr inbounds %struct.regs, %struct.regs* %0, i32 0, i32 1 store i32 %1, i32* %.sroa.4.0..sroa_idx4, align 4 %.sroa.7.0..sr...

search for: sroa