thr3ads.net - search: "memcpyopt"

Displaying 20 results from an estimated 86 matches for "memcpyopt".

2016 Nov 17

Possible MemCpyOpt bug?

Hi all, I think I've managed to trick the legacy MemCpyOpt (MCO) into an incorrect transform, but I would like to confirm the validity of my counterexample before working on the fix. Suppose the following IR: %T = type { i32, i32 } define void @f(%T* %a, %T* %b, %T* %c, %T* %d) { %val = load %T, %T* %a, !alias.scope !{!10} ; stor...

[LLVMdev] [PATCH] PR2218

2009 Jul 22

[LLVMdev] [PATCH] PR2218

Hello, This patch fixes PR2218. However, I'm not pretty sure that this optimization should be in MemCpyOpt. I think that GVN is good place as well. Regards -- Jakub Staszak -------------- next part -------------- A non-text attachment was scrubbed... Name: pr2218.patch Type: application/octet-stream Size: 6146 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20...

RFC: Mark BasicAA as a CFG-only pass.

2020 Feb 10

RFC: Mark BasicAA as a CFG-only pass.

On 2/10/20 2:35 PM, Alina Sbirlea wrote: > Hi, > > Here's a tentative patch of the changes for this: D74353 > <https://reviews.llvm.org/D74353>. I suppose that, as expected, it's invalidated less often this way. Given that it's generally stateless, does this really represent a cost savings? -Hal > > Thank you, > Alina > > > On Mon, Feb 10,

[LLVMdev] [PATCH] PR2218

2009 Jul 23

[LLVMdev] [PATCH] PR2218

On Jul 22, 2009, at 1:37 PM, Jakub Staszak wrote: > Hello, > > This patch fixes PR2218. Very nice. Are you sure this fixes PR2218? The example there doesn't have any loads in it. > However, I'm not pretty sure that this optimization should be in > MemCpyOpt. I think that GVN is good place as well. Yes, you're right. My long term goal is to merge the relevant pieces of memcpyopt into GVN/DSE at some point. To do that, some more major surgery needs to be done to memdep to make it work both backward (to support GVN) and forward (to support D...

The value of padding when storing an aggregate into memory

2020 Aug 19

The value of padding when storing an aggregate into memory

...gregate fills padding with undef. Here are a few clues that supports this change: - According to C17, the value of padding bytes when storing values in structures or unions is unspecified. - IPSCCP ignores padding and directly stores a constant aggregate if possible: https://godbolt.org/z/ddWq9z Memcpyopt ignores padding when copying an aggregate or storing a constant: https://godbolt.org/z/hY6ndd / https://godbolt.org/z/3WMP5a - Alive2 (with store operation updated) did not find any problematic transformation from LLVM unit tests and while running translation validation on a few C programs. The p...

[LLVMdev] Optimizing out redundant alloca involving byval params

2015 Mar 08

[LLVMdev] Optimizing out redundant alloca involving byval params

...lt, isolating the pass in question should >> be easy. >> >> >> Thank you. >> Mircea. >> >> >> On Thu, Mar 5, 2015 at 4:39 PM Philip Reames <listmail at philipreames.com> >> wrote: >> >>> Reid is right that this would go in memcpyopt, but... we there's an >>> active discussion on the commit list which will solve this through a >>> different mechanism. There's an active desire to avoid teaching GVN and >>> related pieces (of which memcpyopt is one) about first class aggregates. >>> We...

[LLVMdev] Optimizing out redundant alloca involving byval params

2015 Mar 06

[LLVMdev] Optimizing out redundant alloca involving byval params

Reid is right that this would go in memcpyopt, but... we there's an active discussion on the commit list which will solve this through a different mechanism. There's an active desire to avoid teaching GVN and related pieces (of which memcpyopt is one) about first class aggregates. We don't have enough active users of the feat...

The value of padding when storing an aggregate into memory

2020 Aug 19

The value of padding when storing an aggregate into memory

Hello Alexander, > Interesting topic. Is any such optimization reachable from C? Yes, I think so - both PassBuilder and PassManagerBuilder add MemCpyOpt & IPSCCP in the default pass pipeline. Juneyoung On Wed, Aug 19, 2020 at 8:43 PM Alexander Cherepanov <ch3root at openwall.com> wrote: > On 19/08/2020 06.05, Juneyoung Lee via llvm-dev wrote: > > LangRef isn't clear about the value of padding when an aggregate value is &gt...

[PATCH] D26127: [MemorySSA] Repair AccessList invariants after insertion of new MemoryUseOrDef.

2016 Oct 31

[PATCH] D26127: [MemorySSA] Repair AccessList invariants after insertion of new MemoryUseOrDef.

On Sun, Oct 30, 2016 at 5:03 PM, Bryant Wong < 3.14472+reviews.llvm.org at gmail.com> wrote: > To give this a bit of context, this patch stems from issues that I've > encountered while porting MemCpyOpt to MSSA. > Okay. I'm not sure i would try to port instead of just rewrite. The whole goal of MemorySSA is to enable us to write memory optimizations in non-N^2 ways. If you just replace one querying with the other querying, this is not likely to give you this result. It may be faster (or...

[LLVMdev] Optimizing out redundant alloca involving byval params

2015 Mar 06

[LLVMdev] Optimizing out redundant alloca involving byval params

...the right result, isolating the pass in question should be easy. > > Thank you. > Mircea. > > On Thu, Mar 5, 2015 at 4:39 PM Philip Reames > <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote: > > Reid is right that this would go in memcpyopt, but... we there's > an active discussion on the commit list which will solve this > through a different mechanism. There's an active desire to avoid > teaching GVN and related pieces (of which memcpyopt is one) about > first class aggregates. We don't have...

CFLAA

2016 Aug 25

CFLAA

...1,482 loop-unswitch Total number of instructions analyzed 109,279 -3 loop-vectorize # loops analyzed for vectorization 526,766 -136 mem2reg # PHI nodes inserted 4,150,078 -3 mem2reg # alloca's promoted with a single store 4,567 6 memcpyopt # memcpy instructions deleted 96 1 memcpyopt # memcpys converted to memset 1,074 173 memcpyopt # memmoves converted to memcpy 39,584 6 memcpyopt # memsets inferred 179,629 2,475 memdep # block queries that were completely...

CFLAA

2016 Aug 25

CFLAA

...109,279 -3 loop-vectorize # loops analyzed for vectorization >> >> 526,766 -136 mem2reg # PHI nodes inserted >> >> 4,150,078 -3 mem2reg # alloca's promoted with a single >> store >> >> 4,567 6 memcpyopt # memcpy instructions deleted >> >> 96 1 memcpyopt # memcpys converted to memset >> >> 1,074 173 memcpyopt # memmoves converted to memcpy >> >> 39,584 6 memcpyopt # memsets inferred >> >>...

[LLVMdev] [PATCH] PR2218

2009 Sep 02

[LLVMdev] [PATCH] PR2218

...s fine now. Hey Jakub, Thanks for working on this again, one more round :) Please merge the three testcases into one file. We added a new FileCheck tool which allows you to check for the exact sequence of instructions expected, which also allows the tests to be merged into one file. +/// MemCpyOpt::pointerIsParameter - returns true iff pointer is a parameter of +/// C call instruction. +bool MemCpyOpt::pointerIsParameter(Value *pointer, CallInst *C, unsigned &argI) +{ + CallSite CS = CallSite::get(C); + for (argI = 0; argI < CS.arg_size(); ++argI) Please make this a static...

[LLVMdev] Memcpy / Memset for address spaces >= 256

2014 Mar 11

[LLVMdev] Memcpy / Memset for address spaces >= 256

...w how to lower a Memcpy and Memset if one of the pointer operands have an address space >= 256. This is understandable since the libc's memcpy / memset don't work for these address spaces. However, both Clang (when copying a struct) and some optimization passes (LoopIdiomRecognize, MemCpyOpt) can emit memcpy / memset for these address spaces. This triggers an assert in SelectionDAGBuilder. The optimization passes could be modified to give up when they encounter an address space >= 256, but I think clang would need some new code that emits a struct copy member-by-member. I thi...

RFC Storing BB order in llvm::Instruction for faster local dominance

2018 Sep 27

RFC Storing BB order in llvm::Instruction for faster local dominance

On 09/27/2018 12:24 AM, Chris Lattner via llvm-dev wrote: On Sep 26, 2018, at 11:55 AM, Reid Kleckner <rnk at google.com<mailto:rnk at google.com>> wrote: As suggested in the bug, if we were to rewrite these passes to use MemorySSA, this bottleneck would go away. I rebased a patch to do that for DSE, but finishing it off and enabling it by default is probably out of scope for me.

RFC Storing BB order in llvm::Instruction for faster local dominance

2018 Sep 25

RFC Storing BB order in llvm::Instruction for faster local dominance

...llvm::BasicBlock at > all. > > Do you have a sense of where these expensive domtree queries are > coming from? Are they from a couple of key places or are they > scattered throughout the pass pipeline? > When I dug into the profile with WPA, most of the time was spent in DSE and memcpyopt, which call AAResults::callCapturesBefore, which calls llvm::PointerMayBeCapturedBefore. Neither pass in an OrderedBasicBlock, so they rebuild the OrderedBasicBlock in linear time on every query. These passes insert instructions, so it's not correct to simply create and reuse an OrderedBasicBlo...

[LLVMdev] [PATCH] PR2218

2009 Jul 25

[LLVMdev] [PATCH] PR2218

...37 PM, Jakub Staszak wrote: > >> Hello, >> >> This patch fixes PR2218. > > Very nice. Are you sure this fixes PR2218? The example there > doesn't have any loads in it. > >> However, I'm not pretty sure that this optimization should be in >> MemCpyOpt. I think that GVN is good place as well. > > Yes, you're right. My long term goal is to merge the relevant > pieces of memcpyopt into GVN/DSE at some point. To do that, some > more major surgery needs to be done to memdep to make it work both > backward (to support GVN) a...

[LLVMdev] [PATCH] PR2218

2009 Sep 02

[LLVMdev] [PATCH] PR2218

Hello, I fixed my patch as you asked. Sorry for the delay, I'd been working on my SSU patch (http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-August/025347.html ) I hope that everything is fine now. -Jakub -------------- next part -------------- A non-text attachment was scrubbed... Name: pr2218-3.patch Type: application/octet-stream Size: 7511 bytes Desc: not available URL:

[LLVMdev] -O4 limitations in llvm/llvm-gcc-4.2 2.5?

2009 Jan 25

[LLVMdev] -O4 limitations in llvm/llvm-gcc-4.2 2.5?

...gt; in llvm 2.5 is still limited to dead code elimination, > correct? No. libLTO does the equivalent to opt -internalize -ipsccp -globalopt -constmerge -deadargelim -instcombine -inline -prune-eh -globaldce -argpromotion -instcombine -jump-threading -scalarrepl -globalsmodref-aa -licm -gvn -memcpyopt -dse -instcombine -jump-threading -mem2reg -simplifycfg -globaldce Will LTO ever be extended to inlining across > files as well as constant-folding and global data > allocation optimizations? Our optimization passes know nothing of file boundaries. That includes the inliner, which bliss...

Optimization of successive constant stores

2015 Dec 11

Optimization of successive constant stores

...to a single instruction, e.g.: define void @test(%UodStructType*) { %2 = bitcast %UodStructType* %0 to i32* store i32 0x04030201, i32* %2, align 8 ret void } I don't see any optimization that would do this. Interestingly, if I store the same 8-bit constant in all four bytes, then MemCpyOpt will indeed convert this to a 32-bit store. Am I doing something wrong, or is there really no optimization pass that can clean this up? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151211/17827054/attachm...

search for: memcpyopt