Displaying 20 results from an estimated 86 matches for "memcpyopt".
2016 Nov 17
2
Possible MemCpyOpt bug?
Hi all,
I think I've managed to trick the legacy MemCpyOpt (MCO) into an incorrect
transform, but I would like to confirm the validity of my counterexample
before
working on the fix. Suppose the following IR:
%T = type { i32, i32 }
define void @f(%T* %a, %T* %b, %T* %c, %T* %d) {
%val = load %T, %T* %a, !alias.scope !{!10}
; stor...
2009 Jul 22
2
[LLVMdev] [PATCH] PR2218
Hello,
This patch fixes PR2218. However, I'm not pretty sure that this
optimization should be in MemCpyOpt. I think that GVN is good place as
well.
Regards
--
Jakub Staszak
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pr2218.patch
Type: application/octet-stream
Size: 6146 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20...
2020 Feb 10
2
RFC: Mark BasicAA as a CFG-only pass.
On 2/10/20 2:35 PM, Alina Sbirlea wrote:
> Hi,
>
> Here's a tentative patch of the changes for this: D74353
> <https://reviews.llvm.org/D74353>.
I suppose that, as expected, it's invalidated less often this way. Given
that it's generally stateless, does this really represent a cost savings?
-Hal
>
> Thank you,
> Alina
>
>
> On Mon, Feb 10,
2009 Jul 23
0
[LLVMdev] [PATCH] PR2218
On Jul 22, 2009, at 1:37 PM, Jakub Staszak wrote:
> Hello,
>
> This patch fixes PR2218.
Very nice. Are you sure this fixes PR2218? The example there doesn't
have any loads in it.
> However, I'm not pretty sure that this optimization should be in
> MemCpyOpt. I think that GVN is good place as well.
Yes, you're right. My long term goal is to merge the relevant pieces
of memcpyopt into GVN/DSE at some point. To do that, some more major
surgery needs to be done to memdep to make it work both backward (to
support GVN) and forward (to support D...
2020 Aug 19
2
The value of padding when storing an aggregate into memory
...gregate fills
padding with undef.
Here are a few clues that supports this change:
- According to C17, the value of padding bytes when storing values in
structures or unions is unspecified.
- IPSCCP ignores padding and directly stores a constant aggregate if
possible: https://godbolt.org/z/ddWq9z
Memcpyopt ignores padding when copying an aggregate or storing a constant:
https://godbolt.org/z/hY6ndd / https://godbolt.org/z/3WMP5a
- Alive2 (with store operation updated) did not find any problematic
transformation from LLVM unit tests and while running translation
validation on a few C programs.
The p...
2015 Mar 08
2
[LLVMdev] Optimizing out redundant alloca involving byval params
...lt, isolating the pass in question should
>> be easy.
>>
>>
>> Thank you.
>> Mircea.
>>
>>
>> On Thu, Mar 5, 2015 at 4:39 PM Philip Reames <listmail at philipreames.com>
>> wrote:
>>
>>> Reid is right that this would go in memcpyopt, but... we there's an
>>> active discussion on the commit list which will solve this through a
>>> different mechanism. There's an active desire to avoid teaching GVN and
>>> related pieces (of which memcpyopt is one) about first class aggregates.
>>> We...
2015 Mar 06
2
[LLVMdev] Optimizing out redundant alloca involving byval params
Reid is right that this would go in memcpyopt, but... we there's an
active discussion on the commit list which will solve this through a
different mechanism. There's an active desire to avoid teaching GVN and
related pieces (of which memcpyopt is one) about first class
aggregates. We don't have enough active users of the feat...
2020 Aug 19
2
The value of padding when storing an aggregate into memory
Hello Alexander,
> Interesting topic. Is any such optimization reachable from C?
Yes, I think so - both PassBuilder and PassManagerBuilder add MemCpyOpt &
IPSCCP in the default pass pipeline.
Juneyoung
On Wed, Aug 19, 2020 at 8:43 PM Alexander Cherepanov <ch3root at openwall.com>
wrote:
> On 19/08/2020 06.05, Juneyoung Lee via llvm-dev wrote:
> > LangRef isn't clear about the value of padding when an aggregate value is
>...
2016 Oct 31
1
[PATCH] D26127: [MemorySSA] Repair AccessList invariants after insertion of new MemoryUseOrDef.
On Sun, Oct 30, 2016 at 5:03 PM, Bryant Wong <
3.14472+reviews.llvm.org at gmail.com> wrote:
> To give this a bit of context, this patch stems from issues that I've
> encountered while porting MemCpyOpt to MSSA.
>
Okay. I'm not sure i would try to port instead of just rewrite. The whole
goal of MemorySSA is to enable us to write memory optimizations in non-N^2
ways.
If you just replace one querying with the other querying, this is not
likely to give you this result.
It may be faster (or...
2015 Mar 06
2
[LLVMdev] Optimizing out redundant alloca involving byval params
...the right result, isolating the pass in question
should be easy.
>
> Thank you.
> Mircea.
>
> On Thu, Mar 5, 2015 at 4:39 PM Philip Reames
> <listmail at philipreames.com <mailto:listmail at philipreames.com>> wrote:
>
> Reid is right that this would go in memcpyopt, but... we there's
> an active discussion on the commit list which will solve this
> through a different mechanism. There's an active desire to avoid
> teaching GVN and related pieces (of which memcpyopt is one) about
> first class aggregates. We don't have...
2016 Aug 25
2
CFLAA
...1,482 loop-unswitch Total number of instructions analyzed
109,279 -3 loop-vectorize # loops analyzed for vectorization
526,766 -136 mem2reg # PHI nodes inserted
4,150,078 -3 mem2reg # alloca's promoted with a single store
4,567 6 memcpyopt # memcpy instructions deleted
96 1 memcpyopt # memcpys converted to memset
1,074 173 memcpyopt # memmoves converted to memcpy
39,584 6 memcpyopt # memsets inferred
179,629 2,475 memdep # block queries that were completely...
2016 Aug 25
4
CFLAA
...109,279 -3 loop-vectorize # loops analyzed for vectorization
>>
>> 526,766 -136 mem2reg # PHI nodes inserted
>>
>> 4,150,078 -3 mem2reg # alloca's promoted with a single
>> store
>>
>> 4,567 6 memcpyopt # memcpy instructions deleted
>>
>> 96 1 memcpyopt # memcpys converted to memset
>>
>> 1,074 173 memcpyopt # memmoves converted to memcpy
>>
>> 39,584 6 memcpyopt # memsets inferred
>>
>>...
2009 Sep 02
0
[LLVMdev] [PATCH] PR2218
...s fine now.
Hey Jakub,
Thanks for working on this again, one more round :)
Please merge the three testcases into one file. We added a new
FileCheck tool which allows you to check for the exact sequence of
instructions expected, which also allows the tests to be merged into
one file.
+/// MemCpyOpt::pointerIsParameter - returns true iff pointer is a
parameter of
+/// C call instruction.
+bool MemCpyOpt::pointerIsParameter(Value *pointer, CallInst *C,
unsigned &argI)
+{
+ CallSite CS = CallSite::get(C);
+ for (argI = 0; argI < CS.arg_size(); ++argI)
Please make this a static...
2014 Mar 11
4
[LLVMdev] Memcpy / Memset for address spaces >= 256
...w how to lower a Memcpy and Memset if one
of the pointer operands have an address space >= 256. This is
understandable since the libc's memcpy / memset don't work for these
address spaces. However, both Clang (when copying a struct) and some
optimization passes (LoopIdiomRecognize, MemCpyOpt) can emit memcpy /
memset for these address spaces. This triggers an assert in
SelectionDAGBuilder. The optimization passes could be modified to give
up when they encounter an address space >= 256, but I think clang would
need some new code that emits a struct copy member-by-member. I thi...
2018 Sep 27
2
RFC Storing BB order in llvm::Instruction for faster local dominance
On 09/27/2018 12:24 AM, Chris Lattner via llvm-dev wrote:
On Sep 26, 2018, at 11:55 AM, Reid Kleckner <rnk at google.com<mailto:rnk at google.com>> wrote:
As suggested in the bug, if we were to rewrite these passes to use MemorySSA, this bottleneck would go away. I rebased a patch to do that for DSE, but finishing it off and enabling it by default is probably out of scope for me.
2018 Sep 25
3
RFC Storing BB order in llvm::Instruction for faster local dominance
...llvm::BasicBlock at
> all.
>
> Do you have a sense of where these expensive domtree queries are
> coming from? Are they from a couple of key places or are they
> scattered throughout the pass pipeline?
>
When I dug into the profile with WPA, most of the time was spent in DSE and
memcpyopt, which call AAResults::callCapturesBefore, which calls
llvm::PointerMayBeCapturedBefore. Neither pass in an OrderedBasicBlock, so
they rebuild the OrderedBasicBlock in linear time on every query. These
passes insert instructions, so it's not correct to simply create and reuse
an OrderedBasicBlo...
2009 Jul 25
2
[LLVMdev] [PATCH] PR2218
...37 PM, Jakub Staszak wrote:
>
>> Hello,
>>
>> This patch fixes PR2218.
>
> Very nice. Are you sure this fixes PR2218? The example there
> doesn't have any loads in it.
>
>> However, I'm not pretty sure that this optimization should be in
>> MemCpyOpt. I think that GVN is good place as well.
>
> Yes, you're right. My long term goal is to merge the relevant
> pieces of memcpyopt into GVN/DSE at some point. To do that, some
> more major surgery needs to be done to memdep to make it work both
> backward (to support GVN) a...
2009 Sep 02
2
[LLVMdev] [PATCH] PR2218
Hello,
I fixed my patch as you asked. Sorry for the delay, I'd been working
on my SSU patch (http://lists.cs.uiuc.edu/pipermail/llvmdev/2009-August/025347.html
)
I hope that everything is fine now.
-Jakub
-------------- next part --------------
A non-text attachment was scrubbed...
Name: pr2218-3.patch
Type: application/octet-stream
Size: 7511 bytes
Desc: not available
URL:
2009 Jan 25
0
[LLVMdev] -O4 limitations in llvm/llvm-gcc-4.2 2.5?
...gt; in llvm 2.5 is still limited to dead code elimination,
> correct?
No.
libLTO does the equivalent to opt -internalize -ipsccp -globalopt
-constmerge -deadargelim -instcombine -inline -prune-eh -globaldce
-argpromotion -instcombine -jump-threading -scalarrepl -globalsmodref-aa
-licm -gvn -memcpyopt -dse -instcombine -jump-threading -mem2reg
-simplifycfg -globaldce
Will LTO ever be extended to inlining across
> files as well as constant-folding and global data
> allocation optimizations?
Our optimization passes know nothing of file boundaries. That includes
the inliner, which bliss...
2015 Dec 11
2
Optimization of successive constant stores
...to a single instruction, e.g.:
define void @test(%UodStructType*) {
%2 = bitcast %UodStructType* %0 to i32*
store i32 0x04030201, i32* %2, align 8
ret void
}
I don't see any optimization that would do this.
Interestingly, if I store the same 8-bit constant in all four bytes, then
MemCpyOpt will indeed convert this to a 32-bit store.
Am I doing something wrong, or is there really no optimization pass that
can clean this up?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20151211/17827054/attachm...