similar to: [LLVMdev] Memset/memcpy: user control of loop-idiom recognizer

Displaying 20 results from an estimated 50000 matches similar to: "[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer"

2014 Dec 02
3
[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer
On Dec 3, 2014, at 6:12 AM, Eric Christopher <echristo at gmail.com> wrote: > > > >> On Tue Dec 02 2014 at 12:12:01 PM Robert Lougher <rob.lougher at gmail.com> wrote: >> On 2 December 2014 at 19:57, Joerg Sonnenberger <joerg at britannica.bec.de> wrote: >> > On Tue, Dec 02, 2014 at 07:23:01PM +0000, Robert Lougher wrote: >> >> In
2014 Dec 02
2
[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer
On 2 December 2014 at 19:57, Joerg Sonnenberger <joerg at britannica.bec.de> wrote: > On Tue, Dec 02, 2014 at 07:23:01PM +0000, Robert Lougher wrote: >> In feedback from game studios a common issue is the replacement of >> loops with calls to memcpy/memset. These loops are often >> hand-optimised, and highly-efficient and the developers strongly want >> a way to
2014 Dec 05
2
[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer
There are a large number of ways to lose information in translating loops into memset/memcpy calls, alignment is one of them. As previously mentioned, loop-trip-count is another. Another is size of accesses. For example, the loop may have originally been using int64_t sized copies. This has definite impact on what the best memset/memcpy expansion is, because effectively, the loop knows that it
2014 Dec 06
2
[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer
Hal, I appreciate the clarification. That was what I was expecting (that the transformation uses intrinsics), Intel compiler does the same thing internally, and like LLVM it is into an internal intrinsic, not a plain library call. Nevertheless, there are a huge number of ways (In machine code) to write "the best" memory copy or memory set sort of code if, as a programmer, you are able
2014 Dec 05
3
[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer
On 3 Dec 2014, at 23:36, Robert Lougher <rob.lougher at gmail.com> wrote: > On 2 December 2014 at 22:18, Alex Rosenberg <alexr at leftfield.org> wrote: >> >> Our C library amplifies this problem by being in a dynamic library, so the >> call has additional overhead, which for small trip counts swamps the >> copy/set. >> > > I can't imagine
2014 Dec 05
4
[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer
On 5 December 2014 at 06:49, Sean Silva <chisophugis at gmail.com> wrote: > > > On Wed, Dec 3, 2014 at 4:23 AM, Robert Lougher <rob.lougher at gmail.com> > wrote: >> >> Hi, >> >> In feedback from game studios a common issue is the replacement of >> loops with calls to memcpy/memset. These loops are often >> hand-optimised, and
2014 Dec 06
3
[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer
On Sat, Dec 06, 2014 at 07:06:31AM -0600, Hal Finkel wrote: > - Direction (should the memory be traversed forward or backward) I don't think that this makes sense for memset and memcpy. It does matter for memmove. Joerg
2017 Aug 17
3
[cfe-dev] Disable memset synthesis
My concern wasn't a phylosophical one but a pragmatic one. Learning about poor choices when lowering memset is probably quite useful. Having a flag that just turns off idiom recognition for it may just work around the problem. But the problem may still exist. In any case, I'm not fundamentally against such a flag but it just seems like something that could 1. Hide a problem 2. Get a bit
2017 Aug 16
2
[cfe-dev] Disable memset synthesis
On Tue, Aug 15, 2017 at 9:37 PM, Tim Northover via cfe-dev < cfe-dev at lists.llvm.org> wrote: > On 15 August 2017 at 19:38, bharathi seshadri via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > I find that GCC has an option -fno-tree-loop-distribute-patterns that > > can be used to disable memcpy/memset synthesis. I wonder if there is > > something similar
2014 Mar 12
2
[LLVMdev] Memcpy / Memset for address spaces >= 256
Hi David, sorry for sending you the mail two times, I forgot to send to the list the first time. On 2014-03-12 09:48, David Chisnall wrote: > I have some patches that automatically expand all memcpy and similar > if the operands are not in AS 0. I think this is probably not quite > the right approach though, and we should be asking the back end for > the function that does a memcpy
2014 Mar 11
4
[LLVMdev] Memcpy / Memset for address spaces >= 256
Hi, SelectionDAGBuilder doesn't know how to lower a Memcpy and Memset if one of the pointer operands have an address space >= 256. This is understandable since the libc's memcpy / memset don't work for these address spaces. However, both Clang (when copying a struct) and some optimization passes (LoopIdiomRecognize, MemCpyOpt) can emit memcpy / memset for these address
2017 Aug 16
3
Disable memset synthesis
Our application is 32-bit big-endian ARM and we use -O3 with LTO. clang optimizes certain initialization of structures to zero with calls to memset, which are not further lowered to move instructions. Investigating perf reports, it looks like it may be beneficial to disable this optimization that introduces a function call to memset in certain hot paths. I tried passing -fno-builtin, but that
2016 Nov 10
5
array fill idioms
I am asking for some collective wisdom/guidance. What sort of IR construct should one use to implement filling each element in an array (or vector) with the same value? In C++, this might arise in "std:fill" or "std:fill_n", when the element values in the vector are identical. In the D language, one can fill an array or a slice of an array by an assignment, e.g.
2018 Jan 24
0
[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)
Hi Alexandre, The script uses extended-sed syntax, so you need to run sed with the -E option. For example, when preparing the patch I created a file ( script.sed ) containing all of the lines that I copied into the commit message. Then, I ran this bash one-liner from the test directory: for f in $(find . -name '*.ll'); do sed -E -i ‘.sedbak' -f script.sed $f; done When I was happy
2018 Jan 24
2
[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)
Hello, Is there a script to update those test cases? I see mention of a sed script in the commit message but when I try it (see attached) on sed I get the following error: sed: file script line 2: invalid reference \3 on `s' command's RHS Did I lose something in a copy-paste? Is it not really a sed script? How do I run it? On Fri, Jan 19, 2018 at 9:15 AM, Daniel Neilson via
2016 Aug 25
2
CFLAA
I did gathered aggregate statistics reported by “-stats” over the ~400 test files. The following table summarizes the impact. The first column is the sum where the new analysis is enabled, the second column is the delta from baseline where no CFL alias analysis is performed. I am not experienced enough to know which of these are “good” or “bad” indicators. —david 72,250 685 SLP
2016 Aug 25
4
CFLAA
(and sys::cas_flag that STATISTIC uses is a uint32 ...) On Thu, Aug 25, 2016 at 9:54 AM, Daniel Berlin <dberlin at dberlin.org> wrote: > Okay, dumb question: > Are you really getting negative numbers in the second column? > > 526,766 -136 mem2reg # PHI nodes inserted > > http://llvm.org/docs/doxygen/html/PromoteMemoryToRegister_8cpp_source.html >
2015 Nov 10
2
SROA and volatile memcpy/memset
Hi, I have a customer testcase where SROA splits a volatile memcpy and we end up generating bad code[1]. While this looks like a bug, simply preventing SROA from splitting volatile memory intrinsics causes basictest.ll for SROA to fail. Not only that, but it also seems like handling of volatile memory transfers was done with some intent. What are the design decisions in SROA regarding
2017 Jul 20
2
Which assumptions do llvm.memcpy/memmove/memset.* make when the count is 0?
Hi all, when I call the llvm.memcpy/memmove/memset.* intrinsics, typically I have to pass in valid (non-dangling, non-NULL pointers) of the given alignment. However, to what extent to these rules apply when the count is 0? Concretely (for any variant of the three aforementioned intrinsics): Is it UB to call them on a dangling pointer when count is 0? On a pointer of less than the given
2015 Nov 10
4
SROA and volatile memcpy/memset
On 11/10/2015 1:07 PM, Joerg Sonnenberger via llvm-dev wrote: > On Tue, Nov 10, 2015 at 10:41:06AM -0600, Krzysztof Parzyszek via llvm-dev wrote: >> I have a customer testcase where SROA splits a volatile memcpy and we end up >> generating bad code[1]. While this looks like a bug, simply preventing SROA >> from splitting volatile memory intrinsics causes basictest.ll for SROA