thr3ads.net - similar to: "[LLVMdev] Memcpy expansion: InstCombine vs SelectionDAG"

Displaying 20 results from an estimated 40000 matches similar to: "[LLVMdev] Memcpy expansion: InstCombine vs SelectionDAG"

[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer

2014 Dec 05

[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer

On 3 Dec 2014, at 23:36, Robert Lougher <rob.lougher at gmail.com> wrote: > On 2 December 2014 at 22:18, Alex Rosenberg <alexr at leftfield.org> wrote: >> >> Our C library amplifies this problem by being in a dynamic library, so the >> call has additional overhead, which for small trip counts swamps the >> copy/set. >> > > I can't imagine

[RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.

2015 Aug 19

[RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.

Hi All, I'd like to float two changes to the llvm.memcpy / llvm.memmove intrinsics. (1) Add an i1 <mayPerfectlyAlias> argument to the llvm.memcpy intrinsic. When set to '1' (the auto-upgrade default), this argument would indicate that the source and destination arguments may perfectly alias (otherwise they must not alias at all - memcpy prohibits partial overlap). While the C

[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer

2014 Dec 05

[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer

There are a large number of ways to lose information in translating loops into memset/memcpy calls, alignment is one of them. As previously mentioned, loop-trip-count is another. Another is size of accesses. For example, the loop may have originally been using int64_t sized copies. This has definite impact on what the best memset/memcpy expansion is, because effectively, the loop knows that it

[RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.

2015 Aug 19

[RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.

On 08/19/2015 09:35 AM, Pete Cooper via llvm-dev wrote: > Hey Lang >> On Aug 18, 2015, at 6:04 PM, Lang Hames via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> Hi All, >> >> I'd like to float two changes to the llvm.memcpy / llvm.memmove intrinsics. >> >> >> (1) Add an i1 <mayPerfectlyAlias> argument to the llvm.memcpy

[RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.

2015 Aug 19

[RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.

> On Aug 19, 2015, at 12:01 PM, Hal Finkel via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > ----- Original Message ----- >> From: "Philip Reames via llvm-dev" <llvm-dev at lists.llvm.org> >> To: "Pete Cooper" <peter_cooper at apple.com>, "Lang Hames" <lhames at gmail.com> >> Cc: "LLVM Developers Mailing

RFC: Inline expansion of memcmp vs call to standard library

2016 Dec 30

RFC: Inline expansion of memcmp vs call to standard library

Can I make another suggestion: create an intrinsic for memory equality, e.g. llvm.memcmp_eq.p0i8.p0i8.i64(i8*a, i8*b, i64 len). This intrinsic would return zero if the memory regions are equal, and nonzero otherwise. However, it does NOT return any notion of "greater" or "less". Many applications require only determining equality, rather than a total ordering. Given that

RFC: Inline expansion of memcmp vs call to standard library

2016 Dec 30

RFC: Inline expansion of memcmp vs call to standard library

With the intrinsic support for ‘memcpy’ and ‘memset’ the operands also have associated alignment operands. I think that ‘memcmp’ should also provide the alignment information for each of the source operands (when statically known). In some cases this will lead to more optimal alignment aware lowering, and for targets for which unaligned access is costly or fatal, it can be lowered safely.

[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer

2014 Dec 02

[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer

On Dec 3, 2014, at 6:12 AM, Eric Christopher <echristo at gmail.com> wrote: > > > >> On Tue Dec 02 2014 at 12:12:01 PM Robert Lougher <rob.lougher at gmail.com> wrote: >> On 2 December 2014 at 19:57, Joerg Sonnenberger <joerg at britannica.bec.de> wrote: >> > On Tue, Dec 02, 2014 at 07:23:01PM +0000, Robert Lougher wrote: >> >> In

[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer

2014 Dec 06

[LLVMdev] Memset/memcpy: user control of loop-idiom recognizer

Hal, I appreciate the clarification. That was what I was expecting (that the transformation uses intrinsics), Intel compiler does the same thing internally, and like LLVM it is into an internal intrinsic, not a plain library call. Nevertheless, there are a huge number of ways (In machine code) to write "the best" memory copy or memory set sort of code if, as a programmer, you are able

[RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.

2015 Aug 20

[RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.

Pete - That patch sounds great! Philip, Hal, Medhi, Gerolf - Thanks very much for the feedback. So how about this: (1) We drop llvm.memcpy's alignment argument and use Pete's alignment-via-metadata patch (whatever version of it passes review). (2) llvm.memcpy retains its current semantics, but we teach clang, SimplifyLibCalls, etc. to add noalias metadata where we know it's safe.

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 25

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

Thanks, that worked like a charm except for the following: llvm generate: call void @llvm.memcpy.p3i8.p1i8.i64(i8 addrspace(3)* align 1 bitcast ([512 x float] addrspace(3)* @a_scratchpad to i8 addrspace(3)*), i8 addrspace(1)* align 1 %0, i64 2048, i1 false) And we expected: call void @llvm.memcpy.p3i8.p1i8.i64(i8 addrspace(3)* bitcast ([512 x float] addrspace(3)* [[SPM0]] to i8

[RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.

2015 Aug 21

[RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.

Hi Hal > By this I assume you mean some new 'nooverlap' metadata? I don't think we have any existing metadata with the correct semantics. I was thinking we could just use the existing noalias metadata. Implicitly, the current llvm.memcpy semantics are "src and dst overlap perfectly or not at all" (perhaps we should update the docs to reflect this if we plan to rely on

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 25

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

Yes, all that is correct. My question is more a long term question: why do the .ll printer specify the alignment if it is equivalent to the default one? That is, it seems the sed script expect the printer to not specify it (this would match the load/store behavior), but the ll-printer does specify it, which either means the printer is not ideal on this case and I should fix it, or in this case

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 25

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

Good question. AFAIK, the IR-printer doesn’t understand the semantics of parameter attributes. In this case, it only knows that there is an attribute on the parameter that is integer valued (with value 1) and that has the name “align”, so it prints it out. If we don’t want it printing out ‘align 1’ then it’s up to us to not set the alignment parameter attribute to a value if that value would be 1.

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 25

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

Hi Alexandre, Before the change you would have been expecting one of the following, correct? a) call void @llvm.memcpy.p3i8.p1i8.i64(i8 addrspace(3)* bitcast ([512 x float] addrspace(3)* [[SPM0]] to i8 addrspace(3)*), i8 addrspace(1)* [[APTR]], i64 2048, i32 0, i1 false) b) call void @llvm.memcpy.p3i8.p1i8.i64(i8 addrspace(3)* bitcast ([512 x float] addrspace(3)* [[SPM0]] to i8 addrspace(3)*), i8

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 24

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

Hello, Is there a script to update those test cases? I see mention of a sed script in the commit message but when I try it (see attached) on sed I get the following error: sed: file script line 2: invalid reference \3 on `s' command's RHS Did I lose something in a copy-paste? Is it not really a sed script? How do I run it? On Fri, Jan 19, 2018 at 9:15 AM, Daniel Neilson via

[RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.

2015 Sep 08

[RFC] Generalize llvm.memcpy / llvm.memmove intrinsics.

Hi Hal, > If you attach noalias metadata to the memcpy call, it will apply to both the source and destination; we don't have a way to differentiate. It might be true that if you attach both noalias and alias.scope metadata to the call, then querying the call against itself will return NoModRef, but that's really hacky (and, in part, wrong, because the destination still alias with

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

2018 Jan 24

[PATCH] D41675: Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)

Hi Alexandre, The script uses extended-sed syntax, so you need to run sed with the -E option. For example, when preparing the patch I created a file ( script.sed ) containing all of the lines that I copied into the commit message. Then, I ran this bash one-liner from the test directory: for f in $(find . -name '*.ll'); do sed -E -i ‘.sedbak' -f script.sed $f; done When I was happy

llvm.memcpy for struct copy

2018 Jan 31

llvm.memcpy for struct copy

Hi Ma, how can I transform the llvm.memcpy into data move loop IR and eliminate > the bitcast instruction ? > I'm not sure why you are concerned about memcpy and bitcasts, but if you call MCpyInst->getSource() and MCpyInst->getDest() it will look through casts and give you the 'true' source/destination. If you want to get rid of memcpy altogether, you can take a look

Change memcpy/memmove/memset to have dest and source alignment attributes

2018 Jan 02

Change memcpy/memmove/memset to have dest and source alignment attributes

Good day all, I’ve spent a few days resurrecting the circa-2015 work on removing the explicit alignment argument (4th arg) from the @llvm.memcpy/memmove/memset intrinsics in favour of using the alignment attribute on the pointer args of calls to the intrinsic. This work was first proposed back in August 2015 by Lang Hames: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html (item

similar to: [LLVMdev] Memcpy expansion: InstCombine vs SelectionDAG