thr3ads.net - similar to: "SROA and volatile memcpy/memset"

Displaying 20 results from an estimated 10000 matches similar to: "SROA and volatile memcpy/memset"

2015 Nov 10

SROA and volatile memcpy/memset

On 11/10/2015 1:07 PM, Joerg Sonnenberger via llvm-dev wrote: > On Tue, Nov 10, 2015 at 10:41:06AM -0600, Krzysztof Parzyszek via llvm-dev wrote: >> I have a customer testcase where SROA splits a volatile memcpy and we end up >> generating bad code[1]. While this looks like a bug, simply preventing SROA >> from splitting volatile memory intrinsics causes basictest.ll for SROA

SROA and volatile memcpy/memset

2015 Nov 11

SROA and volatile memcpy/memset

On 11/11/2015 8:53 AM, Hal Finkel wrote: > > SROA seems to be doing a number of things here. What about if we prevented SROA from generating multiple slices splitting volatile accesses? There might be a significant difference between that and something like this test (test/Transforms/SROA/basictest.ll): > > define i32 @test6() { > ; CHECK-LABEL: @test6( > ; CHECK: alloca i32 >

SROA and volatile memcpy/memset

2015 Nov 11

SROA and volatile memcpy/memset

On 11/11/2015 9:28 AM, Chandler Carruth wrote: > So, here is the model that LLVM is using: a volatile memcpy is lowered > to a loop of loads and stores of indeterminate width. As such, splitting > a memcpy is always valid. > > If we want a very specific load and store width for volatile accesses, I > think that the frontend should generate concrete loads and stores of a > type

SROA and volatile memcpy/memset

2015 Nov 11

SROA and volatile memcpy/memset

On 11/11/2015 9:36 AM, Hal Finkel wrote: > ----- Original Message ----- >> From: "Krzysztof Parzyszek" <kparzysz at codeaurora.org> >> >> Yeah, the remark about devices I made in my post was a result of a >> "last-minute" thought to add some rationale. It doesn't actually >> apply >> to SROA, since there are no devices that are

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

2013 Jan 29

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

I can't think of a better way to do this, so I think it's ok. I also submitted a complementary patch on llvm-commits clarifying volatile semantics. -Andy On Jan 28, 2013, at 8:54 AM, Arnaud A. de Grandmaison <arnaud.allarddegrandmaison at parrot.com> wrote: > Hi All, > > In the language reference manual, the access behavior of the memcpy, > memmove and memset

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

2013 Jan 28

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

Hi All, In the language reference manual, the access behavior of the memcpy, memmove and memset intrinsics is not well defined with respect to the volatile flag. The LRM even states that "it is unwise to depend on it". This forces optimization passes to be conservatively correct and prevent optimizations. A very simple example of this is : $ cat test.c #include <stdint.h>

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

2013 Jan 31

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

Thanks Andy and Chandler, After specifying the volatile access behaviour, the second step was to autoupgrade the memmove/memcpy intrinsics, and implement (is|set)Volatile in terms of (is|set)(Src|Dest)Volatile, with no functional change. 0001-Specify-the-access-behaviour-of-the-memcpy-memmove-a.patch is the one you already reviewed, unaltered.

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

2013 Feb 03

[LLVMdev] Specify the volatile access behaviour of the memcpy, memmove and memset intrinsics

Same patches as before, but 0002-memcpy has been updated to put the (is|set)SrcVolatile methods to where they logically belong : MemTransferInst. This makes (is|set)Volatile methods look a bit ugly to keep compatibility with existing behaviour, but they will hopefully disappear when all users have moved to the new interface --- in the next series of patches. I plan to give a try to phabricator

[LLVMdev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

2013 Jan 20

[LLVMdev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

As a results of my investigations, the thread is also added to cfe-dev. The context : while porting my company code from the LLVM/Clang releases 3.1 to 3.2, I stumbled on a code size and performance regression. The testcase is : $ cat test.c #include <stdint.h> struct R { uint16_t a; uint16_t b; }; volatile struct R * const addr = (volatile struct R *) 416; void test(uint16_t a) {

[LLVMdev] [cfe-dev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

2013 Jan 20

[LLVMdev] [cfe-dev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

I doubt you needed to add cfe-dev here. Sorry I hadn't seen this, this seems like an easy and simple deficiency in the IR intrinsic for memcpy. See below. On Sun, Jan 20, 2013 at 1:42 PM, Arnaud de Grandmaison < arnaud.allarddegrandmaison at parrot.com> wrote: > define void @test(i16 zeroext %a) nounwind uwtable { > %r.sroa.0 = alloca i16, align 2 > %r.sroa.1 = alloca i16,

[LLVMdev] Weird volatile propagation ?

2013 Jan 18

[LLVMdev] Weird volatile propagation ?

Hi All, Using clang+llvm at head, I noticed a weird behaviour with the following reduced testcase : $ cat test.c #include <stdint.h> struct R { uint16_t a; uint16_t b; }; volatile struct R * const addr = (volatile struct R *) 416; void test(uint16_t a) { struct R r = { a, 1 }; *addr = r; } $ clang -O2 -o - -emit-llvm -S -c test.c ; ModuleID = 'test.c' target

[LLVMdev] Heads up! New SROA implementation is going on-by-default today!

2012 Sep 24

[LLVMdev] Heads up! New SROA implementation is going on-by-default today!

On Mon, Sep 24, 2012 at 3:41 AM, David Tweed <david.tweed at arm.com> wrote: > Just a note that the following new regressions have started to show up on > ARM/Linux: > Thanks for letting me know! I know that there is one serious bug that would impact in BE system. I should have that fixed today, along with a crasher. I would appreciate help tracking down any issues once the one I

[LLVMdev] Heads up! New SROA implementation is going on-by-default today!

2012 Sep 22

[LLVMdev] Heads up! New SROA implementation is going on-by-default today!

After a lot of testing and help from Duncan, Benjamin, Joerg and others, I think the new SROA is ready for some broader testing. I've fixed all the crashers and miscompiles that Duncan and Joerg have been able to find (although I'm sure there are a few left I'll tackle when there are reports), and the LNT numbers look *really* good. Here is the latest LNT run we got by flipping it on

[LLVMdev] wrong code generation for memcpy function in SROA optimization pass

2013 Nov 24

[LLVMdev] wrong code generation for memcpy function in SROA optimization pass

SROA optimization pass did some optimizations and transforms for memcpy function,such as ld/st operations.When someone has written down code like size>sizeof(dest) in memcpy(*dest,*src,size), there was much likely a wrong code generation.for example,considered as such testcase: int main() { char ch; short sh = 0x1234; memcpy(&ch,&sh,2); printf("ch=0x%02x\n",ch); } At

[LLVMdev] [cfe-dev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

2013 Jan 21

[LLVMdev] [cfe-dev] codegen of volatile aggregate copies (was "Weird volatile propagation" on llvm-dev)

On 01/20/2013 10:56 PM, Chandler Carruth wrote: > I doubt you needed to add cfe-dev here. Sorry I hadn't seen this, this > seems like an easy and simple deficiency in the IR intrinsic for > memcpy. See below. > > On Sun, Jan 20, 2013 at 1:42 PM, Arnaud de Grandmaison > <arnaud.allarddegrandmaison at parrot.com > <mailto:arnaud.allarddegrandmaison at

[RFC] Aggreate load/store, proposed plan

2015 Aug 20

[RFC] Aggreate load/store, proposed plan

It is pretty clear people need this. Let's get this moving. I'll try to sum up the point that have been made and I'll try to address them carefully. 1/ There is no good solution for large aggregates. That is true. However, I don't think this is a reason to not address smaller aggregates, as they appear to be needed. Realistically, the proportion of aggregates that are very large

@llvm.memcpy not honoring volatile?

2019 Jun 06

@llvm.memcpy not honoring volatile?

The primary reason I don’t want to provide any guarantees for what instructions are used to implement volatile memcpy is that it would forbid lowering a volatile memcpy to a library call. clang uses a volatile memcpy for struct assignment in C. For example, “void f(volatile struct S*p) { p[0] = p[1]; }”. It’s not really that useful, but it’s been done that way since before clang was written.

@llvm.memcpy not honoring volatile?

2019 Jun 05

@llvm.memcpy not honoring volatile?

The following IR with the volatile parameter set to true > call void @llvm.memcpy.p0i8.p0i8.i64(i8* align 1 %0, i8* align 1 %1, i64 7, i1 true) generates the following asm: > movl (%rsi), %eax > movl 3(%rsi), %ecx > movl %ecx, 3(%rdi) > movl %eax, (%rdi) It performs an overlapping read/write which - I believe - is violating the volatile semantic Full example here:

@llvm.memcpy not honoring volatile?

2019 Jun 05

@llvm.memcpy not honoring volatile?

On Wed, 5 Jun 2019 at 13:49, Eli Friedman via llvm-dev <llvm-dev at lists.llvm.org> wrote: > I don’t see any particular reason to guarantee that a volatile memcpy will access each byte exactly once. How is that useful? I agree it's probably not that useful, but I think the non-duplicating property of volatile is ingrained strongly enough that viewing a memcpy as a single load and

@llvm.memcpy not honoring volatile?

2019 Jun 13

@llvm.memcpy not honoring volatile?

> On Jun 12, 2019, at 9:38 PM, James Y Knight <jyknight at google.com> wrote: > > >> On Tue, Jun 11, 2019 at 12:08 PM JF Bastien via llvm-dev <llvm-dev at lists.llvm.org> wrote: > >> I think we want option 2.: keep volatile memcpy, and implement it as touching each byte exactly once. That’s unlikely to be particularly useful for every direct-to-hardware

similar to: SROA and volatile memcpy/memset