Krzysztof Parzyszek via llvm-dev
2015-Nov-10  16:41 UTC
[llvm-dev] SROA and volatile memcpy/memset
Hi, I have a customer testcase where SROA splits a volatile memcpy and we end up generating bad code[1]. While this looks like a bug, simply preventing SROA from splitting volatile memory intrinsics causes basictest.ll for SROA to fail. Not only that, but it also seems like handling of volatile memory transfers was done with some intent. What are the design decisions in SROA regarding handling of volatile memcpy/memset? [1] In our applications, in most cases volatile objects are used to communicate with external devices. Both, the address and the transfer size must match what the device is expecting, so breaking up volatile memcpy/memset in smaller pieces must be done very carefully, if at all. Generally, target-independent transformations aren't expected to have that knowledge and so it would be preferable that they leave such memory traffic alone. -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Joerg Sonnenberger via llvm-dev
2015-Nov-10  19:07 UTC
[llvm-dev] SROA and volatile memcpy/memset
On Tue, Nov 10, 2015 at 10:41:06AM -0600, Krzysztof Parzyszek via llvm-dev wrote:> I have a customer testcase where SROA splits a volatile memcpy and we end up > generating bad code[1]. While this looks like a bug, simply preventing SROA > from splitting volatile memory intrinsics causes basictest.ll for SROA to > fail. Not only that, but it also seems like handling of volatile memory > transfers was done with some intent.There is no such thing as a volatile memcpy or memset in standard ISO C, so what exactly are you doing and why do you expect it to work that way? Joerg
Krzysztof Parzyszek via llvm-dev
2015-Nov-10  19:22 UTC
[llvm-dev] SROA and volatile memcpy/memset
On 11/10/2015 1:07 PM, Joerg Sonnenberger via llvm-dev wrote:> On Tue, Nov 10, 2015 at 10:41:06AM -0600, Krzysztof Parzyszek via llvm-dev wrote: >> I have a customer testcase where SROA splits a volatile memcpy and we end up >> generating bad code[1]. While this looks like a bug, simply preventing SROA >> from splitting volatile memory intrinsics causes basictest.ll for SROA to >> fail. Not only that, but it also seems like handling of volatile memory >> transfers was done with some intent. > > There is no such thing as a volatile memcpy or memset in standard ISO C, > so what exactly are you doing and why do you expect it to work that way?The motivating example has an aggregate copy where the aggregate is volatile, followed by a store to one of its members. (This does not have anything to do with devices.) SROA expanded this into a series of volatile loads and stores, which cannot be coalesced back into fewer instructions. This is clearly worse than doing the copy and then the member overwrite. --- test.c --- typedef struct { volatile unsigned int value; } atomic_word_t; typedef union { struct { unsigned char state; unsigned char priority; }; atomic_word_t atomic; unsigned int full; } mystruct_t; mystruct_t a; unsigned int foo(void) { mystruct_t x; mystruct_t y; x.full = a.atomic.value; y = x; y.priority = 7; return y.full; } -------------- -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation