thr3ads.net - llvm dev - [LLVMdev] load widening conflicts with AddressSanitizer [Dec 2011]

If this information is useful, please help other people find it:
Share via:

Chris Lattner

2011-Dec-17 01:56 UTC

[LLVMdev] load widening conflicts with AddressSanitizer

On Dec 16, 2011, at 3:29 PM, John Criswell <criswell at illinois.edu>
wrote:> On 12/16/11 4:45 PM, Chris Lattner wrote:
>> 
>> 
>> On Dec 16, 2011, at 2:41 PM, John Criswell wrote:
>> 
>>> On 12/16/11 4:14 PM, Chris Lattner wrote:
>>>> 
>>>> On Dec 16, 2011, at 12:39 PM, Kostya Serebryany wrote:
>>>>> > Do we consider the above transformation legal?
>>>> 
>>>> Yes, the transformation is perfectly legal for the normal
compiler.
>>> 
>>> So how do you guarantee that the behavior is predictable regardless
of hardware platform if you don't define what the behavior should be?
>> 
>> I'm not sure what you mean.  What isn't defined?
> 
> The alloca in question allocates 22 bytes.  The 64-bit load in Kostya's
original email is accessing two additional bytes past the end of the alloca
(i.e., it is accessing array "elements" a[22] and a[23]).  Accessing
that memory with a read or write is undefined behavior.  The program could
fault, read zeros, read arbitrary bit patterns, etc.
John, I think that you are missing that these operations are fully defined by
LLVM IR.  I'm not sure what languages rules you are drawing these rules
from, but they are not the rules of IR.

Doing this inside a compiler (the way we do) also is not invalid according to
the C notions of undefined behavior, as it has the "as if" rule.  I
agree that doing this at the source level would be invalid.

Again, I'm not opposed to having a way to disable these transformations, we
just need a clean way to express it.

-Chris
> In other words, the compiler is transforming this:
> 
> return a[16] + a[21];
> 
> into something like this:
> 
> unsigned long * p = &(a[16]);
> unsigned long v = *p; // This accesses memory locations a[22] and a[23];
doing so is undefined behavior
> (do some bit fiddling to extra a[16] and a[17] from v)
> 
> The original code is memory safe and exhibits defined behavior.  You can do
whatever crazy, semantics-preserving optimization you want, run it on any crazy
architecture you want, and it'll always exhibit the same behavior.
> 
> The optimized code exhibits undefined behavior. On most systems, it just
reads garbage data that is ignored by the compiler, but that's really just a
side-effect of how most OS's and architectures do things.  If you do some
crazy transforms or run on some obscure architecture, the optimized code may
break.
> 
>> 
>> [snip]
>>> What if you have a funky architecture that someone is porting LLVM
to, or someone is using x86-32 segments in an interesting way?
>> 
>> We'll burn that bridge when we get to it ;-)
> 
> ASAN got burnt; SAFECode probably got burnt, too.  If we work around it,
some poor researcher or developer may get burnt by it, too, and spend some time
figuring out why his correct-looking program is not acting properly.  In other
words, you're burning someone else's bridge.
> 
> Granted, perhaps the benefits of an incorrect optimization may outweigh the
loss of using LLVM on novel systems, but are you sure that making the
optimization work properly is going to be so detrimental?
> 
>> 
>>> Moreover, I don't really understand the rationale for allowing
a transform to introduce undefined behavior into programs that exhibit no
undefined behavior.
>> 
>> There is no undefined behavior here.  This is exactly analogous to the
code you get for bitfield accesses.  If you have an uninitialized struct and
start storing into its fields (to initialize it)  you get a series of "load
+ mask + or + store" operations.  These are loading and touching
"undefined" bits in a completely defined way.
> 
> I'll agree that they're both undefined behavior, but I don't
think they fall within the same category.  The bit-mask initializing issue is a
compromise you made because there either isn't an alternative way to do it
that has defined behavior, or such an alternative is too expensive and difficult
to implement and/or use.
> 
> This appears to be a different case.  Fixing the optimization looks simple
enough to me (am I missing something?), and I'm not convinced that fixing it
would hurt performance (although since I haven't run an experiment, that is
conjecture).
> 
> So, perhaps I should ask this: if someone took the time to fix the
transform so that it checks both the alignment *and* the allocation size and
measures the resulting performance change, how much would performance need to
suffer before the cure was deemed worse than the disease?
> 
> -- John T.
> 
> 
> 
>   
>> 
>> -Chris
>> 
> -------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111216/ae0d6b9c/attachment.html>

John Criswell

2011-Dec-19 23:04 UTC

head link

[LLVMdev] load widening conflicts with AddressSanitizer

On 12/16/11 7:56 PM, Chris Lattner wrote:> On Dec 16, 2011, at 3:29 PM, John Criswell <criswell at illinois.edu 
> <mailto:criswell at illinois.edu>> wrote:
>> On 12/16/11 4:45 PM, Chris Lattner wrote:
>>>
>>> On Dec 16, 2011, at 2:41 PM, John Criswell wrote:
>>>
>>>> On 12/16/11 4:14 PM, Chris Lattner wrote:
>>>>> On Dec 16, 2011, at 12:39 PM, Kostya Serebryany wrote:
>>>>>>
>>>>>>     > Do we consider the above transformation legal?
>>>>>>
>>>>>
>>>>> Yes, the transformation is perfectly legal for the normal
compiler.
>>>>
>>>> So how do you guarantee that the behavior is predictable
regardless
>>>> of hardware platform if you don't define what the behavior
should be?
>>>
>>> I'm not sure what you mean.  What isn't defined?
>>
>> The alloca in question allocates 22 bytes.  The 64-bit load in 
>> Kostya's original email is accessing two additional bytes past the 
>> end of the alloca (i.e., it is accessing array "elements"
a[22] and
>> a[23]).  Accessing that memory with a read or write is undefined 
>> behavior.  The program could fault, read zeros, read arbitrary bit 
>> patterns, etc.
>
> John, I think that you are missing that these operations are fully 
> defined by LLVM IR.  I'm not sure what languages rules you are drawing 
> these rules from, but they are not the rules of IR.
I apologize for mixing C and LLVM notation.

If you want to distinguish between C and LLVM semantics, then I think 
load-widening in this particular case has two problems:

1) The load-widening transform is not guaranteed to preserve the 
semantics of the original C program unless the OS and hardware fulfill 
certain assumptions.  The original C program performs in-bounds memory 
accesses at a[16] and a[21].  The transformed equivalent does the same 
thing *except* that it also reads two bytes outside the bounds of the 
array a.  The transformed program is only equivalent *if* the OS and 
hardware fulfill certain assumptions (which, fortunately, they do on 
common systems currently in use).

2) The load-widening transform introduces behavior that, as far as I 
know, is undefined at the LLVM IR level.  It takes an LLVM program that 
loads two values from the alloca'ed memory and changes it to be a single 
load that accesses the same memory locations plus two out-of-bounds 
locations.  Again, if the OS and hardware fulfill certain assumptions, 
then the fault behavior and computation behavior of the LLVM IR before 
and after the optimization is the same, but if those assumptions don't 
hold, then the transformed program may act differently than the original.

At the LLVM IR level, performing a load or store in which part of the 
accessed memory is within bounds and part of it is out of bounds is 
undefined, correct?  At the very least, I would think that there would 
not be a defined behavior for such a load or store.

Am I making sense now?  Is there something I'm misunderstanding here?
>
> Doing this inside a compiler (the way we do) also is not invalid 
> according to the C notions of undefined behavior, as it has the "as 
> if" rule.  I agree that doing this at the source level would be
invalid.
>
> Again, I'm not opposed to having a way to disable these 
> transformations, we just need a clean way to express it.
Having a list of which optimizations are safe to run and which ones are 
not can become tedious.  I'd rather fix the optimization so that it 
always preserves the semantics of the program unless there's a very 
compelling reason not to do so (e.g., significant performance loss).

In this case, it seems easy: instead of requiring that "align 16" is 
sufficient for doing load widening, require that the memory must be 
marked "align 16" and the allocation size is a multiple of 8.  That
will
force load-widening to only occur when it is safe.  There's probably 
other ways to fix it as well (e.g., have load-widening widen to the 
largest size that meets the above two conditions).

-- John T.
>
> -Chris
>
>> In other words, the compiler is transforming this:
>>
>> return a[16] + a[21];
>>
>> into something like this:
>>
>> unsigned long * p = &(a[16]);
>> unsigned long v = *p; // This accesses memory locations a[22] and 
>> a[23]; doing so is undefined behavior
>> (do some bit fiddling to extra a[16] and a[17] from v)
>>
>> The original code is memory safe and exhibits defined behavior.  You 
>> can do whatever crazy, semantics-preserving optimization you want, 
>> run it on any crazy architecture you want, and it'll always exhibit
>> the same behavior.
>>
>> The optimized code exhibits undefined behavior. On most systems, it 
>> just reads garbage data that is ignored by the compiler, but that's
>> really just a side-effect of how most OS's and architectures do 
>> things.  If you do some crazy transforms or run on some obscure 
>> architecture, the optimized code may break.
>>
>>>
>>> [snip]
>>>> What if you have a funky architecture that someone is porting
LLVM
>>>> to, or someone is using x86-32 segments in an interesting way?
>>>
>>> We'll burn that bridge when we get to it ;-)
>>
>> ASAN got burnt; SAFECode probably got burnt, too.  If we work around 
>> it, some poor researcher or developer may get burnt by it, too, and 
>> spend some time figuring out why his correct-looking program is not 
>> acting properly.  In other words, you're burning someone else's
bridge.
>>
>> Granted, perhaps the benefits of an incorrect optimization may 
>> outweigh the loss of using LLVM on novel systems, but are you sure 
>> that making the optimization work properly is going to be so
detrimental?
>>
>>>
>>>> Moreover, I don't really understand the rationale for
allowing a
>>>> transform to introduce undefined behavior into programs that 
>>>> exhibit no undefined behavior.
>>>
>>> There is no undefined behavior here.  This is exactly analogous to 
>>> the code you get for bitfield accesses.  If you have an 
>>> uninitialized struct and start storing into its fields (to 
>>> initialize it)  you get a series of "load + mask + or +
store"
>>> operations.  These are loading and touching "undefined"
bits in a
>>> completely defined way.
>>
>> I'll agree that they're both undefined behavior, but I
don't think
>> they fall within the same category.  The bit-mask initializing issue 
>> is a compromise you made because there either isn't an alternative 
>> way to do it that has defined behavior, or such an alternative is too 
>> expensive and difficult to implement and/or use.
>>
>> This appears to be a different case.  Fixing the optimization looks 
>> simple enough to me (am I missing something?), and I'm not
convinced
>> that fixing it would hurt performance (although since I haven't run
>> an experiment, that is conjecture).
>>
>> So, perhaps I should ask this: if someone took the time to fix the 
>> transform so that it checks both the alignment *and* the allocation 
>> size and measures the resulting performance change, how much would 
>> performance need to suffer before the cure was deemed worse than the 
>> disease?
>>
>> -- John T.
>>
>>
>>
>>
>>>
>>> -Chris
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20111219/039bda95/attachment.html>

Chris Lattner

2011-Dec-20 00:31 UTC

head link

[LLVMdev] load widening conflicts with AddressSanitizer

On Dec 19, 2011, at 3:04 PM, John Criswell wrote:>>> The alloca in question allocates 22 bytes.  The 64-bit load in
Kostya's original email is accessing two additional bytes past the end of
the alloca (i.e., it is accessing array "elements" a[22] and a[23]). 
Accessing that memory with a read or write is undefined behavior.  The program
could fault, read zeros, read arbitrary bit patterns, etc.
>> 
>> John, I think that you are missing that these operations are fully
defined by LLVM IR.  I'm not sure what languages rules you are drawing these
rules from, but they are not the rules of IR.
> 
> I apologize for mixing C and LLVM notation.
> 
> If you want to distinguish between C and LLVM semantics, then I think
load-widening in this particular case has two problems:
> 
> 1) The load-widening transform is not guaranteed to preserve the semantics
of the original C program unless the OS and hardware fulfill certain
assumptions.
Sure, but these assumptions are true of all current hardware and OS's
supported by LLVM.
> 2) The load-widening transform introduces behavior that, as far as I know,
is undefined at the LLVM IR level.
This is incorrect, it is fully defined for LLVM IR.
> Am I making sense now?  Is there something I'm misunderstanding here?
Yes, it is that this is fully defined by LLVM IR. :)  This is not defined by C. 
This is another case where LLVM IR is more general than C is.
>> Doing this inside a compiler (the way we do) also is not invalid
according to the C notions of undefined behavior, as it has the "as
if" rule.  I agree that doing this at the source level would be invalid.
>> 
>> Again, I'm not opposed to having a way to disable these
transformations, we just need a clean way to express it.
> 
> Having a list of which optimizations are safe to run and which ones are not
can become tedious.  I'd rather fix the optimization so that it always
preserves the semantics of the program unless there's a very compelling
reason not to do so (e.g., significant performance loss).
This is one instance of a class of related optimizations, and you're
assuming that they were added for no good reason.  The only reason that someone
bothered to add them to the compiler is that they added a worthwhile performance
win.

If you're willing to move this discussion from "whether this is correct
to do" to "what should we change in the compiler to support asan and
safecode" then I think we'll have a more productive discussion.  This
doesn't seem very hard to solve to everyone's satisfaction.

-Chris

Maybe Matching Threads

Search for more maybe matching threads

llvm dev - Dec 2011 - [LLVMdev] load widening conflicts with AddressSanitizer

[LLVMdev] load widening conflicts with AddressSanitizer

[LLVMdev] load widening conflicts with AddressSanitizer

[LLVMdev] load widening conflicts with AddressSanitizer

Maybe Matching Threads