thr3ads.net - llvm dev - [LLVMdev] load widening conflicts with AddressSanitizer [Jan 2012]

If this information is useful, please help other people find it:
Share via:

Duncan Sands

2012-Jan-24 17:53 UTC

[LLVMdev] load widening conflicts with AddressSanitizer

Hi Kostya,
> On Tue, Jan 24, 2012 at 1:23 AM, Duncan Sands <baldrick at free.fr
> <mailto:baldrick at free.fr>> wrote:
>
>     Hi Kostya,
>
>      > [resurrecting an old mail thread about AddressSanitizer false
positive
>     caused by
>      > load widening]
>      >
>      > Once the Attribute::AddressSafety is set by clang (a separate
patch), fixing
>      > this bug may look as simple as this:
>
> Hi Duncan,
>
>     I don't get the point of an attribute.  There's plenty of code
out there
>     that does wide loads like this directly (without them being created by
the
>     optimizers) since,
>
>
> You mean, the source code that e.g. loads 8 bytes, where up to 7 bytes
might be
> out of array bounds?
yes.
> Yes, we've seen quite a bit of such code.
I'm not surprised.
> First, such code often appears in libc, mostly in hand-written assembly
(e.g.
> strlen), and valgrind/memcheck has a lot of trouble dealing with it
> (it basically has to intercept all such functions, which does not work when
such
> functions are inlined, so valgrind does not properly work with O2-compiled
> binaries).
> asan does not care about it (yet) because it does not instrument libc.
>
> Second, we also seen such hacks in regular C/C++ code (usually, in codecs
or
> compression code).
> Strictly speaking -- all these cases are bugs, according to either C or C++
> standard, and asan does not impose more restrictions than the standard.
> Note, that the hacks like these hurt not only address safety checkers like
> asan/memcheck/drmemory/SAFEcode/etc, but also race detectors like
> tsan/helgrind/drd/etc.
>
> We still have lots of code with these intentional OOB accesses and we want
to
> test it.
> In most cases I've met so far, the developers decided to actually fix
the bugs
> according to the C++ standard and require the memory allocation to have up
to 7
> extra bytes.
As far as I can see the C and C++ standards are not relevant.  ASAN works on
LLVM IR, not on C or C++.  Lots of different languages have LLVM frontends.  I
personally turn Ada and Fortran into LLVM IR all the time for example.  Clearly
the C standard is not relevant to LLVM IR coming from such languages.  What
matters is how LLVM IR is defined.  As far as I know this construct is perfectly
valid in LLVM IR.
> I do expect that sometimes this is impossible or undesirable.
> Then the solution would be to use __attribute__((address_safety)) to avoid
> instrumenting the tricky pieces of code.
Unfortunately there is in general no way of attaching such attributes in many
languages.

Ciao, Duncan.
>
> --kcc
>
>
>     just like the optimizers, they know it is safe and a win.
>     The attribute won't help them.  It looks like a way of just hiding
the real
>     problem, which seems to be that address sanitizer is overly strict.
>

Kostya Serebryany

2012-Jan-24 18:06 UTC

head link

[LLVMdev] load widening conflicts with AddressSanitizer

On Tue, Jan 24, 2012 at 9:53 AM, Duncan Sands <baldrick at free.fr> wrote:
> Hi Kostya,
>
>  On Tue, Jan 24, 2012 at 1:23 AM, Duncan Sands <baldrick at free.fr
>> <mailto:baldrick at free.fr>> wrote:
>>
>>    Hi Kostya,
>>
>>     > [resurrecting an old mail thread about AddressSanitizer false
>> positive
>>    caused by
>>     > load widening]
>>     >
>>     > Once the Attribute::AddressSafety is set by clang (a separate
>> patch), fixing
>>     > this bug may look as simple as this:
>>
>> Hi Duncan,
>>
>>    I don't get the point of an attribute.  There's plenty of
code out
>> there
>>    that does wide loads like this directly (without them being created
by
>> the
>>    optimizers) since,
>>
>>
>> You mean, the source code that e.g. loads 8 bytes, where up to 7 bytes
>> might be
>> out of array bounds?
>>
>
> yes.
>
>
>  Yes, we've seen quite a bit of such code.
>>
>
> I'm not surprised.
>
>
>  First, such code often appears in libc, mostly in hand-written assembly
>> (e.g.
>> strlen), and valgrind/memcheck has a lot of trouble dealing with it
>> (it basically has to intercept all such functions, which does not work
>> when such
>> functions are inlined, so valgrind does not properly work with
O2-compiled
>> binaries).
>> asan does not care about it (yet) because it does not instrument libc.
>>
>> Second, we also seen such hacks in regular C/C++ code (usually, in
codecs
>> or
>> compression code).
>> Strictly speaking -- all these cases are bugs, according to either C or
>> C++
>> standard, and asan does not impose more restrictions than the standard.
>> Note, that the hacks like these hurt not only address safety checkers
like
>> asan/memcheck/drmemory/**SAFEcode/etc, but also race detectors like
>> tsan/helgrind/drd/etc.
>>
>> We still have lots of code with these intentional OOB accesses and we
>> want to
>> test it.
>> In most cases I've met so far, the developers decided to actually
fix the
>> bugs
>> according to the C++ standard and require the memory allocation to have
>> up to 7
>> extra bytes.
>>
>
> As far as I can see the C and C++ standards are not relevant.  ASAN works
> on
> LLVM IR, not on C or C++.  Lots of different languages have LLVM
> frontends.  I
> personally turn Ada and Fortran into LLVM IR all the time for example.
>  Clearly
> the C standard is not relevant to LLVM IR coming from such languages.  What
> matters is how LLVM IR is defined.  As far as I know this construct is
> perfectly
> valid in LLVM IR.

Asan will not work for Fortran and Ada anyway (at least, out of the box).
I am not even sure that anything like asan is needed for Ada (it has bounds
checking built-in, the dynamic memory allocation is much more restrictive).
The tool is rather specific to C/C++ (and ObjectiveC probably, although we
have almost no tests for ObjectiveC, nor much knowledge in it).
Yes, the IR transformations are done on the LLVM level, but the asan
run-time library heavily depends on the C/C++ semantics and even
implementation,
and you can't really separate the asan instrumentation pass from the
run-time.

--kcc




>
>
>  I do expect that sometimes this is impossible or undesirable.
>> Then the solution would be to use __attribute__((address_safety)**) to
>> avoid
>> instrumenting the tricky pieces of code.
>>
>
> Unfortunately there is in general no way of attaching such attributes in
> many
> languages.
>
> Ciao, Duncan.
>
>
>
>> --kcc
>>
>>
>>    just like the optimizers, they know it is safe and a win.
>>    The attribute won't help them.  It looks like a way of just
hiding the
>> real
>>    problem, which seems to be that address sanitizer is overly strict.
>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20120124/f1ecdd47/attachment.html>

Duncan Sands

2012-Jan-24 20:31 UTC

head link

[LLVMdev] load widening conflicts with AddressSanitizer

Hi Kostya,
>     As far as I can see the C and C++ standards are not relevant.  ASAN
works on
>     LLVM IR, not on C or C++.  Lots of different languages have LLVM
frontends.  I
>     personally turn Ada and Fortran into LLVM IR all the time for example. 
Clearly
>     the C standard is not relevant to LLVM IR coming from such languages. 
What
>     matters is how LLVM IR is defined.  As far as I know this construct is
perfectly
>     valid in LLVM IR.
>
>
> Asan will not work for Fortran and Ada anyway (at least, out of the box).
> I am not even sure that anything like asan is needed for Ada (it has bounds
> checking built-in, the dynamic memory allocation is much more restrictive).
> The tool is rather specific to C/C++ (and ObjectiveC probably, although we
have
> almost no tests for ObjectiveC, nor much knowledge in it).
> Yes, the IR transformations are done on the LLVM level, but the asan
run-time
> library heavily depends on the C/C++ semantics and even implementation,
> and you can't really separate the asan instrumentation pass from the
run-time.
it's pretty disappointing to hear that asan is basically just for C.  But
since
it is, I won't bother you anymore about this attribute (though I still
don't
like it much).

Ciao, Duncan.

Apparently Analagous Threads

Search for more possibly parallel threads

llvm dev - Jan 2012 - [LLVMdev] load widening conflicts with AddressSanitizer

[LLVMdev] load widening conflicts with AddressSanitizer

[LLVMdev] load widening conflicts with AddressSanitizer

[LLVMdev] load widening conflicts with AddressSanitizer

Apparently Analagous Threads