thr3ads.net - llvm dev - [LLVMdev] Weird msan problem [Aug 2014]

If this information is useful, please help other people find it:
Share via:

Keno Fischer

2014-Feb-07 05:06 UTC

[LLVMdev] Weird msan problem

Yes, it would be great to get that fixed.


On Wed, Feb 5, 2014 at 4:09 PM, Evgeniy Stepanov
<eugeni.stepanov at gmail.com>wrote:
> On Thu, Feb 6, 2014 at 12:21 AM, Keno Fischer
> <kfischer at college.harvard.edu> wrote:
> > Looks like when you materialize the stores, you should check the size
of
> the
> > the store and emit an appropriate amount of stores to the origin
shadow
> (or
> > just a memset intrinsic?).
>
> Yes. There is an old bug for that on the code.google.com site, but it
> has _never_ caused problems. I guess partial reads at a non-zero
> offset are pretty uncommon, and also LLVM IR tends to use smaller
> stores (larger ones are transformed to memset/memcpy and we handle
> those correctly). Should be fixed anyway.
>
> Note that __msan_print_shadow can lie about origin a bit if the
> address is not 4-byte aligned (it would print origins for aligned
> locations instead).
>
> >
> >
> > On Wed, Feb 5, 2014 at 2:13 PM, Keno Fischer <
> kfischer at college.harvard.edu>
> > wrote:
> >>
> >> The @entry stuff is just a gdb artifact. I've been tracking
this back a
> >> little further, and it seems there's at least an origin
propagation
> problem:
> >>
> >> (gdb) list
> >> 281             goto value_to_pointer_error;
> >> 282         }
> >> 283         else {
> >> 284             if (jl_is_cpointer_type(jvt) &&
jl_tparam0(jvt) == jt) {
> >> 285                 void *ptr = jl_unbox_voidpointer(v);
> >> 286                
assert(__msan_test_shadow(&ptr,sizeof(void**)) => -1);
> >> 287                 return (void*)ptr;
> >> 288             }
> >> 289         }
> >> 290
> >> (gdb) p __msan_print_shadow(v,16)
> >> 00 00 00 00 00 00 00 00 ff ff ff ff ff ff ff ff
> >>  o: 40000e23  o: 40000e23  o: 40000e23  o: 40000e23
> >> $21 = void
> >> (gdb) p __msan_print_shadow(&ptr,8)
> >> ff ff ff ff ff ff ff ff
> >>  o: 40000e23  o: 80007614
> >> $22 = void
> >>
> >> Notice the origin of the lower bits is off. Any idea as to why?
> >>
> >>
> >> On Mon, Feb 3, 2014 at 2:52 AM, Evgeniy Stepanov
> >> <eugeni.stepanov at gmail.com> wrote:
> >>>
> >>> The code for ccall looks right. Sounds like you have a very
small
> >>> range of instructions where an uninitialized value appear. You
could
> >>> try debugging at asm level. Shadow for b should be passed at
offset 0
> >>> in __msan_param_tls.
> >>>
> >>> MSan could propagate shadow through arithmetic and even some
logic
> >>> operations (like select). It could be that b is clean on
function
> >>> entry, but then something uninitialized gets mixed in.
> >>>
> >>> Also, what is this @entry stuff? It looks like bitvector_any1
has 6
> >>> arguments, but your ccall snippet only sets up tls for 3.
> >>>
> >>>
> >>> On Sun, Feb 2, 2014 at 10:39 PM, Keno Fischer
> >>> <kfischer at college.harvard.edu> wrote:
> >>> > I don't think we're doing anything unusual.
Here's the relevant
> snippet
> >>> > from
> >>> > the generated IR for ccall (after instrumentation):
> >>> >
> >>> > ccall:                                            ; preds
= %pass6,
> >>> > %113
> >>> >   %114 = phi i8* [ %110, %113 ], [ %108, %pass6 ]
> >>> >   %115 = inttoptr i64 %17 to i64*
> >>> >   %116 = bitcast i8* %69 to i32*, !dbg !7
> >>> >   %117 = bitcast i8* %114 to i32 (i32*, i64, i64)*, !dbg
!7
> >>> >   store i64 %_msret13, i64* getelementptr inbounds ([1000
x i64]*
> >>> > @__msan_param_tls, i64 0, i64 0), align 8, !dbg !7
> >>> >   store i32 %70, i32* getelementptr inbounds ([1000 x
i32]*
> >>> > @__msan_param_origin_tls, i64 0, i64 0), align 4, !dbg !7
> >>> >   store i64 0, i64* inttoptr (i64 add (i64 ptrtoint
([1000 x i64]*
> >>> > @__msan_param_tls to i64), i64 8) to i64*), align 8, !dbg
!7
> >>> >   store i32 0, i32* inttoptr (i64 add (i64 ptrtoint
([1000 x i32]*
> >>> > @__msan_param_origin_tls to i64), i64 8) to i32*), align
4, !dbg !7
> >>> >   store i64 0, i64* inttoptr (i64 add (i64 ptrtoint
([1000 x i64]*
> >>> > @__msan_param_tls to i64), i64 16) to i64*), align 8,
!dbg !7
> >>> >   store i32 0, i32* inttoptr (i64 add (i64 ptrtoint
([1000 x i32]*
> >>> > @__msan_param_origin_tls to i64), i64 16) to i32*), align
4, !dbg !7
> >>> >   store i32 0, i32* bitcast ([8 x i64]*
@__msan_retval_tls to i32*),
> >>> > align
> >>> > 8, !dbg !7
> >>> >   %118 = call i32 %117(i32* %116, i64 %87, i64 %107),
!dbg !7,
> >>> > !julia_type
> >>> > !9
> >>> >
> >>> >
> >>> >
> >>> > On Sun, Feb 2, 2014 at 6:18 AM, Evgeniy Stepanov
> >>> > <eugeni.stepanov at gmail.com>
> >>> > wrote:
> >>> >>
> >>> >> How is ccall() implemented? If it manually sets up a
stack frame,
> then
> >>> >> it also needs to store argument shadow values in
paramtls.
> >>> >>
> >>> >> I don't think there is an overflow, unless you
have a _lot_ of
> >>> >> arguments in a function call.
> >>> >>
> >>> >> On Sun, Feb 2, 2014 at 9:26 AM, Keno Fischer
> >>> >> <kfischer at college.harvard.edu> wrote:
> >>> >> > Also, I was looking at the instrumented LLVM
code and I noticed
> that
> >>> >> > the
> >>> >> > LLVM global variable is generated with size
1000, while
> >>> >> > kMsanParamTlsSizeInWords == 100, could there be
some sort of
> >>> >> > overflow
> >>> >> > happening?
> >>> >> >
> >>> >> >
> >>> >> > On Sat, Feb 1, 2014 at 6:44 PM, Keno Fischer
> >>> >> > <kfischer at college.harvard.edu>
> >>> >> > wrote:
> >>> >> >>
> >>> >> >> I have verified that both TLS
implementations indeed find the
> same
> >>> >> >> area
> >>> >> >> of
> >>> >> >> memory. Anything else I could look for?
> >>> >> >>
> >>> >> >>
> >>> >> >> On Tue, Jan 28, 2014 at 4:28 PM, Keno
Fischer
> >>> >> >> <kfischer at college.harvard.edu>
wrote:
> >>> >> >>>
> >>> >> >>> Yes, both JIT code and the native
runtime are instrumented. I am
> >>> >> >>> under
> >>> >> >>> the impressions that the the C library
should guarantee that
> from
> >>> >> >>> the
> >>> >> >>> way
> >>> >> >>> the relocations are implemented as long
as both native and JITed
> >>> >> >>> code
> >>> >> >>> are on
> >>> >> >>> the same thread (but I will verify this
and report back).
> >>> >> >>>
> >>> >> >>>
> >>> >> >>> On Tue, Jan 28, 2014 at 2:41 AM, Evgeniy
Stepanov
> >>> >> >>> <eugeni.stepanov at gmail.com>
wrote:
> >>> >> >>>>
> >>> >> >>>> I assume there are transitions
between JITted code and native
> >>> >> >>>> helper
> >>> >> >>>> functions. How are you handling
them? Are native functions
> >>> >> >>>> MSan-instrumented?
> >>> >> >>>> MSan is passing shadow across
function calls in TLS slots. Does
> >>> >> >>>> your
> >>> >> >>>> TLS implementation guarantee that
accesses to __msan_param_tls
> >>> >> >>>> from
> >>> >> >>>> JITted and from native code map to
the same memory?
> >>> >> >>>>
> >>> >> >>>>
> >>> >> >>>> On Mon, Jan 27, 2014 at 11:36 PM,
Evgeniy Stepanov
> >>> >> >>>> <eugeni.stepanov at gmail.com>
wrote:
> >>> >> >>>> > This is really cool. I've
not heard of anyone using MSan with
> >>> >> >>>> > MSJIT
> >>> >> >>>> > before.
> >>> >> >>>> >
> >>> >> >>>> >
> >>> >> >>>> > On Mon, Jan 27, 2014 at 7:44
PM, Keno Fischer
> >>> >> >>>> > <kfischer at
college.harvard.edu> wrote:
> >>> >> >>>> >> Hello everybody,
> >>> >> >>>> >>
> >>> >> >>>> >> I've run into some
strange behavior with memory sanitizer
> that
> >>> >> >>>> >> I
> >>> >> >>>> >> can't
> >>> >> >>>> >> explain and hope somebody
with more knowledge of the
> >>> >> >>>> >> implementation
> >>> >> >>>> >> would be
> >>> >> >>>> >> able to help me out or at
least point me into the right
> >>> >> >>>> >> direction.
> >>> >> >>>> >>
> >>> >> >>>> >> For background, I'm
using memory sanitizer to check Julia
> >>> >> >>>> >> (julialang.org),
> >>> >> >>>> >> which uses (or at least
will once I track down a few bugs)
> >>> >> >>>> >> MCJIT
> >>> >> >>>> >> for
> >>> >> >>>> >> the
> >>> >> >>>> >> code compilation. So far I
have rebuilt the runtime and all
> >>> >> >>>> >> dependencies
> >>> >> >>>> >> (including LLVM, libcxx,
etc.) with memory sanitizer enabled
> >>> >> >>>> >> and
> >>> >> >>>> >> added the
> >>> >> >>>> >> instrumentation pass in the
appropriate place in the julia
> >>> >> >>>> >> code
> >>> >> >>>> >> generator.
> >>> >> >>>> >>
> >>> >> >>>> >> I'm now going through
the usual bootstrap which basically
> >>> >> >>>> >> loads
> >>> >> >>>> >> the
> >>> >> >>>> >> standard
> >>> >> >>>> >> library and compiles it,
does inference, etc. This works
> fine
> >>> >> >>>> >> for
> >>> >> >>>> >> several
> >>> >> >>>> >> hours (this is usually much
faster - by which I mean several
> >>> >> >>>> >> hundred
> >>> >> >>>> >> time -
> >>> >> >>>> >> I suspect the issue is with
MCJIT having to process a ton
> more
> >>> >> >>>> >> relocations
> >>> >> >>>> >> and code and being
inefficient at it, but I can't prove
> that).
> >>> >> >>>> >> That's
> >>> >> >>>> >> not
> >>> >> >>>> >> the issue however.
Eventually, I get
> >>> >> >>>> >>
> >>> >> >>>> >> ==17150== WARNING:
MemorySanitizer:
> use-of-uninitialized-value
> >>> >> >>>> >>     #0 0x7f417cea3189 in
bitvector_any1
> >>> >> >>>> >>
/home/kfischer/julia-san/src/support/bitvector.c:177
> >>> >> >>>> >> [ snip ]
> >>> >> >>>> >>
> >>> >> >>>> >>   Uninitialized value was
created by a heap allocation
> >>> >> >>>> >>     #0 0x7f41815de543 in
__interceptor_malloc
> >>> >> >>>> >>
> >>> >> >>>> >>
> >>> >> >>>> >>
> >>> >> >>>> >>
>
/home/kfischer/julia-san/deps/llvm-svn/projects/compiler-rt/lib/msan/msan_interceptors.cc:854
> >>> >> >>>> >>     #1 0x7f417cc7d7f1 in
alloc_big
> >>> >> >>>> >>
/home/kfischer/julia-san/src/gc.c:355
> >>> >> >>>> >> [snip]
> >>> >> >>>> >>
> >>> >> >>>> >> Now, by going through it in
the debugger, I see
> >>> >> >>>> >>
> >>> >> >>>> >> (gdb) f 3
> >>> >> >>>> >> #3  0x00007f417cea318a in
bitvector_any1 (b=0x60c000607240,
> >>> >> >>>> >> b at entry=<optimized
out>, offs=0, offs at entry=<optimized
> out>,
> >>> >> >>>> >> nbits=256,
> >>> >> >>>> >> nbits at
entry=<optimized out>)
> >>> >> >>>> >>     at bitvector.c:177
> >>> >> >>>> >> 177         if ((b[0] &
mask) != 0) return 1;
> >>> >> >>>> >> (gdb) p
__msan_print_shadow(&b,8)
> >>> >> >>>> >> ff ff ff ff ff ff ff ff
> >>> >> >>>> >>  o: 3f0010a6  o: 80007666
> >>> >> >>>> >>
> >>> >> >>>> >> which seems to indicate
that the local variable b has
> >>> >> >>>> >> uninitialized
> >>> >> >>>> >> data.
> >>> >> >>>> >> I'm having a hard time
believing that though, since if I
> look
> >>> >> >>>> >> at
> >>> >> >>>> >> the
> >>> >> >>>> >> functions before it, the
place where it's coming from is
> >>> >> >>>> >> initialized:
> >>> >> >>>> >>
> >>> >> >>>> >> #4  0x00007f41755208a8 in
julia_isempty248 ()
> >>> >> >>>> >> #5  0x00007f417c163e3d in
jl_apply (f=0x606000984d60,
> >>> >> >>>> >> f at entry=<optimized
> >>> >> >>>> >> out>,
args=0x7fff9132da20, args at entry=<optimized out>,
> >>> >> >>>> >> nargs=1,
> >>> >> >>>> >>     nargs at
entry=<optimized out>) at ./julia.h:1043
> >>> >> >>>> >>
> >>> >> >>>> >> (here's the code of
that julia function for reference)
> >>> >> >>>> >>
> >>> >> >>>> >> isempty(s::IntSet) >
>>> >> >>>> >>     !s.fill1s &&
ccall(:bitvector_any1, Uint32,
> (Ptr{Uint32},
> >>> >> >>>> >> Uint64,
> >>> >> >>>> >> Uint64), s.bits, 0,
s.limit)==0
> >>> >> >>>> >>
> >>> >> >>>> >> Looking at where that value
is coming from:
> >>> >> >>>> >>
> >>> >> >>>> >> (gdb) f 5
> >>> >> >>>> >> #5  0x00007f417c163e3d in
jl_apply (f=0x606000984d60,
> >>> >> >>>> >> f at entry=<optimized
> >>> >> >>>> >> out>,
args=0x7fff9132da20, args at entry=<optimized out>,
> >>> >> >>>> >> nargs=1,
> >>> >> >>>> >>     nargs at
entry=<optimized out>) at ./julia.h:1043
> >>> >> >>>> >> 1043        return
f->fptr((jl_value_t*)f, args, nargs);
> >>> >> >>>> >> (gdb) p
((jl_array_t*)((void**)args[0])[1])->data
> >>> >> >>>> >> $43 = (void *)
0x60c000607240
> >>> >> >>>> >> (gdb) p
> >>> >> >>>> >>
> __msan_print_shadow(((jl_array_t*)((void**)args[0])[1]),0x30)
> >>> >> >>>> >> 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00
> 00
> >>> >> >>>> >> 00
> >>> >> >>>> >> 00
> >>> >> >>>> >> 00 00
> >>> >> >>>> >> 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00
> 00
> >>> >> >>>> >> 00
> >>> >> >>>> >> 00
> >>> >> >>>> >>  o: d800496  o: d800496  o:
d800496  o: d800496  o: d800496
> >>> >> >>>> >> o:
> >>> >> >>>> >> d800496  o:
> >>> >> >>>> >> d800496  o: d800496  o:
d800496  o: d800496  o: d800496  o:
> >>> >> >>>> >> d800496
> >>> >> >>>> >>
> >>> >> >>>> >> There are no uninitialized
values to be seen anywhere and
> the
> >>> >> >>>> >> `b`
> >>> >> >>>> >> value
> >>> >> >>>> >> isn't touched before
that line, so I'm a little stumped.
> >>> >> >>>> >>
> >>> >> >>>> >> One note I should make is
that I did have to implement TLS
> >>> >> >>>> >> support
> >>> >> >>>> >> myself in
> >>> >> >>>> >> MCJIT for this to work
(I'll upstream the patch soon), so I
> >>> >> >>>> >> may
> >>> >> >>>> >> have
> >>> >> >>>> >> made a
> >>> >> >>>> >> mistake, but I haven't
found anything wrong yet. If nothing
> >>> >> >>>> >> looks
> >>> >> >>>> >> unusual,
> >>> >> >>>> >> I'd also appreciate
pointers on what to look for in the TLS
> >>> >> >>>> >> variables.
> >>> >> >>>> >>
> >>> >> >>>> >> Thank you for your help,
> >>> >> >>>> >> Keno
> >>> >> >>>> >>
> >>> >> >>>> >>
> >>> >> >>>> >>
_______________________________________________
> >>> >> >>>> >> LLVM Developers mailing
list
> >>> >> >>>> >> LLVMdev at cs.uiuc.edu     
http://llvm.cs.uiuc.edu
> >>> >> >>>> >>
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >>> >> >>>> >>
> >>> >> >>>
> >>> >> >>>
> >>> >> >>
> >>> >> >
> >>> >
> >>> >
> >>
> >>
> >
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140207/fb2da76f/attachment.html>

Keno Fischer

2014-Aug-01 22:27 UTC

head link

[LLVMdev] Weird msan problem

I'm about to have another pass at this, so was wondering if this was
ever fixed (the reason I'm taking another pass is that I managed to
find the MCJIT issue that was causing this to take forever).

On Fri, Feb 7, 2014 at 12:06 AM, Keno Fischer
<kfischer at college.harvard.edu> wrote:> Yes, it would be great to get that fixed.
>
>
> On Wed, Feb 5, 2014 at 4:09 PM, Evgeniy Stepanov <eugeni.stepanov at
gmail.com>
> wrote:
>>
>> On Thu, Feb 6, 2014 at 12:21 AM, Keno Fischer
>> <kfischer at college.harvard.edu> wrote:
>> > Looks like when you materialize the stores, you should check the
size of
>> > the
>> > the store and emit an appropriate amount of stores to the origin
shadow
>> > (or
>> > just a memset intrinsic?).
>>
>> Yes. There is an old bug for that on the code.google.com site, but it
>> has _never_ caused problems. I guess partial reads at a non-zero
>> offset are pretty uncommon, and also LLVM IR tends to use smaller
>> stores (larger ones are transformed to memset/memcpy and we handle
>> those correctly). Should be fixed anyway.
>>
>> Note that __msan_print_shadow can lie about origin a bit if the
>> address is not 4-byte aligned (it would print origins for aligned
>> locations instead).
>>
>> >
>> >
>> > On Wed, Feb 5, 2014 at 2:13 PM, Keno Fischer
>> > <kfischer at college.harvard.edu>
>> > wrote:
>> >>
>> >> The @entry stuff is just a gdb artifact. I've been
tracking this back a
>> >> little further, and it seems there's at least an origin
propagation
>> >> problem:
>> >>
>> >> (gdb) list
>> >> 281             goto value_to_pointer_error;
>> >> 282         }
>> >> 283         else {
>> >> 284             if (jl_is_cpointer_type(jvt) &&
jl_tparam0(jvt) == jt)
>> >> {
>> >> 285                 void *ptr = jl_unbox_voidpointer(v);
>> >> 286                
assert(__msan_test_shadow(&ptr,sizeof(void**)) =>> >> -1);
>> >> 287                 return (void*)ptr;
>> >> 288             }
>> >> 289         }
>> >> 290
>> >> (gdb) p __msan_print_shadow(v,16)
>> >> 00 00 00 00 00 00 00 00 ff ff ff ff ff ff ff ff
>> >>  o: 40000e23  o: 40000e23  o: 40000e23  o: 40000e23
>> >> $21 = void
>> >> (gdb) p __msan_print_shadow(&ptr,8)
>> >> ff ff ff ff ff ff ff ff
>> >>  o: 40000e23  o: 80007614
>> >> $22 = void
>> >>
>> >> Notice the origin of the lower bits is off. Any idea as to
why?
>> >>
>> >>
>> >> On Mon, Feb 3, 2014 at 2:52 AM, Evgeniy Stepanov
>> >> <eugeni.stepanov at gmail.com> wrote:
>> >>>
>> >>> The code for ccall looks right. Sounds like you have a
very small
>> >>> range of instructions where an uninitialized value appear.
You could
>> >>> try debugging at asm level. Shadow for b should be passed
at offset 0
>> >>> in __msan_param_tls.
>> >>>
>> >>> MSan could propagate shadow through arithmetic and even
some logic
>> >>> operations (like select). It could be that b is clean on
function
>> >>> entry, but then something uninitialized gets mixed in.
>> >>>
>> >>> Also, what is this @entry stuff? It looks like
bitvector_any1 has 6
>> >>> arguments, but your ccall snippet only sets up tls for 3.
>> >>>
>> >>>
>> >>> On Sun, Feb 2, 2014 at 10:39 PM, Keno Fischer
>> >>> <kfischer at college.harvard.edu> wrote:
>> >>> > I don't think we're doing anything unusual.
Here's the relevant
>> >>> > snippet
>> >>> > from
>> >>> > the generated IR for ccall (after instrumentation):
>> >>> >
>> >>> > ccall:                                            ;
preds = %pass6,
>> >>> > %113
>> >>> >   %114 = phi i8* [ %110, %113 ], [ %108, %pass6 ]
>> >>> >   %115 = inttoptr i64 %17 to i64*
>> >>> >   %116 = bitcast i8* %69 to i32*, !dbg !7
>> >>> >   %117 = bitcast i8* %114 to i32 (i32*, i64, i64)*,
!dbg !7
>> >>> >   store i64 %_msret13, i64* getelementptr inbounds
([1000 x i64]*
>> >>> > @__msan_param_tls, i64 0, i64 0), align 8, !dbg !7
>> >>> >   store i32 %70, i32* getelementptr inbounds ([1000 x
i32]*
>> >>> > @__msan_param_origin_tls, i64 0, i64 0), align 4,
!dbg !7
>> >>> >   store i64 0, i64* inttoptr (i64 add (i64 ptrtoint
([1000 x i64]*
>> >>> > @__msan_param_tls to i64), i64 8) to i64*), align 8,
!dbg !7
>> >>> >   store i32 0, i32* inttoptr (i64 add (i64 ptrtoint
([1000 x i32]*
>> >>> > @__msan_param_origin_tls to i64), i64 8) to i32*),
align 4, !dbg !7
>> >>> >   store i64 0, i64* inttoptr (i64 add (i64 ptrtoint
([1000 x i64]*
>> >>> > @__msan_param_tls to i64), i64 16) to i64*), align 8,
!dbg !7
>> >>> >   store i32 0, i32* inttoptr (i64 add (i64 ptrtoint
([1000 x i32]*
>> >>> > @__msan_param_origin_tls to i64), i64 16) to i32*),
align 4, !dbg !7
>> >>> >   store i32 0, i32* bitcast ([8 x i64]*
@__msan_retval_tls to i32*),
>> >>> > align
>> >>> > 8, !dbg !7
>> >>> >   %118 = call i32 %117(i32* %116, i64 %87, i64 %107),
!dbg !7,
>> >>> > !julia_type
>> >>> > !9
>> >>> >
>> >>> >
>> >>> >
>> >>> > On Sun, Feb 2, 2014 at 6:18 AM, Evgeniy Stepanov
>> >>> > <eugeni.stepanov at gmail.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> How is ccall() implemented? If it manually sets
up a stack frame,
>> >>> >> then
>> >>> >> it also needs to store argument shadow values in
paramtls.
>> >>> >>
>> >>> >> I don't think there is an overflow, unless
you have a _lot_ of
>> >>> >> arguments in a function call.
>> >>> >>
>> >>> >> On Sun, Feb 2, 2014 at 9:26 AM, Keno Fischer
>> >>> >> <kfischer at college.harvard.edu> wrote:
>> >>> >> > Also, I was looking at the instrumented LLVM
code and I noticed
>> >>> >> > that
>> >>> >> > the
>> >>> >> > LLVM global variable is generated with size
1000, while
>> >>> >> > kMsanParamTlsSizeInWords == 100, could there
be some sort of
>> >>> >> > overflow
>> >>> >> > happening?
>> >>> >> >
>> >>> >> >
>> >>> >> > On Sat, Feb 1, 2014 at 6:44 PM, Keno Fischer
>> >>> >> > <kfischer at college.harvard.edu>
>> >>> >> > wrote:
>> >>> >> >>
>> >>> >> >> I have verified that both TLS
implementations indeed find the
>> >>> >> >> same
>> >>> >> >> area
>> >>> >> >> of
>> >>> >> >> memory. Anything else I could look for?
>> >>> >> >>
>> >>> >> >>
>> >>> >> >> On Tue, Jan 28, 2014 at 4:28 PM, Keno
Fischer
>> >>> >> >> <kfischer at college.harvard.edu>
wrote:
>> >>> >> >>>
>> >>> >> >>> Yes, both JIT code and the native
runtime are instrumented. I
>> >>> >> >>> am
>> >>> >> >>> under
>> >>> >> >>> the impressions that the the C
library should guarantee that
>> >>> >> >>> from
>> >>> >> >>> the
>> >>> >> >>> way
>> >>> >> >>> the relocations are implemented as
long as both native and
>> >>> >> >>> JITed
>> >>> >> >>> code
>> >>> >> >>> are on
>> >>> >> >>> the same thread (but I will verify
this and report back).
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>> On Tue, Jan 28, 2014 at 2:41 AM,
Evgeniy Stepanov
>> >>> >> >>> <eugeni.stepanov at gmail.com>
wrote:
>> >>> >> >>>>
>> >>> >> >>>> I assume there are transitions
between JITted code and native
>> >>> >> >>>> helper
>> >>> >> >>>> functions. How are you handling
them? Are native functions
>> >>> >> >>>> MSan-instrumented?
>> >>> >> >>>> MSan is passing shadow across
function calls in TLS slots.
>> >>> >> >>>> Does
>> >>> >> >>>> your
>> >>> >> >>>> TLS implementation guarantee
that accesses to __msan_param_tls
>> >>> >> >>>> from
>> >>> >> >>>> JITted and from native code map
to the same memory?
>> >>> >> >>>>
>> >>> >> >>>>
>> >>> >> >>>> On Mon, Jan 27, 2014 at 11:36
PM, Evgeniy Stepanov
>> >>> >> >>>> <eugeni.stepanov at
gmail.com> wrote:
>> >>> >> >>>> > This is really cool.
I've not heard of anyone using MSan
>> >>> >> >>>> > with
>> >>> >> >>>> > MSJIT
>> >>> >> >>>> > before.
>> >>> >> >>>> >
>> >>> >> >>>> >
>> >>> >> >>>> > On Mon, Jan 27, 2014 at
7:44 PM, Keno Fischer
>> >>> >> >>>> > <kfischer at
college.harvard.edu> wrote:
>> >>> >> >>>> >> Hello everybody,
>> >>> >> >>>> >>
>> >>> >> >>>> >> I've run into some
strange behavior with memory sanitizer
>> >>> >> >>>> >> that
>> >>> >> >>>> >> I
>> >>> >> >>>> >> can't
>> >>> >> >>>> >> explain and hope
somebody with more knowledge of the
>> >>> >> >>>> >> implementation
>> >>> >> >>>> >> would be
>> >>> >> >>>> >> able to help me out or
at least point me into the right
>> >>> >> >>>> >> direction.
>> >>> >> >>>> >>
>> >>> >> >>>> >> For background, I'm
using memory sanitizer to check Julia
>> >>> >> >>>> >> (julialang.org),
>> >>> >> >>>> >> which uses (or at least
will once I track down a few bugs)
>> >>> >> >>>> >> MCJIT
>> >>> >> >>>> >> for
>> >>> >> >>>> >> the
>> >>> >> >>>> >> code compilation. So
far I have rebuilt the runtime and all
>> >>> >> >>>> >> dependencies
>> >>> >> >>>> >> (including LLVM,
libcxx, etc.) with memory sanitizer
>> >>> >> >>>> >> enabled
>> >>> >> >>>> >> and
>> >>> >> >>>> >> added the
>> >>> >> >>>> >> instrumentation pass in
the appropriate place in the julia
>> >>> >> >>>> >> code
>> >>> >> >>>> >> generator.
>> >>> >> >>>> >>
>> >>> >> >>>> >> I'm now going
through the usual bootstrap which basically
>> >>> >> >>>> >> loads
>> >>> >> >>>> >> the
>> >>> >> >>>> >> standard
>> >>> >> >>>> >> library and compiles
it, does inference, etc. This works
>> >>> >> >>>> >> fine
>> >>> >> >>>> >> for
>> >>> >> >>>> >> several
>> >>> >> >>>> >> hours (this is usually
much faster - by which I mean
>> >>> >> >>>> >> several
>> >>> >> >>>> >> hundred
>> >>> >> >>>> >> time -
>> >>> >> >>>> >> I suspect the issue is
with MCJIT having to process a ton
>> >>> >> >>>> >> more
>> >>> >> >>>> >> relocations
>> >>> >> >>>> >> and code and being
inefficient at it, but I can't prove
>> >>> >> >>>> >> that).
>> >>> >> >>>> >> That's
>> >>> >> >>>> >> not
>> >>> >> >>>> >> the issue however.
Eventually, I get
>> >>> >> >>>> >>
>> >>> >> >>>> >> ==17150== WARNING:
MemorySanitizer:
>> >>> >> >>>> >>
use-of-uninitialized-value
>> >>> >> >>>> >>     #0 0x7f417cea3189
in bitvector_any1
>> >>> >> >>>> >>
/home/kfischer/julia-san/src/support/bitvector.c:177
>> >>> >> >>>> >> [ snip ]
>> >>> >> >>>> >>
>> >>> >> >>>> >>   Uninitialized value
was created by a heap allocation
>> >>> >> >>>> >>     #0 0x7f41815de543
in __interceptor_malloc
>> >>> >> >>>> >>
>> >>> >> >>>> >>
>> >>> >> >>>> >>
>> >>> >> >>>> >>
>> >>> >> >>>> >>
/home/kfischer/julia-san/deps/llvm-svn/projects/compiler-rt/lib/msan/msan_interceptors.cc:854
>> >>> >> >>>> >>     #1 0x7f417cc7d7f1
in alloc_big
>> >>> >> >>>> >>
/home/kfischer/julia-san/src/gc.c:355
>> >>> >> >>>> >> [snip]
>> >>> >> >>>> >>
>> >>> >> >>>> >> Now, by going through
it in the debugger, I see
>> >>> >> >>>> >>
>> >>> >> >>>> >> (gdb) f 3
>> >>> >> >>>> >> #3  0x00007f417cea318a
in bitvector_any1 (b=0x60c000607240,
>> >>> >> >>>> >> b at
entry=<optimized out>, offs=0, offs at entry=<optimized
>> >>> >> >>>> >> out>,
>> >>> >> >>>> >> nbits=256,
>> >>> >> >>>> >> nbits at
entry=<optimized out>)
>> >>> >> >>>> >>     at bitvector.c:177
>> >>> >> >>>> >> 177         if ((b[0]
& mask) != 0) return 1;
>> >>> >> >>>> >> (gdb) p
__msan_print_shadow(&b,8)
>> >>> >> >>>> >> ff ff ff ff ff ff ff ff
>> >>> >> >>>> >>  o: 3f0010a6  o:
80007666
>> >>> >> >>>> >>
>> >>> >> >>>> >> which seems to indicate
that the local variable b has
>> >>> >> >>>> >> uninitialized
>> >>> >> >>>> >> data.
>> >>> >> >>>> >> I'm having a hard
time believing that though, since if I
>> >>> >> >>>> >> look
>> >>> >> >>>> >> at
>> >>> >> >>>> >> the
>> >>> >> >>>> >> functions before it,
the place where it's coming from is
>> >>> >> >>>> >> initialized:
>> >>> >> >>>> >>
>> >>> >> >>>> >> #4  0x00007f41755208a8
in julia_isempty248 ()
>> >>> >> >>>> >> #5  0x00007f417c163e3d
in jl_apply (f=0x606000984d60,
>> >>> >> >>>> >> f at
entry=<optimized
>> >>> >> >>>> >> out>,
args=0x7fff9132da20, args at entry=<optimized out>,
>> >>> >> >>>> >> nargs=1,
>> >>> >> >>>> >>     nargs at
entry=<optimized out>) at ./julia.h:1043
>> >>> >> >>>> >>
>> >>> >> >>>> >> (here's the code of
that julia function for reference)
>> >>> >> >>>> >>
>> >>> >> >>>> >> isempty(s::IntSet)
>> >>> >> >>>> >>     !s.fill1s
&& ccall(:bitvector_any1, Uint32,
>> >>> >> >>>> >> (Ptr{Uint32},
>> >>> >> >>>> >> Uint64,
>> >>> >> >>>> >> Uint64), s.bits, 0,
s.limit)==0
>> >>> >> >>>> >>
>> >>> >> >>>> >> Looking at where that
value is coming from:
>> >>> >> >>>> >>
>> >>> >> >>>> >> (gdb) f 5
>> >>> >> >>>> >> #5  0x00007f417c163e3d
in jl_apply (f=0x606000984d60,
>> >>> >> >>>> >> f at
entry=<optimized
>> >>> >> >>>> >> out>,
args=0x7fff9132da20, args at entry=<optimized out>,
>> >>> >> >>>> >> nargs=1,
>> >>> >> >>>> >>     nargs at
entry=<optimized out>) at ./julia.h:1043
>> >>> >> >>>> >> 1043        return
f->fptr((jl_value_t*)f, args, nargs);
>> >>> >> >>>> >> (gdb) p
((jl_array_t*)((void**)args[0])[1])->data
>> >>> >> >>>> >> $43 = (void *)
0x60c000607240
>> >>> >> >>>> >> (gdb) p
>> >>> >> >>>> >>
>> >>> >> >>>> >>
__msan_print_shadow(((jl_array_t*)((void**)args[0])[1]),0x30)
>> >>> >> >>>> >> 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00
>> >>> >> >>>> >> 00
>> >>> >> >>>> >> 00
>> >>> >> >>>> >> 00
>> >>> >> >>>> >> 00 00
>> >>> >> >>>> >> 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00
>> >>> >> >>>> >> 00
>> >>> >> >>>> >> 00
>> >>> >> >>>> >> 00
>> >>> >> >>>> >>  o: d800496  o: d800496
o: d800496  o: d800496  o: d800496
>> >>> >> >>>> >> o:
>> >>> >> >>>> >> d800496  o:
>> >>> >> >>>> >> d800496  o: d800496  o:
d800496  o: d800496  o: d800496  o:
>> >>> >> >>>> >> d800496
>> >>> >> >>>> >>
>> >>> >> >>>> >> There are no
uninitialized values to be seen anywhere and
>> >>> >> >>>> >> the
>> >>> >> >>>> >> `b`
>> >>> >> >>>> >> value
>> >>> >> >>>> >> isn't touched
before that line, so I'm a little stumped.
>> >>> >> >>>> >>
>> >>> >> >>>> >> One note I should make
is that I did have to implement TLS
>> >>> >> >>>> >> support
>> >>> >> >>>> >> myself in
>> >>> >> >>>> >> MCJIT for this to work
(I'll upstream the patch soon), so I
>> >>> >> >>>> >> may
>> >>> >> >>>> >> have
>> >>> >> >>>> >> made a
>> >>> >> >>>> >> mistake, but I
haven't found anything wrong yet. If nothing
>> >>> >> >>>> >> looks
>> >>> >> >>>> >> unusual,
>> >>> >> >>>> >> I'd also appreciate
pointers on what to look for in the TLS
>> >>> >> >>>> >> variables.
>> >>> >> >>>> >>
>> >>> >> >>>> >> Thank you for your
help,
>> >>> >> >>>> >> Keno
>> >>> >> >>>> >>
>> >>> >> >>>> >>
>> >>> >> >>>> >>
_______________________________________________
>> >>> >> >>>> >> LLVM Developers mailing
list
>> >>> >> >>>> >> LLVMdev at cs.uiuc.edu 
http://llvm.cs.uiuc.edu
>> >>> >> >>>> >>
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >>> >> >>>> >>
>> >>> >> >>>
>> >>> >> >>>
>> >>> >> >>
>> >>> >> >
>> >>> >
>> >>> >
>> >>
>> >>
>> >
>
>

Evgeniy Stepanov

2014-Aug-04 09:26 UTC

head link

[LLVMdev] Weird msan problem

__msan_print_shadow has been fixed and improved a lot.

Origins for 8-byte stores are not fixed yet.

Btw, try -fsanitize-memory-track-origins=2.



On Sat, Aug 2, 2014 at 2:27 AM, Keno Fischer
<kfischer at college.harvard.edu> wrote:> I'm about to have another pass at this, so was wondering if this was
> ever fixed (the reason I'm taking another pass is that I managed to
> find the MCJIT issue that was causing this to take forever).
>
> On Fri, Feb 7, 2014 at 12:06 AM, Keno Fischer
> <kfischer at college.harvard.edu> wrote:
>> Yes, it would be great to get that fixed.
>>
>>
>> On Wed, Feb 5, 2014 at 4:09 PM, Evgeniy Stepanov <eugeni.stepanov at
gmail.com>
>> wrote:
>>>
>>> On Thu, Feb 6, 2014 at 12:21 AM, Keno Fischer
>>> <kfischer at college.harvard.edu> wrote:
>>> > Looks like when you materialize the stores, you should check
the size of
>>> > the
>>> > the store and emit an appropriate amount of stores to the
origin shadow
>>> > (or
>>> > just a memset intrinsic?).
>>>
>>> Yes. There is an old bug for that on the code.google.com site, but
it
>>> has _never_ caused problems. I guess partial reads at a non-zero
>>> offset are pretty uncommon, and also LLVM IR tends to use smaller
>>> stores (larger ones are transformed to memset/memcpy and we handle
>>> those correctly). Should be fixed anyway.
>>>
>>> Note that __msan_print_shadow can lie about origin a bit if the
>>> address is not 4-byte aligned (it would print origins for aligned
>>> locations instead).
>>>
>>> >
>>> >
>>> > On Wed, Feb 5, 2014 at 2:13 PM, Keno Fischer
>>> > <kfischer at college.harvard.edu>
>>> > wrote:
>>> >>
>>> >> The @entry stuff is just a gdb artifact. I've been
tracking this back a
>>> >> little further, and it seems there's at least an
origin propagation
>>> >> problem:
>>> >>
>>> >> (gdb) list
>>> >> 281             goto value_to_pointer_error;
>>> >> 282         }
>>> >> 283         else {
>>> >> 284             if (jl_is_cpointer_type(jvt) &&
jl_tparam0(jvt) == jt)
>>> >> {
>>> >> 285                 void *ptr = jl_unbox_voidpointer(v);
>>> >> 286                
assert(__msan_test_shadow(&ptr,sizeof(void**)) =>>> >> -1);
>>> >> 287                 return (void*)ptr;
>>> >> 288             }
>>> >> 289         }
>>> >> 290
>>> >> (gdb) p __msan_print_shadow(v,16)
>>> >> 00 00 00 00 00 00 00 00 ff ff ff ff ff ff ff ff
>>> >>  o: 40000e23  o: 40000e23  o: 40000e23  o: 40000e23
>>> >> $21 = void
>>> >> (gdb) p __msan_print_shadow(&ptr,8)
>>> >> ff ff ff ff ff ff ff ff
>>> >>  o: 40000e23  o: 80007614
>>> >> $22 = void
>>> >>
>>> >> Notice the origin of the lower bits is off. Any idea as to
why?
>>> >>
>>> >>
>>> >> On Mon, Feb 3, 2014 at 2:52 AM, Evgeniy Stepanov
>>> >> <eugeni.stepanov at gmail.com> wrote:
>>> >>>
>>> >>> The code for ccall looks right. Sounds like you have a
very small
>>> >>> range of instructions where an uninitialized value
appear. You could
>>> >>> try debugging at asm level. Shadow for b should be
passed at offset 0
>>> >>> in __msan_param_tls.
>>> >>>
>>> >>> MSan could propagate shadow through arithmetic and
even some logic
>>> >>> operations (like select). It could be that b is clean
on function
>>> >>> entry, but then something uninitialized gets mixed in.
>>> >>>
>>> >>> Also, what is this @entry stuff? It looks like
bitvector_any1 has 6
>>> >>> arguments, but your ccall snippet only sets up tls for
3.
>>> >>>
>>> >>>
>>> >>> On Sun, Feb 2, 2014 at 10:39 PM, Keno Fischer
>>> >>> <kfischer at college.harvard.edu> wrote:
>>> >>> > I don't think we're doing anything
unusual. Here's the relevant
>>> >>> > snippet
>>> >>> > from
>>> >>> > the generated IR for ccall (after
instrumentation):
>>> >>> >
>>> >>> > ccall:                                           
; preds = %pass6,
>>> >>> > %113
>>> >>> >   %114 = phi i8* [ %110, %113 ], [ %108, %pass6 ]
>>> >>> >   %115 = inttoptr i64 %17 to i64*
>>> >>> >   %116 = bitcast i8* %69 to i32*, !dbg !7
>>> >>> >   %117 = bitcast i8* %114 to i32 (i32*, i64,
i64)*, !dbg !7
>>> >>> >   store i64 %_msret13, i64* getelementptr
inbounds ([1000 x i64]*
>>> >>> > @__msan_param_tls, i64 0, i64 0), align 8, !dbg
!7
>>> >>> >   store i32 %70, i32* getelementptr inbounds
([1000 x i32]*
>>> >>> > @__msan_param_origin_tls, i64 0, i64 0), align 4,
!dbg !7
>>> >>> >   store i64 0, i64* inttoptr (i64 add (i64
ptrtoint ([1000 x i64]*
>>> >>> > @__msan_param_tls to i64), i64 8) to i64*), align
8, !dbg !7
>>> >>> >   store i32 0, i32* inttoptr (i64 add (i64
ptrtoint ([1000 x i32]*
>>> >>> > @__msan_param_origin_tls to i64), i64 8) to
i32*), align 4, !dbg !7
>>> >>> >   store i64 0, i64* inttoptr (i64 add (i64
ptrtoint ([1000 x i64]*
>>> >>> > @__msan_param_tls to i64), i64 16) to i64*),
align 8, !dbg !7
>>> >>> >   store i32 0, i32* inttoptr (i64 add (i64
ptrtoint ([1000 x i32]*
>>> >>> > @__msan_param_origin_tls to i64), i64 16) to
i32*), align 4, !dbg !7
>>> >>> >   store i32 0, i32* bitcast ([8 x i64]*
@__msan_retval_tls to i32*),
>>> >>> > align
>>> >>> > 8, !dbg !7
>>> >>> >   %118 = call i32 %117(i32* %116, i64 %87, i64
%107), !dbg !7,
>>> >>> > !julia_type
>>> >>> > !9
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>> > On Sun, Feb 2, 2014 at 6:18 AM, Evgeniy Stepanov
>>> >>> > <eugeni.stepanov at gmail.com>
>>> >>> > wrote:
>>> >>> >>
>>> >>> >> How is ccall() implemented? If it manually
sets up a stack frame,
>>> >>> >> then
>>> >>> >> it also needs to store argument shadow values
in paramtls.
>>> >>> >>
>>> >>> >> I don't think there is an overflow,
unless you have a _lot_ of
>>> >>> >> arguments in a function call.
>>> >>> >>
>>> >>> >> On Sun, Feb 2, 2014 at 9:26 AM, Keno Fischer
>>> >>> >> <kfischer at college.harvard.edu>
wrote:
>>> >>> >> > Also, I was looking at the instrumented
LLVM code and I noticed
>>> >>> >> > that
>>> >>> >> > the
>>> >>> >> > LLVM global variable is generated with
size 1000, while
>>> >>> >> > kMsanParamTlsSizeInWords == 100, could
there be some sort of
>>> >>> >> > overflow
>>> >>> >> > happening?
>>> >>> >> >
>>> >>> >> >
>>> >>> >> > On Sat, Feb 1, 2014 at 6:44 PM, Keno
Fischer
>>> >>> >> > <kfischer at college.harvard.edu>
>>> >>> >> > wrote:
>>> >>> >> >>
>>> >>> >> >> I have verified that both TLS
implementations indeed find the
>>> >>> >> >> same
>>> >>> >> >> area
>>> >>> >> >> of
>>> >>> >> >> memory. Anything else I could look
for?
>>> >>> >> >>
>>> >>> >> >>
>>> >>> >> >> On Tue, Jan 28, 2014 at 4:28 PM,
Keno Fischer
>>> >>> >> >> <kfischer at
college.harvard.edu> wrote:
>>> >>> >> >>>
>>> >>> >> >>> Yes, both JIT code and the
native runtime are instrumented. I
>>> >>> >> >>> am
>>> >>> >> >>> under
>>> >>> >> >>> the impressions that the the C
library should guarantee that
>>> >>> >> >>> from
>>> >>> >> >>> the
>>> >>> >> >>> way
>>> >>> >> >>> the relocations are implemented
as long as both native and
>>> >>> >> >>> JITed
>>> >>> >> >>> code
>>> >>> >> >>> are on
>>> >>> >> >>> the same thread (but I will
verify this and report back).
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>> On Tue, Jan 28, 2014 at 2:41 AM,
Evgeniy Stepanov
>>> >>> >> >>> <eugeni.stepanov at
gmail.com> wrote:
>>> >>> >> >>>>
>>> >>> >> >>>> I assume there are
transitions between JITted code and native
>>> >>> >> >>>> helper
>>> >>> >> >>>> functions. How are you
handling them? Are native functions
>>> >>> >> >>>> MSan-instrumented?
>>> >>> >> >>>> MSan is passing shadow
across function calls in TLS slots.
>>> >>> >> >>>> Does
>>> >>> >> >>>> your
>>> >>> >> >>>> TLS implementation guarantee
that accesses to __msan_param_tls
>>> >>> >> >>>> from
>>> >>> >> >>>> JITted and from native code
map to the same memory?
>>> >>> >> >>>>
>>> >>> >> >>>>
>>> >>> >> >>>> On Mon, Jan 27, 2014 at
11:36 PM, Evgeniy Stepanov
>>> >>> >> >>>> <eugeni.stepanov at
gmail.com> wrote:
>>> >>> >> >>>> > This is really cool.
I've not heard of anyone using MSan
>>> >>> >> >>>> > with
>>> >>> >> >>>> > MSJIT
>>> >>> >> >>>> > before.
>>> >>> >> >>>> >
>>> >>> >> >>>> >
>>> >>> >> >>>> > On Mon, Jan 27, 2014 at
7:44 PM, Keno Fischer
>>> >>> >> >>>> > <kfischer at
college.harvard.edu> wrote:
>>> >>> >> >>>> >> Hello everybody,
>>> >>> >> >>>> >>
>>> >>> >> >>>> >> I've run into
some strange behavior with memory sanitizer
>>> >>> >> >>>> >> that
>>> >>> >> >>>> >> I
>>> >>> >> >>>> >> can't
>>> >>> >> >>>> >> explain and hope
somebody with more knowledge of the
>>> >>> >> >>>> >> implementation
>>> >>> >> >>>> >> would be
>>> >>> >> >>>> >> able to help me out
or at least point me into the right
>>> >>> >> >>>> >> direction.
>>> >>> >> >>>> >>
>>> >>> >> >>>> >> For background,
I'm using memory sanitizer to check Julia
>>> >>> >> >>>> >> (julialang.org),
>>> >>> >> >>>> >> which uses (or at
least will once I track down a few bugs)
>>> >>> >> >>>> >> MCJIT
>>> >>> >> >>>> >> for
>>> >>> >> >>>> >> the
>>> >>> >> >>>> >> code compilation.
So far I have rebuilt the runtime and all
>>> >>> >> >>>> >> dependencies
>>> >>> >> >>>> >> (including LLVM,
libcxx, etc.) with memory sanitizer
>>> >>> >> >>>> >> enabled
>>> >>> >> >>>> >> and
>>> >>> >> >>>> >> added the
>>> >>> >> >>>> >> instrumentation
pass in the appropriate place in the julia
>>> >>> >> >>>> >> code
>>> >>> >> >>>> >> generator.
>>> >>> >> >>>> >>
>>> >>> >> >>>> >> I'm now going
through the usual bootstrap which basically
>>> >>> >> >>>> >> loads
>>> >>> >> >>>> >> the
>>> >>> >> >>>> >> standard
>>> >>> >> >>>> >> library and
compiles it, does inference, etc. This works
>>> >>> >> >>>> >> fine
>>> >>> >> >>>> >> for
>>> >>> >> >>>> >> several
>>> >>> >> >>>> >> hours (this is
usually much faster - by which I mean
>>> >>> >> >>>> >> several
>>> >>> >> >>>> >> hundred
>>> >>> >> >>>> >> time -
>>> >>> >> >>>> >> I suspect the issue
is with MCJIT having to process a ton
>>> >>> >> >>>> >> more
>>> >>> >> >>>> >> relocations
>>> >>> >> >>>> >> and code and being
inefficient at it, but I can't prove
>>> >>> >> >>>> >> that).
>>> >>> >> >>>> >> That's
>>> >>> >> >>>> >> not
>>> >>> >> >>>> >> the issue however.
Eventually, I get
>>> >>> >> >>>> >>
>>> >>> >> >>>> >> ==17150== WARNING:
MemorySanitizer:
>>> >>> >> >>>> >>
use-of-uninitialized-value
>>> >>> >> >>>> >>     #0
0x7f417cea3189 in bitvector_any1
>>> >>> >> >>>> >>
/home/kfischer/julia-san/src/support/bitvector.c:177
>>> >>> >> >>>> >> [ snip ]
>>> >>> >> >>>> >>
>>> >>> >> >>>> >>   Uninitialized
value was created by a heap allocation
>>> >>> >> >>>> >>     #0
0x7f41815de543 in __interceptor_malloc
>>> >>> >> >>>> >>
>>> >>> >> >>>> >>
>>> >>> >> >>>> >>
>>> >>> >> >>>> >>
>>> >>> >> >>>> >>
/home/kfischer/julia-san/deps/llvm-svn/projects/compiler-rt/lib/msan/msan_interceptors.cc:854
>>> >>> >> >>>> >>     #1
0x7f417cc7d7f1 in alloc_big
>>> >>> >> >>>> >>
/home/kfischer/julia-san/src/gc.c:355
>>> >>> >> >>>> >> [snip]
>>> >>> >> >>>> >>
>>> >>> >> >>>> >> Now, by going
through it in the debugger, I see
>>> >>> >> >>>> >>
>>> >>> >> >>>> >> (gdb) f 3
>>> >>> >> >>>> >> #3 
0x00007f417cea318a in bitvector_any1 (b=0x60c000607240,
>>> >>> >> >>>> >> b at
entry=<optimized out>, offs=0, offs at entry=<optimized
>>> >>> >> >>>> >> out>,
>>> >>> >> >>>> >> nbits=256,
>>> >>> >> >>>> >> nbits at
entry=<optimized out>)
>>> >>> >> >>>> >>     at
bitvector.c:177
>>> >>> >> >>>> >> 177         if
((b[0] & mask) != 0) return 1;
>>> >>> >> >>>> >> (gdb) p
__msan_print_shadow(&b,8)
>>> >>> >> >>>> >> ff ff ff ff ff ff
ff ff
>>> >>> >> >>>> >>  o: 3f0010a6  o:
80007666
>>> >>> >> >>>> >>
>>> >>> >> >>>> >> which seems to
indicate that the local variable b has
>>> >>> >> >>>> >> uninitialized
>>> >>> >> >>>> >> data.
>>> >>> >> >>>> >> I'm having a
hard time believing that though, since if I
>>> >>> >> >>>> >> look
>>> >>> >> >>>> >> at
>>> >>> >> >>>> >> the
>>> >>> >> >>>> >> functions before
it, the place where it's coming from is
>>> >>> >> >>>> >> initialized:
>>> >>> >> >>>> >>
>>> >>> >> >>>> >> #4 
0x00007f41755208a8 in julia_isempty248 ()
>>> >>> >> >>>> >> #5 
0x00007f417c163e3d in jl_apply (f=0x606000984d60,
>>> >>> >> >>>> >> f at
entry=<optimized
>>> >>> >> >>>> >> out>,
args=0x7fff9132da20, args at entry=<optimized out>,
>>> >>> >> >>>> >> nargs=1,
>>> >>> >> >>>> >>     nargs at
entry=<optimized out>) at ./julia.h:1043
>>> >>> >> >>>> >>
>>> >>> >> >>>> >> (here's the
code of that julia function for reference)
>>> >>> >> >>>> >>
>>> >>> >> >>>> >> isempty(s::IntSet)
>>> >>> >> >>>> >>     !s.fill1s
&& ccall(:bitvector_any1, Uint32,
>>> >>> >> >>>> >> (Ptr{Uint32},
>>> >>> >> >>>> >> Uint64,
>>> >>> >> >>>> >> Uint64), s.bits, 0,
s.limit)==0
>>> >>> >> >>>> >>
>>> >>> >> >>>> >> Looking at where
that value is coming from:
>>> >>> >> >>>> >>
>>> >>> >> >>>> >> (gdb) f 5
>>> >>> >> >>>> >> #5 
0x00007f417c163e3d in jl_apply (f=0x606000984d60,
>>> >>> >> >>>> >> f at
entry=<optimized
>>> >>> >> >>>> >> out>,
args=0x7fff9132da20, args at entry=<optimized out>,
>>> >>> >> >>>> >> nargs=1,
>>> >>> >> >>>> >>     nargs at
entry=<optimized out>) at ./julia.h:1043
>>> >>> >> >>>> >> 1043        return
f->fptr((jl_value_t*)f, args, nargs);
>>> >>> >> >>>> >> (gdb) p
((jl_array_t*)((void**)args[0])[1])->data
>>> >>> >> >>>> >> $43 = (void *)
0x60c000607240
>>> >>> >> >>>> >> (gdb) p
>>> >>> >> >>>> >>
>>> >>> >> >>>> >>
__msan_print_shadow(((jl_array_t*)((void**)args[0])[1]),0x30)
>>> >>> >> >>>> >> 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> >>> >> >>>> >> 00
>>> >>> >> >>>> >> 00
>>> >>> >> >>>> >> 00
>>> >>> >> >>>> >> 00 00
>>> >>> >> >>>> >> 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>> >>> >> >>>> >> 00
>>> >>> >> >>>> >> 00
>>> >>> >> >>>> >> 00
>>> >>> >> >>>> >>  o: d800496  o:
d800496  o: d800496  o: d800496  o: d800496
>>> >>> >> >>>> >> o:
>>> >>> >> >>>> >> d800496  o:
>>> >>> >> >>>> >> d800496  o: d800496
o: d800496  o: d800496  o: d800496  o:
>>> >>> >> >>>> >> d800496
>>> >>> >> >>>> >>
>>> >>> >> >>>> >> There are no
uninitialized values to be seen anywhere and
>>> >>> >> >>>> >> the
>>> >>> >> >>>> >> `b`
>>> >>> >> >>>> >> value
>>> >>> >> >>>> >> isn't touched
before that line, so I'm a little stumped.
>>> >>> >> >>>> >>
>>> >>> >> >>>> >> One note I should
make is that I did have to implement TLS
>>> >>> >> >>>> >> support
>>> >>> >> >>>> >> myself in
>>> >>> >> >>>> >> MCJIT for this to
work (I'll upstream the patch soon), so I
>>> >>> >> >>>> >> may
>>> >>> >> >>>> >> have
>>> >>> >> >>>> >> made a
>>> >>> >> >>>> >> mistake, but I
haven't found anything wrong yet. If nothing
>>> >>> >> >>>> >> looks
>>> >>> >> >>>> >> unusual,
>>> >>> >> >>>> >> I'd also
appreciate pointers on what to look for in the TLS
>>> >>> >> >>>> >> variables.
>>> >>> >> >>>> >>
>>> >>> >> >>>> >> Thank you for your
help,
>>> >>> >> >>>> >> Keno
>>> >>> >> >>>> >>
>>> >>> >> >>>> >>
>>> >>> >> >>>> >>
_______________________________________________
>>> >>> >> >>>> >> LLVM Developers
mailing list
>>> >>> >> >>>> >> LLVMdev at
cs.uiuc.edu         http://llvm.cs.uiuc.edu
>>> >>> >> >>>> >>
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>>> >>> >> >>>> >>
>>> >>> >> >>>
>>> >>> >> >>>
>>> >>> >> >>
>>> >>> >> >
>>> >>> >
>>> >>> >
>>> >>
>>> >>
>>> >
>>
>>

llvm dev - Aug 2014 - [LLVMdev] Weird msan problem

[LLVMdev] Weird msan problem

[LLVMdev] Weird msan problem

[LLVMdev] Weird msan problem