> On 2014 Dec 10, at 14:08, Tom Stellard <tom at stellard.net> wrote: > > On Wed, Dec 10, 2014 at 11:21:08AM -0800, Duncan P. N. Exon Smith wrote: >> >>> On 2014 Dec 10, at 08:40, Tom Stellard <tom at stellard.net> wrote: >>> >>> On Tue, Dec 09, 2014 at 09:22:16PM -0800, Duncan P. N. Exon Smith wrote: >>>> The `Metadata`/`Value` split (PR21532) landed in r223802 -- at least, the >>>> C++ side of it. This was a rocky day, but I suppose that's what I get >>>> for failing to stage the change in smaller pieces. >>>> >>>> As of r223916 (lldb), I'm not aware of any remaining (in-tree) breakage, >>>> so if I've missed some problem in the sea of buildbot errors, please >>>> flag me down. >>>> >>>> I'll follow up soon with bitcode and assembly changes! >>> >>> Hi Duncan, >>> >>> I started getting random assertion failures in some tests yesterday, and I think >>> it may be related to this change. Here is the stack trace: >>> >>> #0 0x00007ffff59f4c39 in raise () from /lib64/libc.so.6 >>> #1 0x00007ffff59f6348 in abort () from /lib64/libc.so.6 >>> #2 0x00007ffff59edb96 in __assert_fail_base () from /lib64/libc.so.6 >>> #3 0x00007ffff59edc42 in __assert_fail () from /lib64/libc.so.6 >>> #4 0x00007ffff3a30e92 in llvm::LeakDetectorImpl<void>::addGarbage(void const*) [clone .part.19] () from /opt/buildbot/lib/libLLVM-3.6svn.so >>> #5 0x00007ffff3a30fd3 in llvm::LeakDetector::addGarbageObjectImpl(void*) () from /opt/buildbot/lib/libLLVM-3.6svn.so >>> #6 0x00007ffff3a40eed in llvm::MDNode::getTemporary(llvm::LLVMContext&, llvm::ArrayRef<llvm::Metadata*>) () from /opt/buildbot/lib/libLLVM-3.6svn.so >>> #7 0x00007ffff3426b3f in MapValueImpl(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>> #8 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>> #9 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>> #10 0x00007ffff3426eed in llvm::MapValue(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>> #11 0x00007ffff3426f39 in llvm::MapValue(llvm::MDNode const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>> #12 0x00007ffff3427174 in llvm::RemapInstruction(llvm::Instruction*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>> #13 0x00007ffff3755786 in (anonymous namespace)::ModuleLinker::linkGlobalValueBody(llvm::GlobalValue&) () from /opt/buildbot/lib/libLLVM-3.6svn.so >>> #14 0x00007ffff375767f in llvm::Linker::linkInModule(llvm::Module*) () from /opt/buildbot/lib/libLLVM-3.6svn.so >>> #15 0x00007ffff3758cfb in llvm::Linker::LinkModules(llvm::Module*, llvm::Module*, std::function<void (llvm::DiagnosticInfo const&)>) () >>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>> #16 0x00007ffff6c9d8cf in clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) () from /opt/buildbot/lib/libOpenCL.so.1 >>> #17 0x00007ffff6e61f23 in clang::ParseAST(clang::Sema&, bool, bool) () from /opt/buildbot/lib/libOpenCL.so.1 >>> #18 0x00007ffff6c9e6bb in clang::CodeGenAction::ExecuteAction() () from /opt/buildbot/lib/libOpenCL.so.1 >>> #19 0x00007ffff6b7ead6 in clang::FrontendAction::Execute() () from /opt/buildbot/lib/libOpenCL.so.1 >>> #20 0x00007ffff6b5d179 in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) () from /opt/buildbot/lib/libOpenCL.so.1 >>> #21 0x00007ffff6b1282c in (anonymous namespace)::compile_llvm (llvm_ctx=..., >>> source="\n__kernel void test_fn(__local float *sSharedStorage, __global float *srcValues, __global uint *offsets, __global float *destBuffer, uint alignmentOffset )\n{\n int tid = get_global_id( 0 );\n sSha"..., headers=..., name="input.cl", triple="r600--", processor="verde", opts="", >>> address_spaces=..., optimization_level=@0x7fffffff21cc: 2, r_log=...) at llvm/invocation.cpp:255 >>> #22 0x00007ffff6b140c8 in clover::compile_program_llvm (source=..., headers=..., ir=ir at entry=PIPE_SHADER_IR_NATIVE, target=..., opts=..., r_log=...) >>> at llvm/invocation.cpp:710 >>> #23 0x00007ffff6b0a371 in clover::program::build (this=this at entry=0x23a0530, devs=..., opts=opts at entry=0x7ffff793dc0d "", headers=...) >>> at core/program.cpp:63 >>> #24 0x00007ffff6af31c4 in clBuildProgram (d_prog=0x23a0538, num_devs=0, d_devs=0x0, p_opts=<optimized out>, pfn_notify=0x0, user_data=0x0) >>> at api/program.cpp:182 >>> >>> Does this look related? If so, let me know what other information you need to >>> try to debug this issue. >> >> This could be related; I'm not sure. >> > > I'm pretty sure that this commit is the cause of the regression. > > r223801 works and r223810 does not, and I don't think any of the other > commits in that range could cause this. > >> It looks like a leak detection assertion, and I didn't need to change >> that logic at all. `ValueMap` calls `MDNode::getTemporary()` and >> `MDNode::deleteTemporary()` in the same ways it used to (and I didn't >> touch the implementation of those). >> >> Can you reproduce this with `llvm-link`? If so, that sounds like the >> best place to start. > > I can't reproduce this using llvm-link unfortunately. Any other ideas?(Continuing via email, since Tom stepped away IRC.) Tom, from the trace [1], it the problematic pointer (0x27e4c80) only shows up once. [1]: http://people.freedesktop.org/~tstellar/md-crash.out That means that something *else* -- other than `MDNode::getTemporary()` -- must be adding that address to garbage and failing to remove it. I just dug into `LeakDetector::addGarbageObject()` and it stores *all* calls to `addGarbage()` in the same place. There are a fair number of these in the IR: $ git grep -e addGarbageObject -w -- lib/IR/ lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this); lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this); lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this); lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this); lib/IR/Instruction.cpp: if (!P) LeakDetector::addGarbageObject(this); lib/IR/Metadata.cpp: LeakDetector::addGarbageObject(N); I think the next step is to identify who called `addGarbageObject()` with the problematic address, and what the stack trace was. The weird thing is, I also noticed a semantic change I made here accidentally. `addGarbageObject()` has two overloads: `void*` and `const Value*`. `MDNode::getTemporary()` used to match the latter, but now it matches the former. The weird part: all the other calls to `addGarbageObject()` look like they send in a `Value *`. Do you have any other calls to `addGarbageObject()`? Do they match `void *`? Also, what happens with the attached patch? (If this fixes your problem, I think it's just papering over something...) -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-IR-Detect-Metadata-leaks-separately-from-generic-obj.patch Type: application/octet-stream Size: 4377 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141210/8ad53b94/attachment.obj> -------------- next part --------------
+zalman at google.com> On 2014 Dec 10, at 15:57, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote: > >> >> On 2014 Dec 10, at 14:08, Tom Stellard <tom at stellard.net> wrote: >> >> On Wed, Dec 10, 2014 at 11:21:08AM -0800, Duncan P. N. Exon Smith wrote: >>> >>>> On 2014 Dec 10, at 08:40, Tom Stellard <tom at stellard.net> wrote: >>>> >>>> On Tue, Dec 09, 2014 at 09:22:16PM -0800, Duncan P. N. Exon Smith wrote: >>>>> The `Metadata`/`Value` split (PR21532) landed in r223802 -- at least, the >>>>> C++ side of it. This was a rocky day, but I suppose that's what I get >>>>> for failing to stage the change in smaller pieces. >>>>> >>>>> As of r223916 (lldb), I'm not aware of any remaining (in-tree) breakage, >>>>> so if I've missed some problem in the sea of buildbot errors, please >>>>> flag me down. >>>>> >>>>> I'll follow up soon with bitcode and assembly changes! >>>> >>>> Hi Duncan, >>>> >>>> I started getting random assertion failures in some tests yesterday, and I think >>>> it may be related to this change. Here is the stack trace: >>>> >>>> #0 0x00007ffff59f4c39 in raise () from /lib64/libc.so.6 >>>> #1 0x00007ffff59f6348 in abort () from /lib64/libc.so.6 >>>> #2 0x00007ffff59edb96 in __assert_fail_base () from /lib64/libc.so.6 >>>> #3 0x00007ffff59edc42 in __assert_fail () from /lib64/libc.so.6 >>>> #4 0x00007ffff3a30e92 in llvm::LeakDetectorImpl<void>::addGarbage(void const*) [clone .part.19] () from /opt/buildbot/lib/libLLVM-3.6svn.so >>>> #5 0x00007ffff3a30fd3 in llvm::LeakDetector::addGarbageObjectImpl(void*) () from /opt/buildbot/lib/libLLVM-3.6svn.so >>>> #6 0x00007ffff3a40eed in llvm::MDNode::getTemporary(llvm::LLVMContext&, llvm::ArrayRef<llvm::Metadata*>) () from /opt/buildbot/lib/libLLVM-3.6svn.so >>>> #7 0x00007ffff3426b3f in MapValueImpl(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>>> #8 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>>> #9 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>>> #10 0x00007ffff3426eed in llvm::MapValue(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>>> #11 0x00007ffff3426f39 in llvm::MapValue(llvm::MDNode const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>>> #12 0x00007ffff3427174 in llvm::RemapInstruction(llvm::Instruction*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>>> #13 0x00007ffff3755786 in (anonymous namespace)::ModuleLinker::linkGlobalValueBody(llvm::GlobalValue&) () from /opt/buildbot/lib/libLLVM-3.6svn.so >>>> #14 0x00007ffff375767f in llvm::Linker::linkInModule(llvm::Module*) () from /opt/buildbot/lib/libLLVM-3.6svn.so >>>> #15 0x00007ffff3758cfb in llvm::Linker::LinkModules(llvm::Module*, llvm::Module*, std::function<void (llvm::DiagnosticInfo const&)>) () >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>>> #16 0x00007ffff6c9d8cf in clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) () from /opt/buildbot/lib/libOpenCL.so.1 >>>> #17 0x00007ffff6e61f23 in clang::ParseAST(clang::Sema&, bool, bool) () from /opt/buildbot/lib/libOpenCL.so.1 >>>> #18 0x00007ffff6c9e6bb in clang::CodeGenAction::ExecuteAction() () from /opt/buildbot/lib/libOpenCL.so.1 >>>> #19 0x00007ffff6b7ead6 in clang::FrontendAction::Execute() () from /opt/buildbot/lib/libOpenCL.so.1 >>>> #20 0x00007ffff6b5d179 in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) () from /opt/buildbot/lib/libOpenCL.so.1 >>>> #21 0x00007ffff6b1282c in (anonymous namespace)::compile_llvm (llvm_ctx=..., >>>> source="\n__kernel void test_fn(__local float *sSharedStorage, __global float *srcValues, __global uint *offsets, __global float *destBuffer, uint alignmentOffset )\n{\n int tid = get_global_id( 0 );\n sSha"..., headers=..., name="input.cl", triple="r600--", processor="verde", opts="", >>>> address_spaces=..., optimization_level=@0x7fffffff21cc: 2, r_log=...) at llvm/invocation.cpp:255 >>>> #22 0x00007ffff6b140c8 in clover::compile_program_llvm (source=..., headers=..., ir=ir at entry=PIPE_SHADER_IR_NATIVE, target=..., opts=..., r_log=...) >>>> at llvm/invocation.cpp:710 >>>> #23 0x00007ffff6b0a371 in clover::program::build (this=this at entry=0x23a0530, devs=..., opts=opts at entry=0x7ffff793dc0d "", headers=...) >>>> at core/program.cpp:63 >>>> #24 0x00007ffff6af31c4 in clBuildProgram (d_prog=0x23a0538, num_devs=0, d_devs=0x0, p_opts=<optimized out>, pfn_notify=0x0, user_data=0x0) >>>> at api/program.cpp:182 >>>> >>>> Does this look related? If so, let me know what other information you need to >>>> try to debug this issue. >>> >>> This could be related; I'm not sure. >>> >> >> I'm pretty sure that this commit is the cause of the regression. >> >> r223801 works and r223810 does not, and I don't think any of the other >> commits in that range could cause this. >> >>> It looks like a leak detection assertion, and I didn't need to change >>> that logic at all. `ValueMap` calls `MDNode::getTemporary()` and >>> `MDNode::deleteTemporary()` in the same ways it used to (and I didn't >>> touch the implementation of those). >>> >>> Can you reproduce this with `llvm-link`? If so, that sounds like the >>> best place to start. >> >> I can't reproduce this using llvm-link unfortunately. Any other ideas? > > (Continuing via email, since Tom stepped away IRC.) > > Tom, from the trace [1], it the problematic pointer (0x27e4c80) only shows > up once. > > [1]: http://people.freedesktop.org/~tstellar/md-crash.out > > That means that something *else* -- other than `MDNode::getTemporary()` > -- must be adding that address to garbage and failing to remove it. > > I just dug into `LeakDetector::addGarbageObject()` and it stores *all* > calls to `addGarbage()` in the same place. There are a fair number of > these in the IR: > > $ git grep -e addGarbageObject -w -- lib/IR/ > lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this); > lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this); > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this); > lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this); > lib/IR/Instruction.cpp: if (!P) LeakDetector::addGarbageObject(this); > lib/IR/Metadata.cpp: LeakDetector::addGarbageObject(N); > > I think the next step is to identify who called `addGarbageObject()` with > the problematic address, and what the stack trace was. > > The weird thing is, I also noticed a semantic change I made here > accidentally. `addGarbageObject()` has two overloads: `void*` and > `const Value*`. `MDNode::getTemporary()` used to match the latter, but > now it matches the former. > > The weird part: all the other calls to `addGarbageObject()` look like they > send in a `Value *`. > > Do you have any other calls to `addGarbageObject()`? Do they match > `void *`? > > Also, what happens with the attached patch? (If this fixes your problem, > I think it's just papering over something...) > > <0001-IR-Detect-Metadata-leaks-separately-from-generic-obj.patch>Zalman also had a reproduction, and he's been able to track it down to an `addGarbageObject()` call from `MachineBasicBlock`. It looks like the MBB gets deallocated but `removeGarbageObject()` isn't called yet. CC'ing him here so he can join the thread once he's gotten a little further. BTW, if this is blocking anyone, I can commit the patch I attached to my previous email (or you can apply it locally). I think it's probably the right thing eventually -- it improves the output when there *is* an issue -- but I haven't committed it yet since it'll cover up the problem.
I need to take off until later this evening. So far I've found the following: The assertion failure I'm seeing is due to the to the addGarbageObject call in: void ilist_traits<MachineInstr>::removeNodeFromList(MachineInstr *N) in MachineBasicBlock.cpp . This is somewhat puzzling as the destructor for MachineBasicBlock calls removeGarbageObject so it should be impossible for new to return that pointer again. The only explanation I can come up with is that this class does not have a virtual destructor and the object in question is being deleted with a different type than MachineBasicBlock. Also, changing removeGarbage in LeaksContext.h to: void removeGarbage(const T* o) { if (o == Cache) Cache = nullptr; // Cache hit else { assert(Ts.count(o) == 1 && "Removing Object not in set!"); Ts.erase(o); } } causes the added assertion to fail in my code. I'm not sure if the leaks detector is supposed to disallow removing an object that was never added, but if so, it seems worth adding the assertion. (And if not, perhaps explicitly documenting that it is allowed to remove and object that was never added or has already been removed.) And while I don't expect anyone to pull and build Halide to repro this, here's how to do it: 1) Clone https://github.com/halide/Halide.git to a directory I'll call "halide". 2) Symlink halide/llvm to your llvm build. (This may not be necessary as llvm-config may take care of everything.) 3) Make sure llvm-config from the right development build is on your path. 4) Type "make test_boundary_conditions" in the halide dir. (Or "make OPTIMIZE-g test_boundary_conditions" ) This should work on Linux or Mac. I've been debugging on Linux, but I expect this repros on Mac OS X as well. Thanks, -Z- On Wed, Dec 10, 2014 at 5:27 PM, Duncan P. N. Exon Smith < dexonsmith at apple.com> wrote:> +zalman at google.com > > > On 2014 Dec 10, at 15:57, Duncan P. N. Exon Smith <dexonsmith at apple.com> > wrote: > > > >> > >> On 2014 Dec 10, at 14:08, Tom Stellard <tom at stellard.net> wrote: > >> > >> On Wed, Dec 10, 2014 at 11:21:08AM -0800, Duncan P. N. Exon Smith wrote: > >>> > >>>> On 2014 Dec 10, at 08:40, Tom Stellard <tom at stellard.net> wrote: > >>>> > >>>> On Tue, Dec 09, 2014 at 09:22:16PM -0800, Duncan P. N. Exon Smith > wrote: > >>>>> The `Metadata`/`Value` split (PR21532) landed in r223802 -- at > least, the > >>>>> C++ side of it. This was a rocky day, but I suppose that's what I > get > >>>>> for failing to stage the change in smaller pieces. > >>>>> > >>>>> As of r223916 (lldb), I'm not aware of any remaining (in-tree) > breakage, > >>>>> so if I've missed some problem in the sea of buildbot errors, please > >>>>> flag me down. > >>>>> > >>>>> I'll follow up soon with bitcode and assembly changes! > >>>> > >>>> Hi Duncan, > >>>> > >>>> I started getting random assertion failures in some tests yesterday, > and I think > >>>> it may be related to this change. Here is the stack trace: > >>>> > >>>> #0 0x00007ffff59f4c39 in raise () from /lib64/libc.so.6 > >>>> #1 0x00007ffff59f6348 in abort () from /lib64/libc.so.6 > >>>> #2 0x00007ffff59edb96 in __assert_fail_base () from /lib64/libc.so.6 > >>>> #3 0x00007ffff59edc42 in __assert_fail () from /lib64/libc.so.6 > >>>> #4 0x00007ffff3a30e92 in > llvm::LeakDetectorImpl<void>::addGarbage(void const*) [clone .part.19] () > from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #5 0x00007ffff3a30fd3 in > llvm::LeakDetector::addGarbageObjectImpl(void*) () from /opt/buildbot/lib/ > libLLVM-3.6svn.so > >>>> #6 0x00007ffff3a40eed in > llvm::MDNode::getTemporary(llvm::LLVMContext&, > llvm::ArrayRef<llvm::Metadata*>) () from /opt/buildbot/lib/ > libLLVM-3.6svn.so > >>>> #7 0x00007ffff3426b3f in MapValueImpl(llvm::Metadata const*, > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #8 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #9 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #10 0x00007ffff3426eed in llvm::MapValue(llvm::Metadata const*, > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #11 0x00007ffff3426f39 in llvm::MapValue(llvm::MDNode const*, > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #12 0x00007ffff3427174 in llvm::RemapInstruction(llvm::Instruction*, > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #13 0x00007ffff3755786 in (anonymous > namespace)::ModuleLinker::linkGlobalValueBody(llvm::GlobalValue&) () from > /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #14 0x00007ffff375767f in llvm::Linker::linkInModule(llvm::Module*) > () from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #15 0x00007ffff3758cfb in llvm::Linker::LinkModules(llvm::Module*, > llvm::Module*, std::function<void (llvm::DiagnosticInfo const&)>) () > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #16 0x00007ffff6c9d8cf in > clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) () from > /opt/buildbot/lib/libOpenCL.so.1 > >>>> #17 0x00007ffff6e61f23 in clang::ParseAST(clang::Sema&, bool, bool) > () from /opt/buildbot/lib/libOpenCL.so.1 > >>>> #18 0x00007ffff6c9e6bb in clang::CodeGenAction::ExecuteAction() () > from /opt/buildbot/lib/libOpenCL.so.1 > >>>> #19 0x00007ffff6b7ead6 in clang::FrontendAction::Execute() () from > /opt/buildbot/lib/libOpenCL.so.1 > >>>> #20 0x00007ffff6b5d179 in > clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) () from > /opt/buildbot/lib/libOpenCL.so.1 > >>>> #21 0x00007ffff6b1282c in (anonymous namespace)::compile_llvm > (llvm_ctx=..., > >>>> source="\n__kernel void test_fn(__local float *sSharedStorage, > __global float *srcValues, __global uint *offsets, __global float > *destBuffer, uint alignmentOffset )\n{\n int tid = get_global_id( 0 > );\n sSha"..., headers=..., name="input.cl", triple="r600--", > processor="verde", opts="", > >>>> address_spaces=..., optimization_level=@0x7fffffff21cc: 2, > r_log=...) at llvm/invocation.cpp:255 > >>>> #22 0x00007ffff6b140c8 in clover::compile_program_llvm (source=..., > headers=..., ir=ir at entry=PIPE_SHADER_IR_NATIVE, target=..., opts=..., > r_log=...) > >>>> at llvm/invocation.cpp:710 > >>>> #23 0x00007ffff6b0a371 in clover::program::build (this=this at entry=0x23a0530, > devs=..., opts=opts at entry=0x7ffff793dc0d "", headers=...) > >>>> at core/program.cpp:63 > >>>> #24 0x00007ffff6af31c4 in clBuildProgram (d_prog=0x23a0538, > num_devs=0, d_devs=0x0, p_opts=<optimized out>, pfn_notify=0x0, > user_data=0x0) > >>>> at api/program.cpp:182 > >>>> > >>>> Does this look related? If so, let me know what other information > you need to > >>>> try to debug this issue. > >>> > >>> This could be related; I'm not sure. > >>> > >> > >> I'm pretty sure that this commit is the cause of the regression. > >> > >> r223801 works and r223810 does not, and I don't think any of the other > >> commits in that range could cause this. > >> > >>> It looks like a leak detection assertion, and I didn't need to change > >>> that logic at all. `ValueMap` calls `MDNode::getTemporary()` and > >>> `MDNode::deleteTemporary()` in the same ways it used to (and I didn't > >>> touch the implementation of those). > >>> > >>> Can you reproduce this with `llvm-link`? If so, that sounds like the > >>> best place to start. > >> > >> I can't reproduce this using llvm-link unfortunately. Any other ideas? > > > > (Continuing via email, since Tom stepped away IRC.) > > > > Tom, from the trace [1], it the problematic pointer (0x27e4c80) only > shows > > up once. > > > > [1]: http://people.freedesktop.org/~tstellar/md-crash.out > > > > That means that something *else* -- other than `MDNode::getTemporary()` > > -- must be adding that address to garbage and failing to remove it. > > > > I just dug into `LeakDetector::addGarbageObject()` and it stores *all* > > calls to `addGarbage()` in the same place. There are a fair number of > > these in the IR: > > > > $ git grep -e addGarbageObject -w -- lib/IR/ > > lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Instruction.cpp: if (!P) LeakDetector::addGarbageObject(this); > > lib/IR/Metadata.cpp: LeakDetector::addGarbageObject(N); > > > > I think the next step is to identify who called `addGarbageObject()` with > > the problematic address, and what the stack trace was. > > > > The weird thing is, I also noticed a semantic change I made here > > accidentally. `addGarbageObject()` has two overloads: `void*` and > > `const Value*`. `MDNode::getTemporary()` used to match the latter, but > > now it matches the former. > > > > The weird part: all the other calls to `addGarbageObject()` look like > they > > send in a `Value *`. > > > > Do you have any other calls to `addGarbageObject()`? Do they match > > `void *`? > > > > Also, what happens with the attached patch? (If this fixes your problem, > > I think it's just papering over something...) > > > > <0001-IR-Detect-Metadata-leaks-separately-from-generic-obj.patch> > > Zalman also had a reproduction, and he's been able to track it down to > an `addGarbageObject()` call from `MachineBasicBlock`. It looks like > the MBB gets deallocated but `removeGarbageObject()` isn't called yet. > > CC'ing him here so he can join the thread once he's gotten a little > further. > > BTW, if this is blocking anyone, I can commit the patch I attached to > my previous email (or you can apply it locally). I think it's probably > the right thing eventually -- it improves the output when there *is* an > issue -- but I haven't committed it yet since it'll cover up the > problem.-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141210/887df356/attachment.html>
On Wed, Dec 10, 2014 at 05:27:45PM -0800, Duncan P. N. Exon Smith wrote:> +zalman at google.com >Hi Duncan, This patch plus another small change fixes the assertion failure for me. With the patch alone, the void* overload of addGarbageObject() was being used by MDNode::getTemporary(), so I had to cast the object as an MDNode*: diff --git a/lib/IR/Metadata.cpp b/lib/IR/Metadata.cpp index cd5edd2..916d216 100644 --- a/lib/IR/Metadata.cpp +++ b/lib/IR/Metadata.cpp @@ -564,7 +564,7 @@ MDNode *MDNode::getMDNode(LLVMContext &Context, ArrayRef<Metadata *> MDs, MDNodeFwdDecl *MDNode::getTemporary(LLVMContext &Context, ArrayRef<Metadata *> MDs) { MDNodeFwdDecl *N = new (MDs.size()) MDNodeFwdDecl(Context, MDs); - LeakDetector::addGarbageObject(N); + LeakDetector::addGarbageObject((MDNode*)N); return N; } I'm in favor of committing this. -Tom> > On 2014 Dec 10, at 15:57, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote: > > > >> > >> On 2014 Dec 10, at 14:08, Tom Stellard <tom at stellard.net> wrote: > >> > >> On Wed, Dec 10, 2014 at 11:21:08AM -0800, Duncan P. N. Exon Smith wrote: > >>> > >>>> On 2014 Dec 10, at 08:40, Tom Stellard <tom at stellard.net> wrote: > >>>> > >>>> On Tue, Dec 09, 2014 at 09:22:16PM -0800, Duncan P. N. Exon Smith wrote: > >>>>> The `Metadata`/`Value` split (PR21532) landed in r223802 -- at least, the > >>>>> C++ side of it. This was a rocky day, but I suppose that's what I get > >>>>> for failing to stage the change in smaller pieces. > >>>>> > >>>>> As of r223916 (lldb), I'm not aware of any remaining (in-tree) breakage, > >>>>> so if I've missed some problem in the sea of buildbot errors, please > >>>>> flag me down. > >>>>> > >>>>> I'll follow up soon with bitcode and assembly changes! > >>>> > >>>> Hi Duncan, > >>>> > >>>> I started getting random assertion failures in some tests yesterday, and I think > >>>> it may be related to this change. Here is the stack trace: > >>>> > >>>> #0 0x00007ffff59f4c39 in raise () from /lib64/libc.so.6 > >>>> #1 0x00007ffff59f6348 in abort () from /lib64/libc.so.6 > >>>> #2 0x00007ffff59edb96 in __assert_fail_base () from /lib64/libc.so.6 > >>>> #3 0x00007ffff59edc42 in __assert_fail () from /lib64/libc.so.6 > >>>> #4 0x00007ffff3a30e92 in llvm::LeakDetectorImpl<void>::addGarbage(void const*) [clone .part.19] () from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #5 0x00007ffff3a30fd3 in llvm::LeakDetector::addGarbageObjectImpl(void*) () from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #6 0x00007ffff3a40eed in llvm::MDNode::getTemporary(llvm::LLVMContext&, llvm::ArrayRef<llvm::Metadata*>) () from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #7 0x00007ffff3426b3f in MapValueImpl(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #8 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #9 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #10 0x00007ffff3426eed in llvm::MapValue(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #11 0x00007ffff3426f39 in llvm::MapValue(llvm::MDNode const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #12 0x00007ffff3427174 in llvm::RemapInstruction(llvm::Instruction*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #13 0x00007ffff3755786 in (anonymous namespace)::ModuleLinker::linkGlobalValueBody(llvm::GlobalValue&) () from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #14 0x00007ffff375767f in llvm::Linker::linkInModule(llvm::Module*) () from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #15 0x00007ffff3758cfb in llvm::Linker::LinkModules(llvm::Module*, llvm::Module*, std::function<void (llvm::DiagnosticInfo const&)>) () > >>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>> #16 0x00007ffff6c9d8cf in clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) () from /opt/buildbot/lib/libOpenCL.so.1 > >>>> #17 0x00007ffff6e61f23 in clang::ParseAST(clang::Sema&, bool, bool) () from /opt/buildbot/lib/libOpenCL.so.1 > >>>> #18 0x00007ffff6c9e6bb in clang::CodeGenAction::ExecuteAction() () from /opt/buildbot/lib/libOpenCL.so.1 > >>>> #19 0x00007ffff6b7ead6 in clang::FrontendAction::Execute() () from /opt/buildbot/lib/libOpenCL.so.1 > >>>> #20 0x00007ffff6b5d179 in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) () from /opt/buildbot/lib/libOpenCL.so.1 > >>>> #21 0x00007ffff6b1282c in (anonymous namespace)::compile_llvm (llvm_ctx=..., > >>>> source="\n__kernel void test_fn(__local float *sSharedStorage, __global float *srcValues, __global uint *offsets, __global float *destBuffer, uint alignmentOffset )\n{\n int tid = get_global_id( 0 );\n sSha"..., headers=..., name="input.cl", triple="r600--", processor="verde", opts="", > >>>> address_spaces=..., optimization_level=@0x7fffffff21cc: 2, r_log=...) at llvm/invocation.cpp:255 > >>>> #22 0x00007ffff6b140c8 in clover::compile_program_llvm (source=..., headers=..., ir=ir at entry=PIPE_SHADER_IR_NATIVE, target=..., opts=..., r_log=...) > >>>> at llvm/invocation.cpp:710 > >>>> #23 0x00007ffff6b0a371 in clover::program::build (this=this at entry=0x23a0530, devs=..., opts=opts at entry=0x7ffff793dc0d "", headers=...) > >>>> at core/program.cpp:63 > >>>> #24 0x00007ffff6af31c4 in clBuildProgram (d_prog=0x23a0538, num_devs=0, d_devs=0x0, p_opts=<optimized out>, pfn_notify=0x0, user_data=0x0) > >>>> at api/program.cpp:182 > >>>> > >>>> Does this look related? If so, let me know what other information you need to > >>>> try to debug this issue. > >>> > >>> This could be related; I'm not sure. > >>> > >> > >> I'm pretty sure that this commit is the cause of the regression. > >> > >> r223801 works and r223810 does not, and I don't think any of the other > >> commits in that range could cause this. > >> > >>> It looks like a leak detection assertion, and I didn't need to change > >>> that logic at all. `ValueMap` calls `MDNode::getTemporary()` and > >>> `MDNode::deleteTemporary()` in the same ways it used to (and I didn't > >>> touch the implementation of those). > >>> > >>> Can you reproduce this with `llvm-link`? If so, that sounds like the > >>> best place to start. > >> > >> I can't reproduce this using llvm-link unfortunately. Any other ideas? > > > > (Continuing via email, since Tom stepped away IRC.) > > > > Tom, from the trace [1], it the problematic pointer (0x27e4c80) only shows > > up once. > > > > [1]: http://people.freedesktop.org/~tstellar/md-crash.out > > > > That means that something *else* -- other than `MDNode::getTemporary()` > > -- must be adding that address to garbage and failing to remove it. > > > > I just dug into `LeakDetector::addGarbageObject()` and it stores *all* > > calls to `addGarbage()` in the same place. There are a fair number of > > these in the IR: > > > > $ git grep -e addGarbageObject -w -- lib/IR/ > > lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this); > > lib/IR/Instruction.cpp: if (!P) LeakDetector::addGarbageObject(this); > > lib/IR/Metadata.cpp: LeakDetector::addGarbageObject(N); > > > > I think the next step is to identify who called `addGarbageObject()` with > > the problematic address, and what the stack trace was. > > > > The weird thing is, I also noticed a semantic change I made here > > accidentally. `addGarbageObject()` has two overloads: `void*` and > > `const Value*`. `MDNode::getTemporary()` used to match the latter, but > > now it matches the former. > > > > The weird part: all the other calls to `addGarbageObject()` look like they > > send in a `Value *`. > > > > Do you have any other calls to `addGarbageObject()`? Do they match > > `void *`? > > > > Also, what happens with the attached patch? (If this fixes your problem, > > I think it's just papering over something...) > > > > <0001-IR-Detect-Metadata-leaks-separately-from-generic-obj.patch> > > Zalman also had a reproduction, and he's been able to track it down to > an `addGarbageObject()` call from `MachineBasicBlock`. It looks like > the MBB gets deallocated but `removeGarbageObject()` isn't called yet. > > CC'ing him here so he can join the thread once he's gotten a little > further. > > BTW, if this is blocking anyone, I can commit the patch I attached to > my previous email (or you can apply it locally). I think it's probably > the right thing eventually -- it improves the output when there *is* an > issue -- but I haven't committed it yet since it'll cover up the > problem.