I committed: r224058 = 966942da9e68b59c31ce770e7f94c55a63482c6b r224060 = da75f7277e3a129aed8ef8aa4e0d84de40b76fd4 r224061 = f88e4c8e9171045454b2c8e05054c2af8da3fe4f Let me know if somehow you're still hitting the problem. r224061 removes leak detection entirely from `MachineInstr`. There aren't any leaks to be had there, since they're allocated in a custom allocator. They're just dropped away once `MachineFunction` is deleted. @Zalman, thanks again for your help digging into this.> On 2014 Dec 11, at 09:05, Tom Stellard <tom at stellard.net> wrote: > > On Thu, Dec 11, 2014 at 11:52:34AM -0500, Tom Stellard wrote: >> On Wed, Dec 10, 2014 at 05:27:45PM -0800, Duncan P. N. Exon Smith wrote: >>> +zalman at google.com >>> >> >> Hi Duncan, >> >> This patch plus another small change fixes the assertion failure for >> me. With the patch alone, the void* overload of addGarbageObject() >> was being used by MDNode::getTemporary(), so I had to cast the object as >> an MDNode*: >> >> diff --git a/lib/IR/Metadata.cpp b/lib/IR/Metadata.cpp >> index cd5edd2..916d216 100644 >> --- a/lib/IR/Metadata.cpp >> +++ b/lib/IR/Metadata.cpp >> @@ -564,7 +564,7 @@ MDNode *MDNode::getMDNode(LLVMContext &Context, >> ArrayRef<Metadata *> MDs, >> MDNodeFwdDecl *MDNode::getTemporary(LLVMContext &Context, >> ArrayRef<Metadata *> MDs) { >> MDNodeFwdDecl *N = new (MDs.size()) MDNodeFwdDecl(Context, MDs); >> - LeakDetector::addGarbageObject(N); >> + LeakDetector::addGarbageObject((MDNode*)N); >> return N; >> } > > Sorry, after more extensive testing, this doesn't work. It looks like > you need to add const MDNode * overloads to addGarbageObject() adding > them for addGarbageObjectImpl() doesn't seem to work: > > diff --git a/include/llvm/IR/LeakDetector.h > b/include/llvm/IR/LeakDetector.h > index e0b131e..b272eaf 100644 > --- a/include/llvm/IR/LeakDetector.h > +++ b/include/llvm/IR/LeakDetector.h > @@ -79,6 +79,17 @@ struct LeakDetector { > #endif > } > > + static void addGarbageObject(const MDNode *Object) { > +#ifndef NDEBUG > + addGarbageObjectImpl(Object); > +#endif > + } > + static void removeGarbageObject(const MDNode *Object) { > +#ifndef NDEBUG > + removeGarbageObjectImpl(Object); > +#endif > + } > + > private: > // If we are debugging, the actual implementations will be called... > static void addGarbageObjectImpl(const Value *Object); >> >> >> I'm in favor of committing this. >> >> -Tom >> >> >>>> On 2014 Dec 10, at 15:57, Duncan P. N. Exon Smith <dexonsmith at apple.com> wrote: >>>> >>>>> >>>>> On 2014 Dec 10, at 14:08, Tom Stellard <tom at stellard.net> wrote: >>>>> >>>>> On Wed, Dec 10, 2014 at 11:21:08AM -0800, Duncan P. N. Exon Smith wrote: >>>>>> >>>>>>> On 2014 Dec 10, at 08:40, Tom Stellard <tom at stellard.net> wrote: >>>>>>> >>>>>>> On Tue, Dec 09, 2014 at 09:22:16PM -0800, Duncan P. N. Exon Smith wrote: >>>>>>>> The `Metadata`/`Value` split (PR21532) landed in r223802 -- at least, the >>>>>>>> C++ side of it. This was a rocky day, but I suppose that's what I get >>>>>>>> for failing to stage the change in smaller pieces. >>>>>>>> >>>>>>>> As of r223916 (lldb), I'm not aware of any remaining (in-tree) breakage, >>>>>>>> so if I've missed some problem in the sea of buildbot errors, please >>>>>>>> flag me down. >>>>>>>> >>>>>>>> I'll follow up soon with bitcode and assembly changes! >>>>>>> >>>>>>> Hi Duncan, >>>>>>> >>>>>>> I started getting random assertion failures in some tests yesterday, and I think >>>>>>> it may be related to this change. Here is the stack trace: >>>>>>> >>>>>>> #0 0x00007ffff59f4c39 in raise () from /lib64/libc.so.6 >>>>>>> #1 0x00007ffff59f6348 in abort () from /lib64/libc.so.6 >>>>>>> #2 0x00007ffff59edb96 in __assert_fail_base () from /lib64/libc.so.6 >>>>>>> #3 0x00007ffff59edc42 in __assert_fail () from /lib64/libc.so.6 >>>>>>> #4 0x00007ffff3a30e92 in llvm::LeakDetectorImpl<void>::addGarbage(void const*) [clone .part.19] () from /opt/buildbot/lib/libLLVM-3.6svn.so >>>>>>> #5 0x00007ffff3a30fd3 in llvm::LeakDetector::addGarbageObjectImpl(void*) () from /opt/buildbot/lib/libLLVM-3.6svn.so >>>>>>> #6 0x00007ffff3a40eed in llvm::MDNode::getTemporary(llvm::LLVMContext&, llvm::ArrayRef<llvm::Metadata*>) () from /opt/buildbot/lib/libLLVM-3.6svn.so >>>>>>> #7 0x00007ffff3426b3f in MapValueImpl(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>>>>>> #8 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>>>>>> #9 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>>>>>> #10 0x00007ffff3426eed in llvm::MapValue(llvm::Metadata const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>>>>>> #11 0x00007ffff3426f39 in llvm::MapValue(llvm::MDNode const*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>>>>>> #12 0x00007ffff3427174 in llvm::RemapInstruction(llvm::Instruction*, llvm::ValueMap<llvm::Value const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>>>>>> #13 0x00007ffff3755786 in (anonymous namespace)::ModuleLinker::linkGlobalValueBody(llvm::GlobalValue&) () from /opt/buildbot/lib/libLLVM-3.6svn.so >>>>>>> #14 0x00007ffff375767f in llvm::Linker::linkInModule(llvm::Module*) () from /opt/buildbot/lib/libLLVM-3.6svn.so >>>>>>> #15 0x00007ffff3758cfb in llvm::Linker::LinkModules(llvm::Module*, llvm::Module*, std::function<void (llvm::DiagnosticInfo const&)>) () >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so >>>>>>> #16 0x00007ffff6c9d8cf in clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) () from /opt/buildbot/lib/libOpenCL.so.1 >>>>>>> #17 0x00007ffff6e61f23 in clang::ParseAST(clang::Sema&, bool, bool) () from /opt/buildbot/lib/libOpenCL.so.1 >>>>>>> #18 0x00007ffff6c9e6bb in clang::CodeGenAction::ExecuteAction() () from /opt/buildbot/lib/libOpenCL.so.1 >>>>>>> #19 0x00007ffff6b7ead6 in clang::FrontendAction::Execute() () from /opt/buildbot/lib/libOpenCL.so.1 >>>>>>> #20 0x00007ffff6b5d179 in clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) () from /opt/buildbot/lib/libOpenCL.so.1 >>>>>>> #21 0x00007ffff6b1282c in (anonymous namespace)::compile_llvm (llvm_ctx=..., >>>>>>> source="\n__kernel void test_fn(__local float *sSharedStorage, __global float *srcValues, __global uint *offsets, __global float *destBuffer, uint alignmentOffset )\n{\n int tid = get_global_id( 0 );\n sSha"..., headers=..., name="input.cl", triple="r600--", processor="verde", opts="", >>>>>>> address_spaces=..., optimization_level=@0x7fffffff21cc: 2, r_log=...) at llvm/invocation.cpp:255 >>>>>>> #22 0x00007ffff6b140c8 in clover::compile_program_llvm (source=..., headers=..., ir=ir at entry=PIPE_SHADER_IR_NATIVE, target=..., opts=..., r_log=...) >>>>>>> at llvm/invocation.cpp:710 >>>>>>> #23 0x00007ffff6b0a371 in clover::program::build (this=this at entry=0x23a0530, devs=..., opts=opts at entry=0x7ffff793dc0d "", headers=...) >>>>>>> at core/program.cpp:63 >>>>>>> #24 0x00007ffff6af31c4 in clBuildProgram (d_prog=0x23a0538, num_devs=0, d_devs=0x0, p_opts=<optimized out>, pfn_notify=0x0, user_data=0x0) >>>>>>> at api/program.cpp:182 >>>>>>> >>>>>>> Does this look related? If so, let me know what other information you need to >>>>>>> try to debug this issue. >>>>>> >>>>>> This could be related; I'm not sure. >>>>>> >>>>> >>>>> I'm pretty sure that this commit is the cause of the regression. >>>>> >>>>> r223801 works and r223810 does not, and I don't think any of the other >>>>> commits in that range could cause this. >>>>> >>>>>> It looks like a leak detection assertion, and I didn't need to change >>>>>> that logic at all. `ValueMap` calls `MDNode::getTemporary()` and >>>>>> `MDNode::deleteTemporary()` in the same ways it used to (and I didn't >>>>>> touch the implementation of those). >>>>>> >>>>>> Can you reproduce this with `llvm-link`? If so, that sounds like the >>>>>> best place to start. >>>>> >>>>> I can't reproduce this using llvm-link unfortunately. Any other ideas? >>>> >>>> (Continuing via email, since Tom stepped away IRC.) >>>> >>>> Tom, from the trace [1], it the problematic pointer (0x27e4c80) only shows >>>> up once. >>>> >>>> [1]: http://people.freedesktop.org/~tstellar/md-crash.out >>>> >>>> That means that something *else* -- other than `MDNode::getTemporary()` >>>> -- must be adding that address to garbage and failing to remove it. >>>> >>>> I just dug into `LeakDetector::addGarbageObject()` and it stores *all* >>>> calls to `addGarbage()` in the same place. There are a fair number of >>>> these in the IR: >>>> >>>> $ git grep -e addGarbageObject -w -- lib/IR/ >>>> lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this); >>>> lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this); >>>> lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); >>>> lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); >>>> lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); >>>> lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); >>>> lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); >>>> lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); >>>> lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); >>>> lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); >>>> lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); >>>> lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this); >>>> lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this); >>>> lib/IR/Instruction.cpp: if (!P) LeakDetector::addGarbageObject(this); >>>> lib/IR/Metadata.cpp: LeakDetector::addGarbageObject(N); >>>> >>>> I think the next step is to identify who called `addGarbageObject()` with >>>> the problematic address, and what the stack trace was. >>>> >>>> The weird thing is, I also noticed a semantic change I made here >>>> accidentally. `addGarbageObject()` has two overloads: `void*` and >>>> `const Value*`. `MDNode::getTemporary()` used to match the latter, but >>>> now it matches the former. >>>> >>>> The weird part: all the other calls to `addGarbageObject()` look like they >>>> send in a `Value *`. >>>> >>>> Do you have any other calls to `addGarbageObject()`? Do they match >>>> `void *`? >>>> >>>> Also, what happens with the attached patch? (If this fixes your problem, >>>> I think it's just papering over something...) >>>> >>>> <0001-IR-Detect-Metadata-leaks-separately-from-generic-obj.patch> >>> >>> Zalman also had a reproduction, and he's been able to track it down to >>> an `addGarbageObject()` call from `MachineBasicBlock`. It looks like >>> the MBB gets deallocated but `removeGarbageObject()` isn't called yet. >>> >>> CC'ing him here so he can join the thread once he's gotten a little >>> further. >>> >>> BTW, if this is blocking anyone, I can commit the patch I attached to >>> my previous email (or you can apply it locally). I think it's probably >>> the right thing eventually -- it improves the output when there *is* an >>> issue -- but I haven't committed it yet since it'll cover up the >>> problem. >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
The assertion no longer fires in Halide with top-of-tree llvm. Thank you for the fix. -Z- On Thu, Dec 11, 2014 at 1:56 PM, Duncan P. N. Exon Smith < dexonsmith at apple.com> wrote:> I committed: > > r224058 = 966942da9e68b59c31ce770e7f94c55a63482c6b > r224060 = da75f7277e3a129aed8ef8aa4e0d84de40b76fd4 > r224061 = f88e4c8e9171045454b2c8e05054c2af8da3fe4f > > Let me know if somehow you're still hitting the problem. > > r224061 removes leak detection entirely from `MachineInstr`. There aren't > any leaks to be had there, since they're allocated in a custom allocator. > They're just dropped away once `MachineFunction` is deleted. > > @Zalman, thanks again for your help digging into this. > > > On 2014 Dec 11, at 09:05, Tom Stellard <tom at stellard.net> wrote: > > > > On Thu, Dec 11, 2014 at 11:52:34AM -0500, Tom Stellard wrote: > >> On Wed, Dec 10, 2014 at 05:27:45PM -0800, Duncan P. N. Exon Smith wrote: > >>> +zalman at google.com > >>> > >> > >> Hi Duncan, > >> > >> This patch plus another small change fixes the assertion failure for > >> me. With the patch alone, the void* overload of addGarbageObject() > >> was being used by MDNode::getTemporary(), so I had to cast the object as > >> an MDNode*: > >> > >> diff --git a/lib/IR/Metadata.cpp b/lib/IR/Metadata.cpp > >> index cd5edd2..916d216 100644 > >> --- a/lib/IR/Metadata.cpp > >> +++ b/lib/IR/Metadata.cpp > >> @@ -564,7 +564,7 @@ MDNode *MDNode::getMDNode(LLVMContext &Context, > >> ArrayRef<Metadata *> MDs, > >> MDNodeFwdDecl *MDNode::getTemporary(LLVMContext &Context, > >> ArrayRef<Metadata *> MDs) { > >> MDNodeFwdDecl *N = new (MDs.size()) MDNodeFwdDecl(Context, MDs); > >> - LeakDetector::addGarbageObject(N); > >> + LeakDetector::addGarbageObject((MDNode*)N); > >> return N; > >> } > > > > Sorry, after more extensive testing, this doesn't work. It looks like > > you need to add const MDNode * overloads to addGarbageObject() adding > > them for addGarbageObjectImpl() doesn't seem to work: > > > > diff --git a/include/llvm/IR/LeakDetector.h > > b/include/llvm/IR/LeakDetector.h > > index e0b131e..b272eaf 100644 > > --- a/include/llvm/IR/LeakDetector.h > > +++ b/include/llvm/IR/LeakDetector.h > > @@ -79,6 +79,17 @@ struct LeakDetector { > > #endif > > } > > > > + static void addGarbageObject(const MDNode *Object) { > > +#ifndef NDEBUG > > + addGarbageObjectImpl(Object); > > +#endif > > + } > > + static void removeGarbageObject(const MDNode *Object) { > > +#ifndef NDEBUG > > + removeGarbageObjectImpl(Object); > > +#endif > > + } > > + > > private: > > // If we are debugging, the actual implementations will be called... > > static void addGarbageObjectImpl(const Value *Object); > >> > >> > >> I'm in favor of committing this. > >> > >> -Tom > >> > >> > >>>> On 2014 Dec 10, at 15:57, Duncan P. N. Exon Smith < > dexonsmith at apple.com> wrote: > >>>> > >>>>> > >>>>> On 2014 Dec 10, at 14:08, Tom Stellard <tom at stellard.net> wrote: > >>>>> > >>>>> On Wed, Dec 10, 2014 at 11:21:08AM -0800, Duncan P. N. Exon Smith > wrote: > >>>>>> > >>>>>>> On 2014 Dec 10, at 08:40, Tom Stellard <tom at stellard.net> wrote: > >>>>>>> > >>>>>>> On Tue, Dec 09, 2014 at 09:22:16PM -0800, Duncan P. N. Exon Smith > wrote: > >>>>>>>> The `Metadata`/`Value` split (PR21532) landed in r223802 -- at > least, the > >>>>>>>> C++ side of it. This was a rocky day, but I suppose that's what > I get > >>>>>>>> for failing to stage the change in smaller pieces. > >>>>>>>> > >>>>>>>> As of r223916 (lldb), I'm not aware of any remaining (in-tree) > breakage, > >>>>>>>> so if I've missed some problem in the sea of buildbot errors, > please > >>>>>>>> flag me down. > >>>>>>>> > >>>>>>>> I'll follow up soon with bitcode and assembly changes! > >>>>>>> > >>>>>>> Hi Duncan, > >>>>>>> > >>>>>>> I started getting random assertion failures in some tests > yesterday, and I think > >>>>>>> it may be related to this change. Here is the stack trace: > >>>>>>> > >>>>>>> #0 0x00007ffff59f4c39 in raise () from /lib64/libc.so.6 > >>>>>>> #1 0x00007ffff59f6348 in abort () from /lib64/libc.so.6 > >>>>>>> #2 0x00007ffff59edb96 in __assert_fail_base () from > /lib64/libc.so.6 > >>>>>>> #3 0x00007ffff59edc42 in __assert_fail () from /lib64/libc.so.6 > >>>>>>> #4 0x00007ffff3a30e92 in > llvm::LeakDetectorImpl<void>::addGarbage(void const*) [clone .part.19] () > from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>>>>> #5 0x00007ffff3a30fd3 in > llvm::LeakDetector::addGarbageObjectImpl(void*) () from /opt/buildbot/lib/ > libLLVM-3.6svn.so > >>>>>>> #6 0x00007ffff3a40eed in > llvm::MDNode::getTemporary(llvm::LLVMContext&, > llvm::ArrayRef<llvm::Metadata*>) () from /opt/buildbot/lib/ > libLLVM-3.6svn.so > >>>>>>> #7 0x00007ffff3426b3f in MapValueImpl(llvm::Metadata const*, > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>>>>> #8 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>>>>> #9 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>>>>> #10 0x00007ffff3426eed in llvm::MapValue(llvm::Metadata const*, > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>>>>> #11 0x00007ffff3426f39 in llvm::MapValue(llvm::MDNode const*, > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>>>>> #12 0x00007ffff3427174 in > llvm::RemapInstruction(llvm::Instruction*, llvm::ValueMap<llvm::Value > const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, > llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, > llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>>>>> #13 0x00007ffff3755786 in (anonymous > namespace)::ModuleLinker::linkGlobalValueBody(llvm::GlobalValue&) () from > /opt/buildbot/lib/libLLVM-3.6svn.so > >>>>>>> #14 0x00007ffff375767f in > llvm::Linker::linkInModule(llvm::Module*) () from /opt/buildbot/lib/ > libLLVM-3.6svn.so > >>>>>>> #15 0x00007ffff3758cfb in llvm::Linker::LinkModules(llvm::Module*, > llvm::Module*, std::function<void (llvm::DiagnosticInfo const&)>) () > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > >>>>>>> #16 0x00007ffff6c9d8cf in > clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) () from > /opt/buildbot/lib/libOpenCL.so.1 > >>>>>>> #17 0x00007ffff6e61f23 in clang::ParseAST(clang::Sema&, bool, > bool) () from /opt/buildbot/lib/libOpenCL.so.1 > >>>>>>> #18 0x00007ffff6c9e6bb in clang::CodeGenAction::ExecuteAction() () > from /opt/buildbot/lib/libOpenCL.so.1 > >>>>>>> #19 0x00007ffff6b7ead6 in clang::FrontendAction::Execute() () from > /opt/buildbot/lib/libOpenCL.so.1 > >>>>>>> #20 0x00007ffff6b5d179 in > clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) () from > /opt/buildbot/lib/libOpenCL.so.1 > >>>>>>> #21 0x00007ffff6b1282c in (anonymous namespace)::compile_llvm > (llvm_ctx=..., > >>>>>>> source="\n__kernel void test_fn(__local float *sSharedStorage, > __global float *srcValues, __global uint *offsets, __global float > *destBuffer, uint alignmentOffset )\n{\n int tid = get_global_id( 0 > );\n sSha"..., headers=..., name="input.cl", triple="r600--", > processor="verde", opts="", > >>>>>>> address_spaces=..., optimization_level=@0x7fffffff21cc: 2, > r_log=...) at llvm/invocation.cpp:255 > >>>>>>> #22 0x00007ffff6b140c8 in clover::compile_program_llvm > (source=..., headers=..., ir=ir at entry=PIPE_SHADER_IR_NATIVE, target=..., > opts=..., r_log=...) > >>>>>>> at llvm/invocation.cpp:710 > >>>>>>> #23 0x00007ffff6b0a371 in clover::program::build (this=this at entry=0x23a0530, > devs=..., opts=opts at entry=0x7ffff793dc0d "", headers=...) > >>>>>>> at core/program.cpp:63 > >>>>>>> #24 0x00007ffff6af31c4 in clBuildProgram (d_prog=0x23a0538, > num_devs=0, d_devs=0x0, p_opts=<optimized out>, pfn_notify=0x0, > user_data=0x0) > >>>>>>> at api/program.cpp:182 > >>>>>>> > >>>>>>> Does this look related? If so, let me know what other information > you need to > >>>>>>> try to debug this issue. > >>>>>> > >>>>>> This could be related; I'm not sure. > >>>>>> > >>>>> > >>>>> I'm pretty sure that this commit is the cause of the regression. > >>>>> > >>>>> r223801 works and r223810 does not, and I don't think any of the > other > >>>>> commits in that range could cause this. > >>>>> > >>>>>> It looks like a leak detection assertion, and I didn't need to > change > >>>>>> that logic at all. `ValueMap` calls `MDNode::getTemporary()` and > >>>>>> `MDNode::deleteTemporary()` in the same ways it used to (and I > didn't > >>>>>> touch the implementation of those). > >>>>>> > >>>>>> Can you reproduce this with `llvm-link`? If so, that sounds like > the > >>>>>> best place to start. > >>>>> > >>>>> I can't reproduce this using llvm-link unfortunately. Any other > ideas? > >>>> > >>>> (Continuing via email, since Tom stepped away IRC.) > >>>> > >>>> Tom, from the trace [1], it the problematic pointer (0x27e4c80) only > shows > >>>> up once. > >>>> > >>>> [1]: http://people.freedesktop.org/~tstellar/md-crash.out > >>>> > >>>> That means that something *else* -- other than > `MDNode::getTemporary()` > >>>> -- must be adding that address to garbage and failing to remove it. > >>>> > >>>> I just dug into `LeakDetector::addGarbageObject()` and it stores *all* > >>>> calls to `addGarbage()` in the same place. There are a fair number of > >>>> these in the IR: > >>>> > >>>> $ git grep -e addGarbageObject -w -- lib/IR/ > >>>> lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this); > >>>> lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this); > >>>> lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > >>>> lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > >>>> lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > >>>> lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > >>>> lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > >>>> lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > >>>> lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > >>>> lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > >>>> lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > >>>> lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this); > >>>> lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this); > >>>> lib/IR/Instruction.cpp: if (!P) > LeakDetector::addGarbageObject(this); > >>>> lib/IR/Metadata.cpp: LeakDetector::addGarbageObject(N); > >>>> > >>>> I think the next step is to identify who called `addGarbageObject()` > with > >>>> the problematic address, and what the stack trace was. > >>>> > >>>> The weird thing is, I also noticed a semantic change I made here > >>>> accidentally. `addGarbageObject()` has two overloads: `void*` and > >>>> `const Value*`. `MDNode::getTemporary()` used to match the latter, > but > >>>> now it matches the former. > >>>> > >>>> The weird part: all the other calls to `addGarbageObject()` look like > they > >>>> send in a `Value *`. > >>>> > >>>> Do you have any other calls to `addGarbageObject()`? Do they match > >>>> `void *`? > >>>> > >>>> Also, what happens with the attached patch? (If this fixes your > problem, > >>>> I think it's just papering over something...) > >>>> > >>>> <0001-IR-Detect-Metadata-leaks-separately-from-generic-obj.patch> > >>> > >>> Zalman also had a reproduction, and he's been able to track it down to > >>> an `addGarbageObject()` call from `MachineBasicBlock`. It looks like > >>> the MBB gets deallocated but `removeGarbageObject()` isn't called yet. > >>> > >>> CC'ing him here so he can join the thread once he's gotten a little > >>> further. > >>> > >>> BTW, if this is blocking anyone, I can commit the patch I attached to > >>> my previous email (or you can apply it locally). I think it's probably > >>> the right thing eventually -- it improves the output when there *is* an > >>> issue -- but I haven't committed it yet since it'll cover up the > >>> problem. > >> _______________________________________________ > >> LLVM Developers mailing list > >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20141211/0b45819b/attachment.html>
On Thu, Dec 11, 2014 at 02:41:20PM -0800, Zalman Stern wrote:> The assertion no longer fires in Halide with top-of-tree llvm. Thank you > for the fix.+1. Thanks for the quick fix. -Tom> > -Z- > > > On Thu, Dec 11, 2014 at 1:56 PM, Duncan P. N. Exon Smith < > dexonsmith at apple.com> wrote: > > > I committed: > > > > r224058 = 966942da9e68b59c31ce770e7f94c55a63482c6b > > r224060 = da75f7277e3a129aed8ef8aa4e0d84de40b76fd4 > > r224061 = f88e4c8e9171045454b2c8e05054c2af8da3fe4f > > > > Let me know if somehow you're still hitting the problem. > > > > r224061 removes leak detection entirely from `MachineInstr`. There aren't > > any leaks to be had there, since they're allocated in a custom allocator. > > They're just dropped away once `MachineFunction` is deleted. > > > > @Zalman, thanks again for your help digging into this. > > > > > On 2014 Dec 11, at 09:05, Tom Stellard <tom at stellard.net> wrote: > > > > > > On Thu, Dec 11, 2014 at 11:52:34AM -0500, Tom Stellard wrote: > > >> On Wed, Dec 10, 2014 at 05:27:45PM -0800, Duncan P. N. Exon Smith wrote: > > >>> +zalman at google.com > > >>> > > >> > > >> Hi Duncan, > > >> > > >> This patch plus another small change fixes the assertion failure for > > >> me. With the patch alone, the void* overload of addGarbageObject() > > >> was being used by MDNode::getTemporary(), so I had to cast the object as > > >> an MDNode*: > > >> > > >> diff --git a/lib/IR/Metadata.cpp b/lib/IR/Metadata.cpp > > >> index cd5edd2..916d216 100644 > > >> --- a/lib/IR/Metadata.cpp > > >> +++ b/lib/IR/Metadata.cpp > > >> @@ -564,7 +564,7 @@ MDNode *MDNode::getMDNode(LLVMContext &Context, > > >> ArrayRef<Metadata *> MDs, > > >> MDNodeFwdDecl *MDNode::getTemporary(LLVMContext &Context, > > >> ArrayRef<Metadata *> MDs) { > > >> MDNodeFwdDecl *N = new (MDs.size()) MDNodeFwdDecl(Context, MDs); > > >> - LeakDetector::addGarbageObject(N); > > >> + LeakDetector::addGarbageObject((MDNode*)N); > > >> return N; > > >> } > > > > > > Sorry, after more extensive testing, this doesn't work. It looks like > > > you need to add const MDNode * overloads to addGarbageObject() adding > > > them for addGarbageObjectImpl() doesn't seem to work: > > > > > > diff --git a/include/llvm/IR/LeakDetector.h > > > b/include/llvm/IR/LeakDetector.h > > > index e0b131e..b272eaf 100644 > > > --- a/include/llvm/IR/LeakDetector.h > > > +++ b/include/llvm/IR/LeakDetector.h > > > @@ -79,6 +79,17 @@ struct LeakDetector { > > > #endif > > > } > > > > > > + static void addGarbageObject(const MDNode *Object) { > > > +#ifndef NDEBUG > > > + addGarbageObjectImpl(Object); > > > +#endif > > > + } > > > + static void removeGarbageObject(const MDNode *Object) { > > > +#ifndef NDEBUG > > > + removeGarbageObjectImpl(Object); > > > +#endif > > > + } > > > + > > > private: > > > // If we are debugging, the actual implementations will be called... > > > static void addGarbageObjectImpl(const Value *Object); > > >> > > >> > > >> I'm in favor of committing this. > > >> > > >> -Tom > > >> > > >> > > >>>> On 2014 Dec 10, at 15:57, Duncan P. N. Exon Smith < > > dexonsmith at apple.com> wrote: > > >>>> > > >>>>> > > >>>>> On 2014 Dec 10, at 14:08, Tom Stellard <tom at stellard.net> wrote: > > >>>>> > > >>>>> On Wed, Dec 10, 2014 at 11:21:08AM -0800, Duncan P. N. Exon Smith > > wrote: > > >>>>>> > > >>>>>>> On 2014 Dec 10, at 08:40, Tom Stellard <tom at stellard.net> wrote: > > >>>>>>> > > >>>>>>> On Tue, Dec 09, 2014 at 09:22:16PM -0800, Duncan P. N. Exon Smith > > wrote: > > >>>>>>>> The `Metadata`/`Value` split (PR21532) landed in r223802 -- at > > least, the > > >>>>>>>> C++ side of it. This was a rocky day, but I suppose that's what > > I get > > >>>>>>>> for failing to stage the change in smaller pieces. > > >>>>>>>> > > >>>>>>>> As of r223916 (lldb), I'm not aware of any remaining (in-tree) > > breakage, > > >>>>>>>> so if I've missed some problem in the sea of buildbot errors, > > please > > >>>>>>>> flag me down. > > >>>>>>>> > > >>>>>>>> I'll follow up soon with bitcode and assembly changes! > > >>>>>>> > > >>>>>>> Hi Duncan, > > >>>>>>> > > >>>>>>> I started getting random assertion failures in some tests > > yesterday, and I think > > >>>>>>> it may be related to this change. Here is the stack trace: > > >>>>>>> > > >>>>>>> #0 0x00007ffff59f4c39 in raise () from /lib64/libc.so.6 > > >>>>>>> #1 0x00007ffff59f6348 in abort () from /lib64/libc.so.6 > > >>>>>>> #2 0x00007ffff59edb96 in __assert_fail_base () from > > /lib64/libc.so.6 > > >>>>>>> #3 0x00007ffff59edc42 in __assert_fail () from /lib64/libc.so.6 > > >>>>>>> #4 0x00007ffff3a30e92 in > > llvm::LeakDetectorImpl<void>::addGarbage(void const*) [clone .part.19] () > > from /opt/buildbot/lib/libLLVM-3.6svn.so > > >>>>>>> #5 0x00007ffff3a30fd3 in > > llvm::LeakDetector::addGarbageObjectImpl(void*) () from /opt/buildbot/lib/ > > libLLVM-3.6svn.so > > >>>>>>> #6 0x00007ffff3a40eed in > > llvm::MDNode::getTemporary(llvm::LLVMContext&, > > llvm::ArrayRef<llvm::Metadata*>) () from /opt/buildbot/lib/ > > libLLVM-3.6svn.so > > >>>>>>> #7 0x00007ffff3426b3f in MapValueImpl(llvm::Metadata const*, > > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > > >>>>>>> #8 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, > > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > > >>>>>>> #9 0x00007ffff3426bd6 in MapValueImpl(llvm::Metadata const*, > > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > > >>>>>>> #10 0x00007ffff3426eed in llvm::MapValue(llvm::Metadata const*, > > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > > >>>>>>> #11 0x00007ffff3426f39 in llvm::MapValue(llvm::MDNode const*, > > llvm::ValueMap<llvm::Value const*, llvm::WeakVH, > > llvm::ValueMapConfig<llvm::Value const*, llvm::sys::SmartMutex<false> > >&, > > llvm::RemapFlags, llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > > >>>>>>> #12 0x00007ffff3427174 in > > llvm::RemapInstruction(llvm::Instruction*, llvm::ValueMap<llvm::Value > > const*, llvm::WeakVH, llvm::ValueMapConfig<llvm::Value const*, > > llvm::sys::SmartMutex<false> > >&, llvm::RemapFlags, > > llvm::ValueMapTypeRemapper*, llvm::ValueMaterializer*) () > > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > > >>>>>>> #13 0x00007ffff3755786 in (anonymous > > namespace)::ModuleLinker::linkGlobalValueBody(llvm::GlobalValue&) () from > > /opt/buildbot/lib/libLLVM-3.6svn.so > > >>>>>>> #14 0x00007ffff375767f in > > llvm::Linker::linkInModule(llvm::Module*) () from /opt/buildbot/lib/ > > libLLVM-3.6svn.so > > >>>>>>> #15 0x00007ffff3758cfb in llvm::Linker::LinkModules(llvm::Module*, > > llvm::Module*, std::function<void (llvm::DiagnosticInfo const&)>) () > > >>>>>>> from /opt/buildbot/lib/libLLVM-3.6svn.so > > >>>>>>> #16 0x00007ffff6c9d8cf in > > clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) () from > > /opt/buildbot/lib/libOpenCL.so.1 > > >>>>>>> #17 0x00007ffff6e61f23 in clang::ParseAST(clang::Sema&, bool, > > bool) () from /opt/buildbot/lib/libOpenCL.so.1 > > >>>>>>> #18 0x00007ffff6c9e6bb in clang::CodeGenAction::ExecuteAction() () > > from /opt/buildbot/lib/libOpenCL.so.1 > > >>>>>>> #19 0x00007ffff6b7ead6 in clang::FrontendAction::Execute() () from > > /opt/buildbot/lib/libOpenCL.so.1 > > >>>>>>> #20 0x00007ffff6b5d179 in > > clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) () from > > /opt/buildbot/lib/libOpenCL.so.1 > > >>>>>>> #21 0x00007ffff6b1282c in (anonymous namespace)::compile_llvm > > (llvm_ctx=..., > > >>>>>>> source="\n__kernel void test_fn(__local float *sSharedStorage, > > __global float *srcValues, __global uint *offsets, __global float > > *destBuffer, uint alignmentOffset )\n{\n int tid = get_global_id( 0 > > );\n sSha"..., headers=..., name="input.cl", triple="r600--", > > processor="verde", opts="", > > >>>>>>> address_spaces=..., optimization_level=@0x7fffffff21cc: 2, > > r_log=...) at llvm/invocation.cpp:255 > > >>>>>>> #22 0x00007ffff6b140c8 in clover::compile_program_llvm > > (source=..., headers=..., ir=ir at entry=PIPE_SHADER_IR_NATIVE, target=..., > > opts=..., r_log=...) > > >>>>>>> at llvm/invocation.cpp:710 > > >>>>>>> #23 0x00007ffff6b0a371 in clover::program::build (this=this at entry=0x23a0530, > > devs=..., opts=opts at entry=0x7ffff793dc0d "", headers=...) > > >>>>>>> at core/program.cpp:63 > > >>>>>>> #24 0x00007ffff6af31c4 in clBuildProgram (d_prog=0x23a0538, > > num_devs=0, d_devs=0x0, p_opts=<optimized out>, pfn_notify=0x0, > > user_data=0x0) > > >>>>>>> at api/program.cpp:182 > > >>>>>>> > > >>>>>>> Does this look related? If so, let me know what other information > > you need to > > >>>>>>> try to debug this issue. > > >>>>>> > > >>>>>> This could be related; I'm not sure. > > >>>>>> > > >>>>> > > >>>>> I'm pretty sure that this commit is the cause of the regression. > > >>>>> > > >>>>> r223801 works and r223810 does not, and I don't think any of the > > other > > >>>>> commits in that range could cause this. > > >>>>> > > >>>>>> It looks like a leak detection assertion, and I didn't need to > > change > > >>>>>> that logic at all. `ValueMap` calls `MDNode::getTemporary()` and > > >>>>>> `MDNode::deleteTemporary()` in the same ways it used to (and I > > didn't > > >>>>>> touch the implementation of those). > > >>>>>> > > >>>>>> Can you reproduce this with `llvm-link`? If so, that sounds like > > the > > >>>>>> best place to start. > > >>>>> > > >>>>> I can't reproduce this using llvm-link unfortunately. Any other > > ideas? > > >>>> > > >>>> (Continuing via email, since Tom stepped away IRC.) > > >>>> > > >>>> Tom, from the trace [1], it the problematic pointer (0x27e4c80) only > > shows > > >>>> up once. > > >>>> > > >>>> [1]: http://people.freedesktop.org/~tstellar/md-crash.out > > >>>> > > >>>> That means that something *else* -- other than > > `MDNode::getTemporary()` > > >>>> -- must be adding that address to garbage and failing to remove it. > > >>>> > > >>>> I just dug into `LeakDetector::addGarbageObject()` and it stores *all* > > >>>> calls to `addGarbage()` in the same place. There are a fair number of > > >>>> these in the IR: > > >>>> > > >>>> $ git grep -e addGarbageObject -w -- lib/IR/ > > >>>> lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this); > > >>>> lib/IR/BasicBlock.cpp: LeakDetector::addGarbageObject(this); > > >>>> lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > > >>>> lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > > >>>> lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > > >>>> lib/IR/Function.cpp: LeakDetector::addGarbageObject(this); > > >>>> lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > > >>>> lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > > >>>> lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > > >>>> lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > > >>>> lib/IR/Globals.cpp: LeakDetector::addGarbageObject(this); > > >>>> lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this); > > >>>> lib/IR/Instruction.cpp: LeakDetector::addGarbageObject(this); > > >>>> lib/IR/Instruction.cpp: if (!P) > > LeakDetector::addGarbageObject(this); > > >>>> lib/IR/Metadata.cpp: LeakDetector::addGarbageObject(N); > > >>>> > > >>>> I think the next step is to identify who called `addGarbageObject()` > > with > > >>>> the problematic address, and what the stack trace was. > > >>>> > > >>>> The weird thing is, I also noticed a semantic change I made here > > >>>> accidentally. `addGarbageObject()` has two overloads: `void*` and > > >>>> `const Value*`. `MDNode::getTemporary()` used to match the latter, > > but > > >>>> now it matches the former. > > >>>> > > >>>> The weird part: all the other calls to `addGarbageObject()` look like > > they > > >>>> send in a `Value *`. > > >>>> > > >>>> Do you have any other calls to `addGarbageObject()`? Do they match > > >>>> `void *`? > > >>>> > > >>>> Also, what happens with the attached patch? (If this fixes your > > problem, > > >>>> I think it's just papering over something...) > > >>>> > > >>>> <0001-IR-Detect-Metadata-leaks-separately-from-generic-obj.patch> > > >>> > > >>> Zalman also had a reproduction, and he's been able to track it down to > > >>> an `addGarbageObject()` call from `MachineBasicBlock`. It looks like > > >>> the MBB gets deallocated but `removeGarbageObject()` isn't called yet. > > >>> > > >>> CC'ing him here so he can join the thread once he's gotten a little > > >>> further. > > >>> > > >>> BTW, if this is blocking anyone, I can commit the patch I attached to > > >>> my previous email (or you can apply it locally). I think it's probably > > >>> the right thing eventually -- it improves the output when there *is* an > > >>> issue -- but I haven't committed it yet since it'll cover up the > > >>> problem. > > >> _______________________________________________ > > >> LLVM Developers mailing list > > >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > > >