On Thu, 09 Dec 2010 11:24:19 -0600 greened at obbligato.org (David A. Greene) wrote:> greened at obbligato.org (David A. Greene) writes: > > > Often I run a few different builds in parallel, with different > > obj/build directories. Is it possible that the test infrastructure > > writes something to the source directories or some common temp > > directory? That could confuse things when doing parallel > > build/test and would explain all these failures. When I don't run > > in parallel, things seem to work much better. > > There is definitely something to this. If I take a random failing > testcase and run the test in isolation in the shell, it works. So > what, if anything, does lit/FileCheck/etc. do that might run > interference if there is another copy of lit/FileCheck/etc. running > at the same time? I tried strace -etrace=file but nothing obvious > pops out.I don't see anything wrong with FileCheck either. However looks here, that .bc file is in the *source* tree, not the obj tree: not llvm-dis < /ptmp/dag/llvm-project.official/llvm/trunk/test/Bitcode/null-type.ll.bc > /dev/null |& grep "Invalid MODULE_CODE_FUNCTION record" And I guess the .bc file is created by something during the test, multiple builds in parallel == everything breaks. Daniel, is the .bc file created by 'lit'? Best regards, --Edwin
Török Edwin <edwintorok at gmail.com> writes:> I don't see anything wrong with FileCheck either. > > However looks here, that .bc file is in the *source* tree, not the obj tree: > not llvm-dis < /ptmp/dag/llvm-project.official/llvm/trunk/test/Bitcode/null-type.ll.bc > /dev/null |& grep "Invalid MODULE_CODE_FUNCTION record"I think that's there from checkout. You can see it on the web. It seems I spoke too soon about build-level parallelization (running multiple parallel builds in parallel). 4.1.2 works fine for Debug+Asserts builds but fails horribly on Release+Asserts buils on SLES 10.1 when run by itself. I'm assuming there's nothing wrong with doing an LLVM parallel build (make -j). Sigh. More compilers to try... -Dave
On Thu, Dec 9, 2010 at 9:46 AM, Török Edwin <edwintorok at gmail.com> wrote:> On Thu, 09 Dec 2010 11:24:19 -0600 > greened at obbligato.org (David A. Greene) wrote: > >> greened at obbligato.org (David A. Greene) writes: >> >> > Often I run a few different builds in parallel, with different >> > obj/build directories. Is it possible that the test infrastructure >> > writes something to the source directories or some common temp >> > directory? That could confuse things when doing parallel >> > build/test and would explain all these failures. When I don't run >> > in parallel, things seem to work much better. >> >> There is definitely something to this. If I take a random failing >> testcase and run the test in isolation in the shell, it works. So >> what, if anything, does lit/FileCheck/etc. do that might run >> interference if there is another copy of lit/FileCheck/etc. running >> at the same time? I tried strace -etrace=file but nothing obvious >> pops out.Perhaps its this? - http://llvm.org/bugs/show_bug.cgi?id=8199 -jason> > I don't see anything wrong with FileCheck either. > > However looks here, that .bc file is in the *source* tree, not the obj tree: > not llvm-dis < /ptmp/dag/llvm-project.official/llvm/trunk/test/Bitcode/null-type.ll.bc > /dev/null |& grep "Invalid MODULE_CODE_FUNCTION record" > > And I guess the .bc file is created by something during the test, multiple builds in parallel == everything breaks. > Daniel, is the .bc file created by 'lit'?> Best regards, > --Edwin > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
On Thu, 09 Dec 2010 11:57:04 -0600 greened at obbligato.org (David A. Greene) wrote:> Török Edwin <edwintorok at gmail.com> writes: > > > I don't see anything wrong with FileCheck either. > > > > However looks here, that .bc file is in the *source* tree, not the > > obj tree: not llvm-dis > > < /ptmp/dag/llvm-project.official/llvm/trunk/test/Bitcode/null-type.ll.bc > > > /dev/null |& grep "Invalid MODULE_CODE_FUNCTION record" > > I think that's there from checkout. You can see it on the web.Ah OK.> > It seems I spoke too soon about build-level parallelization (running > multiple parallel builds in parallel). 4.1.2 works fine for > Debug+Asserts builds but fails horribly on Release+Asserts buils on > SLES 10.1 when run by itself.Can you try running some of the failing lines manually and see if they still fail? Also are you doing these tests on LLVM 2.8, or trunk? There's one caveat with running tests though: if you have opt, llc, etc. in your path then 'make check' might use those instead of the ones you just built. IIRC I've bumped into that when I had an old LLVM 2.6 in /usr/local, and 2.7 tests failed.> I'm assuming there's nothing wrong > with doing an LLVM parallel build (make -j).make -j is fine, I do it all the time. I've never run make check in parallel though.> Sigh. More compilers to try...Try 4.3 if SLES has it. Best regards, --Edwin
Jason Kim <jasonwkim at google.com> writes:>>> There is definitely something to this. If I take a random failing >>> testcase and run the test in isolation in the shell, it works. So >>> what, if anything, does lit/FileCheck/etc. do that might run >>> interference if there is another copy of lit/FileCheck/etc. running >>> at the same time? I tried strace -etrace=file but nothing obvious >>> pops out. > > Perhaps its this? - http://llvm.org/bugs/show_bug.cgi?id=8199Good catch! I think that is probably it. I have old installations of llvm-gcc/opt/llc/etc. that I use to tell the new build where llvm-gcc is. It would explain why results for Debug/Release differ because I do separate installations for each build flavor and branch. This may also explain why I've sometimes seen tests working in the past where others see failures. Wow, this probably explains a lot. For now, I think if I tweak the way I do the build to always build without pointing to llvm-gcc first, build and test LLVM then build llvm-gcc and re-build LLVM, it should work. It will take much longer, though. :( -Dave