Hi folks, I'm investigating the LNT failures on our bot and found that I cannot reproduce BenchmarkGame pass. I've compiled it with GCC, Clang on both ARM and x86_64, with -O3 or with the arguments that the test-suite passes to it and all I can get is the result below: Found duplicate: 420094 Found duplicate: 341335 Found duplicate: 150397 Found duplicate: 157527 Found duplicate: 269724 But not the one that is on the reference output: Found duplicate: 4 Found duplicate: 485365 Found duplicate: 417267 Found duplicate: 436989 Found duplicate: 60067 If I run the LNT on my machine (x86_64) that test fails, and if I change the reference output to the one above, it passes. On the ARM buildbot I'm also getting the same results, so I'm really surprised that the x86_64 LNT buildbot is passing. PowerPC is also failing, and I suspect for the same reason. Is there any chance that the results are not being checked correctly? Any other ideas? I'm tempted to just change the reference output and see what happens with the other bots... thanks, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130312/893f6eeb/attachment.html>
> Is there any chance that the results are not being checked correctly? Any > other ideas?I think I vaguely convinced myself that the infrastructure didn't actually check whether tests it classified as benchmarks passed or failed. Not sure I had any good evidence for it other than things like you're seeing.> I'm tempted to just change the reference output and see what > happens with the other bots...Could be worth a try. But if that thing really is generating random numbers I'm not sure replacing one genuine cast-iron random number with another is the best solution long-term. Tim.
On 12 March 2013 14:24, Tim Northover <t.p.northover at gmail.com> wrote:> Could be worth a try. But if that thing really is generating random > numbers I'm not sure replacing one genuine cast-iron random number > with another is the best solution long-term. >The test is initializing srand(1), so in theory, it shouldn't be different between compilers, since Clang is using the same libraries. Also, if the "native" result is generated by GCC, than all problems go away, since the result will be target dependent (or rather, library dependent). Is there a way to turn on the dynamic generation of the native file instead of copying it from the reference_output? cheers, --renato -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130312/c9fb4761/attachment.html>
Hi Renato, This is probably a platform specific dependency where the Linux output file differs from the Darwin one. I fixed up a lot of those in the past but the random number issue blocks some others. For reference see LLVM r111522. On my machine I get output that matches the reference output: -- ddunbar at ozzy-2:BenchmarkGame (master)$ clang puzzle.c && ./a.out Found duplicate: 4 Found duplicate: 485365 Found duplicate: 417267 Found duplicate: 436989 Found duplicate: 60067 -- The best solution is that which I mention in r111522 - build some extra runtime support code that each benchmark can use, and include a platform stable RNG in it. - Daniel On Tue, Mar 12, 2013 at 6:56 AM, Renato Golin <renato.golin at linaro.org>wrote:> Hi folks, > > I'm investigating the LNT failures on our bot and found that I cannot > reproduce BenchmarkGame pass. > > I've compiled it with GCC, Clang on both ARM and x86_64, with -O3 or with > the arguments that the test-suite passes to it and all I can get is the > result below: > > Found duplicate: 420094 > Found duplicate: 341335 > Found duplicate: 150397 > Found duplicate: 157527 > Found duplicate: 269724 > > But not the one that is on the reference output: > > Found duplicate: 4 > Found duplicate: 485365 > Found duplicate: 417267 > Found duplicate: 436989 > Found duplicate: 60067 > > If I run the LNT on my machine (x86_64) that test fails, and if I change > the reference output to the one above, it passes. > > On the ARM buildbot I'm also getting the same results, so I'm really > surprised that the x86_64 LNT buildbot is passing. PowerPC is also failing, > and I suspect for the same reason. > > Is there any chance that the results are not being checked correctly? Any > other ideas? I'm tempted to just change the reference output and see what > happens with the other bots... > > thanks, > --renato >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130312/0d4af634/attachment.html>
On Tue, Mar 12, 2013 at 7:24 AM, Tim Northover <t.p.northover at gmail.com>wrote:> > Is there any chance that the results are not being checked correctly? Any > > other ideas? > > I think I vaguely convinced myself that the infrastructure didn't > actually check whether tests it classified as benchmarks passed or > failed. Not sure I had any good evidence for it other than things like > you're seeing. >This is false. Every test gets compared against some kind of expected output file (which includes the exit code). The correct output is either: a. a reference output file or b. the output from a natively run executable depending on some of the test parameters. - Daniel> > > I'm tempted to just change the reference output and see what > > happens with the other bots... > > Could be worth a try. But if that thing really is generating random > numbers I'm not sure replacing one genuine cast-iron random number > with another is the best solution long-term. > > Tim. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130312/adf91ffd/attachment.html>