On 04/15/2013 11:01 AM, David Blaikie wrote:> On Wed, Apr 3, 2013 at 12:12 AM, Török Edwin <edwin+ml-debian at etorok.net> wrote: >> On 04/03/2013 01:20 AM, Renato Golin wrote: >>> Hi Torok, >>> >>> I've used a hard-coded list on the input parameter and still got some output (slightly) scrambled between two different bots... >>> >>> I though the dbdir could be the culprit, but it has only one file. Attached is the output of both. >>> >> >> The version of ClamAV in the LLVM test-suite is quite old, > > In the interests of having relevant metrics - should we update to a > more recent version?The reason for adding ClamAV to the test-suite was more for testing correctness than performance. (there were quite a few GCC bugs triggered by ClamAV's code for example). To do a proper performance test for ClamAV you need to run it for at least half an hour on a large set of representative files, i.e. not something for LLVM. Otherwise what might seem like a 20% improvement could very well be just a 0.2% improvement in practice. You can try to update to a newer version, I think the script that I used to convert ClamAV's source-code to LLVM-test-suite-friendly source code is still in the repository, although it may need updating as some new files probably got added upstream. Unfortunately I don't have the time to do this update myself at this time. --Edwin
On Fri, Apr 19, 2013 at 1:13 PM, Renato Golin <renato.golin at linaro.org>wrote:> On 19 April 2013 17:48, Török Edwin <edwin at etorok.net> wrote: > >> Otherwise what might seem like a 20% improvement >> could very well be just a 0.2% improvement in practice. >> > > This is (maybe to a lesser extent) what happens with most of our > benchmarks, and running them 3 times doesn't add that much confidence but > makes it run much slower. I end up treating the test-suite as functionality > and correctness test, rather than useful benchmark data. > > I agree it would be great to have a decent benchmark infrastructure for > LLVM, but I'm not sure the test-suite is the appropriate place. Maybe a > different type of run that crank the inputs up to 11 and let the > applications run for longer, to be run once a week or so wouldn't be a bad > idea, though. > >A simple benchmark that we run "all the time" is how long it takes for us to compile ourselves. Do we track this? -- Sean Silva -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130419/7128f0a0/attachment.html>
| This is (maybe to a lesser extent) what happens with most of our benchmarks, and running them 3 times doesn't add that much confidence but makes it run much slower. I end up treating the test-suite as functionality and correctness test, rather than |useful benchmark data. |I agree it would be great to have a decent benchmark infrastructure for LLVM, but I'm not sure the test-suite is the appropriate place. Maybe a different type of run that crank the inputs up to 11 and let the applications run for longer, to be run once a |week or so wouldn't be a bad idea, though. Performance benchmarking also has a coupling to the memory space layout (both code and run-time data placement) that can make individual comparisons of a single before and single after compile be unrepresentative, eg, see http://plasma.cs.umass.edu/emery/stabilizer Addressing this without going quite as far as the stabilizer stuff goes is turning out to be quite challenging. Cheers, Dave -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130422/a815f51c/attachment.html>