thr3ads.net - llvm dev - [llvm-dev] [test-suite] r261857 - [cmake] Add support for arbitrary metrics [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Matthias Braun via llvm-dev

2016-Mar-24 00:54 UTC

[llvm-dev] [test-suite] r261857 - [cmake] Add support for arbitrary metrics

Let's move this to llvm-dev. I should describe my goals/motivation for the
work I have been putting into the llvm-testsuite lately. This is how I see the
llvm-test-suite today:

- We provide a familiar cmake build system so people have a known environment to
tweak compilation flags.
- Together with the benchmark executable we build a .test file that describes
how to invoke the benchmark and can be run by the familiar llvm-lit tool:
- Running a benchmark means executing its executable with a certain set of
flags. Some of the SPEC benchmarks even require multiple invocations with
different flags.
- There is a set of steps to verify that the benchmark worked correctly. This
usually means invoking "diff" or "fpcmp" and comparing the
results with a reference file.
- The lit benchmark driver modifies these benchmark descriptions to create a
test plan. In the simplest case this means prefixing the executable with
"timeit" and collecting the number. But we are adding more features
like collecting code size, running the benchmark on a remote device, prefixing
different instrumentation tools like the linux "perf" tool, a utility
tasks that collects and merge PGO data files after a benchmark run, ...

This allows us to add new instrumentation and metrics in the future without
touching the benchmarks itself. It works best for bigger benchmark that run for
a while (a few seconds minimum). It works nicely with benchmark suites like
SPEC, geekbench, mediabench.... Let's call this "macro
benchmarking".


Having said all that. You make a very good case for what we should call
"micro benchmarking". The google benchmarking library does indeed look
like a fantastic tool. We should definitely evaluate how we can integrate this
into the llvm test-suite, we think of it as a new flavor of benchmarks. We
won't be able to redesign SPEC but we surely can find things like TSVC which
we could adapt to this. I have no immediate plans to put much more work into the
test-suite, but I agree that micro benchmarking would be an exciting addition to
our testing strategy. I'd be happy to review patches or talk through
possible designs on IRC.

- Matthias
> https://github.com/google/benchmark
<https://github.com/google/benchmark>

> On Mar 23, 2016, at 3:45 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> ----- Original Message -----
>> From: "Hal Finkel via llvm-commits" <llvm-commits at
lists.llvm.org>
>> To: "Matthias Braun" <mbraun at apple.com>
>> Cc: "nd" <nd at arm.com>, "llvm-commits"
<llvm-commits at lists.llvm.org>
>> Sent: Wednesday, March 23, 2016 5:19:37 PM
>> Subject: Re: [test-suite] r261857 - [cmake] Add support for arbitrary
metrics
>> 
>> ----- Original Message -----
>>> From: "Matthias Braun" <mbraun at apple.com>
>>> To: "Hal Finkel" <hfinkel at anl.gov>
>>> Cc: "James Molloy" <James.Molloy at arm.com>,
"nd" <nd at arm.com>,
>>> "llvm-commits" <llvm-commits at lists.llvm.org>
>>> Sent: Friday, March 4, 2016 12:36:36 PM
>>> Subject: Re: [test-suite] r261857 - [cmake] Add support for
>>> arbitrary metrics
>>> 
>>> A test can report "internal" metrics now. Though I
don't think lnt
>>> would split those into a notion of sub-tests I think.
>>> It would be an interesting feature to add. Though if we have the
>>> choice to modify a benchmark, we should still prefer smaller
>>> independent ones IMO as that gives a better idea when some of the
>>> other metrics change (compiletime, codesize, hopefully things like
>>> memory usage or performance counters in the future).
>> 
>> Unless the kernels are large, their code size within the context of a
>> complete executable might be hard to track regardless (because by
>> the time you add in the static libc startup code, ELF headers, etc.
>> any change would be a smaller percentage of the total). Explicitly
>> instrumenting the code to mark regions of interest is probably best
>> (which is true for timing too), but that seems like a separate
>> (although worthwhile) project.
>> 
>> In any case, for TSVC, for example, the single test has 136 kernels;
>> which I currently group into 18 binaries. I have a float and double
>> version for each, so we have 36 total binaries. What you're
>> suggesting would have us produce 272 separate executables, just for
>> TSVC. Ideally, I'd like aligned and unaligned variants of each of
>> these. I've not done that because I thought that 72 executables
>> would be a bit much, but that's 544 executables if I generate one
>> per kernel variant.
>> 
>> The LCALS benchmark, which I'd really like to add sometime soon,
has
>> another ~100 kernels, which is ~200 to do both float and double
>> (which we should do).
>> 
>> What do you think is reasonable here?
>> 
> 
> Also, we might want to consider updating some of these tests to use
Google's benchmark library (https://github.com/google/benchmark
<https://github.com/google/benchmark>). Have you looked at this? Aside
from giving us a common output format, the real advantage of using a driver
library like this is that it lets us dynamically pick the number of loop
iterations based on per-iteration timing. This appeals to me because the number
of iterations that is reasonable for some embedded device is normally quite
different from what is reasonable for a server-class machine. Doing this,
however, means that we definitely can't rely on overall executable timing.
Thoughts?
> 
> -Hal
> 
>> Thanks again,
>> Hal
>> 
>>> 
>>> - Matthias
>>> 
>>>> On Mar 4, 2016, at 8:22 AM, Hal Finkel <hfinkel at
anl.gov> wrote:
>>>> 
>>>> Hi James,
>>>> 
>>>> If I'm reading this correctly, you can have multiple
metrics per
>>>> test. Is that correct?
>>>> 
>>>> I'd really like to support tests with internal timers (i.e.
a
>>>> timer
>>>> per kernel), so that we can have more fine-grained timing
without
>>>> splitting executables into multiple parts (e.g. as I had to do
>>>> with TSVC).
>>>> 
>>>> Thanks again,
>>>> Hal
>>>> 
>>>> ----- Original Message -----
>>>>> From: "James Molloy via llvm-commits"
>>>>> <llvm-commits at lists.llvm.org>
>>>>> To: "Matthias Braun" <mbraun at apple.com>
>>>>> Cc: "nd" <nd at arm.com>,
"llvm-commits"
>>>>> <llvm-commits at lists.llvm.org>
>>>>> Sent: Friday, February 26, 2016 3:06:01 AM
>>>>> Subject: Re: [test-suite] r261857 - [cmake] Add support for
>>>>> arbitrary metrics
>>>>> 
>>>>> Hi Matthias,
>>>>> 
>>>>> Thanks :) I’ve been working internally to move all our
testing
>>>>> from
>>>>> ad-hoc driver scripts to CMake+LIT-based. Currently I have
CMake
>>>>> drivers for a very popular mobile benchmark (but the
pre-release
>>>>> version so pushing this upstream might be difficult), and
EEMBC
>>>>> (automotive, telecom, consumer).
>>>>> 
>>>>> I really want all of these to live upstream, but I have to
do a
>>>>> bit
>>>>> of legal checking before I can push them. In the meantime
I’m
>>>>> happy
>>>>> to add an example to the repositories; alternatively I
could
>>>>> modify
>>>>> the SPEC drivers to also compute SPECrate as a metric?
>>>>> 
>>>>> Cheers,
>>>>> 
>>>>> James
>>>>> 
>>>>>> On 25 Feb 2016, at 21:33, Matthias Braun <mbraun at
apple.com>
>>>>>> wrote:
>>>>>> 
>>>>>> Hi James,
>>>>>> 
>>>>>> thanks for working on the test-suite. It's nice to
see new
>>>>>> capabilities added to the lit system.
>>>>>> 
>>>>>> Are you planing to add tests that use this? If not we
should
>>>>>> really
>>>>>> have at least an example/unit-test type thing in the
>>>>>> repository.
>>>>>> 
>>>>>> - Matthias
>>>>>> 
>>>>>>> On Feb 25, 2016, at 3:06 AM, James Molloy via
llvm-commits
>>>>>>> <llvm-commits at lists.llvm.org> wrote:
>>>>>>> 
>>>>>>> Author: jamesm
>>>>>>> Date: Thu Feb 25 05:06:15 2016
>>>>>>> New Revision: 261857
>>>>>>> 
>>>>>>> URL:
http://llvm.org/viewvc/llvm-project?rev=261857&view=rev
>>>>>>> Log:
>>>>>>> [cmake] Add support for arbitrary metrics
>>>>>>> 
>>>>>>> This allows a .test script to specify a command to
get a
>>>>>>> metric
>>>>>>> for the test. For example:
>>>>>>> 
>>>>>>> METRIC: score: grep "Score:" %o | awk
'{print $2}'
>>>>>>> 
>>>>>>> Modified:
>>>>>>>
test-suite/trunk/cmake/modules/SingleMultiSource.cmake
>>>>>>> test-suite/trunk/litsupport/test.py
>>>>>>> test-suite/trunk/litsupport/testscript.py
>>>>>>> 
>>>>>>> Modified:
>>>>>>>
test-suite/trunk/cmake/modules/SingleMultiSource.cmake
>>>>>>> URL:
>>>>>>>
http://llvm.org/viewvc/llvm-project/test-suite/trunk/cmake/modules/SingleMultiSource.cmake?rev=261857&r1=261856&r2=261857&view=diff
>>>>>>>
=============================================================================>>>>>>>
--- test-suite/trunk/cmake/modules/SingleMultiSource.cmake
>>>>>>> (original)
>>>>>>> +++
test-suite/trunk/cmake/modules/SingleMultiSource.cmake Thu
>>>>>>> Feb
>>>>>>> 25 05:06:15 2016
>>>>>>> @@ -223,3 +223,17 @@ macro(llvm_test_verify)
>>>>>>>  set(TESTSCRIPT "${TESTSCRIPT}VERIFY:
${JOINED_ARGUMENTS}\n")
>>>>>>> endif()
>>>>>>> endmacro()
>>>>>>> +
>>>>>>> +macro(llvm_test_metric)
>>>>>>> +  CMAKE_PARSE_ARGUMENTS(ARGS ""
"RUN_TYPE;METRIC" "" ${ARGN})
>>>>>>> +  if(NOT DEFINED TESTSCRIPT)
>>>>>>> +    set(TESTSCRIPT "" PARENT_SCOPE)
>>>>>>> +  endif()
>>>>>>> +  # ARGS_UNPARSED_ARGUMENTS is a
semicolon-separated list.
>>>>>>> Change
>>>>>>> it into a
>>>>>>> +  # whitespace-separated string.
>>>>>>> +  string(REPLACE ";" " "
JOINED_ARGUMENTS
>>>>>>> "${ARGS_UNPARSED_ARGUMENTS}")
>>>>>>> +  if(NOT DEFINED ARGS_RUN_TYPE OR
"${ARGS_RUN_TYPE}" STREQUAL
>>>>>>> "${TEST_SUITE_RUN_TYPE}")
>>>>>>> +    set(TESTSCRIPT "${TESTSCRIPT}METRIC:
${ARGS_METRIC}:
>>>>>>> ${JOINED_ARGUMENTS}\n")
>>>>>>> +  endif()
>>>>>>> +endmacro()
>>>>>>> +
>>>>>>> \ No newline at end of file
>>>>>>> 
>>>>>>> Modified: test-suite/trunk/litsupport/test.py
>>>>>>> URL:
>>>>>>>
http://llvm.org/viewvc/llvm-project/test-suite/trunk/litsupport/test.py?rev=261857&r1=261856&r2=261857&view=diff
>>>>>>>
=============================================================================>>>>>>>
--- test-suite/trunk/litsupport/test.py (original)
>>>>>>> +++ test-suite/trunk/litsupport/test.py Thu Feb 25
05:06:15
>>>>>>> 2016
>>>>>>> @@ -56,7 +56,7 @@ class
TestSuiteTest(FileBasedTest):
>>>>>>>      res = testscript.parse(test.getSourcePath())
>>>>>>>      if litConfig.noExecute:
>>>>>>>          return lit.Test.Result(Test.PASS)
>>>>>>> -        runscript, verifyscript = res
>>>>>>> +        runscript, verifyscript, metricscripts =
res
>>>>>>> 
>>>>>>>      # Apply the usual lit substitutions (%s, %S,
%p, %T,
>>>>>>>      ...)
>>>>>>>      tmpDir, tmpBase = getTempPaths(test)
>>>>>>> @@ -65,6 +65,8 @@ class
TestSuiteTest(FileBasedTest):
>>>>>>>      substitutions += [('%o', outfile)]
>>>>>>>      runscript = applySubstitutions(runscript,
substitutions)
>>>>>>>      verifyscript =
applySubstitutions(verifyscript,
>>>>>>>      substitutions)
>>>>>>> +        metricscripts = {k: applySubstitutions(v,
>>>>>>> substitutions)
>>>>>>> +                         for k,v in
metricscripts.items()}
>>>>>>>      context = TestContext(test, litConfig,
runscript,
>>>>>>>      verifyscript, tmpDir,
>>>>>>>                            tmpBase)
>>>>>>> 
>>>>>>> @@ -80,6 +82,7 @@ class
TestSuiteTest(FileBasedTest):
>>>>>>>      output = ""
>>>>>>>      n_runs = 1
>>>>>>>      runtimes = []
>>>>>>> +        metrics = {}
>>>>>>>      for n in range(n_runs):
>>>>>>>          res = runScript(context, runscript)
>>>>>>>          if isinstance(res, lit.Test.Result):
>>>>>>> @@ -94,6 +97,15 @@ class
TestSuiteTest(FileBasedTest):
>>>>>>>              output += "\n" + err
>>>>>>>              return lit.Test.Result(Test.FAIL,
output)
>>>>>>> 
>>>>>>> +            # Execute metric extraction scripts.
>>>>>>> +            for metric, script in
metricscripts.items():
>>>>>>> +                res = runScript(context, script)
>>>>>>> +                if isinstance(res,
lit.Test.Result):
>>>>>>> +                    return res
>>>>>>> +
>>>>>>> +                out, err, exitCode, timeoutInfo =
res
>>>>>>> +                metrics.setdefault(metric,
>>>>>>> list()).append(float(out))
>>>>>>> +
>>>>>>>          try:
>>>>>>>              runtime = runsafely.getTime(context)
>>>>>>>              runtimes.append(runtime)
>>>>>>> @@ -128,6 +140,8 @@ class
TestSuiteTest(FileBasedTest):
>>>>>>>      result = lit.Test.Result(Test.PASS, output)
>>>>>>>      if len(runtimes) > 0:
>>>>>>>          result.addMetric('exec_time',
>>>>>>>          lit.Test.toMetricValue(runtimes[0]))
>>>>>>> +        for metric, values in metrics.items():
>>>>>>> +            result.addMetric(metric,
>>>>>>> lit.Test.toMetricValue(values[0]))
>>>>>>>      compiletime.collect(context, result)
>>>>>>> 
>>>>>>>      return result
>>>>>>> 
>>>>>>> Modified: test-suite/trunk/litsupport/testscript.py
>>>>>>> URL:
>>>>>>>
http://llvm.org/viewvc/llvm-project/test-suite/trunk/litsupport/testscript.py?rev=261857&r1=261856&r2=261857&view=diff
>>>>>>>
=============================================================================>>>>>>>
--- test-suite/trunk/litsupport/testscript.py (original)
>>>>>>> +++ test-suite/trunk/litsupport/testscript.py Thu
Feb 25
>>>>>>> 05:06:15
>>>>>>> 2016
>>>>>>> @@ -22,13 +22,18 @@ def parse(filename):
>>>>>>>  # Collect the test lines from the script.
>>>>>>>  runscript = []
>>>>>>>  verifyscript = []
>>>>>>> -    keywords = ['RUN:', 'VERIFY:']
>>>>>>> +    metricscripts = {}
>>>>>>> +    keywords = ['RUN:', 'VERIFY:',
'METRIC:']
>>>>>>>  for line_number, command_type, ln in \
>>>>>>>         
parseIntegratedTestScriptCommands(filename,
>>>>>>>          keywords):
>>>>>>>      if command_type == 'RUN':
>>>>>>>          _parseShellCommand(runscript, ln)
>>>>>>>      elif command_type == 'VERIFY':
>>>>>>>          _parseShellCommand(verifyscript, ln)
>>>>>>> +        elif command_type == 'METRIC':
>>>>>>> +            metric, ln = ln.split(':', 1)
>>>>>>> +            metricscript
>>>>>>> metricscripts.setdefault(metric.strip(), list())
>>>>>>> +            _parseShellCommand(metricscript, ln)
>>>>>>>      else:
>>>>>>>          raise ValueError("unknown script
command type: %r" %
>>>>>>>          (
>>>>>>>                           command_type,))
>>>>>>> @@ -43,4 +48,4 @@ def parse(filename):
>>>>>>>          raise ValueError("Test has
unterminated RUN/VERIFY
>>>>>>>          lines " +
>>>>>>>                           "(ending with
'\\')")
>>>>>>> 
>>>>>>> -    return runscript, verifyscript
>>>>>>> +    return runscript, verifyscript, metricscripts
>>>>>>> 
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> llvm-commits mailing list
>>>>>>> llvm-commits at lists.llvm.org
>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>> 
>>>>> _______________________________________________
>>>>> llvm-commits mailing list
>>>>> llvm-commits at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>> 
>>>> --
>>>> Hal Finkel
>>>> Assistant Computational Scientist
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>> 
>> 
>> --
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
>> _______________________________________________
>> llvm-commits mailing list
>> llvm-commits at lists.llvm.org <mailto:llvm-commits at
lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>
>> 
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160323/ac071897/attachment-0001.html>

Mehdi Amini via llvm-dev

2016-Mar-24 01:00 UTC

head link

[llvm-dev] [test-suite] r261857 - [cmake] Add support for arbitrary metrics

> On Mar 23, 2016, at 5:54 PM, Matthias Braun via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> Let's move this to llvm-dev. I should describe my goals/motivation for
the work I have been putting into the llvm-testsuite lately. This is how I see
the llvm-test-suite today:
> 
> - We provide a familiar cmake build system so people have a known
environment to tweak compilation flags.
> - Together with the benchmark executable we build a .test file that
describes how to invoke the benchmark and can be run by the familiar llvm-lit
tool:
> - Running a benchmark means executing its executable with a certain set of
flags. Some of the SPEC benchmarks even require multiple invocations with
different flags.
> - There is a set of steps to verify that the benchmark worked correctly.
This usually means invoking "diff" or "fpcmp" and comparing
the results with a reference file.
> - The lit benchmark driver modifies these benchmark descriptions to create
a test plan. In the simplest case this means prefixing the executable with
"timeit" and collecting the number. But we are adding more features
like collecting code size, running the benchmark on a remote device, prefixing
different instrumentation tools like the linux "perf" tool, a utility
tasks that collects and merge PGO data files after a benchmark run, ...
> 
> This allows us to add new instrumentation and metrics in the future without
touching the benchmarks itself. It works best for bigger benchmark that run for
a while (a few seconds minimum). It works nicely with benchmark suites like
SPEC, geekbench, mediabench.... Let's call this "macro
benchmarking".
> 
> 
> Having said all that. You make a very good case for what we should call
"micro benchmarking". The google benchmarking library does indeed look
like a fantastic tool. We should definitely evaluate how we can integrate this
into the llvm test-suite, we think of it as a new flavor of benchmarks. We
won't be able to redesign SPEC but we surely can find things like TSVC which
we could adapt to this. I have no immediate plans to put much more work into the
test-suite, but I agree that micro benchmarking would be an exciting addition to
our testing strategy. I'd be happy to review patches or talk through
possible designs on IRC.
Note: I suggested to have the Halide test infrastructure compatible with google
benchmarks framework during the initial review, because long term Halide can
generate interesting micro-benchmarks.

-- 
Mehdi

> 
> - Matthias
> 
>> https://github.com/google/benchmark
<https://github.com/google/benchmark>
> 
> 
> 
>> On Mar 23, 2016, at 3:45 PM, Hal Finkel <hfinkel at anl.gov
<mailto:hfinkel at anl.gov>> wrote:
>> 
>> ----- Original Message -----
>>> From: "Hal Finkel via llvm-commits" <llvm-commits at
lists.llvm.org <mailto:llvm-commits at lists.llvm.org>>
>>> To: "Matthias Braun" <mbraun at apple.com
<mailto:mbraun at apple.com>>
>>> Cc: "nd" <nd at arm.com <mailto:nd at
arm.com>>, "llvm-commits" <llvm-commits at lists.llvm.org
<mailto:llvm-commits at lists.llvm.org>>
>>> Sent: Wednesday, March 23, 2016 5:19:37 PM
>>> Subject: Re: [test-suite] r261857 - [cmake] Add support for
arbitrary metrics
>>> 
>>> ----- Original Message -----
>>>> From: "Matthias Braun" <mbraun at apple.com
<mailto:mbraun at apple.com>>
>>>> To: "Hal Finkel" <hfinkel at anl.gov
<mailto:hfinkel at anl.gov>>
>>>> Cc: "James Molloy" <James.Molloy at arm.com
<mailto:James.Molloy at arm.com>>, "nd" <nd at arm.com
<mailto:nd at arm.com>>,
>>>> "llvm-commits" <llvm-commits at lists.llvm.org
<mailto:llvm-commits at lists.llvm.org>>
>>>> Sent: Friday, March 4, 2016 12:36:36 PM
>>>> Subject: Re: [test-suite] r261857 - [cmake] Add support for
>>>> arbitrary metrics
>>>> 
>>>> A test can report "internal" metrics now. Though I
don't think lnt
>>>> would split those into a notion of sub-tests I think.
>>>> It would be an interesting feature to add. Though if we have
the
>>>> choice to modify a benchmark, we should still prefer smaller
>>>> independent ones IMO as that gives a better idea when some of
the
>>>> other metrics change (compiletime, codesize, hopefully things
like
>>>> memory usage or performance counters in the future).
>>> 
>>> Unless the kernels are large, their code size within the context of
a
>>> complete executable might be hard to track regardless (because by
>>> the time you add in the static libc startup code, ELF headers, etc.
>>> any change would be a smaller percentage of the total). Explicitly
>>> instrumenting the code to mark regions of interest is probably best
>>> (which is true for timing too), but that seems like a separate
>>> (although worthwhile) project.
>>> 
>>> In any case, for TSVC, for example, the single test has 136
kernels;
>>> which I currently group into 18 binaries. I have a float and double
>>> version for each, so we have 36 total binaries. What you're
>>> suggesting would have us produce 272 separate executables, just for
>>> TSVC. Ideally, I'd like aligned and unaligned variants of each
of
>>> these. I've not done that because I thought that 72 executables
>>> would be a bit much, but that's 544 executables if I generate
one
>>> per kernel variant.
>>> 
>>> The LCALS benchmark, which I'd really like to add sometime
soon, has
>>> another ~100 kernels, which is ~200 to do both float and double
>>> (which we should do).
>>> 
>>> What do you think is reasonable here?
>>> 
>> 
>> Also, we might want to consider updating some of these tests to use
Google's benchmark library (https://github.com/google/benchmark
<https://github.com/google/benchmark>). Have you looked at this? Aside
from giving us a common output format, the real advantage of using a driver
library like this is that it lets us dynamically pick the number of loop
iterations based on per-iteration timing. This appeals to me because the number
of iterations that is reasonable for some embedded device is normally quite
different from what is reasonable for a server-class machine. Doing this,
however, means that we definitely can't rely on overall executable timing.
Thoughts?
>> 
>> -Hal
>> 
>>> Thanks again,
>>> Hal
>>> 
>>>> 
>>>> - Matthias
>>>> 
>>>>> On Mar 4, 2016, at 8:22 AM, Hal Finkel <hfinkel at
anl.gov <mailto:hfinkel at anl.gov>> wrote:
>>>>> 
>>>>> Hi James,
>>>>> 
>>>>> If I'm reading this correctly, you can have multiple
metrics per
>>>>> test. Is that correct?
>>>>> 
>>>>> I'd really like to support tests with internal timers
(i.e. a
>>>>> timer
>>>>> per kernel), so that we can have more fine-grained timing
without
>>>>> splitting executables into multiple parts (e.g. as I had to
do
>>>>> with TSVC).
>>>>> 
>>>>> Thanks again,
>>>>> Hal
>>>>> 
>>>>> ----- Original Message -----
>>>>>> From: "James Molloy via llvm-commits"
>>>>>> <llvm-commits at lists.llvm.org
<mailto:llvm-commits at lists.llvm.org>>
>>>>>> To: "Matthias Braun" <mbraun at apple.com
<mailto:mbraun at apple.com>>
>>>>>> Cc: "nd" <nd at arm.com <mailto:nd at
arm.com>>, "llvm-commits"
>>>>>> <llvm-commits at lists.llvm.org
<mailto:llvm-commits at lists.llvm.org>>
>>>>>> Sent: Friday, February 26, 2016 3:06:01 AM
>>>>>> Subject: Re: [test-suite] r261857 - [cmake] Add support
for
>>>>>> arbitrary metrics
>>>>>> 
>>>>>> Hi Matthias,
>>>>>> 
>>>>>> Thanks :) I’ve been working internally to move all our
testing
>>>>>> from
>>>>>> ad-hoc driver scripts to CMake+LIT-based. Currently I
have CMake
>>>>>> drivers for a very popular mobile benchmark (but the
pre-release
>>>>>> version so pushing this upstream might be difficult),
and EEMBC
>>>>>> (automotive, telecom, consumer).
>>>>>> 
>>>>>> I really want all of these to live upstream, but I have
to do a
>>>>>> bit
>>>>>> of legal checking before I can push them. In the
meantime I’m
>>>>>> happy
>>>>>> to add an example to the repositories; alternatively I
could
>>>>>> modify
>>>>>> the SPEC drivers to also compute SPECrate as a metric?
>>>>>> 
>>>>>> Cheers,
>>>>>> 
>>>>>> James
>>>>>> 
>>>>>>> On 25 Feb 2016, at 21:33, Matthias Braun <mbraun
at apple.com <mailto:mbraun at apple.com>>
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Hi James,
>>>>>>> 
>>>>>>> thanks for working on the test-suite. It's nice
to see new
>>>>>>> capabilities added to the lit system.
>>>>>>> 
>>>>>>> Are you planing to add tests that use this? If not
we should
>>>>>>> really
>>>>>>> have at least an example/unit-test type thing in
the
>>>>>>> repository.
>>>>>>> 
>>>>>>> - Matthias
>>>>>>> 
>>>>>>>> On Feb 25, 2016, at 3:06 AM, James Molloy via
llvm-commits
>>>>>>>> <llvm-commits at lists.llvm.org
<mailto:llvm-commits at lists.llvm.org>> wrote:
>>>>>>>> 
>>>>>>>> Author: jamesm
>>>>>>>> Date: Thu Feb 25 05:06:15 2016
>>>>>>>> New Revision: 261857
>>>>>>>> 
>>>>>>>> URL:
http://llvm.org/viewvc/llvm-project?rev=261857&view=rev
<http://llvm.org/viewvc/llvm-project?rev=261857&view=rev>
>>>>>>>> Log:
>>>>>>>> [cmake] Add support for arbitrary metrics
>>>>>>>> 
>>>>>>>> This allows a .test script to specify a command
to get a
>>>>>>>> metric
>>>>>>>> for the test. For example:
>>>>>>>> 
>>>>>>>> METRIC: score: grep "Score:" %o | awk
'{print $2}'
>>>>>>>> 
>>>>>>>> Modified:
>>>>>>>>
test-suite/trunk/cmake/modules/SingleMultiSource.cmake
>>>>>>>> test-suite/trunk/litsupport/test.py
>>>>>>>> test-suite/trunk/litsupport/testscript.py
>>>>>>>> 
>>>>>>>> Modified:
>>>>>>>>
test-suite/trunk/cmake/modules/SingleMultiSource.cmake
>>>>>>>> URL:
>>>>>>>>
http://llvm.org/viewvc/llvm-project/test-suite/trunk/cmake/modules/SingleMultiSource.cmake?rev=261857&r1=261856&r2=261857&view=diff
<http://llvm.org/viewvc/llvm-project/test-suite/trunk/cmake/modules/SingleMultiSource.cmake?rev=261857&r1=261856&r2=261857&view=diff>
>>>>>>>>
=============================================================================>>>>>>>>
--- test-suite/trunk/cmake/modules/SingleMultiSource.cmake
>>>>>>>> (original)
>>>>>>>> +++
test-suite/trunk/cmake/modules/SingleMultiSource.cmake Thu
>>>>>>>> Feb
>>>>>>>> 25 05:06:15 2016
>>>>>>>> @@ -223,3 +223,17 @@ macro(llvm_test_verify)
>>>>>>>>  set(TESTSCRIPT "${TESTSCRIPT}VERIFY:
${JOINED_ARGUMENTS}\n")
>>>>>>>> endif()
>>>>>>>> endmacro()
>>>>>>>> +
>>>>>>>> +macro(llvm_test_metric)
>>>>>>>> +  CMAKE_PARSE_ARGUMENTS(ARGS ""
"RUN_TYPE;METRIC" "" ${ARGN})
>>>>>>>> +  if(NOT DEFINED TESTSCRIPT)
>>>>>>>> +    set(TESTSCRIPT "" PARENT_SCOPE)
>>>>>>>> +  endif()
>>>>>>>> +  # ARGS_UNPARSED_ARGUMENTS is a
semicolon-separated list.
>>>>>>>> Change
>>>>>>>> it into a
>>>>>>>> +  # whitespace-separated string.
>>>>>>>> +  string(REPLACE ";" " "
JOINED_ARGUMENTS
>>>>>>>> "${ARGS_UNPARSED_ARGUMENTS}")
>>>>>>>> +  if(NOT DEFINED ARGS_RUN_TYPE OR
"${ARGS_RUN_TYPE}" STREQUAL
>>>>>>>> "${TEST_SUITE_RUN_TYPE}")
>>>>>>>> +    set(TESTSCRIPT "${TESTSCRIPT}METRIC:
${ARGS_METRIC}:
>>>>>>>> ${JOINED_ARGUMENTS}\n")
>>>>>>>> +  endif()
>>>>>>>> +endmacro()
>>>>>>>> +
>>>>>>>> \ No newline at end of file
>>>>>>>> 
>>>>>>>> Modified: test-suite/trunk/litsupport/test.py
>>>>>>>> URL:
>>>>>>>>
http://llvm.org/viewvc/llvm-project/test-suite/trunk/litsupport/test.py?rev=261857&r1=261856&r2=261857&view=diff
>>>>>>>>
=============================================================================>>>>>>>>
--- test-suite/trunk/litsupport/test.py (original)
>>>>>>>> +++ test-suite/trunk/litsupport/test.py Thu Feb
25 05:06:15
>>>>>>>> 2016
>>>>>>>> @@ -56,7 +56,7 @@ class
TestSuiteTest(FileBasedTest):
>>>>>>>>      res =
testscript.parse(test.getSourcePath())
>>>>>>>>      if litConfig.noExecute:
>>>>>>>>          return lit.Test.Result(Test.PASS)
>>>>>>>> -        runscript, verifyscript = res
>>>>>>>> +        runscript, verifyscript, metricscripts
= res
>>>>>>>> 
>>>>>>>>      # Apply the usual lit substitutions (%s,
%S, %p, %T,
>>>>>>>>      ...)
>>>>>>>>      tmpDir, tmpBase = getTempPaths(test)
>>>>>>>> @@ -65,6 +65,8 @@ class
TestSuiteTest(FileBasedTest):
>>>>>>>>      substitutions += [('%o', outfile)]
>>>>>>>>      runscript = applySubstitutions(runscript,
substitutions)
>>>>>>>>      verifyscript =
applySubstitutions(verifyscript,
>>>>>>>>      substitutions)
>>>>>>>> +        metricscripts = {k:
applySubstitutions(v,
>>>>>>>> substitutions)
>>>>>>>> +                         for k,v in
metricscripts.items()}
>>>>>>>>      context = TestContext(test, litConfig,
runscript,
>>>>>>>>      verifyscript, tmpDir,
>>>>>>>>                            tmpBase)
>>>>>>>> 
>>>>>>>> @@ -80,6 +82,7 @@ class
TestSuiteTest(FileBasedTest):
>>>>>>>>      output = ""
>>>>>>>>      n_runs = 1
>>>>>>>>      runtimes = []
>>>>>>>> +        metrics = {}
>>>>>>>>      for n in range(n_runs):
>>>>>>>>          res = runScript(context, runscript)
>>>>>>>>          if isinstance(res, lit.Test.Result):
>>>>>>>> @@ -94,6 +97,15 @@ class
TestSuiteTest(FileBasedTest):
>>>>>>>>              output += "\n" + err
>>>>>>>>              return lit.Test.Result(Test.FAIL,
output)
>>>>>>>> 
>>>>>>>> +            # Execute metric extraction
scripts.
>>>>>>>> +            for metric, script in
metricscripts.items():
>>>>>>>> +                res = runScript(context,
script)
>>>>>>>> +                if isinstance(res,
lit.Test.Result):
>>>>>>>> +                    return res
>>>>>>>> +
>>>>>>>> +                out, err, exitCode,
timeoutInfo = res
>>>>>>>> +                metrics.setdefault(metric,
>>>>>>>> list()).append(float(out))
>>>>>>>> +
>>>>>>>>          try:
>>>>>>>>              runtime =
runsafely.getTime(context)
>>>>>>>>              runtimes.append(runtime)
>>>>>>>> @@ -128,6 +140,8 @@ class
TestSuiteTest(FileBasedTest):
>>>>>>>>      result = lit.Test.Result(Test.PASS,
output)
>>>>>>>>      if len(runtimes) > 0:
>>>>>>>>          result.addMetric('exec_time',
>>>>>>>>          lit.Test.toMetricValue(runtimes[0]))
>>>>>>>> +        for metric, values in metrics.items():
>>>>>>>> +            result.addMetric(metric,
>>>>>>>> lit.Test.toMetricValue(values[0]))
>>>>>>>>      compiletime.collect(context, result)
>>>>>>>> 
>>>>>>>>      return result
>>>>>>>> 
>>>>>>>> Modified:
test-suite/trunk/litsupport/testscript.py
>>>>>>>> URL:
>>>>>>>>
http://llvm.org/viewvc/llvm-project/test-suite/trunk/litsupport/testscript.py?rev=261857&r1=261856&r2=261857&view=diff
>>>>>>>>
=============================================================================>>>>>>>>
--- test-suite/trunk/litsupport/testscript.py (original)
>>>>>>>> +++ test-suite/trunk/litsupport/testscript.py
Thu Feb 25
>>>>>>>> 05:06:15
>>>>>>>> 2016
>>>>>>>> @@ -22,13 +22,18 @@ def parse(filename):
>>>>>>>>  # Collect the test lines from the script.
>>>>>>>>  runscript = []
>>>>>>>>  verifyscript = []
>>>>>>>> -    keywords = ['RUN:',
'VERIFY:']
>>>>>>>> +    metricscripts = {}
>>>>>>>> +    keywords = ['RUN:',
'VERIFY:', 'METRIC:']
>>>>>>>>  for line_number, command_type, ln in \
>>>>>>>>         
parseIntegratedTestScriptCommands(filename,
>>>>>>>>          keywords):
>>>>>>>>      if command_type == 'RUN':
>>>>>>>>          _parseShellCommand(runscript, ln)
>>>>>>>>      elif command_type == 'VERIFY':
>>>>>>>>          _parseShellCommand(verifyscript, ln)
>>>>>>>> +        elif command_type == 'METRIC':
>>>>>>>> +            metric, ln = ln.split(':',
1)
>>>>>>>> +            metricscript
>>>>>>>> metricscripts.setdefault(metric.strip(),
list())
>>>>>>>> +            _parseShellCommand(metricscript,
ln)
>>>>>>>>      else:
>>>>>>>>          raise ValueError("unknown script
command type: %r" %
>>>>>>>>          (
>>>>>>>>                           command_type,))
>>>>>>>> @@ -43,4 +48,4 @@ def parse(filename):
>>>>>>>>          raise ValueError("Test has
unterminated RUN/VERIFY
>>>>>>>>          lines " +
>>>>>>>>                           "(ending with
'\\')")
>>>>>>>> 
>>>>>>>> -    return runscript, verifyscript
>>>>>>>> +    return runscript, verifyscript,
metricscripts
>>>>>>>> 
>>>>>>>> 
>>>>>>>> _______________________________________________
>>>>>>>> llvm-commits mailing list
>>>>>>>> llvm-commits at lists.llvm.org
>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>> 
>>>>>> _______________________________________________
>>>>>> llvm-commits mailing list
>>>>>> llvm-commits at lists.llvm.org <mailto:llvm-commits
at lists.llvm.org>
>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>> 
>>>>> --
>>>>> Hal Finkel
>>>>> Assistant Computational Scientist
>>>>> Leadership Computing Facility
>>>>> Argonne National Laboratory
>>>> 
>>> 
>>> --
>>> Hal Finkel
>>> Assistant Computational Scientist
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>>> _______________________________________________
>>> llvm-commits mailing list
>>> llvm-commits at lists.llvm.org <mailto:llvm-commits at
lists.llvm.org>
>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>
>>> 
>> 
>> -- 
>> Hal Finkel
>> Assistant Computational Scientist
>> Leadership Computing Facility
>> Argonne National Laboratory
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160323/5990899a/attachment.html>

Matthias Braun via llvm-dev

2016-Mar-24 05:08 UTC

head link

[llvm-dev] [test-suite] r261857 - [cmake] Add support for arbitrary metrics

Okay, I was intrigued, tried it and it turns out you can add make a patch for
basic google benchmark support in 40 minutes:

http://reviews.llvm.org/D18428 <http://reviews.llvm.org/D18428>

So there is a base now if someone wants to write benchmarks for it in the
future.

- Matthias
> On Mar 23, 2016, at 6:00 PM, Mehdi Amini via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
>> 
>> On Mar 23, 2016, at 5:54 PM, Matthias Braun via llvm-dev <llvm-dev
at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
>> 
>> Let's move this to llvm-dev. I should describe my goals/motivation
for the work I have been putting into the llvm-testsuite lately. This is how I
see the llvm-test-suite today:
>> 
>> - We provide a familiar cmake build system so people have a known
environment to tweak compilation flags.
>> - Together with the benchmark executable we build a .test file that
describes how to invoke the benchmark and can be run by the familiar llvm-lit
tool:
>> - Running a benchmark means executing its executable with a certain set
of flags. Some of the SPEC benchmarks even require multiple invocations with
different flags.
>> - There is a set of steps to verify that the benchmark worked
correctly. This usually means invoking "diff" or "fpcmp" and
comparing the results with a reference file.
>> - The lit benchmark driver modifies these benchmark descriptions to
create a test plan. In the simplest case this means prefixing the executable
with "timeit" and collecting the number. But we are adding more
features like collecting code size, running the benchmark on a remote device,
prefixing different instrumentation tools like the linux "perf" tool,
a utility tasks that collects and merge PGO data files after a benchmark run,
...
>> 
>> This allows us to add new instrumentation and metrics in the future
without touching the benchmarks itself. It works best for bigger benchmark that
run for a while (a few seconds minimum). It works nicely with benchmark suites
like SPEC, geekbench, mediabench.... Let's call this "macro
benchmarking".
>> 
>> 
>> Having said all that. You make a very good case for what we should call
"micro benchmarking". The google benchmarking library does indeed look
like a fantastic tool. We should definitely evaluate how we can integrate this
into the llvm test-suite, we think of it as a new flavor of benchmarks. We
won't be able to redesign SPEC but we surely can find things like TSVC which
we could adapt to this. I have no immediate plans to put much more work into the
test-suite, but I agree that micro benchmarking would be an exciting addition to
our testing strategy. I'd be happy to review patches or talk through
possible designs on IRC.
> 
> Note: I suggested to have the Halide test infrastructure compatible with
google benchmarks framework during the initial review, because long term Halide
can generate interesting micro-benchmarks.
> 
> -- 
> Mehdi
> 
> 
>> 
>> - Matthias
>> 
>>> https://github.com/google/benchmark
<https://github.com/google/benchmark>
>> 
>> 
>> 
>>> On Mar 23, 2016, at 3:45 PM, Hal Finkel <hfinkel at anl.gov
<mailto:hfinkel at anl.gov>> wrote:
>>> 
>>> ----- Original Message -----
>>>> From: "Hal Finkel via llvm-commits" <llvm-commits
at lists.llvm.org <mailto:llvm-commits at lists.llvm.org>>
>>>> To: "Matthias Braun" <mbraun at apple.com
<mailto:mbraun at apple.com>>
>>>> Cc: "nd" <nd at arm.com <mailto:nd at
arm.com>>, "llvm-commits" <llvm-commits at lists.llvm.org
<mailto:llvm-commits at lists.llvm.org>>
>>>> Sent: Wednesday, March 23, 2016 5:19:37 PM
>>>> Subject: Re: [test-suite] r261857 - [cmake] Add support for
arbitrary metrics
>>>> 
>>>> ----- Original Message -----
>>>>> From: "Matthias Braun" <mbraun at apple.com
<mailto:mbraun at apple.com>>
>>>>> To: "Hal Finkel" <hfinkel at anl.gov
<mailto:hfinkel at anl.gov>>
>>>>> Cc: "James Molloy" <James.Molloy at arm.com
<mailto:James.Molloy at arm.com>>, "nd" <nd at arm.com
<mailto:nd at arm.com>>,
>>>>> "llvm-commits" <llvm-commits at lists.llvm.org
<mailto:llvm-commits at lists.llvm.org>>
>>>>> Sent: Friday, March 4, 2016 12:36:36 PM
>>>>> Subject: Re: [test-suite] r261857 - [cmake] Add support for
>>>>> arbitrary metrics
>>>>> 
>>>>> A test can report "internal" metrics now. Though
I don't think lnt
>>>>> would split those into a notion of sub-tests I think.
>>>>> It would be an interesting feature to add. Though if we
have the
>>>>> choice to modify a benchmark, we should still prefer
smaller
>>>>> independent ones IMO as that gives a better idea when some
of the
>>>>> other metrics change (compiletime, codesize, hopefully
things like
>>>>> memory usage or performance counters in the future).
>>>> 
>>>> Unless the kernels are large, their code size within the
context of a
>>>> complete executable might be hard to track regardless (because
by
>>>> the time you add in the static libc startup code, ELF headers,
etc.
>>>> any change would be a smaller percentage of the total).
Explicitly
>>>> instrumenting the code to mark regions of interest is probably
best
>>>> (which is true for timing too), but that seems like a separate
>>>> (although worthwhile) project.
>>>> 
>>>> In any case, for TSVC, for example, the single test has 136
kernels;
>>>> which I currently group into 18 binaries. I have a float and
double
>>>> version for each, so we have 36 total binaries. What you're
>>>> suggesting would have us produce 272 separate executables, just
for
>>>> TSVC. Ideally, I'd like aligned and unaligned variants of
each of
>>>> these. I've not done that because I thought that 72
executables
>>>> would be a bit much, but that's 544 executables if I
generate one
>>>> per kernel variant.
>>>> 
>>>> The LCALS benchmark, which I'd really like to add sometime
soon, has
>>>> another ~100 kernels, which is ~200 to do both float and double
>>>> (which we should do).
>>>> 
>>>> What do you think is reasonable here?
>>>> 
>>> 
>>> Also, we might want to consider updating some of these tests to use
Google's benchmark library (https://github.com/google/benchmark
<https://github.com/google/benchmark>). Have you looked at this? Aside
from giving us a common output format, the real advantage of using a driver
library like this is that it lets us dynamically pick the number of loop
iterations based on per-iteration timing. This appeals to me because the number
of iterations that is reasonable for some embedded device is normally quite
different from what is reasonable for a server-class machine. Doing this,
however, means that we definitely can't rely on overall executable timing.
Thoughts?
>>> 
>>> -Hal
>>> 
>>>> Thanks again,
>>>> Hal
>>>> 
>>>>> 
>>>>> - Matthias
>>>>> 
>>>>>> On Mar 4, 2016, at 8:22 AM, Hal Finkel <hfinkel at
anl.gov <mailto:hfinkel at anl.gov>> wrote:
>>>>>> 
>>>>>> Hi James,
>>>>>> 
>>>>>> If I'm reading this correctly, you can have
multiple metrics per
>>>>>> test. Is that correct?
>>>>>> 
>>>>>> I'd really like to support tests with internal
timers (i.e. a
>>>>>> timer
>>>>>> per kernel), so that we can have more fine-grained
timing without
>>>>>> splitting executables into multiple parts (e.g. as I
had to do
>>>>>> with TSVC).
>>>>>> 
>>>>>> Thanks again,
>>>>>> Hal
>>>>>> 
>>>>>> ----- Original Message -----
>>>>>>> From: "James Molloy via llvm-commits"
>>>>>>> <llvm-commits at lists.llvm.org
<mailto:llvm-commits at lists.llvm.org>>
>>>>>>> To: "Matthias Braun" <mbraun at
apple.com <mailto:mbraun at apple.com>>
>>>>>>> Cc: "nd" <nd at arm.com <mailto:nd
at arm.com>>, "llvm-commits"
>>>>>>> <llvm-commits at lists.llvm.org
<mailto:llvm-commits at lists.llvm.org>>
>>>>>>> Sent: Friday, February 26, 2016 3:06:01 AM
>>>>>>> Subject: Re: [test-suite] r261857 - [cmake] Add
support for
>>>>>>> arbitrary metrics
>>>>>>> 
>>>>>>> Hi Matthias,
>>>>>>> 
>>>>>>> Thanks :) I’ve been working internally to move all
our testing
>>>>>>> from
>>>>>>> ad-hoc driver scripts to CMake+LIT-based. Currently
I have CMake
>>>>>>> drivers for a very popular mobile benchmark (but
the pre-release
>>>>>>> version so pushing this upstream might be
difficult), and EEMBC
>>>>>>> (automotive, telecom, consumer).
>>>>>>> 
>>>>>>> I really want all of these to live upstream, but I
have to do a
>>>>>>> bit
>>>>>>> of legal checking before I can push them. In the
meantime I’m
>>>>>>> happy
>>>>>>> to add an example to the repositories;
alternatively I could
>>>>>>> modify
>>>>>>> the SPEC drivers to also compute SPECrate as a
metric?
>>>>>>> 
>>>>>>> Cheers,
>>>>>>> 
>>>>>>> James
>>>>>>> 
>>>>>>>> On 25 Feb 2016, at 21:33, Matthias Braun
<mbraun at apple.com <mailto:mbraun at apple.com>>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> Hi James,
>>>>>>>> 
>>>>>>>> thanks for working on the test-suite. It's
nice to see new
>>>>>>>> capabilities added to the lit system.
>>>>>>>> 
>>>>>>>> Are you planing to add tests that use this? If
not we should
>>>>>>>> really
>>>>>>>> have at least an example/unit-test type thing
in the
>>>>>>>> repository.
>>>>>>>> 
>>>>>>>> - Matthias
>>>>>>>> 
>>>>>>>>> On Feb 25, 2016, at 3:06 AM, James Molloy
via llvm-commits
>>>>>>>>> <llvm-commits at lists.llvm.org
<mailto:llvm-commits at lists.llvm.org>> wrote:
>>>>>>>>> 
>>>>>>>>> Author: jamesm
>>>>>>>>> Date: Thu Feb 25 05:06:15 2016
>>>>>>>>> New Revision: 261857
>>>>>>>>> 
>>>>>>>>> URL:
http://llvm.org/viewvc/llvm-project?rev=261857&view=rev
<http://llvm.org/viewvc/llvm-project?rev=261857&view=rev>
>>>>>>>>> Log:
>>>>>>>>> [cmake] Add support for arbitrary metrics
>>>>>>>>> 
>>>>>>>>> This allows a .test script to specify a
command to get a
>>>>>>>>> metric
>>>>>>>>> for the test. For example:
>>>>>>>>> 
>>>>>>>>> METRIC: score: grep "Score:" %o |
awk '{print $2}'
>>>>>>>>> 
>>>>>>>>> Modified:
>>>>>>>>>
test-suite/trunk/cmake/modules/SingleMultiSource.cmake
>>>>>>>>> test-suite/trunk/litsupport/test.py
>>>>>>>>> test-suite/trunk/litsupport/testscript.py
>>>>>>>>> 
>>>>>>>>> Modified:
>>>>>>>>>
test-suite/trunk/cmake/modules/SingleMultiSource.cmake
>>>>>>>>> URL:
>>>>>>>>>
http://llvm.org/viewvc/llvm-project/test-suite/trunk/cmake/modules/SingleMultiSource.cmake?rev=261857&r1=261856&r2=261857&view=diff
<http://llvm.org/viewvc/llvm-project/test-suite/trunk/cmake/modules/SingleMultiSource.cmake?rev=261857&r1=261856&r2=261857&view=diff>
>>>>>>>>>
=============================================================================>>>>>>>>>
--- test-suite/trunk/cmake/modules/SingleMultiSource.cmake
>>>>>>>>> (original)
>>>>>>>>> +++
test-suite/trunk/cmake/modules/SingleMultiSource.cmake Thu
>>>>>>>>> Feb
>>>>>>>>> 25 05:06:15 2016
>>>>>>>>> @@ -223,3 +223,17 @@
macro(llvm_test_verify)
>>>>>>>>>  set(TESTSCRIPT "${TESTSCRIPT}VERIFY:
${JOINED_ARGUMENTS}\n")
>>>>>>>>> endif()
>>>>>>>>> endmacro()
>>>>>>>>> +
>>>>>>>>> +macro(llvm_test_metric)
>>>>>>>>> +  CMAKE_PARSE_ARGUMENTS(ARGS ""
"RUN_TYPE;METRIC" "" ${ARGN})
>>>>>>>>> +  if(NOT DEFINED TESTSCRIPT)
>>>>>>>>> +    set(TESTSCRIPT ""
PARENT_SCOPE)
>>>>>>>>> +  endif()
>>>>>>>>> +  # ARGS_UNPARSED_ARGUMENTS is a
semicolon-separated list.
>>>>>>>>> Change
>>>>>>>>> it into a
>>>>>>>>> +  # whitespace-separated string.
>>>>>>>>> +  string(REPLACE ";" "
" JOINED_ARGUMENTS
>>>>>>>>> "${ARGS_UNPARSED_ARGUMENTS}")
>>>>>>>>> +  if(NOT DEFINED ARGS_RUN_TYPE OR
"${ARGS_RUN_TYPE}" STREQUAL
>>>>>>>>> "${TEST_SUITE_RUN_TYPE}")
>>>>>>>>> +    set(TESTSCRIPT
"${TESTSCRIPT}METRIC: ${ARGS_METRIC}:
>>>>>>>>> ${JOINED_ARGUMENTS}\n")
>>>>>>>>> +  endif()
>>>>>>>>> +endmacro()
>>>>>>>>> +
>>>>>>>>> \ No newline at end of file
>>>>>>>>> 
>>>>>>>>> Modified:
test-suite/trunk/litsupport/test.py
>>>>>>>>> URL:
>>>>>>>>>
http://llvm.org/viewvc/llvm-project/test-suite/trunk/litsupport/test.py?rev=261857&r1=261856&r2=261857&view=diff
<http://llvm.org/viewvc/llvm-project/test-suite/trunk/litsupport/test.py?rev=261857&r1=261856&r2=261857&view=diff>
>>>>>>>>>
=============================================================================>>>>>>>>>
--- test-suite/trunk/litsupport/test.py (original)
>>>>>>>>> +++ test-suite/trunk/litsupport/test.py Thu
Feb 25 05:06:15
>>>>>>>>> 2016
>>>>>>>>> @@ -56,7 +56,7 @@ class
TestSuiteTest(FileBasedTest):
>>>>>>>>>      res =
testscript.parse(test.getSourcePath())
>>>>>>>>>      if litConfig.noExecute:
>>>>>>>>>          return lit.Test.Result(Test.PASS)
>>>>>>>>> -        runscript, verifyscript = res
>>>>>>>>> +        runscript, verifyscript,
metricscripts = res
>>>>>>>>> 
>>>>>>>>>      # Apply the usual lit substitutions
(%s, %S, %p, %T,
>>>>>>>>>      ...)
>>>>>>>>>      tmpDir, tmpBase = getTempPaths(test)
>>>>>>>>> @@ -65,6 +65,8 @@ class
TestSuiteTest(FileBasedTest):
>>>>>>>>>      substitutions += [('%o',
outfile)]
>>>>>>>>>      runscript =
applySubstitutions(runscript, substitutions)
>>>>>>>>>      verifyscript =
applySubstitutions(verifyscript,
>>>>>>>>>      substitutions)
>>>>>>>>> +        metricscripts = {k:
applySubstitutions(v,
>>>>>>>>> substitutions)
>>>>>>>>> +                         for k,v in
metricscripts.items()}
>>>>>>>>>      context = TestContext(test, litConfig,
runscript,
>>>>>>>>>      verifyscript, tmpDir,
>>>>>>>>>                            tmpBase)
>>>>>>>>> 
>>>>>>>>> @@ -80,6 +82,7 @@ class
TestSuiteTest(FileBasedTest):
>>>>>>>>>      output = ""
>>>>>>>>>      n_runs = 1
>>>>>>>>>      runtimes = []
>>>>>>>>> +        metrics = {}
>>>>>>>>>      for n in range(n_runs):
>>>>>>>>>          res = runScript(context,
runscript)
>>>>>>>>>          if isinstance(res,
lit.Test.Result):
>>>>>>>>> @@ -94,6 +97,15 @@ class
TestSuiteTest(FileBasedTest):
>>>>>>>>>              output += "\n" + err
>>>>>>>>>              return
lit.Test.Result(Test.FAIL, output)
>>>>>>>>> 
>>>>>>>>> +            # Execute metric extraction
scripts.
>>>>>>>>> +            for metric, script in
metricscripts.items():
>>>>>>>>> +                res = runScript(context,
script)
>>>>>>>>> +                if isinstance(res,
lit.Test.Result):
>>>>>>>>> +                    return res
>>>>>>>>> +
>>>>>>>>> +                out, err, exitCode,
timeoutInfo = res
>>>>>>>>> +                metrics.setdefault(metric,
>>>>>>>>> list()).append(float(out))
>>>>>>>>> +
>>>>>>>>>          try:
>>>>>>>>>              runtime =
runsafely.getTime(context)
>>>>>>>>>              runtimes.append(runtime)
>>>>>>>>> @@ -128,6 +140,8 @@ class
TestSuiteTest(FileBasedTest):
>>>>>>>>>      result = lit.Test.Result(Test.PASS,
output)
>>>>>>>>>      if len(runtimes) > 0:
>>>>>>>>>         
result.addMetric('exec_time',
>>>>>>>>>         
lit.Test.toMetricValue(runtimes[0]))
>>>>>>>>> +        for metric, values in
metrics.items():
>>>>>>>>> +            result.addMetric(metric,
>>>>>>>>> lit.Test.toMetricValue(values[0]))
>>>>>>>>>      compiletime.collect(context, result)
>>>>>>>>> 
>>>>>>>>>      return result
>>>>>>>>> 
>>>>>>>>> Modified:
test-suite/trunk/litsupport/testscript.py
>>>>>>>>> URL:
>>>>>>>>>
http://llvm.org/viewvc/llvm-project/test-suite/trunk/litsupport/testscript.py?rev=261857&r1=261856&r2=261857&view=diff
>>>>>>>>>
=============================================================================>>>>>>>>>
--- test-suite/trunk/litsupport/testscript.py (original)
>>>>>>>>> +++
test-suite/trunk/litsupport/testscript.py Thu Feb 25
>>>>>>>>> 05:06:15
>>>>>>>>> 2016
>>>>>>>>> @@ -22,13 +22,18 @@ def parse(filename):
>>>>>>>>>  # Collect the test lines from the script.
>>>>>>>>>  runscript = []
>>>>>>>>>  verifyscript = []
>>>>>>>>> -    keywords = ['RUN:',
'VERIFY:']
>>>>>>>>> +    metricscripts = {}
>>>>>>>>> +    keywords = ['RUN:',
'VERIFY:', 'METRIC:']
>>>>>>>>>  for line_number, command_type, ln in \
>>>>>>>>>         
parseIntegratedTestScriptCommands(filename,
>>>>>>>>>          keywords):
>>>>>>>>>      if command_type == 'RUN':
>>>>>>>>>          _parseShellCommand(runscript, ln)
>>>>>>>>>      elif command_type == 'VERIFY':
>>>>>>>>>          _parseShellCommand(verifyscript,
ln)
>>>>>>>>> +        elif command_type ==
'METRIC':
>>>>>>>>> +            metric, ln =
ln.split(':', 1)
>>>>>>>>> +            metricscript
>>>>>>>>> metricscripts.setdefault(metric.strip(),
list())
>>>>>>>>> +           
_parseShellCommand(metricscript, ln)
>>>>>>>>>      else:
>>>>>>>>>          raise ValueError("unknown
script command type: %r" %
>>>>>>>>>          (
>>>>>>>>>                           command_type,))
>>>>>>>>> @@ -43,4 +48,4 @@ def parse(filename):
>>>>>>>>>          raise ValueError("Test has
unterminated RUN/VERIFY
>>>>>>>>>          lines " +
>>>>>>>>>                           "(ending
with '\\')")
>>>>>>>>> 
>>>>>>>>> -    return runscript, verifyscript
>>>>>>>>> +    return runscript, verifyscript,
metricscripts
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>
_______________________________________________
>>>>>>>>> llvm-commits mailing list
>>>>>>>>> llvm-commits at lists.llvm.org
>>>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
>>>>>>> 
>>>>>>> _______________________________________________
>>>>>>> llvm-commits mailing list
>>>>>>> llvm-commits at lists.llvm.org
<mailto:llvm-commits at lists.llvm.org>
>>>>>>>
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>
>>>>>> 
>>>>>> --
>>>>>> Hal Finkel
>>>>>> Assistant Computational Scientist
>>>>>> Leadership Computing Facility
>>>>>> Argonne National Laboratory
>>>>> 
>>>> 
>>>> --
>>>> Hal Finkel
>>>> Assistant Computational Scientist
>>>> Leadership Computing Facility
>>>> Argonne National Laboratory
>>>> _______________________________________________
>>>> llvm-commits mailing list
>>>> llvm-commits at lists.llvm.org <mailto:llvm-commits at
lists.llvm.org>
>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-commits>
>>>> 
>>> 
>>> -- 
>>> Hal Finkel
>>> Assistant Computational Scientist
>>> Leadership Computing Facility
>>> Argonne National Laboratory
>> 
>> _______________________________________________
>> LLVM Developers mailing list
>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160323/e8c1d90b/attachment-0001.html>

Seemingly Similar Threads

Search for more maybe matching threads

llvm dev - Mar 2016 - [test-suite] r261857 - [cmake] Add support for arbitrary metrics

[llvm-dev] [test-suite] r261857 - [cmake] Add support for arbitrary metrics

[llvm-dev] [test-suite] r261857 - [cmake] Add support for arbitrary metrics

[llvm-dev] [test-suite] r261857 - [cmake] Add support for arbitrary metrics

Seemingly Similar Threads