thr3ads.net - llvm dev - [LLVMdev] Use perf tool for more accurate time measuring on Linux [May 2014]

If this information is useful, please help other people find it:
Share via:

Yi Kong

2014-May-16 16:37 UTC

[LLVMdev] Use perf tool for more accurate time measuring on Linux

Hi all,

The LLVM benchmarking system produces very noisy results even on quiet
machines. One of the sources of inaccuracy is the timing tool we are
using. Because it is a user-space tool, the OS can context switch it
and we will get an outlier result. Perf stat uses SW_TASK_CLOCK
counter in kernel to measure time, therefore more accurate. It also
does not get context switched.

I've implemented a wrapper script over perf stat which mimics the
behaviour of timeit tool in test suite, so that nothing else needs to
be modified. The script is not yet feature complete as timeit, but
enough to run nightly tests.

I carried out experiments on several machines and saw different level
of improvements. I am no longer seeing outlier results, and MAD is
considerably lower. The run-by-run changes results over the same
revision shrank from around 10 to only 2-3. The MAD reduced from
around 0.01 to 0.003 on a quiet machine.

I've attached the patch and please experiment with it.

Cheers,
Yi Kong
-------------- next part --------------
Index: tools/timeit.sh
==================================================================---
tools/timeit.sh	(revision 0)
+++ tools/timeit.sh	(revision 0)
@@ -0,0 +1,25 @@
+#! /bin/bash
+# A wrapper over perf to provide similar functionality to timeit.c
+
+REPORT=/dev/stderr
+OUTPUT=/dev/stdout
+
+# FIXME: Have similar behaviour as timeit.c
+while [[ $1 = -* ]]; do
+	if [ $1 = "--summary" ]; then
+		REPORT=$2
+	elif [ $1 = "--redirect-output" ]; then
+		OUTPUT=$2
+	fi
+	shift 2
+done
+
+perf stat -o stats $@ > $OUTPUT
+
+echo exit $? > $REPORT
+awk -F' ' '{if ($2 == "task-clock") print
"user",$1/1000; else if($2 =="seconds") print
"real",$1;}' stats >> $REPORT
+
+# FIXME: How to measure sys time?
+echo sys 0.0000 >> $REPORT
+
+rm stats

Property changes on: tools/timeit.sh
___________________________________________________________________
Added: svn:executable
   + *

Index: tools/Makefile
==================================================================---
tools/Makefile	(revision 208774)
+++ tools/Makefile	(working copy)
@@ -8,8 +8,13 @@
 all:: timeit-target
 endif
 
+ifeq ($(TARGET_OS),Linux)
+timeit: timeit.sh
+	cp -f $< $@
+else
 timeit: timeit.c
 	$(ORIGINAL_CC) $(CFLAGS) -O3 -o $@ $<
+endif
 
 timeit-target: timeit.c
 	$(LD_ENV_OVERRIDES) $(LCC) -o $@ $< $(LDFLAGS) $(CFLAGS) $(TARGET_FLAGS)
-O3

Yi Kong

2014-May-16 16:40 UTC

head link

[LLVMdev] Use perf tool for more accurate time measuring on Linux

Also some screenshots. Both running on a quiet x86 machine.

Timeit: 10 samples per run(aggregated by minimum, 0.05 MWU significance level)
Perf stat: 8 samples per run(aggregated by minimum, 0.05 MWU significance level)

On 16 May 2014 17:37, Yi Kong <kongy.dev at gmail.com>
wrote:> Hi all,
>
> The LLVM benchmarking system produces very noisy results even on quiet
> machines. One of the sources of inaccuracy is the timing tool we are
> using. Because it is a user-space tool, the OS can context switch it
> and we will get an outlier result. Perf stat uses SW_TASK_CLOCK
> counter in kernel to measure time, therefore more accurate. It also
> does not get context switched.
>
> I've implemented a wrapper script over perf stat which mimics the
> behaviour of timeit tool in test suite, so that nothing else needs to
> be modified. The script is not yet feature complete as timeit, but
> enough to run nightly tests.
>
> I carried out experiments on several machines and saw different level
> of improvements. I am no longer seeing outlier results, and MAD is
> considerably lower. The run-by-run changes results over the same
> revision shrank from around 10 to only 2-3. The MAD reduced from
> around 0.01 to 0.003 on a quiet machine.
>
> I've attached the patch and please experiment with it.
>
> Cheers,
> Yi Kong-------------- next part --------------
A non-text attachment was scrubbed...
Name: perfit.png
Type: image/png
Size: 39723 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140516/6f1e46e5/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: timeit.png
Type: image/png
Size: 70089 bytes
Desc: not available
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20140516/6f1e46e5/attachment-0001.png>

Hal Finkel

2014-May-16 17:08 UTC

head link

[LLVMdev] Use perf tool for more accurate time measuring on Linux

----- Original Message -----> From: "Yi Kong" <kongy.dev at gmail.com>
> To: "Hal Finkel" <hfinkel at anl.gov>, "Renato
Golin" <renato.golin at linaro.org>, "Tobias Grosser"
<tobias at grosser.es>
> Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu>
> Sent: Friday, May 16, 2014 11:37:28 AM
> Subject: Use perf tool for more accurate time measuring on Linux
> 
> Hi all,
> 
> The LLVM benchmarking system produces very noisy results even on
> quiet
> machines. One of the sources of inaccuracy is the timing tool we are
> using. Because it is a user-space tool, the OS can context switch it
> and we will get an outlier result. Perf stat uses SW_TASK_CLOCK
> counter in kernel to measure time, therefore more accurate. It also
> does not get context switched.
> 
> I've implemented a wrapper script over perf stat which mimics the
> behaviour of timeit tool in test suite, so that nothing else needs to
> be modified. The script is not yet feature complete as timeit, but
> enough to run nightly tests.
> 
> I carried out experiments on several machines and saw different level
> of improvements. I am no longer seeing outlier results, and MAD is
> considerably lower. The run-by-run changes results over the same
> revision shrank from around 10 to only 2-3. The MAD reduced from
> around 0.01 to 0.003 on a quiet machine.
That sounds great, thanks for working on this!

First, we'd definitely need more documentation on what perf is and how to
get it. The testing guide (and lnt dependencies), at least, need to be updated.
FWIW, I don't have any machines on which this is already installed (and so
it certainly is not installed by default). On Ubuntu, it is claimed that both
linux-base and linux-tools-common provide a 'pref' utility, and on rpm
systems it looks like there is a perf package.

+# FIXME: How to measure sys time?
+echo sys 0.0000 >> $REPORT

Is this just the difference between the real time and the task time (or would
that be a reasonable approximation)?

Thanks again,
Hal
> 
> I've attached the patch and please experiment with it.
> 
> Cheers,
> Yi Kong
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

Yi Kong

2014-May-16 17:17 UTC

head link

[LLVMdev] Use perf tool for more accurate time measuring on Linux

On 16 May 2014 18:08, Hal Finkel <hfinkel at anl.gov>
wrote:> ----- Original Message -----
>> From: "Yi Kong" <kongy.dev at gmail.com>
>> To: "Hal Finkel" <hfinkel at anl.gov>, "Renato
Golin" <renato.golin at linaro.org>, "Tobias Grosser"
<tobias at grosser.es>
>> Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu>
>> Sent: Friday, May 16, 2014 11:37:28 AM
>> Subject: Use perf tool for more accurate time measuring on Linux
>>
>> Hi all,
>>
>> The LLVM benchmarking system produces very noisy results even on
>> quiet
>> machines. One of the sources of inaccuracy is the timing tool we are
>> using. Because it is a user-space tool, the OS can context switch it
>> and we will get an outlier result. Perf stat uses SW_TASK_CLOCK
>> counter in kernel to measure time, therefore more accurate. It also
>> does not get context switched.
>>
>> I've implemented a wrapper script over perf stat which mimics the
>> behaviour of timeit tool in test suite, so that nothing else needs to
>> be modified. The script is not yet feature complete as timeit, but
>> enough to run nightly tests.
>>
>> I carried out experiments on several machines and saw different level
>> of improvements. I am no longer seeing outlier results, and MAD is
>> considerably lower. The run-by-run changes results over the same
>> revision shrank from around 10 to only 2-3. The MAD reduced from
>> around 0.01 to 0.003 on a quiet machine.
>
> That sounds great, thanks for working on this!
>
> First, we'd definitely need more documentation on what perf is and how
to get it. The testing guide (and lnt dependencies), at least, need to be
updated. FWIW, I don't have any machines on which this is already installed
(and so it certainly is not installed by default). On Ubuntu, it is claimed that
both linux-base and linux-tools-common provide a 'pref' utility, and on
rpm systems it looks like there is a perf package.
>
> +# FIXME: How to measure sys time?
> +echo sys 0.0000 >> $REPORT
>
> Is this just the difference between the real time and the task time (or
would that be a reasonable approximation)?
Not on some occasions. But for most programs, that should a reasonable
approximation. Since test suite does not care about the sys time, we
can leave it for now.
$ time -p sleep 5
real 5.00
user 0.00
sys 0.00
>
> Thanks again,
> Hal
>
>>
>> I've attached the patch and please experiment with it.
>>
>> Cheers,
>> Yi Kong
>>
>
> --
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory

Tobias Grosser

2014-May-16 18:13 UTC

head link

[LLVMdev] Use perf tool for more accurate time measuring on Linux

On 16/05/2014 18:37, Yi Kong wrote:> Hi all,
>
> The LLVM benchmarking system produces very noisy results even on quiet
> machines. One of the sources of inaccuracy is the timing tool we are
> using. Because it is a user-space tool, the OS can context switch it
> and we will get an outlier result. Perf stat uses SW_TASK_CLOCK
> counter in kernel to measure time, therefore more accurate. It also
> does not get context switched.
>
> I've implemented a wrapper script over perf stat which mimics the
> behaviour of timeit tool in test suite, so that nothing else needs to
> be modified. The script is not yet feature complete as timeit, but
> enough to run nightly tests.
>
> I carried out experiments on several machines and saw different level
> of improvements. I am no longer seeing outlier results, and MAD is
> considerably lower. The run-by-run changes results over the same
> revision shrank from around 10 to only 2-3. The MAD reduced from
> around 0.01 to 0.003 on a quiet machine.
>
> I've attached the patch and please experiment with it.
I think this is a great idea. I installed perf on all my performance 
trackers. As you already experimented with it on x86, I would be fine 
with just committing the patch. We can adjust it if the results are
not as expected.

Tobias

Hal Finkel

2014-May-16 18:19 UTC

head link

[LLVMdev] Use perf tool for more accurate time measuring on Linux

----- Original Message -----> From: "Tobias Grosser" <tobias at grosser.es>
> To: "Yi Kong" <kongy.dev at gmail.com>, "Hal
Finkel" <hfinkel at anl.gov>, "Renato Golin"
<renato.golin at linaro.org>
> Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu>
> Sent: Friday, May 16, 2014 1:13:43 PM
> Subject: Re: Use perf tool for more accurate time measuring on Linux
> 
> On 16/05/2014 18:37, Yi Kong wrote:
> > Hi all,
> >
> > The LLVM benchmarking system produces very noisy results even on
> > quiet
> > machines. One of the sources of inaccuracy is the timing tool we
> > are
> > using. Because it is a user-space tool, the OS can context switch
> > it
> > and we will get an outlier result. Perf stat uses SW_TASK_CLOCK
> > counter in kernel to measure time, therefore more accurate. It also
> > does not get context switched.
> >
> > I've implemented a wrapper script over perf stat which mimics the
> > behaviour of timeit tool in test suite, so that nothing else needs
> > to
> > be modified. The script is not yet feature complete as timeit, but
> > enough to run nightly tests.
> >
> > I carried out experiments on several machines and saw different
> > level
> > of improvements. I am no longer seeing outlier results, and MAD is
> > considerably lower. The run-by-run changes results over the same
> > revision shrank from around 10 to only 2-3. The MAD reduced from
> > around 0.01 to 0.003 on a quiet machine.
> >
> > I've attached the patch and please experiment with it.
> 
> I think this is a great idea. I installed perf on all my performance
> trackers. As you already experimented with it on x86, I would be fine
> with just committing the patch. We can adjust it if the results are
> not as expected.
This is a general adjustment for anything running the test suite on a Linux
target, I'd wait until we have some testing on non-x86 platforms before we
enable it by default for all architectures; and we have other x86 buildbots
besides yours, right?

 -Hal
> 
> Tobias
> 
-- 
Hal Finkel
Assistant Computational Scientist
Leadership Computing Facility
Argonne National Laboratory

llvm dev - May 2014 - [LLVMdev] Use perf tool for more accurate time measuring on Linux

[LLVMdev] Use perf tool for more accurate time measuring on Linux

[LLVMdev] Use perf tool for more accurate time measuring on Linux

[LLVMdev] Use perf tool for more accurate time measuring on Linux

[LLVMdev] Use perf tool for more accurate time measuring on Linux

[LLVMdev] Use perf tool for more accurate time measuring on Linux

[LLVMdev] Use perf tool for more accurate time measuring on Linux