thr3ads.net - llvm dev - [llvm-dev] RFC: Add bitcode tests to test-suite [Mar 2016]

If this information is useful, please help other people find it:
Share via:

Mehdi Amini via llvm-dev

2016-Mar-01 01:06 UTC

[llvm-dev] RFC: Add bitcode tests to test-suite

Sent from my iPhone
> On Feb 29, 2016, at 3:39 PM, Alina Sbirlea <alina.sbirlea at
gmail.com> wrote:
> 
> 
> 
>> On Mon, Feb 29, 2016 at 2:06 PM, Mehdi Amini <mehdi.amini at
apple.com> wrote:
>> 
>>> On Feb 29, 2016, at 1:50 PM, Alina Sbirlea <alina.sbirlea at
gmail.com> wrote:
>>> 
>>> 
>>> 
>>> On Mon, Feb 29, 2016 at 12:18 PM, Mehdi Amini <mehdi.amini at
apple.com> wrote:
>>>> 
>>>>>> On Feb 29, 2016, at 11:40 AM, Mehdi Amini via llvm-dev
<llvm-dev at lists.llvm.org> wrote:
>>>>>> 
>>>>>> 
>>>>>> On Feb 29, 2016, at 11:16 AM, Alina Sbirlea via
llvm-dev <llvm-dev at lists.llvm.org> wrote:
>>>>>> 
>>>>>> All,
>>>>>> 
>>>>>> To get the discussion going in a focused manner, here
is an initial patch with a running test. The test is from the Halide suite and
is checking the correctness of several simd operations.
>>>>>> (Notes: the patch is large due to the number of
operations being tested;
>>>>>> I expect a lot of changes before actually landing it,
this is simply to continue the discussion using a concrete example.)
>>>>>> http://reviews.llvm.org/D17726
>>>>> 
>>>>> I can't figure how to download the patch *with the
bitcode files* from Phabricator. Can you push this on github (or somewhere
else)? (or if I missed how to proceed...).
>>>> 
>>>> I was able to figure how get them "one by one", it
would still be more convenient to have an archive or a repo to clone somewhere.
>>>> 
>>>>>> A few questions/todos to start the discussion:
>>>>>> 1. What is a good location for these tests? They are in
a separate Bitcode directory atm, but using the llvm_multisource. This may
change to more closely model the approach for external tests (see next item).
>>>> 
>>>> A good location would be their own external repository IMO :)
>>>> 
>>>>>> 2. There is a single .cpp file testing all operations
provided by individual bitcode files. I expect this to change. Instead of using
llvm_multisource to have the same test run with specific arguments, each run
testing a single operation.
>>>>>> 3. The building approach I took is to first link all
bitcode files into a single one, then obtain the assembly for it, which cmake
knows to take as an input source.
>>>> 
>>>> Yeah, so I'd rather have a split-build model, with a split
execution model. Having a gigantic bitcode file to debug an issue is not
friendly.
>>>> I'd expect to have a .cpp file that contains the main and
the logic to run test, and then every test that is linked-in to be executed, a
bit like gtests is doing (there are multiple registering mechanisms that would
avoid to declare explicitly a test in the header).
>>>> -> filters.h and filter_headers.h should just go away.
>>> 
>>> I agree, this is related to point 2. The plan here is to update the
current test .cpp file to test each operation individually. In this model it
will be enough to link with a single bitcode file per test.
>>>  
>>>> 
>>>> Also on the test in general: we should have an idea for each
test what it is doing and how.
>>>> I was expecting your tests to be on the pattern of having an
implementation in C++ and an implementation in Halide bitcode of a filters (or
whatever) and run both on random data and verifies that the result is matching.
>>>> Unfortunately from what I can see you are feeding the tests
with random data, and the tests are "blackboxes" that set an error
flag if they detect an error.
>>>> This is not super robust: the compiler can mess with the error
checking and eliminate it for instance, making any error undetected.
>>> 
>>> The Halide bitcode filters compare the result of vectorized
operations vs scalar runs of the same code. The error code against which we
compare the output will be set to loose tolerance - it is currently 0. We're
interested in codegen bugs that return the wrong value entirely, not accuracy
differences (especially for floating point tests).
>>> With the new error threshold, the data fed into may be random or
read from provided input files, I can do either.
>>> The filters will still look somewhat like blackboxes, though the
name of the filter says what operation it's  being tested and the
disassembled bitcode files are reasonably readable.
>>> Using your suggestion, the driver .cpp file will test one operation
at a time (argvs set accordingly) and return right away once an error is found.
Sound about right?
>> 
>> All of this is great. 
>> The part that is not clear to me is why isn't it to have (what does
it buy us over, or why is it better for us compared to) a possible a C/C++
reference implementation of the filter, and hoisting (and refactor) all the
logic to feed the tests and validate the output *out* of the filters. A filter
would be just the mathematical function performed and nothing more (separation
of concerns, more robust framework, easier debugging when things-go-wrong,
etc.).
> 
> I believe the answer is that Halide generates vectorized code in a way that
is not generated by llvm when starting from C/C++.
I don't really see how *this* addresses my point.  This is justifying why
your bitcode is interesting and why we are having this conversation at all :)
It does not say why we can't have a scalar naive C/C++ impl along with the
bitcode for filter.

> Having a C/C++ scalar reference would involve quite a bit of effort for all
tests. The primary reason Halide is being used is that you don't need to
write a lot of C/C++ code to get different optimizations for the same code (e.g.
vectorized vs scalar is a one line difference).
Yes, this is what is nice with Halide: "write once, codegen multiple
variant". But it does not mean you can't write a c++ reference for
every Halide filter  (not for every codegen variant of a filter!)

It's been 2 years since my last experiments with Halide, but my memories
were that there was a C backend?
I had in mind for each test to have (possibly in a separate directory for each
test):
 - the halide source for the filter.
 - the c/c++ (maybe generated?) for the filter.
 - the bitcode generated for the filter (potentially multiple variant depending
on the CPU support and/or the schedule).

Then some common code/infrastructure to interact with the individual filters,
loading them, executing the variants for a filter, and checking results.

If the reference c/c++ can't be generated by halide (or obtained somehow),
and we can't do better than the current tests infrastructure you have, then
I'm worried about the cost/benefit for this test-suite.

> So we can get really fast test coverage for possible codegen bugs by
comparing that different layout optimizations in Halide give the same result.
> I think having each filter tested separately should give a good separation
of concerns and easy debugging for each particular test.
This is great for halide validation, we are all agreeing with this I think. The
question is where is the tradeoff for the LLVM project. I'm trying to make
sure that the extra coverage doesn't come with a burden to debug and triage
issues when something will break: i.e. the tests need to be very friendly to
interact with. This is the motivation for my comments so far.
Other people in the community may have a different opinion/appreciation of the
situation, this just represents my current thoughts.

Hope it makes sense.

-- 
Mehdi

> 
>  
>> 
>>>> Also, just looking quickly at one IR I'm surprised by
things like:
>>>> 
>>>> "assert succeeded165":                            ;
preds = %"assert succeeded146"
>>>>   %buf_host181 = getelementptr inbounds %struct.buffer_t,
%struct.buffer_t* %error_op_pcmpeqq_272.buffer, i64 0, i32 1
>>>>   %23 = bitcast i8** %buf_host181 to double**
>>>>   %error_op_pcmpeqq_272.host226227232 = load double*, double**
%23, align 8
>>>>   %24 = icmp eq %struct.buffer_t* %error_op_pcmpeqq_272.buffer,
null
>>>>   br i1 %24, label %"assert failed183", label
%"assert succeeded184", !prof !4
>>>> 
>>>> Here you have as check for nullptr at %24, but you already
loaded %error_op_pcmpeqq_272.host226227232 from this pointer just before!
>>> 
>>> It's checking that the host value loaded from buffer_t is not
null. I don't see what's wrong with this. What am I missing?
>> 
>> I may be misreading it, my impression when skimming through the code
was that it seems equivalent to:
>> 
>> foo(buffer_t *out) {
>>   auto value = out->host;
>>   if (!out) {
>>     error("nullptr");
>>   }
>> }
>> 
>> 
>> In case I haven't been clear: I think this work is valuable for the
project, and thank you for putting some effort into it :)
>> 
>> -- 
>> Mehdi
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>> 
>>>> 
>>>> 
>>>>>> A separate discussion is on reading metadata (mcpu and
mattr) in llc. I added a script to work around that for now.
>>>> 
>>>> The generic way of doing it in llvm is (I think) to use
function attributes:
>>>> 
>>>> attributes #0 = { "target-cpu"="x86-64"
"target-features"="+avx2" }
>>>> 
>>>> You shouldn't need it on the command line I think?
>>> 
>>> Yes, I believe so too. Currently these are set in mcpu and mattr by
Halide and not read in by llc, hence the need for feeding them as parameters.
It's a separate issue that we'll need to go into in depth, but I
don't want it to interfere with getting feedback on how to best publish
these tests.
>>>  
>>>> 
>>>> -- 
>>>> Mehdi
>>>> 
>>>> 
>>>> 
>>>>>> 
>>>>>> Looking forward to your feedback!
>>>>>> 
>>>>>> Thanks,
>>>>>> Alina
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Fri, Feb 19, 2016 at 6:50 AM, Kristof Beyls
<kristof.beyls at arm.com> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>>> On 18/02/2016 19:12, Alina Sbirlea via llvm-dev
wrote:
>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I have more questions for Alina. What kind
of tests do you have:
>>>>>>>>> 
>>>>>>>>> - "the compiler takes the bitcode and
generates code without crashing"
>>>>>>>>> - "the compiled test runs without
crashing"
>>>>>>>>> - "the compiled test will produce an
output that be checked against a reference"
>>>>>>>>> - "the compiled test is meaningful as
a benchmarks"
>>>>>>>> 
>>>>>>>> We have all 4 kinds of tests in Halide. The
bitcode files for the first category is already available and I'm working on
building the ones for the next 3. We'd like to include all incrementally.
>>>>>>> It seems to me that the first category ("the
compiler takes the bitcode and generates code without crashing") are tests
that should be added to the "make check-all" tests in the LLVM
subproject, rather than the test-suite subproject?
>>>>>>> Or if these tests currently don't crash the
compiler anymore, the bugs must have been fixed, and there should already be
equivalent tests?
>>>>>> 
>>>>>> _______________________________________________
>>>>>> LLVM Developers mailing list
>>>>>> llvm-dev at lists.llvm.org
>>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>>>>> 
>>>>> _______________________________________________
>>>>> LLVM Developers mailing list
>>>>> llvm-dev at lists.llvm.org
>>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> -------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160229/7a0a890c/attachment.html>

Hal Finkel via llvm-dev

2016-Mar-01 06:50 UTC

head link

[llvm-dev] RFC: Add bitcode tests to test-suite

----- Original Message -----
> From: "Mehdi Amini via llvm-dev" <llvm-dev at
lists.llvm.org>
> To: "Alina Sbirlea" <alina.sbirlea at gmail.com>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>
> Sent: Monday, February 29, 2016 7:06:51 PM
> Subject: Re: [llvm-dev] RFC: Add bitcode tests to test-suite
> Sent from my iPhone
> On Feb 29, 2016, at 3:39 PM, Alina Sbirlea < alina.sbirlea at gmail.com
> > wrote:
> > On Mon, Feb 29, 2016 at 2:06 PM, Mehdi Amini <
> > mehdi.amini at apple.com
> > > wrote:
> 
> > > > On Feb 29, 2016, at 1:50 PM, Alina Sbirlea <
> > > > alina.sbirlea at gmail.com
> > > > > wrote:
> > > 
> > 
> 
> > > > On Mon, Feb 29, 2016 at 12:18 PM, Mehdi Amini <
> > > > mehdi.amini at apple.com
> > > > > wrote:
> > > 
> > 
> 
> > > > > > On Feb 29, 2016, at 11:40 AM, Mehdi Amini via
llvm-dev <
> > > > > > llvm-dev at lists.llvm.org > wrote:
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > On Feb 29, 2016, at 11:16 AM, Alina Sbirlea
via llvm-dev
> > > > > > > <
> > > > > > > llvm-dev at lists.llvm.org > wrote:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > All,
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > To get the discussion going in a focused
manner, here is
> > > > > > > an
> > > > > > > initial
> > > > > > > patch with a running test. The test is from
the Halide
> > > > > > > suite
> > > > > > > and
> > > > > > > is
> > > > > > > checking the correctness of several simd
operations.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > (Notes: the patch is large due to the number
of
> > > > > > > operations
> > > > > > > being
> > > > > > > tested;
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > I expect a lot of changes before actually
landing it,
> > > > > > > this
> > > > > > > is
> > > > > > > simply
> > > > > > > to continue the discussion using a concrete
example.)
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > http://reviews.llvm.org/D17726
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > I can't figure how to download the patch *with
the bitcode
> > > > > > files*
> > > > > > from Phabricator. Can you push this on github (or
somewhere
> > > > > > else)?
> > > > > > (or if I missed how to proceed...).
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > I was able to figure how get them "one by
one", it would
> > > > > still
> > > > > be
> > > > > more convenient to have an archive or a repo to clone
> > > > > somewhere.
> > > > 
> > > 
> > 
> 
> > > > > > > A few questions/todos to start the
discussion:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > 1. What is a good location for these tests?
They are in a
> > > > > > > separate
> > > > > > > Bitcode directory atm, but using the
llvm_multisource.
> > > > > > > This
> > > > > > > may
> > > > > > > change to more closely model the approach for
external
> > > > > > > tests
> > > > > > > (see
> > > > > > > next item).
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > A good location would be their own external repository
IMO :)
> > > > 
> > > 
> > 
> 
> > > > > > > 2. There is a single .cpp file testing all
operations
> > > > > > > provided
> > > > > > > by
> > > > > > > individual bitcode files. I expect this to
change.
> > > > > > > Instead
> > > > > > > of
> > > > > > > using
> > > > > > > llvm_multisource to have the same test run
with specific
> > > > > > > arguments,
> > > > > > > each run testing a single operation.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > 3. The building approach I took is to first
link all
> > > > > > > bitcode
> > > > > > > files
> > > > > > > into a single one, then obtain the assembly
for it, which
> > > > > > > cmake
> > > > > > > knows to take as an input source.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > Yeah, so I'd rather have a split-build model, with
a split
> > > > > execution
> > > > > model. Having a gigantic bitcode file to debug an issue
is
> > > > > not
> > > > > friendly.
> > > > 
> > > 
> > 
> 
> > > > > I'd expect to have a .cpp file that contains the
main and the
> > > > > logic
> > > > > to run test, and then every test that is linked-in to
be
> > > > > executed,
> > > > > a
> > > > > bit like gtests is doing (there are multiple
registering
> > > > > mechanisms
> > > > > that would avoid to declare explicitly a test in the
header).
> > > > 
> > > 
> > 
> 
> > > > > -> filters.h and filter_headers.h should just go
away.
> > > > 
> > > 
> > 
> 
> > > > I agree, this is related to point 2. The plan here is to
update
> > > > the
> > > > current test .cpp file to test each operation individually.
In
> > > > this
> > > > model it will be enough to link with a single bitcode file
per
> > > > test.
> > > 
> > 
> 
> > > > > Also on the test in general: we should have an idea for
each
> > > > > test
> > > > > what it is doing and how.
> > > > 
> > > 
> > 
> 
> > > > > I was expecting your tests to be on the pattern of
having an
> > > > > implementation in C++ and an implementation in Halide
bitcode
> > > > > of
> > > > > a
> > > > > filters (or whatever) and run both on random data and
> > > > > verifies
> > > > > that
> > > > > the result is matching.
> > > > 
> > > 
> > 
> 
> > > > > Unfortunately from what I can see you are feeding the
tests
> > > > > with
> > > > > random data, and the tests are "blackboxes"
that set an error
> > > > > flag
> > > > > if they detect an error.
> > > > 
> > > 
> > 
> 
> > > > > This is not super robust: the compiler can mess with
the
> > > > > error
> > > > > checking and eliminate it for instance, making any
error
> > > > > undetected.
> > > > 
> > > 
> > 
> 
> > > > The Halide bitcode filters compare the result of vectorized
> > > > operations vs scalar runs of the same code. The error code
> > > > against
> > > > which we compare the output will be set to loose tolerance -
it
> > > > is
> > > > currently 0. We're interested in codegen bugs that
return the
> > > > wrong
> > > > value entirely, not accuracy differences (especially for
> > > > floating
> > > > point tests).
> > > 
> > 
> 
> > > > With the new error threshold, the data fed into may be
random
> > > > or
> > > > read
> > > > from provided input files, I can do either.
> > > 
> > 
> 
> > > > The filters will still look somewhat like blackboxes, though
> > > > the
> > > > name
> > > > of the filter says what operation it's being tested and
the
> > > > disassembled bitcode files are reasonably readable.
> > > 
> > 
> 
> > > > Using your suggestion, the driver .cpp file will test one
> > > > operation
> > > > at a time (argvs set accordingly) and return right away once
an
> > > > error is found. Sound about right?
> > > 
> > 
> 
> > > All of this is great.
> > 
> 
> > > The part that is not clear to me is why isn't it to have
(what
> > > does
> > > it buy us over, or why is it better for us compared to) a
> > > possible
> > > a
> > > C/C++ reference implementation of the filter, and hoisting (and
> > > refactor) all the logic to feed the tests and validate the output
> > > *out* of the filters. A filter would be just the mathematical
> > > function performed and nothing more (separation of concerns, more
> > > robust framework, easier debugging when things-go-wrong, etc.).
> > 
> 
> > I believe the answer is that Halide generates vectorized code in a
> > way that is not generated by llvm when starting from C/C++.
> 
> I don't really see how *this* addresses my point. This is justifying
> why your bitcode is interesting and why we are having this
> conversation at all :)
> It does not say why we can't have a scalar naive C/C++ impl along
> with the bitcode for filter.
> > Having a C/C++ scalar reference would involve quite a bit of effort
> > for all tests. The primary reason Halide is being used is that you
> > don't need to write a lot of C/C++ code to get different
> > optimizations for the same code (e.g. vectorized vs scalar is a one
> > line difference).
> 
> Yes, this is what is nice with Halide: "write once, codegen multiple
> variant". But it does not mean you can't write a c++ reference for
> every Halide filter (not for every codegen variant of a filter!)
> It's been 2 years since my last experiments with Halide, but my
> memories were that there was a C backend?
> I had in mind for each test to have (possibly in a separate directory
> for each test):
> - the halide source for the filter.
> - the c/c++ (maybe generated?) for the filter.
> - the bitcode generated for the filter (potentially multiple variant
> depending on the CPU support and/or the schedule).
> Then some common code/infrastructure to interact with the individual
> filters, loading them, executing the variants for a filter, and
> checking results.
> If the reference c/c++ can't be generated by halide (or obtained
> somehow), and we can't do better than the current tests
> infrastructure you have, then I'm worried about the cost/benefit for
> this test-suite.
I think that a C/C++ version would be nice to have, but not necessary. IR
generated by non-Clang frontends and/or IR going through alternate optimization
pipelines tend to hit bugs that are much harder to hit with Clang alone. It
would help to have a description of what each test does, but including, for
example, the Halide source code for each test will hopefully be enough of a
guide.
> > So we can get really fast test coverage for possible codegen bugs
> > by
> > comparing that different layout optimizations in Halide give the
> > same result.
> 
> > I think having each filter tested separately should give a good
> > separation of concerns and easy debugging for each particular test.
> 
> This is great for halide validation, we are all agreeing with this I
> think. The question is where is the tradeoff for the LLVM project.
> I'm trying to make sure that the extra coverage doesn't come with a
> burden to debug and triage issues when something will break: i.e.
> the tests need to be very friendly to interact with.I don't think that any non-trivial tests are truly "friendly" to
interact with. Tracking down self-hosting bugs is not friendly, and those
aren't anywhere near the worst ;) -- These tests with their simple driver
seem like good input that bugpoint can reduce (assuming the tests runtimes are
not too long), and that's friendlier than most of the other multisource
tests.

-Hal 
> This is the motivation for my comments so far.
> Other people in the community may have a different
> opinion/appreciation of the situation, this just represents my
> current thoughts.
> Hope it makes sense.
> --
> Mehdi
> > > > > Also, just looking quickly at one IR I'm surprised
by things
> > > > > like:
> > > > 
> > > 
> > 
> 
> > > > > "assert succeeded165": ; preds =
%"assert succeeded146"
> > > > 
> > > 
> > 
> 
> > > > > %buf_host181 = getelementptr inbounds %struct.buffer_t,
> > > > > %struct.buffer_t* %error_op_pcmpeqq_272.buffer, i64 0,
i32 1
> > > > 
> > > 
> > 
> 
> > > > > %23 = bitcast i8** %buf_host181 to double**
> > > > 
> > > 
> > 
> 
> > > > > %error_op_pcmpeqq_272.host226227232 = load double*,
double**
> > > > > %23,
> > > > > align 8
> > > > 
> > > 
> > 
> 
> > > > > %24 = icmp eq %struct.buffer_t*
%error_op_pcmpeqq_272.buffer,
> > > > > null
> > > > 
> > > 
> > 
> 
> > > > > br i1 %24, label %"assert failed183", label
%"assert
> > > > > succeeded184",
> > > > > !prof !4
> > > > 
> > > 
> > 
> 
> > > > > Here you have as check for nullptr at %24, but you
already
> > > > > loaded
> > > > > %error_op_pcmpeqq_272.host226227232 from this pointer
just
> > > > > before!
> > > > 
> > > 
> > 
> 
> > > > It's checking that the host value loaded from buffer_t
is not
> > > > null.
> > > > I
> > > > don't see what's wrong with this. What am I missing?
> > > 
> > 
> 
> > > I may be misreading it, my impression when skimming through the
> > > code
> > > was that it seems equivalent to:
> > 
> 
> > > foo(buffer_t *out) {
> > 
> 
> > > auto value = out->host;
> > 
> 
> > > if (!out) {
> > 
> 
> > > error("nullptr");
> > 
> 
> > > }
> > 
> 
> > > }
> > 
> 
> > > In case I haven't been clear: I think this work is valuable
for
> > > the
> > > project, and thank you for putting some effort into it :)
> > 
> 
> > > --
> > 
> 
> > > Mehdi
> > 
> 
> > > > > > > A separate discussion is on reading metadata
(mcpu and
> > > > > > > mattr)
> > > > > > > in
> > > > > > > llc.
> > > > > > > I added a script to work around that for now.
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > The generic way of doing it in llvm is (I think) to use
> > > > > function
> > > > > attributes:
> > > > 
> > > 
> > 
> 
> > > > > attributes #0 = {
"target-cpu"="x86-64"
> > > > > "target-features"="+avx2"
> > > > > }
> > > > 
> > > 
> > 
> 
> > > > > You shouldn't need it on the command line I think?
> > > > 
> > > 
> > 
> 
> > > > Yes, I believe so too. Currently these are set in mcpu and
> > > > mattr
> > > > by
> > > > Halide and not read in by llc, hence the need for feeding
them
> > > > as
> > > > parameters. It's a separate issue that we'll need to
go into in
> > > > depth, but I don't want it to interfere with getting
feedback
> > > > on
> > > > how
> > > > to best publish these tests.
> > > 
> > 
> 
> > > > > --
> > > > 
> > > 
> > 
> 
> > > > > Mehdi
> > > > 
> > > 
> > 
> 
> > > > > > > Looking forward to your feedback!
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Thanks,
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > Alina
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > On Fri, Feb 19, 2016 at 6:50 AM, Kristof
Beyls <
> > > > > > > kristof.beyls at arm.com > wrote:
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > On 18/02/2016 19:12, Alina Sbirlea via
llvm-dev wrote:
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > I have more questions for
Alina. What kind of tests
> > > > > > > > > > do
> > > > > > > > > > you
> > > > > > > > > > have:
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > - "the compiler takes the
bitcode and generates
> > > > > > > > > > code
> > > > > > > > > > without
> > > > > > > > > > crashing"
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > - "the compiled test runs
without crashing"
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > - "the compiled test will
produce an output that be
> > > > > > > > > > checked
> > > > > > > > > > against
> > > > > > > > > > a
> > > > > > > > > > reference"
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > > - "the compiled test is
meaningful as a benchmarks"
> > > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > > We have all 4 kinds of tests in
Halide. The bitcode
> > > > > > > > > files
> > > > > > > > > for
> > > > > > > > > the
> > > > > > > > > first category is already available
and I'm working
> > > > > > > > > on
> > > > > > > > > building
> > > > > > > > > the
> > > > > > > > > ones for the next 3. We'd like
to include all
> > > > > > > > > incrementally.
> > > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > It seems to me that the first category
("the compiler
> > > > > > > > takes
> > > > > > > > the
> > > > > > > > bitcode and generates code without
crashing") are tests
> > > > > > > > that
> > > > > > > > should
> > > > > > > > be added to the "make
check-all" tests in the LLVM
> > > > > > > > subproject,
> > > > > > > > rather than the test-suite subproject?
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > > Or if these tests currently don't
crash the compiler
> > > > > > > > anymore,
> > > > > > > > the
> > > > > > > > bugs must have been fixed, and there
should already be
> > > > > > > > equivalent
> > > > > > > > tests?
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > >
_______________________________________________
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > LLVM Developers mailing list
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > > llvm-dev at lists.llvm.org
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > >
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > > > > > 
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > _______________________________________________
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > LLVM Developers mailing list
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > > llvm-dev at lists.llvm.org
> > > > > 
> > > > 
> > > 
> > 
> 
> > > > > >
http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> > > > > 
> > > > 
> > > 
> > 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-- 

Hal Finkel 
Assistant Computational Scientist 
Leadership Computing Facility 
Argonne National Laboratory 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160301/cb16d059/attachment.html>

Mehdi Amini via llvm-dev

2016-Mar-01 07:16 UTC

head link

[llvm-dev] RFC: Add bitcode tests to test-suite

> On Feb 29, 2016, at 10:50 PM, Hal Finkel <hfinkel at anl.gov> wrote:
> 
> 
> 
> From: "Mehdi Amini via llvm-dev" <llvm-dev at
lists.llvm.org>
> To: "Alina Sbirlea" <alina.sbirlea at gmail.com>
> Cc: "llvm-dev" <llvm-dev at lists.llvm.org>
> Sent: Monday, February 29, 2016 7:06:51 PM
> Subject: Re: [llvm-dev] RFC: Add bitcode tests to test-suite
> 
> 
> 
> Sent from my iPhone
> 
> On Feb 29, 2016, at 3:39 PM, Alina Sbirlea <alina.sbirlea at gmail.com
<mailto:alina.sbirlea at gmail.com>> wrote:
> 
> 
> 
> On Mon, Feb 29, 2016 at 2:06 PM, Mehdi Amini <mehdi.amini at apple.com
<mailto:mehdi.amini at apple.com>> wrote:
> 
> On Feb 29, 2016, at 1:50 PM, Alina Sbirlea <alina.sbirlea at gmail.com
<mailto:alina.sbirlea at gmail.com>> wrote:
> 
> 
> 
> On Mon, Feb 29, 2016 at 12:18 PM, Mehdi Amini <mehdi.amini at apple.com
<mailto:mehdi.amini at apple.com>> wrote:
> 
> On Feb 29, 2016, at 11:40 AM, Mehdi Amini via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> 
> 
> On Feb 29, 2016, at 11:16 AM, Alina Sbirlea via llvm-dev <llvm-dev at
lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote:
> 
> All,
> 
> To get the discussion going in a focused manner, here is an initial patch
with a running test. The test is from the Halide suite and is checking the
correctness of several simd operations.
> (Notes: the patch is large due to the number of operations being tested; 
> I expect a lot of changes before actually landing it, this is simply to
continue the discussion using a concrete example.)
> http://reviews.llvm.org/D17726 <http://reviews.llvm.org/D17726>
> 
> I can't figure how to download the patch *with the bitcode files* from
Phabricator. Can you push this on github (or somewhere else)? (or if I missed
how to proceed...).
> 
> I was able to figure how get them "one by one", it would still be
more convenient to have an archive or a repo to clone somewhere.
> 
> A few questions/todos to start the discussion:
> 1. What is a good location for these tests? They are in a separate Bitcode
directory atm, but using the llvm_multisource. This may change to more closely
model the approach for external tests (see next item).
> 
> A good location would be their own external repository IMO :)
> 
> 2. There is a single .cpp file testing all operations provided by
individual bitcode files. I expect this to change. Instead of using
llvm_multisource to have the same test run with specific arguments, each run
testing a single operation.
> 3. The building approach I took is to first link all bitcode files into a
single one, then obtain the assembly for it, which cmake knows to take as an
input source.
> 
> Yeah, so I'd rather have a split-build model, with a split execution
model. Having a gigantic bitcode file to debug an issue is not friendly.
> I'd expect to have a .cpp file that contains the main and the logic to
run test, and then every test that is linked-in to be executed, a bit like
gtests is doing (there are multiple registering mechanisms that would avoid to
declare explicitly a test in the header).
> -> filters.h and filter_headers.h should just go away.
> 
> I agree, this is related to point 2. The plan here is to update the current
test .cpp file to test each operation individually. In this model it will be
enough to link with a single bitcode file per test.
>  
> 
> Also on the test in general: we should have an idea for each test what it
is doing and how.
> I was expecting your tests to be on the pattern of having an implementation
in C++ and an implementation in Halide bitcode of a filters (or whatever) and
run both on random data and verifies that the result is matching.
> Unfortunately from what I can see you are feeding the tests with random
data, and the tests are "blackboxes" that set an error flag if they
detect an error.
> This is not super robust: the compiler can mess with the error checking and
eliminate it for instance, making any error undetected.
> 
> The Halide bitcode filters compare the result of vectorized operations vs
scalar runs of the same code. The error code against which we compare the output
will be set to loose tolerance - it is currently 0. We're interested in
codegen bugs that return the wrong value entirely, not accuracy differences
(especially for floating point tests).
> With the new error threshold, the data fed into may be random or read from
provided input files, I can do either.
> The filters will still look somewhat like blackboxes, though the name of
the filter says what operation it's  being tested and the disassembled
bitcode files are reasonably readable.
> Using your suggestion, the driver .cpp file will test one operation at a
time (argvs set accordingly) and return right away once an error is found. Sound
about right?
> 
> All of this is great. 
> The part that is not clear to me is why isn't it to have (what does it
buy us over, or why is it better for us compared to) a possible a C/C++
reference implementation of the filter, and hoisting (and refactor) all the
logic to feed the tests and validate the output *out* of the filters. A filter
would be just the mathematical function performed and nothing more (separation
of concerns, more robust framework, easier debugging when things-go-wrong,
etc.).
> 
> I believe the answer is that Halide generates vectorized code in a way that
is not generated by llvm when starting from C/C++.
> 
> I don't really see how *this* addresses my point.  This is justifying
why your bitcode is interesting and why we are having this conversation at all
:)
> It does not say why we can't have a scalar naive C/C++ impl along with
the bitcode for filter.
> 
> 
> Having a C/C++ scalar reference would involve quite a bit of effort for all
tests. The primary reason Halide is being used is that you don't need to
write a lot of C/C++ code to get different optimizations for the same code (e.g.
vectorized vs scalar is a one line difference).
> 
> Yes, this is what is nice with Halide: "write once, codegen multiple
variant". But it does not mean you can't write a c++ reference for
every Halide filter  (not for every codegen variant of a filter!)
> 
> It's been 2 years since my last experiments with Halide, but my
memories were that there was a C backend?
> I had in mind for each test to have (possibly in a separate directory for
each test):
>  - the halide source for the filter.
>  - the c/c++ (maybe generated?) for the filter.
>  - the bitcode generated for the filter (potentially multiple variant
depending on the CPU support and/or the schedule).
> 
> Then some common code/infrastructure to interact with the individual
filters, loading them, executing the variants for a filter, and checking
results.
> 
> If the reference c/c++ can't be generated by halide (or obtained
somehow), and we can't do better than the current tests infrastructure you
have, then I'm worried about the cost/benefit for this test-suite.
> 
> 
> I think that a C/C++ version would be nice to have, but not necessary.
> IR generated by non-Clang frontends and/or IR going through alternate
optimization pipelines tend to hit bugs that are much harder to hit with Clang
alone.
To make it clear: the point of the C++ reference I was asking for is *not to
stress clang* at all. It is intended to compute the *golden result* to be
compared to the runs of each variant for a filter.  Having a reference is
important when a diff is detected in the output and you need to figure out what
is going on.

> It would help to have a description of what each test does, but including,
for example, the Halide source code for each test will hopefully be enough of a
guide.
> 
> So we can get really fast test coverage for possible codegen bugs by
comparing that different layout optimizations in Halide give the same result.
> I think having each filter tested separately should give a good separation
of concerns and easy debugging for each particular test.
> 
> This is great for halide validation, we are all agreeing with this I think.
The question is where is the tradeoff for the LLVM project. I'm trying to
make sure that the extra coverage doesn't come with a burden to debug and
triage issues when something will break: i.e. the tests need to be very friendly
to interact with.
> 
> I don't think that any non-trivial tests are truly "friendly"
to interact with.
Yeah, I just think we shouldn't make it arbitrarily worse by not having a
good infrastructure to begin with :)

In the end, if we expect this suite to be accepted and maintained as a
"first-class citizen" by LLVM developers (i.e. accepting things like
reverting a commit if it breaks something in this suite), we'd better make
sure the burden to interact with it is minimal.

> Tracking down self-hosting bugs is not friendly, and those aren't
anywhere near the worst ;) --
> These tests with their simple driver seem like good input that bugpoint can
reduce (assuming the tests runtimes are not too long), and that's friendlier
than most of the other multisource tests.
If we're talking exclusively about crashes, I agree. However if we're
considering correctness issue as well (miscompiles), I believe that the
structural changes I proposed are very important to easily perform bugpoint on
them for instance (or bugpoint would just turn the test into "return
false;").
Also I believe these changes are necessary to perform timing measurement for
these tests, if we are interested in the quality of the optimization/codegen (to
be hooked into something like https://github.com/google/benchmark ?).

Best,

-- 
Mehdi


> 
>  -Hal
> 
> This is the motivation for my comments so far.
> Other people in the community may have a different opinion/appreciation of
the situation, this just represents my current thoughts.
> 
> Hope it makes sense.
> 
> -- 
> Mehdi
> 
> 
> 
>  
> 
> Also, just looking quickly at one IR I'm surprised by things like:
> 
> "assert succeeded165":                            ; preds =
%"assert succeeded146"
>   %buf_host181 = getelementptr inbounds %struct.buffer_t, %struct.buffer_t*
%error_op_pcmpeqq_272.buffer, i64 0, i32 1
>   %23 = bitcast i8** %buf_host181 to double**
>   %error_op_pcmpeqq_272.host226227232 = load double*, double** %23, align 8
>   %24 = icmp eq %struct.buffer_t* %error_op_pcmpeqq_272.buffer, null
>   br i1 %24, label %"assert failed183", label %"assert
succeeded184", !prof !4
> 
> Here you have as check for nullptr at %24, but you already loaded
%error_op_pcmpeqq_272.host226227232 from this pointer just before!
> 
> It's checking that the host value loaded from buffer_t is not null. I
don't see what's wrong with this. What am I missing?
> 
> I may be misreading it, my impression when skimming through the code was
that it seems equivalent to:
> 
> foo(buffer_t *out) {
>   auto value = out->host;
>   if (!out) {
>     error("nullptr");
>   }
> }
> 
> 
> In case I haven't been clear: I think this work is valuable for the
project, and thank you for putting some effort into it :)
> 
> -- 
> Mehdi
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> A separate discussion is on reading metadata (mcpu and mattr) in llc. I
added a script to work around that for now.
> 
> The generic way of doing it in llvm is (I think) to use function
attributes:
> 
> attributes #0 = { "target-cpu"="x86-64"
"target-features"="+avx2" }
> 
> You shouldn't need it on the command line I think?
> 
> Yes, I believe so too. Currently these are set in mcpu and mattr by Halide
and not read in by llc, hence the need for feeding them as parameters. It's
a separate issue that we'll need to go into in depth, but I don't want
it to interfere with getting feedback on how to best publish these tests.
>  
> 
> -- 
> Mehdi
> 
> 
> 
> 
> Looking forward to your feedback!
> 
> Thanks,
> Alina
> 
> 
> 
> On Fri, Feb 19, 2016 at 6:50 AM, Kristof Beyls <kristof.beyls at arm.com
<mailto:kristof.beyls at arm.com>> wrote:
> 
> 
> On 18/02/2016 19:12, Alina Sbirlea via llvm-dev wrote:
> 
> 
> I have more questions for Alina. What kind of tests do you have:
> 
> - "the compiler takes the bitcode and generates code without
crashing"
> - "the compiled test runs without crashing"
> - "the compiled test will produce an output that be checked against a
reference"
> - "the compiled test is meaningful as a benchmarks"
> 
> We have all 4 kinds of tests in Halide. The bitcode files for the first
category is already available and I'm working on building the ones for the
next 3. We'd like to include all incrementally.
>  
> 
> It seems to me that the first category ("the compiler takes the
bitcode and generates code without crashing") are tests that should be
added to the "make check-all" tests in the LLVM subproject, rather
than the test-suite subproject?
> Or if these tests currently don't crash the compiler anymore, the bugs
must have been fixed, and there should already be equivalent tests?
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
<http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev>
> 
> 
> 
> 
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
> 
> 
> 
> -- 
> Hal Finkel
> Assistant Computational Scientist
> Leadership Computing Facility
> Argonne National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20160229/23053204/attachment-0001.html>

Apparently Analagous Threads

Search for more reasonably related threads

llvm dev - Mar 2016 - RFC: Add bitcode tests to test-suite

[llvm-dev] RFC: Add bitcode tests to test-suite

[llvm-dev] RFC: Add bitcode tests to test-suite

[llvm-dev] RFC: Add bitcode tests to test-suite

Apparently Analagous Threads