thr3ads.net - llvm dev - [llvm-dev] Debug info interacting with optimization and code generation [Oct 2016]

If this information is useful, please help other people find it:
Share via:

Dehao Chen via llvm-dev

2016-Oct-07 20:27 UTC

[llvm-dev] Debug info interacting with optimization and code generation

In theory, compiler should generate bit-identical code with and without
debug info. I.e.
# clang -c -O2 -g a.cc -o a.g.o
# clang -c -O2 -g0 a.cc -o a.g0.o
# strip a.g.o a.g0.o
# diff a.g.o a.g0.o
The diff should find two binaries identical. For brevity, in the rest of
the mail, I'll refer to this requirement as "codegen consistency"
(any
better name?)

Unfortunately, LLVM does not guarantee codegen consistency. Recently, I've
spent quite some time try to fix related issues (e.g.
https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098). The
most recent issue I'm looking at is that during isel, the IROrder is used
by both debug info and the actual codegen, which is relative harder to fix.

I initially thought that it's just a couple of careless bugs to fix. But
looks like there are much more issues than I expected. So I'm calling the
community for help:

* Is there anyone else who also cares about codegen consistency?
* Any volunteers to help fix codegen consistency issues? (It is easy to
find issues, just build speccpu with -g and -g0, then compare the "objdump
-d" output)
* How to setup a regression test to ensure future changes does not break
codegen consistency?

Any comments?

Thanks,
Dehao
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161007/a29f1d25/attachment.html>

David Blaikie via llvm-dev

2016-Oct-07 20:35 UTC

head link

[llvm-dev] Debug info interacting with optimization and code generation

On Fri, Oct 7, 2016 at 1:27 PM Dehao Chen <dehao at google.com> wrote:
> In theory, compiler should generate bit-identical code with and without
> debug info. I.e.
> # clang -c -O2 -g a.cc -o a.g.o
> # clang -c -O2 -g0 a.cc -o a.g0.o
> # strip a.g.o a.g0.o
> # diff a.g.o a.g0.o
> The diff should find two binaries identical. For brevity, in the rest of
> the mail, I'll refer to this requirement as "codegen
consistency" (any
> better name?)
>
> Unfortunately, LLVM does not guarantee codegen consistency. Recently,
I've
> spent quite some time try to fix related issues (e.g.
> https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098). The
> most recent issue I'm looking at is that during isel, the IROrder is
used
> by both debug info and the actual codegen, which is relative harder to fix.
>
> I initially thought that it's just a couple of careless bugs to fix.
But
> looks like there are much more issues than I expected. So I'm calling
the
> community for help:
>
> * Is there anyone else who also cares about codegen consistency?
> * Any volunteers to help fix codegen consistency issues? (It is easy to
> find issues, just build speccpu with -g and -g0, then compare the
"objdump
> -d" output)
> * How to setup a regression test to ensure future changes does not break
> codegen consistency?
>
Specific test cases would be checked in as usual - beyond that, probably a
self-host that checks for consistency (like a 3 stage bootstrap checks that
stage 2 and 3 are identical). Potentially other workloads could be added if
a selfhost didn't offer enough certainty for common cases.

It's an abstract good/intended goal, for sure - but it's not been a
priority for anyone (as you've seen), so just hasn't been pushed very
hard/far.

- Dave

>
> Any comments?
>
> Thanks,
> Dehao
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161007/e538f05b/attachment.html>

Xinliang David Li via llvm-dev

2016-Oct-07 21:20 UTC

head link

[llvm-dev] Debug info interacting with optimization and code generation

On Fri, Oct 7, 2016 at 1:35 PM, David Blaikie <dblaikie at gmail.com>
wrote:
>
>
> On Fri, Oct 7, 2016 at 1:27 PM Dehao Chen <dehao at google.com>
wrote:
>
>> In theory, compiler should generate bit-identical code with and without
>> debug info. I.e.
>> # clang -c -O2 -g a.cc -o a.g.o
>> # clang -c -O2 -g0 a.cc -o a.g0.o
>> # strip a.g.o a.g0.o
>> # diff a.g.o a.g0.o
>> The diff should find two binaries identical. For brevity, in the rest
of
>> the mail, I'll refer to this requirement as "codegen
consistency" (any
>> better name?)
>>
>> Unfortunately, LLVM does not guarantee codegen consistency. Recently,
>> I've spent quite some time try to fix related issues (e.g.
>> https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098).
>> The most recent issue I'm looking at is that during isel, the
IROrder is
>> used by both debug info and the actual codegen, which is relative
harder to
>> fix.
>>
>> I initially thought that it's just a couple of careless bugs to
fix. But
>> looks like there are much more issues than I expected. So I'm
calling the
>> community for help:
>>
>> * Is there anyone else who also cares about codegen consistency?
>> * Any volunteers to help fix codegen consistency issues? (It is easy to
>> find issues, just build speccpu with -g and -g0, then compare the
"objdump
>> -d" output)
>> * How to setup a regression test to ensure future changes does not
break
>> codegen consistency?
>>
>
> Specific test cases would be checked in as usual - beyond that, probably a
> self-host that checks for consistency (like a 3 stage bootstrap checks that
> stage 2 and 3 are identical). Potentially other workloads could be added if
> a selfhost didn't offer enough certainty for common cases.
>
> It's an abstract good/intended goal, for sure - but it's not been a
> priority for anyone (as you've seen), so just hasn't been pushed
very
> hard/far.
>
I agree with you that this is a good/intended goal, but it is not
'abstract' good goal :)

David

>
> - Dave
>
>
>>
>> Any comments?
>>
>> Thanks,
>> Dehao
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161007/3dec94b5/attachment.html>

Krzysztof Parzyszek via llvm-dev

2016-Oct-07 21:20 UTC

head link

[llvm-dev] Debug info interacting with optimization and code generation

On 10/7/2016 3:35 PM, David Blaikie via llvm-dev wrote:>
> It's an abstract good/intended goal, for sure - but it's not been a
> priority for anyone (as you've seen), so just hasn't been pushed
very
> hard/far.
I wasn't aware of this problem as of late, but with our own compiler 
(for Hexagon) we've made efforts in the past to make sure that -g did 
not affect codegen. Some issues must have crept back in. This is 
definitely something that needs to be fixed.

-Krzysztof

-- 
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, 
hosted by The Linux Foundation

Robinson, Paul via llvm-dev

2016-Oct-07 21:33 UTC

head link

[llvm-dev] Debug info interacting with optimization and code generation

(Resend with llvm-dev added back)

At Sony we have an internal test run that compares generated code with/without
–g, in our suite of regression tests.  See our lightning talk slides from
EuroLLVM 2015.  I believe we list some PRs in there for things we have found and
fixed in the past.
http://llvm.org/devmtg/2015-04/slides/Verifying_code_gen_dash_g_final.pdf

At the moment we have a backlog of about a half-dozen differences worth
investigating.  I have to admit we have not yet looked at whether some of your
recent work has fixed any of them; it is not our top priority, although
obviously it is something we do look at and keep track of.
There are some very minor differences in instruction order that we see, and I
think in most cases that is because –g emits .cfi directives which act as
scheduling barriers.  It might be the case that if we enabled exceptions, we
would not see these as –g differences; we have not experimented with that.
--paulr


From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Dehao
Chen via llvm-dev
Sent: Friday, October 07, 2016 1:28 PM
To: llvm-dev at lists.llvm.org
Cc: David Li
Subject: [llvm-dev] Debug info interacting with optimization and code generation

In theory, compiler should generate bit-identical code with and without debug
info. I.e.
# clang -c -O2 -g a.cc -o a.g.o
# clang -c -O2 -g0 a.cc -o a.g0.o
# strip a.g.o a.g0.o
# diff a.g.o a.g0.o
The diff should find two binaries identical. For brevity, in the rest of the
mail, I'll refer to this requirement as "codegen consistency" (any
better name?)

Unfortunately, LLVM does not guarantee codegen consistency. Recently, I've
spent quite some time try to fix related issues (e.g.
https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098). The most
recent issue I'm looking at is that during isel, the IROrder is used by both
debug info and the actual codegen, which is relative harder to fix.

I initially thought that it's just a couple of careless bugs to fix. But
looks like there are much more issues than I expected. So I'm calling the
community for help:

* Is there anyone else who also cares about codegen consistency?
* Any volunteers to help fix codegen consistency issues? (It is easy to find
issues, just build speccpu with -g and -g0, then compare the "objdump
-d" output)
* How to setup a regression test to ensure future changes does not break codegen
consistency?

Any comments?

Thanks,
Dehao
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161007/db6f471d/attachment-0001.html>

Adrian Prantl via llvm-dev

2016-Oct-07 22:25 UTC

head link

[llvm-dev] Debug info interacting with optimization and code generation

> On Oct 7, 2016, at 1:27 PM, Dehao Chen via llvm-dev <llvm-dev at
lists.llvm.org> wrote:
> 
> In theory, compiler should generate bit-identical code with and without
debug info. I.e.
> # clang -c -O2 -g a.cc -o a.g.o
> # clang -c -O2 -g0 a.cc -o a.g0.o
> # strip a.g.o a.g0.o
> # diff a.g.o a.g0.o 
> The diff should find two binaries identical. For brevity, in the rest of
the mail, I'll refer to this requirement as "codegen consistency"
(any better name?)
> 
> Unfortunately, LLVM does not guarantee codegen consistency. Recently,
I've spent quite some time try to fix related issues (e.g.
https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098). The most
recent issue I'm looking at is that during isel, the IROrder is used by both
debug info and the actual codegen, which is relative harder to fix.
> 
> I initially thought that it's just a couple of careless bugs to fix.
But looks like there are much more issues than I expected. So I'm calling
the community for help:
> 
> * Is there anyone else who also cares about codegen consistency?
We have in the past always treated situations where the presence of debug info
caused different code to be emitted as pretty serious bugs. Typically these bugs
came from code that didn't properly skip over debug intrinsics when doing
peephole-style transformations.
> * Any volunteers to help fix codegen consistency issues? (It is easy to
find issues, just build speccpu with -g and -g0, then compare the "objdump
-d" output)
I certainly don't mind getting CC'ed on any PRs that we find :-)

-- adrian
> * How to setup a regression test to ensure future changes does not break
codegen consistency?
> 
> Any comments?
> 
> Thanks,
> Dehao
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev

Xinliang David Li via llvm-dev

2016-Oct-07 22:28 UTC

head link

[llvm-dev] Debug info interacting with optimization and code generation

A good start is to start file upstream bugs found in SPEC and clang self
build. Once those bugs are fixed, we need to set up bots to do 3-stage
bootstrap of clang to ensure no regressions are introduced.

David

On Fri, Oct 7, 2016 at 3:25 PM, Adrian Prantl <aprantl at apple.com>
wrote:
>
> > On Oct 7, 2016, at 1:27 PM, Dehao Chen via llvm-dev <
> llvm-dev at lists.llvm.org> wrote:
> >
> > In theory, compiler should generate bit-identical code with and
without
> debug info. I.e.
> > # clang -c -O2 -g a.cc -o a.g.o
> > # clang -c -O2 -g0 a.cc -o a.g0.o
> > # strip a.g.o a.g0.o
> > # diff a.g.o a.g0.o
> > The diff should find two binaries identical. For brevity, in the rest
of
> the mail, I'll refer to this requirement as "codegen
consistency" (any
> better name?)
> >
> > Unfortunately, LLVM does not guarantee codegen consistency. Recently,
> I've spent quite some time try to fix related issues (e.g.
> https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098). The
> most recent issue I'm looking at is that during isel, the IROrder is
used
> by both debug info and the actual codegen, which is relative harder to fix.
> >
> > I initially thought that it's just a couple of careless bugs to
fix. But
> looks like there are much more issues than I expected. So I'm calling
the
> community for help:
> >
> > * Is there anyone else who also cares about codegen consistency?
>
> We have in the past always treated situations where the presence of debug
> info caused different code to be emitted as pretty serious bugs. Typically
> these bugs came from code that didn't properly skip over debug
intrinsics
> when doing peephole-style transformations.
>
> > * Any volunteers to help fix codegen consistency issues? (It is easy
to
> find issues, just build speccpu with -g and -g0, then compare the
"objdump
> -d" output)
>
> I certainly don't mind getting CC'ed on any PRs that we find :-)
>
> -- adrian
>
> > * How to setup a regression test to ensure future changes does not
break
> codegen consistency?
> >
> > Any comments?
> >
> > Thanks,
> > Dehao
> > _______________________________________________
> > LLVM Developers mailing list
> > llvm-dev at lists.llvm.org
> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161007/1c89e369/attachment.html>

Greg Bedwell via llvm-dev

2016-Nov-11 21:56 UTC

head link

[llvm-dev] Debug info interacting with optimization and code generation

FWIW, the fix that Rob has just added a patch for (
https://reviews.llvm.org/D26554 ) fixes a case of debug info affecting
optimization, found using the utils/check_cfc tool from Russ's presentation
below on a large game codebase.

> At Sony we have an internal test run that compares generated code
> with/without –g, in our suite of regression tests.  See our lightning talk
> slides from EuroLLVM 2015.  I believe we list some PRs in there for things
> we have found and fixed in the past.
>
> http://llvm.org/devmtg/2015-04/slides/Verifying_code_gen_dash_g_final.pdf
>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20161111/0f4a107b/attachment.html>

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Oct 2016 - Debug info interacting with optimization and code generation

[llvm-dev] Debug info interacting with optimization and code generation

[llvm-dev] Debug info interacting with optimization and code generation

[llvm-dev] Debug info interacting with optimization and code generation

[llvm-dev] Debug info interacting with optimization and code generation

[llvm-dev] Debug info interacting with optimization and code generation

[llvm-dev] Debug info interacting with optimization and code generation

[llvm-dev] Debug info interacting with optimization and code generation

[llvm-dev] Debug info interacting with optimization and code generation

Apparently Analagous Threads