Dehao Chen via llvm-dev
2016-Oct-07 20:27 UTC
[llvm-dev] Debug info interacting with optimization and code generation
In theory, compiler should generate bit-identical code with and without debug info. I.e. # clang -c -O2 -g a.cc -o a.g.o # clang -c -O2 -g0 a.cc -o a.g0.o # strip a.g.o a.g0.o # diff a.g.o a.g0.o The diff should find two binaries identical. For brevity, in the rest of the mail, I'll refer to this requirement as "codegen consistency" (any better name?) Unfortunately, LLVM does not guarantee codegen consistency. Recently, I've spent quite some time try to fix related issues (e.g. https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098). The most recent issue I'm looking at is that during isel, the IROrder is used by both debug info and the actual codegen, which is relative harder to fix. I initially thought that it's just a couple of careless bugs to fix. But looks like there are much more issues than I expected. So I'm calling the community for help: * Is there anyone else who also cares about codegen consistency? * Any volunteers to help fix codegen consistency issues? (It is easy to find issues, just build speccpu with -g and -g0, then compare the "objdump -d" output) * How to setup a regression test to ensure future changes does not break codegen consistency? Any comments? Thanks, Dehao -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161007/a29f1d25/attachment.html>
David Blaikie via llvm-dev
2016-Oct-07 20:35 UTC
[llvm-dev] Debug info interacting with optimization and code generation
On Fri, Oct 7, 2016 at 1:27 PM Dehao Chen <dehao at google.com> wrote:> In theory, compiler should generate bit-identical code with and without > debug info. I.e. > # clang -c -O2 -g a.cc -o a.g.o > # clang -c -O2 -g0 a.cc -o a.g0.o > # strip a.g.o a.g0.o > # diff a.g.o a.g0.o > The diff should find two binaries identical. For brevity, in the rest of > the mail, I'll refer to this requirement as "codegen consistency" (any > better name?) > > Unfortunately, LLVM does not guarantee codegen consistency. Recently, I've > spent quite some time try to fix related issues (e.g. > https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098). The > most recent issue I'm looking at is that during isel, the IROrder is used > by both debug info and the actual codegen, which is relative harder to fix. > > I initially thought that it's just a couple of careless bugs to fix. But > looks like there are much more issues than I expected. So I'm calling the > community for help: > > * Is there anyone else who also cares about codegen consistency? > * Any volunteers to help fix codegen consistency issues? (It is easy to > find issues, just build speccpu with -g and -g0, then compare the "objdump > -d" output) > * How to setup a regression test to ensure future changes does not break > codegen consistency? >Specific test cases would be checked in as usual - beyond that, probably a self-host that checks for consistency (like a 3 stage bootstrap checks that stage 2 and 3 are identical). Potentially other workloads could be added if a selfhost didn't offer enough certainty for common cases. It's an abstract good/intended goal, for sure - but it's not been a priority for anyone (as you've seen), so just hasn't been pushed very hard/far. - Dave> > Any comments? > > Thanks, > Dehao >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161007/e538f05b/attachment.html>
Xinliang David Li via llvm-dev
2016-Oct-07 21:20 UTC
[llvm-dev] Debug info interacting with optimization and code generation
On Fri, Oct 7, 2016 at 1:35 PM, David Blaikie <dblaikie at gmail.com> wrote:> > > On Fri, Oct 7, 2016 at 1:27 PM Dehao Chen <dehao at google.com> wrote: > >> In theory, compiler should generate bit-identical code with and without >> debug info. I.e. >> # clang -c -O2 -g a.cc -o a.g.o >> # clang -c -O2 -g0 a.cc -o a.g0.o >> # strip a.g.o a.g0.o >> # diff a.g.o a.g0.o >> The diff should find two binaries identical. For brevity, in the rest of >> the mail, I'll refer to this requirement as "codegen consistency" (any >> better name?) >> >> Unfortunately, LLVM does not guarantee codegen consistency. Recently, >> I've spent quite some time try to fix related issues (e.g. >> https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098). >> The most recent issue I'm looking at is that during isel, the IROrder is >> used by both debug info and the actual codegen, which is relative harder to >> fix. >> >> I initially thought that it's just a couple of careless bugs to fix. But >> looks like there are much more issues than I expected. So I'm calling the >> community for help: >> >> * Is there anyone else who also cares about codegen consistency? >> * Any volunteers to help fix codegen consistency issues? (It is easy to >> find issues, just build speccpu with -g and -g0, then compare the "objdump >> -d" output) >> * How to setup a regression test to ensure future changes does not break >> codegen consistency? >> > > Specific test cases would be checked in as usual - beyond that, probably a > self-host that checks for consistency (like a 3 stage bootstrap checks that > stage 2 and 3 are identical). Potentially other workloads could be added if > a selfhost didn't offer enough certainty for common cases. > > It's an abstract good/intended goal, for sure - but it's not been a > priority for anyone (as you've seen), so just hasn't been pushed very > hard/far. >I agree with you that this is a good/intended goal, but it is not 'abstract' good goal :) David> > - Dave > > >> >> Any comments? >> >> Thanks, >> Dehao >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161007/3dec94b5/attachment.html>
Krzysztof Parzyszek via llvm-dev
2016-Oct-07 21:20 UTC
[llvm-dev] Debug info interacting with optimization and code generation
On 10/7/2016 3:35 PM, David Blaikie via llvm-dev wrote:> > It's an abstract good/intended goal, for sure - but it's not been a > priority for anyone (as you've seen), so just hasn't been pushed very > hard/far.I wasn't aware of this problem as of late, but with our own compiler (for Hexagon) we've made efforts in the past to make sure that -g did not affect codegen. Some issues must have crept back in. This is definitely something that needs to be fixed. -Krzysztof -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation
Robinson, Paul via llvm-dev
2016-Oct-07 21:33 UTC
[llvm-dev] Debug info interacting with optimization and code generation
(Resend with llvm-dev added back) At Sony we have an internal test run that compares generated code with/without –g, in our suite of regression tests. See our lightning talk slides from EuroLLVM 2015. I believe we list some PRs in there for things we have found and fixed in the past. http://llvm.org/devmtg/2015-04/slides/Verifying_code_gen_dash_g_final.pdf At the moment we have a backlog of about a half-dozen differences worth investigating. I have to admit we have not yet looked at whether some of your recent work has fixed any of them; it is not our top priority, although obviously it is something we do look at and keep track of. There are some very minor differences in instruction order that we see, and I think in most cases that is because –g emits .cfi directives which act as scheduling barriers. It might be the case that if we enabled exceptions, we would not see these as –g differences; we have not experimented with that. --paulr From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Dehao Chen via llvm-dev Sent: Friday, October 07, 2016 1:28 PM To: llvm-dev at lists.llvm.org Cc: David Li Subject: [llvm-dev] Debug info interacting with optimization and code generation In theory, compiler should generate bit-identical code with and without debug info. I.e. # clang -c -O2 -g a.cc -o a.g.o # clang -c -O2 -g0 a.cc -o a.g0.o # strip a.g.o a.g0.o # diff a.g.o a.g0.o The diff should find two binaries identical. For brevity, in the rest of the mail, I'll refer to this requirement as "codegen consistency" (any better name?) Unfortunately, LLVM does not guarantee codegen consistency. Recently, I've spent quite some time try to fix related issues (e.g. https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098). The most recent issue I'm looking at is that during isel, the IROrder is used by both debug info and the actual codegen, which is relative harder to fix. I initially thought that it's just a couple of careless bugs to fix. But looks like there are much more issues than I expected. So I'm calling the community for help: * Is there anyone else who also cares about codegen consistency? * Any volunteers to help fix codegen consistency issues? (It is easy to find issues, just build speccpu with -g and -g0, then compare the "objdump -d" output) * How to setup a regression test to ensure future changes does not break codegen consistency? Any comments? Thanks, Dehao -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161007/db6f471d/attachment-0001.html>
Adrian Prantl via llvm-dev
2016-Oct-07 22:25 UTC
[llvm-dev] Debug info interacting with optimization and code generation
> On Oct 7, 2016, at 1:27 PM, Dehao Chen via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > In theory, compiler should generate bit-identical code with and without debug info. I.e. > # clang -c -O2 -g a.cc -o a.g.o > # clang -c -O2 -g0 a.cc -o a.g0.o > # strip a.g.o a.g0.o > # diff a.g.o a.g0.o > The diff should find two binaries identical. For brevity, in the rest of the mail, I'll refer to this requirement as "codegen consistency" (any better name?) > > Unfortunately, LLVM does not guarantee codegen consistency. Recently, I've spent quite some time try to fix related issues (e.g. https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098). The most recent issue I'm looking at is that during isel, the IROrder is used by both debug info and the actual codegen, which is relative harder to fix. > > I initially thought that it's just a couple of careless bugs to fix. But looks like there are much more issues than I expected. So I'm calling the community for help: > > * Is there anyone else who also cares about codegen consistency?We have in the past always treated situations where the presence of debug info caused different code to be emitted as pretty serious bugs. Typically these bugs came from code that didn't properly skip over debug intrinsics when doing peephole-style transformations.> * Any volunteers to help fix codegen consistency issues? (It is easy to find issues, just build speccpu with -g and -g0, then compare the "objdump -d" output)I certainly don't mind getting CC'ed on any PRs that we find :-) -- adrian> * How to setup a regression test to ensure future changes does not break codegen consistency? > > Any comments? > > Thanks, > Dehao > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Xinliang David Li via llvm-dev
2016-Oct-07 22:28 UTC
[llvm-dev] Debug info interacting with optimization and code generation
A good start is to start file upstream bugs found in SPEC and clang self build. Once those bugs are fixed, we need to set up bots to do 3-stage bootstrap of clang to ensure no regressions are introduced. David On Fri, Oct 7, 2016 at 3:25 PM, Adrian Prantl <aprantl at apple.com> wrote:> > > On Oct 7, 2016, at 1:27 PM, Dehao Chen via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > In theory, compiler should generate bit-identical code with and without > debug info. I.e. > > # clang -c -O2 -g a.cc -o a.g.o > > # clang -c -O2 -g0 a.cc -o a.g0.o > > # strip a.g.o a.g0.o > > # diff a.g.o a.g0.o > > The diff should find two binaries identical. For brevity, in the rest of > the mail, I'll refer to this requirement as "codegen consistency" (any > better name?) > > > > Unfortunately, LLVM does not guarantee codegen consistency. Recently, > I've spent quite some time try to fix related issues (e.g. > https://reviews.llvm.org/D25286 and https://reviews.llvm.org/D25098). The > most recent issue I'm looking at is that during isel, the IROrder is used > by both debug info and the actual codegen, which is relative harder to fix. > > > > I initially thought that it's just a couple of careless bugs to fix. But > looks like there are much more issues than I expected. So I'm calling the > community for help: > > > > * Is there anyone else who also cares about codegen consistency? > > We have in the past always treated situations where the presence of debug > info caused different code to be emitted as pretty serious bugs. Typically > these bugs came from code that didn't properly skip over debug intrinsics > when doing peephole-style transformations. > > > * Any volunteers to help fix codegen consistency issues? (It is easy to > find issues, just build speccpu with -g and -g0, then compare the "objdump > -d" output) > > I certainly don't mind getting CC'ed on any PRs that we find :-) > > -- adrian > > > * How to setup a regression test to ensure future changes does not break > codegen consistency? > > > > Any comments? > > > > Thanks, > > Dehao > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161007/1c89e369/attachment.html>
Greg Bedwell via llvm-dev
2016-Nov-11 21:56 UTC
[llvm-dev] Debug info interacting with optimization and code generation
FWIW, the fix that Rob has just added a patch for ( https://reviews.llvm.org/D26554 ) fixes a case of debug info affecting optimization, found using the utils/check_cfc tool from Russ's presentation below on a large game codebase.> At Sony we have an internal test run that compares generated code > with/without –g, in our suite of regression tests. See our lightning talk > slides from EuroLLVM 2015. I believe we list some PRs in there for things > we have found and fixed in the past. > > http://llvm.org/devmtg/2015-04/slides/Verifying_code_gen_dash_g_final.pdf > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20161111/0f4a107b/attachment.html>
Seemingly Similar Threads
- Enable vectorizer-maximize-bandwidth by default?
- (RFC) Adjusting default loop fully unroll threshold
- (RFC) Encoding code duplication factor in discriminator
- (RFC) Adjusting default loop fully unroll threshold
- (RFC) Encoding code duplication factor in discriminator