Eric Christopher via llvm-dev
2020-Jul-23  01:05 UTC
[llvm-dev] New pass manager for optimization pipeline status and questions
FWIW I'm in favor of this direction while making sure that we keep focus on removing the vestiges of the old pass manager for the code health impact to the project. -eric On Wed, Jul 22, 2020 at 3:15 PM Philip Reames via llvm-dev < llvm-dev at lists.llvm.org> wrote:> (I'm probably going to derail your thread, sorry about that.) > > I think at this point, we should just bite the bullet and make the switch > to NPM by default for Clang's optimization pipeline. Today. > > Why? Because many of our downstream consumers have already switched. > Google has. We (Azul) have. I think I've heard the same for a couple > other major contributors. Why does this matter? Testing. At the current > moment, the vast majority of testing the project gets is exercising NPM, > not LPM. > > NPM is functionally complete for Clang optimization. There might be a few > missing cases around the sanitizers, but last I heard those were on the > edge of being fixed. > > I think we should make the switch, and deal with any fall out as > regressions. If we made the change immediately after a release branch, > we'd have several months to address any major issues before the next > release. > > Philip > On 7/22/20 2:39 PM, Arthur Eubanks via llvm-dev wrote: > > Hi all, > > I wanted to give a quick update on the status of NPM for the IR > optimization pipeline and ask some questions. > > In the past I believe there were thoughts that NPM was basically ready > because all of check-llvm and check-clang passed when > -DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=ON was specified. But that CMake > flag did not apply to opt and any tests running something like `opt > -foo-pass -bar-pass` (which is the vast majority of check-llvm tests) were > still using the legacy PM. The intended way to use NPM was to use the > -passes flag, e.g. `opt -passes='foo,bar'`. > > I've added a -enable-new-pm flag to opt to force running NPM passes even > when `opt -foo-pass` is used. This is because I didn't want to go through > every single test and figure out which ones should be using both -foo-pass > and -passes=foo. Switching on -enable-new-pm currently leads to ~1800 > check-llvm failures. I've documented the failed tests count per directory > in https://bugs.llvm.org/show_bug.cgi?id=46651 (some have been fixed > since that was posted). > > This has led to real bugs in NPM being discovered and fixed (e.g. some > optnone issues). > > But a large portion of the remaining failures are because codegen-only > passes haven't been ported to NPM yet. That's fine for the optimization > pipeline NPM transition since it doesn't affect the optimization pipeline, > but it does present an issue with the approach of the -enable-new-pm flag > (which would by default become true alongside the NPM transition). Lots of > tests are testing codegen-specific passes via opt (e.g. `opt > -amdgpu-lower-intrinsics`) and they can't use NPM (yet). > > I was thinking either we have a way of identifying codegen-only passes and > revert back to the legacy PM in opt whenever we see one, or we go back to > considering the originally intended approach of adding an equivalent > `-passes=` RUN to all tests that should be also running under NPM. > > I'm not sure of a nice and clean solution to identify codegen-only passes. > We could go and update every instance of INITIALIZE_PASS to take another > parameter indicating if it's codegen-only. Or we could just have a central > list somewhere where we check if the pass is in some hardcoded list or has > some prefix (e.g. "x86-"). > > The approach of adding equivalent `-passes=` RUN lines to all relevant > tests seems daunting, but not exactly sure how daunting. Maybe it's > possible to script something and see what fails? We'd still need some way > to identify codegen-only passes to make sure we don't miss anything, and > we'd need to distinguish between analyses and normal passes. Also, it would > slow down test execution since we'd run a lot more tests twice, but maybe > that's not such a big deal? Maybe it's good to have most tests running > against the legacy PM even when NPM is on by default? > > Thoughts? > > This is split off from > http://lists.llvm.org/pipermail/llvm-dev/2020-July/143395.html. > > > > _______________________________________________ > LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200722/8c28dba3/attachment.html>
Sjoerd Meijer via llvm-dev
2020-Jul-23  09:59 UTC
[llvm-dev] New pass manager for optimization pipeline status and questions
I am not in favour of just flipping the switch and then deal with all the fall-out, because we see major regressions that would be unacceptable for our users. Thus, not only would this be very disruptive, also our releases are based on a certain trunk versions, so we would need to revert back to the legacy PM downstream and thus diverge from upstream which wouldn't be ideal for us. About the regressions, see the message/thread that I kicked off earlier (http://lists.llvm.org/pipermail/llvm-dev/2020-July/143646.html) which was quickly followed up by this thread. I would like to see here if we are interesting in defining a few criteria that must be met before we switch: 1. Correctness, which obviously always must come first: looks like this is covered by bots that are running with the NPM, and by downstream users. >From the latest messages, I am getting we are there, or nearly there. 2. Performance (i.e. optimising for speed), 3. Code-size. With 1) correctness box covered and ticked, is now the time to look at codegen quality: 2) performance and 3) code-size? Would it be reasonable that we create a plan or timeline to address this, and thus allow time that these issues can be addressed? We are now ready to start tuning the NPM for code-size. Perhaps we are late to the NPM party (but that was a priority and bandwidth issue), but perhaps with correctness fixed this is actually the right time. I only ran numbers for code-size, and haven't even looked at performance numbers yet, which we would also need to do and takes time. Cheers, Sjoerd. ________________________________ From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Eric Christopher via llvm-dev <llvm-dev at lists.llvm.org> Sent: 23 July 2020 02:05 To: Philip Reames <listmail at philipreames.com>; Alina Sbirlea <asbirlea at google.com>; Chandler Carruth <chandlerc at gmail.com> Cc: llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] New pass manager for optimization pipeline status and questions FWIW I'm in favor of this direction while making sure that we keep focus on removing the vestiges of the old pass manager for the code health impact to the project. -eric On Wed, Jul 22, 2020 at 3:15 PM Philip Reames via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: (I'm probably going to derail your thread, sorry about that.) I think at this point, we should just bite the bullet and make the switch to NPM by default for Clang's optimization pipeline. Today. Why? Because many of our downstream consumers have already switched. Google has. We (Azul) have. I think I've heard the same for a couple other major contributors. Why does this matter? Testing. At the current moment, the vast majority of testing the project gets is exercising NPM, not LPM. NPM is functionally complete for Clang optimization. There might be a few missing cases around the sanitizers, but last I heard those were on the edge of being fixed. I think we should make the switch, and deal with any fall out as regressions. If we made the change immediately after a release branch, we'd have several months to address any major issues before the next release. Philip On 7/22/20 2:39 PM, Arthur Eubanks via llvm-dev wrote: Hi all, I wanted to give a quick update on the status of NPM for the IR optimization pipeline and ask some questions. In the past I believe there were thoughts that NPM was basically ready because all of check-llvm and check-clang passed when -DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=ON was specified. But that CMake flag did not apply to opt and any tests running something like `opt -foo-pass -bar-pass` (which is the vast majority of check-llvm tests) were still using the legacy PM. The intended way to use NPM was to use the -passes flag, e.g. `opt -passes='foo,bar'`. I've added a -enable-new-pm flag to opt to force running NPM passes even when `opt -foo-pass` is used. This is because I didn't want to go through every single test and figure out which ones should be using both -foo-pass and -passes=foo. Switching on -enable-new-pm currently leads to ~1800 check-llvm failures. I've documented the failed tests count per directory in https://bugs.llvm.org/show_bug.cgi?id=46651 (some have been fixed since that was posted). This has led to real bugs in NPM being discovered and fixed (e.g. some optnone issues). But a large portion of the remaining failures are because codegen-only passes haven't been ported to NPM yet. That's fine for the optimization pipeline NPM transition since it doesn't affect the optimization pipeline, but it does present an issue with the approach of the -enable-new-pm flag (which would by default become true alongside the NPM transition). Lots of tests are testing codegen-specific passes via opt (e.g. `opt -amdgpu-lower-intrinsics`) and they can't use NPM (yet). I was thinking either we have a way of identifying codegen-only passes and revert back to the legacy PM in opt whenever we see one, or we go back to considering the originally intended approach of adding an equivalent `-passes=` RUN to all tests that should be also running under NPM. I'm not sure of a nice and clean solution to identify codegen-only passes. We could go and update every instance of INITIALIZE_PASS to take another parameter indicating if it's codegen-only. Or we could just have a central list somewhere where we check if the pass is in some hardcoded list or has some prefix (e.g. "x86-"). The approach of adding equivalent `-passes=` RUN lines to all relevant tests seems daunting, but not exactly sure how daunting. Maybe it's possible to script something and see what fails? We'd still need some way to identify codegen-only passes to make sure we don't miss anything, and we'd need to distinguish between analyses and normal passes. Also, it would slow down test execution since we'd run a lot more tests twice, but maybe that's not such a big deal? Maybe it's good to have most tests running against the legacy PM even when NPM is on by default? Thoughts? This is split off from http://lists.llvm.org/pipermail/llvm-dev/2020-July/143395.html. _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200723/2547840f/attachment.html>
Alina Sbirlea via llvm-dev
2020-Jul-24  18:51 UTC
[llvm-dev] New pass manager for optimization pipeline status and questions
Hi all, The current plan is to prioritize enabling the NPM as soon as possible, and that includes addressing any blockers that are known or arise. This means prioritizing those blockers over other LLVM work. The current umbrella bug is PR46649 <https://bugs.llvm.org/show_bug.cgi?id=46649>. Philip's point is spot on that we are deficient now in the testing of the LegacyPassManager, because so many have already made the switch (FWIW Google switched more than 2 years ago). It's not constructive for the LLVM community to just flip the switch and break current LPM users. The purpose of these communications to llvm-dev and the bug tracking is to be informative as to the planned direction and make as quick of a progress as possible. Please keep in mind that the work on the NPM has been going on for many years and many customers have switched years ago, and delaying this for even an additional year is not acceptable for the code health and stability of LLVM. My point is that we want and should work with users to make the transition smooth, but we do very much need user (meaning companies using LLVM) involvement here in order to not delay the switch further. Best, Alina On Thu, Jul 23, 2020 at 2:59 AM Sjoerd Meijer <Sjoerd.Meijer at arm.com> wrote:> I am not in favour of just flipping the switch and then deal with all the > fall-out, because we see major regressions that would be unacceptable for > our users. Thus, not only would this be very disruptive, also our releases > are based on a certain trunk versions, so we would need to revert back to > the legacy PM downstream and thus diverge from upstream which wouldn't be > ideal for us. About the regressions, see the message/thread that I kicked > off earlier ( > http://lists.llvm.org/pipermail/llvm-dev/2020-July/143646.html) which was > quickly followed up by this thread. > > I would like to see here if we are interesting in defining a few criteria > that must be met before we switch: > > 1. Correctness, which obviously always must come first: looks like > this is covered by bots that are running with the NPM, and by downstream > users. From the latest messages, I am getting we are there, or nearly there. > 2. Performance (i.e. optimising for speed), > 3. Code-size. > > With 1) correctness box covered and ticked, is now the time to look at > codegen quality: 2) performance and 3) code-size? Would it be reasonable > that we create a plan or timeline to address this, and thus allow time that > these issues can be addressed? > > We are now ready to start tuning the NPM for code-size. Perhaps we are > late to the NPM party (but that was a priority and bandwidth issue), but > perhaps with correctness fixed this is actually the right time. I only ran > numbers for code-size, and haven't even looked at performance numbers yet, > which we would also need to do and takes time. > > Cheers, > Sjoerd. > > > ------------------------------ > *From:* llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Eric > Christopher via llvm-dev <llvm-dev at lists.llvm.org> > *Sent:* 23 July 2020 02:05 > *To:* Philip Reames <listmail at philipreames.com>; Alina Sbirlea < > asbirlea at google.com>; Chandler Carruth <chandlerc at gmail.com> > *Cc:* llvm-dev <llvm-dev at lists.llvm.org> > *Subject:* Re: [llvm-dev] New pass manager for optimization pipeline > status and questions > > FWIW I'm in favor of this direction while making sure that we keep focus > on removing the vestiges of the old pass manager for the code health impact > to the project. > > -eric > > On Wed, Jul 22, 2020 at 3:15 PM Philip Reames via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > (I'm probably going to derail your thread, sorry about that.) > > I think at this point, we should just bite the bullet and make the switch > to NPM by default for Clang's optimization pipeline. Today. > > Why? Because many of our downstream consumers have already switched. > Google has. We (Azul) have. I think I've heard the same for a couple > other major contributors. Why does this matter? Testing. At the current > moment, the vast majority of testing the project gets is exercising NPM, > not LPM. > > NPM is functionally complete for Clang optimization. There might be a few > missing cases around the sanitizers, but last I heard those were on the > edge of being fixed. > > I think we should make the switch, and deal with any fall out as > regressions. If we made the change immediately after a release branch, > we'd have several months to address any major issues before the next > release. > > Philip > On 7/22/20 2:39 PM, Arthur Eubanks via llvm-dev wrote: > > Hi all, > > I wanted to give a quick update on the status of NPM for the IR > optimization pipeline and ask some questions. > > In the past I believe there were thoughts that NPM was basically ready > because all of check-llvm and check-clang passed when > -DENABLE_EXPERIMENTAL_NEW_PASS_MANAGER=ON was specified. But that CMake > flag did not apply to opt and any tests running something like `opt > -foo-pass -bar-pass` (which is the vast majority of check-llvm tests) were > still using the legacy PM. The intended way to use NPM was to use the > -passes flag, e.g. `opt -passes='foo,bar'`. > > I've added a -enable-new-pm flag to opt to force running NPM passes even > when `opt -foo-pass` is used. This is because I didn't want to go through > every single test and figure out which ones should be using both -foo-pass > and -passes=foo. Switching on -enable-new-pm currently leads to ~1800 > check-llvm failures. I've documented the failed tests count per directory > in https://bugs.llvm.org/show_bug.cgi?id=46651 (some have been fixed > since that was posted). > > This has led to real bugs in NPM being discovered and fixed (e.g. some > optnone issues). > > But a large portion of the remaining failures are because codegen-only > passes haven't been ported to NPM yet. That's fine for the optimization > pipeline NPM transition since it doesn't affect the optimization pipeline, > but it does present an issue with the approach of the -enable-new-pm flag > (which would by default become true alongside the NPM transition). Lots of > tests are testing codegen-specific passes via opt (e.g. `opt > -amdgpu-lower-intrinsics`) and they can't use NPM (yet). > > I was thinking either we have a way of identifying codegen-only passes and > revert back to the legacy PM in opt whenever we see one, or we go back to > considering the originally intended approach of adding an equivalent > `-passes=` RUN to all tests that should be also running under NPM. > > I'm not sure of a nice and clean solution to identify codegen-only passes. > We could go and update every instance of INITIALIZE_PASS to take another > parameter indicating if it's codegen-only. Or we could just have a central > list somewhere where we check if the pass is in some hardcoded list or has > some prefix (e.g. "x86-"). > > The approach of adding equivalent `-passes=` RUN lines to all relevant > tests seems daunting, but not exactly sure how daunting. Maybe it's > possible to script something and see what fails? We'd still need some way > to identify codegen-only passes to make sure we don't miss anything, and > we'd need to distinguish between analyses and normal passes. Also, it would > slow down test execution since we'd run a lot more tests twice, but maybe > that's not such a big deal? Maybe it's good to have most tests running > against the legacy PM even when NPM is on by default? > > Thoughts? > > This is split off from > http://lists.llvm.org/pipermail/llvm-dev/2020-July/143395.html. > > > > _______________________________________________ > LLVM Developers mailing listllvm-dev at lists.llvm.orghttps://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200724/b511d75c/attachment.html>