Frédéric Riss via llvm-dev
2016-Jun-01 01:02 UTC
[llvm-dev] The state of IRPGO (3 remaining work items)
> On May 24, 2016, at 5:21 PM, Sean Silva <chisophugis at gmail.com> wrote: > > > > On Tue, May 24, 2016 at 3:41 PM, Vedant Kumar <vsk at apple.com <mailto:vsk at apple.com>> wrote: > > > On May 23, 2016, at 8:56 PM, Xinliang David Li <davidxl at google.com <mailto:davidxl at google.com>> wrote: > > > > On Mon, May 23, 2016 at 8:23 PM, Sean Silva <chisophugis at gmail.com <mailto:chisophugis at gmail.com>> wrote: > > Jake and I have been integrating IRPGO on PS4, and we've identified 3 remaining work items. > > > > Sean, thanks for the write up. It matches very well with what we think as well. > > + 1 > > > > - Driver changes > > > > We'd like to make IRPGO the default on PS4. We also think that it would be beneficial to make IRPGO the default PGO on all platforms (coverage would continue to use FE instr as it does currently, of course). In previous conversations (e.g. http://reviews.llvm.org/D15829 <http://reviews.llvm.org/D15829>) it has come up that Apple have requirements that would prevent them from moving to IRPGO as the default PGO, at least without a deprecation period of one or two releases. > > Sean pointed out the problematic scenario in D15829 (in plan "C"): > > ``` > All existing user workflows continue to work, except for workflows that attempt to llvm-profdata merge some old frontend profile data (e.g. they have checked-in to version control and represents some special workload) with the profile data from new binaries. > ``` > > We can address this issue by (1) making sure llvm-profdata emits a helpful warning when merging an FE-based profile with an IR-based one, and (2) keeping an option to use FE instrumentation for PGO. Having (2) helps people who can't (or don't want) to switch to IR PGO. > > > > I'd like to get consensus on a path forward. > > As a point of discussion, how about we make IRPGO the default on all platforms except Apple platforms. > > I'd really rather not introduce this inconsistency. I'm worried that it might lead to Darwin becoming a second-tier platform for PGO. > > Fred (CC'd) is following up with some of our internal users to check if we can change the default behavior of -fprofile-instr-generate. He should be able to chime in on this soon.Sorry it took me so long. I’ve discussed the change in behavior quiet extensively, and I after having changed my mind a couple times, I would argue in favor of keeping the current behavior for the existing flags. I think adding a new switch for IRPGO is a better option. The argument that weighted most on my opinion is the proposed interaction with -fcoverage-mapping, and it is not at all platform specific. With the proposed new behavior, turning coverage on and off in your build system will generate a binary with different performance characteristics and this feels really wrong. I would actually make the IRPGO mode completely incompatible with the -fcoverage-mapping flag. Fred> > > At its core I don't think -fprofile-instr-generate *implies* FE-based instrumentation. So, I'd like to see the driver do this (on all platforms): > > * -fprofile-instr-generate: IR instrumentation > * -fprofile-instr-generate=IR: IR instrumentation > * -fprofile-instr-generate=FE: FE instrumentation > * -fprofile-instr-generate -fcoverage-mapping: FE + coverage instrumentation > > It's a bit ugly because the meaning of -fprofile-instr-generate becomes context-sensitive. But, (1) it doesn't break existing common workflows and (2) it makes it easier to ship IRPGO. The big caveat here is that we'll need to wait a bit and see if our internal users are OK with this. > > Is there a reason to even have the possibility for FEPGO in the long run? From what I can tell, at most we would add a -fuse-the-old-pgo-because-i-want-to-merge-with-old-profiles option to hold people over until they can regenerate their profiles with the current compiler. We can add a flag to control what pre-instrumentation is done to retain the source-level robustness of FEPGO (e.g. -fpgo-no-simplify-before-instrumenting or something). > > > One alternative is to introduce a separate driver flag for IRPGO. This might not work well for Sony's existing users. I'd be interested in any feedback about this approach. > > Personally, I would prefer to maintaining command line compatibility for PGO in Clang (i.e. users don't have to modify their build systems). > > > -- Sean Silva > > > > > I really don't like fragmenting things like this (e.g. if a third-party tests "clang's" PGO they will get something different depending on the platform), but I don't see another way given Apple's constraints. > > > > I'd like to see IRPGO to be the default as well, but the first thing we need is a driver level option to make the switch (prof-gen) -- currently we rely on -Xclang option to switch between two modes, which is less than ideal. > > > > If the concern from Apple is that the old profile still need to work, then this is problem already solved. The reason is that -fprofile-instr-use can automatically detect the type of the profile and switch the mode. > > It's not just that. As Sean pointed out, we're concerned about old profiles inter-operating poorly with new ones. > > thanks, > vedant > > > > - Pre-instrumentation passes > > > > Pre-instrumentation optimization has been critical for reducing the overhead of PGO for the PS4 games we tested (as expected). However, in our measurements (and we are glad to provide more info) the main benefit was inlining (also as expected). A simple pass of inlining at threshold 100 appeared to give all the benefits. Even inlining at threshold 0 gave almost all the benefits. For example, the passes initially proposed in http://reviews.llvm.org/D15828did <http://reviews.llvm.org/D15828did> not improve over just inlining with threshold 100. > > > > (due to PR27299 we also need to add simplifycfg after inlining to clean up, but this doesn't affect the instrumentation overhead in our measurements) > > > > Bottom line: for our use cases, inlining does all the work, but we're not opposed to having more passes, which might be beneficial for non-game workloads (which is most code). > > > > > > > > Yes, Rong is re-collecting performance data before submitting the patch. > > > > - Warnings > > > > We identified 3 classes of issues which manifest as spammy warnings when applying profile data with IRPGO (these affect FEPGO also I believe, but we looked in depth at IRPGO): > > > > 1. The main concerning one is that getPGOFuncName mangles the filename into the counter name. This causes us to get instrprof_error::unknown_function when the pgo-use build is done in a different build directory from the training build (which is a reasonable thing to support). In this situation, PGO data is useless for all `static` functions (and as a byproduct results in a huge volume of warnings). > > > > This can be enhanced with an user option to override the behavior. Can you help filing a tracking bug? > > > > > > 2. In different TU's, pre-instr inlining might make different inlining decisions (for example, different functions may be available for inlining), causing hash mismatch errors (instrprof_error::hash_mismatch). In building a large game, we only saw 8 instance of this, so it is not as severe as 1, but would be good to fix. > > > > > > Rong has a patch addressing that -- will submit after cleanup pass change is done. > > > > > > 3. A .cpp file may be compiled and put into an archive, but then not selected by the linker and will therefore not result in a counter in the profraw. When compiling this file with pgo-use, instrprof_error::unknown_function will result and a warning will be emitted. > > > > yes -- this is a common problem to other compilers as well. > > > > > > > > Case 1 can be fixed using a function hash or other unique identifier instead of a file path. David, in D20195 you mentioned that Rong was working on a patch that would fix 2; we are looking forward to that. > > > > > > Right. > > > > For 3, I unfortunately do not know of any solution. I don't think there is a way for us to make this warning reliable in the face of this circumstance. So my conclusion is that instrprof_error::unknown_function at least must be defaulted to off unfortunately. > > > > yes, this can be annoying. If the warnings can be buffered, then the compiler can check if this is due to missing profile for the whole file and can reduce the warnings into one single warning (source file has no profile data). Making it off by default sounds fine to me too if it is too noisy. > > > > thanks, > > > > David > > > > > > -- Sean Silva-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160531/1f8d7014/attachment.html>
Xinliang David Li via llvm-dev
2016-Jun-01 19:02 UTC
[llvm-dev] The state of IRPGO (3 remaining work items)
On Tue, May 31, 2016 at 6:02 PM, Frédéric Riss <friss at apple.com> wrote:> > On May 24, 2016, at 5:21 PM, Sean Silva <chisophugis at gmail.com> wrote: > > > > On Tue, May 24, 2016 at 3:41 PM, Vedant Kumar <vsk at apple.com> wrote: > >> >> > On May 23, 2016, at 8:56 PM, Xinliang David Li <davidxl at google.com> >> wrote: >> > >> > On Mon, May 23, 2016 at 8:23 PM, Sean Silva <chisophugis at gmail.com> >> wrote: >> > Jake and I have been integrating IRPGO on PS4, and we've identified 3 >> remaining work items. >> > >> > Sean, thanks for the write up. It matches very well with what we think >> as well. >> >> + 1 >> >> >> > - Driver changes >> > >> > We'd like to make IRPGO the default on PS4. We also think that it would >> be beneficial to make IRPGO the default PGO on all platforms (coverage >> would continue to use FE instr as it does currently, of course). In >> previous conversations (e.g. http://reviews.llvm.org/D15829) it has come >> up that Apple have requirements that would prevent them from moving to >> IRPGO as the default PGO, at least without a deprecation period of one or >> two releases. >> >> Sean pointed out the problematic scenario in D15829 (in plan "C"): >> >> ``` >> All existing user workflows continue to work, except for workflows that >> attempt to llvm-profdata merge some old frontend profile data (e.g. they >> have checked-in to version control and represents some special workload) >> with the profile data from new binaries. >> ``` >> >> We can address this issue by (1) making sure llvm-profdata emits a >> helpful warning when merging an FE-based profile with an IR-based one, and >> (2) keeping an option to use FE instrumentation for PGO. Having (2) helps >> people who can't (or don't want) to switch to IR PGO. >> >> >> > I'd like to get consensus on a path forward. >> > As a point of discussion, how about we make IRPGO the default on all >> platforms except Apple platforms. >> >> I'd really rather not introduce this inconsistency. I'm worried that it >> might lead to Darwin becoming a second-tier platform for PGO. >> >> Fred (CC'd) is following up with some of our internal users to check if >> we can change the default behavior of -fprofile-instr-generate. He should >> be able to chime in on this soon. >> > >Fred,> Sorry it took me so long. I’ve discussed the change in behavior quiet > extensively, and I after having changed my mind a couple times, I would > argue in favor of keeping the current behavior for the existing flags. I > think adding a new switch for IRPGO is a better option. The argument that > weighted most on my opinion is the proposed interaction with > -fcoverage-mapping, and it is not at all platform specific. With the > proposed new behavior, turning coverage on and off in your build system > will generate a binary with different performance characteristics and this > feels really wrong. >This is an interesting observation, but IMO this should not be a problem in practice: - Coverage testing and PGO users are not overlapping. They have completely different objectives and expectations. For instance, coverage users care about covering the cold/rarely executed paths and find ways to make them appear in the path, and in the meantime reduce the 'hotness' of real hot paths in order to reduce testing overhead, while PGO users will do the opposite. - Coverage testing and PGO are two different things. Using PGO infrastructure for coverage is actually an implementation detail. This is why it is better to let -fcoverage-mapping to turn on FE instrumentation automatically without needing the user to know about this dependency/detail. There is also an ASAN based coverage support. It would be nice to unify various ways of doing coverage testing with unified use model, reporting tools and interfaces etc. - Longer term, we will add more advanced features in IR based PGO, so unless we duplicate all the work also in FE based instrumentation, the longer term picture is that IR based instrumentation will become the default choice for PGO users, so it seems natural to simplify their use models by making IR based instrumentation the default. On the other hand, the proposed flip won't create additional complexity to coverage testing users.> I would actually make the IRPGO mode completely incompatible with the > -fcoverage-mapping flag. > >yes, agree. thanks, David> Fred > > > >> >> At its core I don't think -fprofile-instr-generate *implies* FE-based >> instrumentation. So, I'd like to see the driver do this (on all platforms): >> >> * -fprofile-instr-generate: IR instrumentation >> * -fprofile-instr-generate=IR: IR instrumentation >> * -fprofile-instr-generate=FE: FE instrumentation >> * -fprofile-instr-generate -fcoverage-mapping: FE + coverage >> instrumentation >> >> It's a bit ugly because the meaning of -fprofile-instr-generate becomes >> context-sensitive. But, (1) it doesn't break existing common workflows and >> (2) it makes it easier to ship IRPGO. The big caveat here is that we'll >> need to wait a bit and see if our internal users are OK with this. >> > > Is there a reason to even have the possibility for FEPGO in the long run? > From what I can tell, at most we would add a -fuse-the-old-pgo-because-i- > want-to-merge-with-old-profiles option to hold people over until they can > regenerate their profiles with the current compiler. We can add a flag to > control what pre-instrumentation is done to retain the source-level > robustness of FEPGO (e.g. -fpgo-no-simplify-before-instrumenting or > something). > > >> One alternative is to introduce a separate driver flag for IRPGO. This >> might not work well for Sony's existing users. I'd be interested in any >> feedback about this approach. >> > > Personally, I would prefer to maintaining command line compatibility for > PGO in Clang (i.e. users don't have to modify their build systems). > > > -- Sean Silva > > >> >> >> > I really don't like fragmenting things like this (e.g. if a third-party >> tests "clang's" PGO they will get something different depending on the >> platform), but I don't see another way given Apple's constraints. >> > >> > I'd like to see IRPGO to be the default as well, but the first thing we >> need is a driver level option to make the switch (prof-gen) -- currently we >> rely on -Xclang option to switch between two modes, which is less than >> ideal. >> > >> > If the concern from Apple is that the old profile still need to work, >> then this is problem already solved. The reason is that -fprofile-instr-use >> can automatically detect the type of the profile and switch the mode. >> >> It's not just that. As Sean pointed out, we're concerned about old >> profiles inter-operating poorly with new ones. >> >> thanks, >> vedant >> >> >> > - Pre-instrumentation passes >> > >> > Pre-instrumentation optimization has been critical for reducing the >> overhead of PGO for the PS4 games we tested (as expected). However, in our >> measurements (and we are glad to provide more info) the main benefit was >> inlining (also as expected). A simple pass of inlining at threshold 100 >> appeared to give all the benefits. Even inlining at threshold 0 gave almost >> all the benefits. For example, the passes initially proposed in >> http://reviews.llvm.org/D15828did not improve over just inlining with >> threshold 100. >> > >> > (due to PR27299 we also need to add simplifycfg after inlining to clean >> up, but this doesn't affect the instrumentation overhead in our >> measurements) >> > >> > Bottom line: for our use cases, inlining does all the work, but we're >> not opposed to having more passes, which might be beneficial for non-game >> workloads (which is most code). >> > >> > >> > >> > Yes, Rong is re-collecting performance data before submitting the patch. >> > >> > - Warnings >> > >> > We identified 3 classes of issues which manifest as spammy warnings >> when applying profile data with IRPGO (these affect FEPGO also I believe, >> but we looked in depth at IRPGO): >> > >> > 1. The main concerning one is that getPGOFuncName mangles the filename >> into the counter name. This causes us to get >> instrprof_error::unknown_function when the pgo-use build is done in a >> different build directory from the training build (which is a reasonable >> thing to support). In this situation, PGO data is useless for all `static` >> functions (and as a byproduct results in a huge volume of warnings). >> > >> > This can be enhanced with an user option to override the behavior. Can >> you help filing a tracking bug? >> > >> > >> > 2. In different TU's, pre-instr inlining might make different inlining >> decisions (for example, different functions may be available for inlining), >> causing hash mismatch errors (instrprof_error::hash_mismatch). In building >> a large game, we only saw 8 instance of this, so it is not as severe as 1, >> but would be good to fix. >> > >> > >> > Rong has a patch addressing that -- will submit after cleanup pass >> change is done. >> > >> > >> > 3. A .cpp file may be compiled and put into an archive, but then not >> selected by the linker and will therefore not result in a counter in the >> profraw. When compiling this file with pgo-use, >> instrprof_error::unknown_function will result and a warning will be emitted. >> > >> > yes -- this is a common problem to other compilers as well. >> > >> > >> > >> > Case 1 can be fixed using a function hash or other unique identifier >> instead of a file path. David, in D20195 you mentioned that Rong was >> working on a patch that would fix 2; we are looking forward to that. >> > >> > >> > Right. >> > >> > For 3, I unfortunately do not know of any solution. I don't think there >> is a way for us to make this warning reliable in the face of this >> circumstance. So my conclusion is that instrprof_error::unknown_function at >> least must be defaulted to off unfortunately. >> > >> > yes, this can be annoying. If the warnings can be buffered, then the >> compiler can check if this is due to missing profile for the whole file and >> can reduce the warnings into one single warning (source file has no profile >> data). Making it off by default sounds fine to me too if it is too noisy. >> > >> > thanks, >> > >> > David >> > >> > >> > -- Sean Silva >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160601/0b39be53/attachment-0001.html>
Sean Silva via llvm-dev
2016-Jun-01 20:46 UTC
[llvm-dev] The state of IRPGO (3 remaining work items)
On Tue, May 31, 2016 at 6:02 PM, Frédéric Riss <friss at apple.com> wrote:> > On May 24, 2016, at 5:21 PM, Sean Silva <chisophugis at gmail.com> wrote: > > > > On Tue, May 24, 2016 at 3:41 PM, Vedant Kumar <vsk at apple.com> wrote: > >> >> > On May 23, 2016, at 8:56 PM, Xinliang David Li <davidxl at google.com> >> wrote: >> > >> > On Mon, May 23, 2016 at 8:23 PM, Sean Silva <chisophugis at gmail.com> >> wrote: >> > Jake and I have been integrating IRPGO on PS4, and we've identified 3 >> remaining work items. >> > >> > Sean, thanks for the write up. It matches very well with what we think >> as well. >> >> + 1 >> >> >> > - Driver changes >> > >> > We'd like to make IRPGO the default on PS4. We also think that it would >> be beneficial to make IRPGO the default PGO on all platforms (coverage >> would continue to use FE instr as it does currently, of course). In >> previous conversations (e.g. http://reviews.llvm.org/D15829) it has come >> up that Apple have requirements that would prevent them from moving to >> IRPGO as the default PGO, at least without a deprecation period of one or >> two releases. >> >> Sean pointed out the problematic scenario in D15829 (in plan "C"): >> >> ``` >> All existing user workflows continue to work, except for workflows that >> attempt to llvm-profdata merge some old frontend profile data (e.g. they >> have checked-in to version control and represents some special workload) >> with the profile data from new binaries. >> ``` >> >> We can address this issue by (1) making sure llvm-profdata emits a >> helpful warning when merging an FE-based profile with an IR-based one, and >> (2) keeping an option to use FE instrumentation for PGO. Having (2) helps >> people who can't (or don't want) to switch to IR PGO. >> >> >> > I'd like to get consensus on a path forward. >> > As a point of discussion, how about we make IRPGO the default on all >> platforms except Apple platforms. >> >> I'd really rather not introduce this inconsistency. I'm worried that it >> might lead to Darwin becoming a second-tier platform for PGO. >> >> Fred (CC'd) is following up with some of our internal users to check if >> we can change the default behavior of -fprofile-instr-generate. He should >> be able to chime in on this soon. >> > > Sorry it took me so long. >Hi Fred, My understanding is that you were specifically investigating whether Apple needed compatibility for merging indexed profiles. Is that compatibility needed? The only compelling argument I have heard to continue to expose FEPGO is that Apple may have a compatibility requirement for merging indexed profiles from previous compiler versions. Even if this is a requirement, then I still intend to make IRPGO the default and only PGO going forward, at least on PS4. I think that doing the same for all platforms in the upstream compiler probably makes sense as well, since an internal Apple vendor compatibility requirement should not penalize all users of the open source project.> I’ve discussed the change in behavior quiet extensively, and I after > having changed my mind a couple times, I would argue in favor of keeping > the current behavior for the existing flags. I think adding a new switch > for IRPGO is a better option. The argument that weighted most on my opinion > is the proposed interaction with -fcoverage-mapping, and it is not at all > platform specific. With the proposed new behavior, turning coverage on and > off in your build system will generate a binary with different performance > characteristics and this feels really wrong. >Bob already mentioned in the other thread that `-fprofile-instr-generate -fcoverage-mapping` was sufficiently different from `-fprofile-instr-generate` that `-fprofile-instr-generate -fcoverage-mapping` was not an acceptable workaround that could be used for enabling FEPGO during a transitionary period, so I'm not convinced that your argument here makes sense. I also share David's opinion that this is not going to be an issue in practice. I think it makes sense for PGO and coverage to have different overheads. Coverage inherently has to trace all locations at source level, while PGO has more freedom. Also, David's point about redundant work on FEPGO is a good one. We don't want to continue maintaining two different PGO's.> I would actually make the IRPGO mode completely incompatible with the > -fcoverage-mapping flag. >I'm not sure what you mean by this. Nobody is proposing anything that would make -fcoverage-mapping do anything related to IRPGO. -- Sean Silva> > Fred > > > >> >> At its core I don't think -fprofile-instr-generate *implies* FE-based >> instrumentation. So, I'd like to see the driver do this (on all platforms): >> >> * -fprofile-instr-generate: IR instrumentation >> * -fprofile-instr-generate=IR: IR instrumentation >> * -fprofile-instr-generate=FE: FE instrumentation >> * -fprofile-instr-generate -fcoverage-mapping: FE + coverage >> instrumentation >> >> It's a bit ugly because the meaning of -fprofile-instr-generate becomes >> context-sensitive. But, (1) it doesn't break existing common workflows and >> (2) it makes it easier to ship IRPGO. The big caveat here is that we'll >> need to wait a bit and see if our internal users are OK with this. >> > > Is there a reason to even have the possibility for FEPGO in the long run? > From what I can tell, at most we would add a -fuse-the-old-pgo-because-i- > want-to-merge-with-old-profiles option to hold people over until they can > regenerate their profiles with the current compiler. We can add a flag to > control what pre-instrumentation is done to retain the source-level > robustness of FEPGO (e.g. -fpgo-no-simplify-before-instrumenting or > something). > > >> One alternative is to introduce a separate driver flag for IRPGO. This >> might not work well for Sony's existing users. I'd be interested in any >> feedback about this approach. >> > > Personally, I would prefer to maintaining command line compatibility for > PGO in Clang (i.e. users don't have to modify their build systems). > > > -- Sean Silva > > >> >> >> > I really don't like fragmenting things like this (e.g. if a third-party >> tests "clang's" PGO they will get something different depending on the >> platform), but I don't see another way given Apple's constraints. >> > >> > I'd like to see IRPGO to be the default as well, but the first thing we >> need is a driver level option to make the switch (prof-gen) -- currently we >> rely on -Xclang option to switch between two modes, which is less than >> ideal. >> > >> > If the concern from Apple is that the old profile still need to work, >> then this is problem already solved. The reason is that -fprofile-instr-use >> can automatically detect the type of the profile and switch the mode. >> >> It's not just that. As Sean pointed out, we're concerned about old >> profiles inter-operating poorly with new ones. >> >> thanks, >> vedant >> >> >> > - Pre-instrumentation passes >> > >> > Pre-instrumentation optimization has been critical for reducing the >> overhead of PGO for the PS4 games we tested (as expected). However, in our >> measurements (and we are glad to provide more info) the main benefit was >> inlining (also as expected). A simple pass of inlining at threshold 100 >> appeared to give all the benefits. Even inlining at threshold 0 gave almost >> all the benefits. For example, the passes initially proposed in >> http://reviews.llvm.org/D15828did not improve over just inlining with >> threshold 100. >> > >> > (due to PR27299 we also need to add simplifycfg after inlining to clean >> up, but this doesn't affect the instrumentation overhead in our >> measurements) >> > >> > Bottom line: for our use cases, inlining does all the work, but we're >> not opposed to having more passes, which might be beneficial for non-game >> workloads (which is most code). >> > >> > >> > >> > Yes, Rong is re-collecting performance data before submitting the patch. >> > >> > - Warnings >> > >> > We identified 3 classes of issues which manifest as spammy warnings >> when applying profile data with IRPGO (these affect FEPGO also I believe, >> but we looked in depth at IRPGO): >> > >> > 1. The main concerning one is that getPGOFuncName mangles the filename >> into the counter name. This causes us to get >> instrprof_error::unknown_function when the pgo-use build is done in a >> different build directory from the training build (which is a reasonable >> thing to support). In this situation, PGO data is useless for all `static` >> functions (and as a byproduct results in a huge volume of warnings). >> > >> > This can be enhanced with an user option to override the behavior. Can >> you help filing a tracking bug? >> > >> > >> > 2. In different TU's, pre-instr inlining might make different inlining >> decisions (for example, different functions may be available for inlining), >> causing hash mismatch errors (instrprof_error::hash_mismatch). In building >> a large game, we only saw 8 instance of this, so it is not as severe as 1, >> but would be good to fix. >> > >> > >> > Rong has a patch addressing that -- will submit after cleanup pass >> change is done. >> > >> > >> > 3. A .cpp file may be compiled and put into an archive, but then not >> selected by the linker and will therefore not result in a counter in the >> profraw. When compiling this file with pgo-use, >> instrprof_error::unknown_function will result and a warning will be emitted. >> > >> > yes -- this is a common problem to other compilers as well. >> > >> > >> > >> > Case 1 can be fixed using a function hash or other unique identifier >> instead of a file path. David, in D20195 you mentioned that Rong was >> working on a patch that would fix 2; we are looking forward to that. >> > >> > >> > Right. >> > >> > For 3, I unfortunately do not know of any solution. I don't think there >> is a way for us to make this warning reliable in the face of this >> circumstance. So my conclusion is that instrprof_error::unknown_function at >> least must be defaulted to off unfortunately. >> > >> > yes, this can be annoying. If the warnings can be buffered, then the >> compiler can check if this is due to missing profile for the whole file and >> can reduce the warnings into one single warning (source file has no profile >> data). Making it off by default sounds fine to me too if it is too noisy. >> > >> > thanks, >> > >> > David >> > >> > >> > -- Sean Silva >> > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160601/85b8d9be/attachment.html>
Frédéric Riss via llvm-dev
2016-Jun-02 00:46 UTC
[llvm-dev] The state of IRPGO (3 remaining work items)
> On Jun 1, 2016, at 1:46 PM, Sean Silva <chisophugis at gmail.com> wrote: > > > > On Tue, May 31, 2016 at 6:02 PM, Frédéric Riss <friss at apple.com <mailto:friss at apple.com>> wrote: > >> On May 24, 2016, at 5:21 PM, Sean Silva <chisophugis at gmail.com <mailto:chisophugis at gmail.com>> wrote: >> >> >> >> On Tue, May 24, 2016 at 3:41 PM, Vedant Kumar <vsk at apple.com <mailto:vsk at apple.com>> wrote: >> >> > On May 23, 2016, at 8:56 PM, Xinliang David Li <davidxl at google.com <mailto:davidxl at google.com>> wrote: >> > >> > On Mon, May 23, 2016 at 8:23 PM, Sean Silva <chisophugis at gmail.com <mailto:chisophugis at gmail.com>> wrote: >> > Jake and I have been integrating IRPGO on PS4, and we've identified 3 remaining work items. >> > >> > Sean, thanks for the write up. It matches very well with what we think as well. >> >> + 1 >> >> >> > - Driver changes >> > >> > We'd like to make IRPGO the default on PS4. We also think that it would be beneficial to make IRPGO the default PGO on all platforms (coverage would continue to use FE instr as it does currently, of course). In previous conversations (e.g. http://reviews.llvm.org/D15829 <http://reviews.llvm.org/D15829>) it has come up that Apple have requirements that would prevent them from moving to IRPGO as the default PGO, at least without a deprecation period of one or two releases. >> >> Sean pointed out the problematic scenario in D15829 (in plan "C"): >> >> ``` >> All existing user workflows continue to work, except for workflows that attempt to llvm-profdata merge some old frontend profile data (e.g. they have checked-in to version control and represents some special workload) with the profile data from new binaries. >> ``` >> >> We can address this issue by (1) making sure llvm-profdata emits a helpful warning when merging an FE-based profile with an IR-based one, and (2) keeping an option to use FE instrumentation for PGO. Having (2) helps people who can't (or don't want) to switch to IR PGO. >> >> >> > I'd like to get consensus on a path forward. >> > As a point of discussion, how about we make IRPGO the default on all platforms except Apple platforms. >> >> I'd really rather not introduce this inconsistency. I'm worried that it might lead to Darwin becoming a second-tier platform for PGO. >> >> Fred (CC'd) is following up with some of our internal users to check if we can change the default behavior of -fprofile-instr-generate. He should be able to chime in on this soon. > > Sorry it took me so long. > > Hi Fred, > > My understanding is that you were specifically investigating whether Apple needed compatibility for merging indexed profiles. Is that compatibility needed? The only compelling argument I have heard to continue to expose FEPGO is that Apple may have a compatibility requirement for merging indexed profiles from previous compiler versions.Sorry no, my comment had nothing to do with merging profiles. I understand that this will break, and it might very well be an issue for us, but I think there is a more fundamental issue with the proposed plan. As you bring it up though, this is a user visible breakage that shouldn’t be disregarded completely.> Even if this is a requirement, then I still intend to make IRPGO the default and only PGO going forward, at least on PS4. I think that doing the same for all platforms in the upstream compiler probably makes sense as well, since an internal Apple vendor compatibility requirement should not penalize all users of the open source project.Again, I’m not expressing an Apple requirement, just trying to discuss the specifics of the proposed implementation. My goal is not to hinder anything, and I want our platforms to be able to use IRPGO reliably if users see the need for it.> I’ve discussed the change in behavior quiet extensively, and I after having changed my mind a couple times, I would argue in favor of keeping the current behavior for the existing flags. I think adding a new switch for IRPGO is a better option. The argument that weighted most on my opinion is the proposed interaction with -fcoverage-mapping, and it is not at all platform specific. With the proposed new behavior, turning coverage on and off in your build system will generate a binary with different performance characteristics and this feels really wrong. > > Bob already mentioned in the other thread that `-fprofile-instr-generate -fcoverage-mapping` was sufficiently different from `-fprofile-instr-generate` that `-fprofile-instr-generate -fcoverage-mapping` was not an acceptable workaround that could be used for enabling FEPGO during a transitionary period, so I'm not convinced that your argument here makes sense.I’m not sure what you’re referring to here, and I have a hard time parsing the sentence. I suppose “was not an acceptable” should read “was an acceptable”? I would be surprised that Bob ever agreed to completely transition away from FEPGO. I didn’t even understand that getting rid of FEPGO was on table as you seem to imply bellow.> I also share David's opinion that this is not going to be an issue in practice. I think it makes sense for PGO and coverage to have different overheads. Coverage inherently has to trace all locations at source level, while PGO has more freedom.I’m sorry if I wasn’t clear, but I’m not talking about instrumentation overhead, I’m talking about the performance of the binary generated using the profiles. If we go the route of making the meaning of -fprofile-instr-generate depend on whether -fcoverage-mapping gets passed, then we change the kind of instrumentation and thus the input to the optimizations behind the user’s back. I wouldn’t be surprised that using profiles generated by FEPGO and IRPGO give you a final executable with measurably different performance characteristics. If you’re tracking your performance, this can be really painful. Recently we wasted days investigating performance regressions that were due to buggy profiles. I strongly believe having an option seemingly unrelated to PGO change this behavior is wrong and can cause actual pain for our end users.> Also, David's point about redundant work on FEPGO is a good one. We don't want to continue maintaining two different PGO’s.Are you implying that LLVM should drop FEPGO? It’s a totally sensible thing to do to use your tests as training data for your profile generation. It’s also a very valid thing to do to use your tests to do coverage. Xcode does both of these things. I would see it a a big regression to not support doing both at the same time (this would mean doubling compile+testing time for users of both). As the instrumentation needs to stay there for coverage anyway, I expect FEPGO to stay there and maintained (we care a lot about coverage). I’m not saying that all the work going into IRPGO should be duplicated in FEPGO, but what’s there and working should keep working. Also, FEPGO has a lot of nice characteristics like resilience to IRGEN changes. If you have archived profiles, then when you switch compilers your performance shouldn’t degrade with FEPGO (modulo optimization bugs), while it’s much more likely to degrade with IRPGO. It overall looks like a much better option for people who do not need the lower instrumentation overhead.> I would actually make the IRPGO mode completely incompatible with the -fcoverage-mapping flag. > > I'm not sure what you mean by this. Nobody is proposing anything that would make -fcoverage-mapping do anything related to IRPGO.What I mean is that -f<whatever enables IRPGO> should error out when passed at the same time as -fcoverage-mapping. Thanks! Fred> -- Sean Silva > > > > Fred > >> >> >> At its core I don't think -fprofile-instr-generate *implies* FE-based instrumentation. So, I'd like to see the driver do this (on all platforms): >> >> * -fprofile-instr-generate: IR instrumentation >> * -fprofile-instr-generate=IR: IR instrumentation >> * -fprofile-instr-generate=FE: FE instrumentation >> * -fprofile-instr-generate -fcoverage-mapping: FE + coverage instrumentation >> >> It's a bit ugly because the meaning of -fprofile-instr-generate becomes context-sensitive. But, (1) it doesn't break existing common workflows and (2) it makes it easier to ship IRPGO. The big caveat here is that we'll need to wait a bit and see if our internal users are OK with this. >> >> Is there a reason to even have the possibility for FEPGO in the long run? From what I can tell, at most we would add a -fuse-the-old-pgo-because-i-want-to-merge-with-old-profiles option to hold people over until they can regenerate their profiles with the current compiler. We can add a flag to control what pre-instrumentation is done to retain the source-level robustness of FEPGO (e.g. -fpgo-no-simplify-before-instrumenting or something). >> >> >> One alternative is to introduce a separate driver flag for IRPGO. This might not work well for Sony's existing users. I'd be interested in any feedback about this approach. >> >> Personally, I would prefer to maintaining command line compatibility for PGO in Clang (i.e. users don't have to modify their build systems). >> >> >> -- Sean Silva >> >> >> >> > I really don't like fragmenting things like this (e.g. if a third-party tests "clang's" PGO they will get something different depending on the platform), but I don't see another way given Apple's constraints. >> > >> > I'd like to see IRPGO to be the default as well, but the first thing we need is a driver level option to make the switch (prof-gen) -- currently we rely on -Xclang option to switch between two modes, which is less than ideal. >> > >> > If the concern from Apple is that the old profile still need to work, then this is problem already solved. The reason is that -fprofile-instr-use can automatically detect the type of the profile and switch the mode. >> >> It's not just that. As Sean pointed out, we're concerned about old profiles inter-operating poorly with new ones. >> >> thanks, >> vedant >> >> >> > - Pre-instrumentation passes >> > >> > Pre-instrumentation optimization has been critical for reducing the overhead of PGO for the PS4 games we tested (as expected). However, in our measurements (and we are glad to provide more info) the main benefit was inlining (also as expected). A simple pass of inlining at threshold 100 appeared to give all the benefits. Even inlining at threshold 0 gave almost all the benefits. For example, the passes initially proposed in http://reviews.llvm.org/D15828did <http://reviews.llvm.org/D15828did> not improve over just inlining with threshold 100. >> > >> > (due to PR27299 we also need to add simplifycfg after inlining to clean up, but this doesn't affect the instrumentation overhead in our measurements) >> > >> > Bottom line: for our use cases, inlining does all the work, but we're not opposed to having more passes, which might be beneficial for non-game workloads (which is most code). >> > >> > >> > >> > Yes, Rong is re-collecting performance data before submitting the patch. >> > >> > - Warnings >> > >> > We identified 3 classes of issues which manifest as spammy warnings when applying profile data with IRPGO (these affect FEPGO also I believe, but we looked in depth at IRPGO): >> > >> > 1. The main concerning one is that getPGOFuncName mangles the filename into the counter name. This causes us to get instrprof_error::unknown_function when the pgo-use build is done in a different build directory from the training build (which is a reasonable thing to support). In this situation, PGO data is useless for all `static` functions (and as a byproduct results in a huge volume of warnings). >> > >> > This can be enhanced with an user option to override the behavior. Can you help filing a tracking bug? >> > >> > >> > 2. In different TU's, pre-instr inlining might make different inlining decisions (for example, different functions may be available for inlining), causing hash mismatch errors (instrprof_error::hash_mismatch). In building a large game, we only saw 8 instance of this, so it is not as severe as 1, but would be good to fix. >> > >> > >> > Rong has a patch addressing that -- will submit after cleanup pass change is done. >> > >> > >> > 3. A .cpp file may be compiled and put into an archive, but then not selected by the linker and will therefore not result in a counter in the profraw. When compiling this file with pgo-use, instrprof_error::unknown_function will result and a warning will be emitted. >> > >> > yes -- this is a common problem to other compilers as well. >> > >> > >> > >> > Case 1 can be fixed using a function hash or other unique identifier instead of a file path. David, in D20195 you mentioned that Rong was working on a patch that would fix 2; we are looking forward to that. >> > >> > >> > Right. >> > >> > For 3, I unfortunately do not know of any solution. I don't think there is a way for us to make this warning reliable in the face of this circumstance. So my conclusion is that instrprof_error::unknown_function at least must be defaulted to off unfortunately. >> > >> > yes, this can be annoying. If the warnings can be buffered, then the compiler can check if this is due to missing profile for the whole file and can reduce the warnings into one single warning (source file has no profile data). Making it off by default sounds fine to me too if it is too noisy. >> > >> > thanks, >> > >> > David >> > >> > >> > -- Sean Silva-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160601/a00b6b81/attachment-0001.html>
Sean Silva via llvm-dev
2016-Jun-02 07:13 UTC
[llvm-dev] The state of IRPGO (3 remaining work items)
On Wed, Jun 1, 2016 at 12:02 PM, Xinliang David Li <davidxl at google.com> wrote:> > > On Tue, May 31, 2016 at 6:02 PM, Frédéric Riss <friss at apple.com> wrote: > >> >> On May 24, 2016, at 5:21 PM, Sean Silva <chisophugis at gmail.com> wrote: >> >> >> >> On Tue, May 24, 2016 at 3:41 PM, Vedant Kumar <vsk at apple.com> wrote: >> >>> >>> > On May 23, 2016, at 8:56 PM, Xinliang David Li <davidxl at google.com> >>> wrote: >>> > >>> > On Mon, May 23, 2016 at 8:23 PM, Sean Silva <chisophugis at gmail.com> >>> wrote: >>> > Jake and I have been integrating IRPGO on PS4, and we've identified 3 >>> remaining work items. >>> > >>> > Sean, thanks for the write up. It matches very well with what we think >>> as well. >>> >>> + 1 >>> >>> >>> > - Driver changes >>> > >>> > We'd like to make IRPGO the default on PS4. We also think that it >>> would be beneficial to make IRPGO the default PGO on all platforms >>> (coverage would continue to use FE instr as it does currently, of course). >>> In previous conversations (e.g. http://reviews.llvm.org/D15829) it has >>> come up that Apple have requirements that would prevent them from moving to >>> IRPGO as the default PGO, at least without a deprecation period of one or >>> two releases. >>> >>> Sean pointed out the problematic scenario in D15829 (in plan "C"): >>> >>> ``` >>> All existing user workflows continue to work, except for workflows that >>> attempt to llvm-profdata merge some old frontend profile data (e.g. they >>> have checked-in to version control and represents some special workload) >>> with the profile data from new binaries. >>> ``` >>> >>> We can address this issue by (1) making sure llvm-profdata emits a >>> helpful warning when merging an FE-based profile with an IR-based one, and >>> (2) keeping an option to use FE instrumentation for PGO. Having (2) helps >>> people who can't (or don't want) to switch to IR PGO. >>> >>> >>> > I'd like to get consensus on a path forward. >>> > As a point of discussion, how about we make IRPGO the default on all >>> platforms except Apple platforms. >>> >>> I'd really rather not introduce this inconsistency. I'm worried that it >>> might lead to Darwin becoming a second-tier platform for PGO. >>> >>> Fred (CC'd) is following up with some of our internal users to check if >>> we can change the default behavior of -fprofile-instr-generate. He should >>> be able to chime in on this soon. >>> >> >> > Fred, > > >> Sorry it took me so long. I’ve discussed the change in behavior quiet >> extensively, and I after having changed my mind a couple times, I would >> argue in favor of keeping the current behavior for the existing flags. I >> think adding a new switch for IRPGO is a better option. The argument that >> weighted most on my opinion is the proposed interaction with >> -fcoverage-mapping, and it is not at all platform specific. With the >> proposed new behavior, turning coverage on and off in your build system >> will generate a binary with different performance characteristics and this >> feels really wrong. >> > > > This is an interesting observation, but IMO this should not be a problem > in practice: > > - Coverage testing and PGO users are not overlapping. They have completely > different objectives and expectations. For instance, coverage users care > about covering the cold/rarely executed paths and find ways to make them > appear in the path, and in the meantime reduce the 'hotness' of real hot > paths in order to reduce testing overhead, while PGO users will do the > opposite. > - Coverage testing and PGO are two different things. Using PGO > infrastructure for coverage is actually an implementation detail. This is > why it is better to let -fcoverage-mapping to turn on FE instrumentation > automatically without needing the user to know about this dependency/detail. >This is a very good point and I think is the right long-term direction. It was a mistake to make this implementation detail user-visible. -- Sean Silva> There is also an ASAN based coverage support. It would be nice to unify > various ways of doing coverage testing with unified use model, reporting > tools and interfaces etc. > - Longer term, we will add more advanced features in IR based PGO, so > unless we duplicate all the work also in FE based instrumentation, the > longer term picture is that IR based instrumentation will become the > default choice for PGO users, so it seems natural to simplify their use > models by making IR based instrumentation the default. On the other hand, > the proposed flip won't create additional complexity to coverage testing > users. > > > >> I would actually make the IRPGO mode completely incompatible with the >> -fcoverage-mapping flag. >> >> > yes, agree. > > thanks, > > David > > > >> Fred >> >> >> >>> >>> At its core I don't think -fprofile-instr-generate *implies* FE-based >>> instrumentation. So, I'd like to see the driver do this (on all platforms): >>> >>> * -fprofile-instr-generate: IR instrumentation >>> * -fprofile-instr-generate=IR: IR instrumentation >>> * -fprofile-instr-generate=FE: FE instrumentation >>> * -fprofile-instr-generate -fcoverage-mapping: FE + coverage >>> instrumentation >>> >>> It's a bit ugly because the meaning of -fprofile-instr-generate becomes >>> context-sensitive. But, (1) it doesn't break existing common workflows and >>> (2) it makes it easier to ship IRPGO. The big caveat here is that we'll >>> need to wait a bit and see if our internal users are OK with this. >>> >> >> Is there a reason to even have the possibility for FEPGO in the long run? >> From what I can tell, at most we would add a -fuse-the-old-pgo-because-i- >> want-to-merge-with-old-profiles option to hold people over until they >> can regenerate their profiles with the current compiler. We can add a flag >> to control what pre-instrumentation is done to retain the source-level >> robustness of FEPGO (e.g. -fpgo-no-simplify-before-instrumenting or >> something). >> >> >>> One alternative is to introduce a separate driver flag for IRPGO. This >>> might not work well for Sony's existing users. I'd be interested in any >>> feedback about this approach. >>> >> >> Personally, I would prefer to maintaining command line compatibility for >> PGO in Clang (i.e. users don't have to modify their build systems). >> >> >> -- Sean Silva >> >> >>> >>> >>> > I really don't like fragmenting things like this (e.g. if a >>> third-party tests "clang's" PGO they will get something different depending >>> on the platform), but I don't see another way given Apple's constraints. >>> > >>> > I'd like to see IRPGO to be the default as well, but the first thing >>> we need is a driver level option to make the switch (prof-gen) -- currently >>> we rely on -Xclang option to switch between two modes, which is less than >>> ideal. >>> > >>> > If the concern from Apple is that the old profile still need to work, >>> then this is problem already solved. The reason is that -fprofile-instr-use >>> can automatically detect the type of the profile and switch the mode. >>> >>> It's not just that. As Sean pointed out, we're concerned about old >>> profiles inter-operating poorly with new ones. >>> >>> thanks, >>> vedant >>> >>> >>> > - Pre-instrumentation passes >>> > >>> > Pre-instrumentation optimization has been critical for reducing the >>> overhead of PGO for the PS4 games we tested (as expected). However, in our >>> measurements (and we are glad to provide more info) the main benefit was >>> inlining (also as expected). A simple pass of inlining at threshold 100 >>> appeared to give all the benefits. Even inlining at threshold 0 gave almost >>> all the benefits. For example, the passes initially proposed in >>> http://reviews.llvm.org/D15828did not improve over just inlining with >>> threshold 100. >>> > >>> > (due to PR27299 we also need to add simplifycfg after inlining to >>> clean up, but this doesn't affect the instrumentation overhead in our >>> measurements) >>> > >>> > Bottom line: for our use cases, inlining does all the work, but we're >>> not opposed to having more passes, which might be beneficial for non-game >>> workloads (which is most code). >>> > >>> > >>> > >>> > Yes, Rong is re-collecting performance data before submitting the >>> patch. >>> > >>> > - Warnings >>> > >>> > We identified 3 classes of issues which manifest as spammy warnings >>> when applying profile data with IRPGO (these affect FEPGO also I believe, >>> but we looked in depth at IRPGO): >>> > >>> > 1. The main concerning one is that getPGOFuncName mangles the filename >>> into the counter name. This causes us to get >>> instrprof_error::unknown_function when the pgo-use build is done in a >>> different build directory from the training build (which is a reasonable >>> thing to support). In this situation, PGO data is useless for all `static` >>> functions (and as a byproduct results in a huge volume of warnings). >>> > >>> > This can be enhanced with an user option to override the behavior. Can >>> you help filing a tracking bug? >>> > >>> > >>> > 2. In different TU's, pre-instr inlining might make different inlining >>> decisions (for example, different functions may be available for inlining), >>> causing hash mismatch errors (instrprof_error::hash_mismatch). In building >>> a large game, we only saw 8 instance of this, so it is not as severe as 1, >>> but would be good to fix. >>> > >>> > >>> > Rong has a patch addressing that -- will submit after cleanup pass >>> change is done. >>> > >>> > >>> > 3. A .cpp file may be compiled and put into an archive, but then not >>> selected by the linker and will therefore not result in a counter in the >>> profraw. When compiling this file with pgo-use, >>> instrprof_error::unknown_function will result and a warning will be emitted. >>> > >>> > yes -- this is a common problem to other compilers as well. >>> > >>> > >>> > >>> > Case 1 can be fixed using a function hash or other unique identifier >>> instead of a file path. David, in D20195 you mentioned that Rong was >>> working on a patch that would fix 2; we are looking forward to that. >>> > >>> > >>> > Right. >>> > >>> > For 3, I unfortunately do not know of any solution. I don't think >>> there is a way for us to make this warning reliable in the face of this >>> circumstance. So my conclusion is that instrprof_error::unknown_function at >>> least must be defaulted to off unfortunately. >>> > >>> > yes, this can be annoying. If the warnings can be buffered, then the >>> compiler can check if this is due to missing profile for the whole file and >>> can reduce the warnings into one single warning (source file has no profile >>> data). Making it off by default sounds fine to me too if it is too noisy. >>> > >>> > thanks, >>> > >>> > David >>> > >>> > >>> > -- Sean Silva >>> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160602/678a0cba/attachment.html>