Bob Wilson
2013-Nov-19 17:19 UTC
[LLVMdev] bit code file incompatibility due to debug info changes
On Nov 18, 2013, at 11:36 PM, Eric Christopher <echristo at gmail.com> wrote:> On Mon, Nov 18, 2013 at 11:02 PM, Bob Wilson <bob.wilson at apple.com> wrote: >> In the bug report, Rafael commented that the compiler should never crash, even if the debug info metadata is old. I certainly agree, and we should fix that. I don’t think that really solves this problem, at least not by itself. For one thing, if the metadata format is really changing so rapidly, how do we ensure that we have fixed all the problems and that we don’t regress? Is there a plan for testing the behavior with a wide variety of old debug info? >> > > Not a bit. The idea behind the verifier is that if it sees something > unexpected it can do something about it whether it's emitting a > diagnostic (though we don't emit diagnostics from the back end), or > just removing the metadata in memory ala strip (not on disk - that > really would be unfriendly).I understand the idea behind the verifier. I'm asking about testing. If we're relying on the verifier for correctness, then we need to test that on a regular basis. I know we have a good number of tests with valid metadata. How do you propose that we test to make sure the verifier catches all invalid metadata? This seems like a weak point of the proposal to rely on the verifier rather than a version number.> >> What is the user experience in the case where the debug info is out of date? Not crashing is a pretty low bar. Might we end up emitting DWARF that is incorrect? If we’re just going to drop the old debug info, I would expect a warning diagnostic to report that. >> > > A warning is about the best you can expect. The metadata is basically > locked with assumptions on fields throughout the code. There's really > no way to continue reliably. > > -ericThis really is a pretty horrible user experience, even with a warning. I'd just like to comment that we're actually running into this problem at Apple. It is not an idle concern. We had a static library built with LTO using an older version of clang. When linking that with a recent version of clang, the compiler crashed. Obviously we need to fix the crash. Even if we don't crash, emitting incorrect debug info or silently removing the debug info is only somewhat better. Stripping the debug info and providing a warning would be a reasonable short-term workaround. I don't think we can get away with doing anything less than that. For the longer term, perhaps we ought to revisit the decision to not auto-upgrade debug info metadata, at least once the format stabilizes. I had not realized the implications of that for LTO at the time it was originally discussed.> >> On Nov 18, 2013, at 12:47 PM, Bob Wilson <bob.wilson at apple.com> wrote: >> >>> OK. That is fine with me. Saying "WONTFIX" seems to be a bit different than "let's carry on the discussion on the mailing list until we figure out what to do", though. I had intended to use that PR to track the fact that we have an issue here that I think needs to be addressed before the 3.4 release. I don't care where the discussion happens. >>> >>> On Nov 18, 2013, at 11:53 AM, Eric Christopher <echristo at gmail.com> wrote: >>> >>>> For the same reason that you asked me in two threads: because it's >>>> splitting the discussion up and making it more complicated. Let's keep >>>> it to one thread and then open a bug at the end when/if we have >>>> something that we can take action on. >>>> >>>> -eric >>>> >>>> On Mon, Nov 18, 2013 at 11:05 AM, Bob Wilson <bob.wilson at apple.com> wrote: >>>>> Eric, why did you move the PR back to resolved/won't-fix again after I >>>>> reopened it? We really need to do *something* here. I'm open to >>>>> discussing any of the possible solutions that have been offered but just >>>>> letting the compiler crash does not seem acceptable. >>>>> >>>>> On Nov 18, 2013, at 11:00 AM, Chandler Carruth <chandlerc at google.com> wrote: >>>>> >>>>> >>>>> On Mon, Nov 18, 2013 at 10:55 AM, David Blaikie <dblaikie at gmail.com> wrote: >>>>>> >>>>>> It depends a bit, also, on what kind of guarantees we need to offer. If >>>>>> the guarantee when reading IR from disk is "will not crash" then there's >>>>>> nothing for it but to run full debug info verification. >>>>>> >>>>>> On the other hand, if we can assume that some specific metadata implies >>>>>> the correctness of some other metadata, then all we need to do is check a >>>>>> known debug info version number. If it's the current number, do nothing >>>>>> (assume the rest of the debug info is up to date and well formed), otherwise >>>>>> do all the work to find the debug info and dump it (no need to verify it - >>>>>> we just have to do the work to find it, so we don't go following bad links >>>>>> later on - that should be as easy as dropping the llvm.dbg.cu named node and >>>>>> removing all debug intrinsics and the instruction metadata line references). >>>>>> But this latter scheme isn't robust against arbitrary metadata (that could, >>>>>> coincidentally, have the right version number and arbitrary metadata that >>>>>> breaks all our debug info metadata assumptions) >>>>>> >>>>>> If the latter is sufficient for everyone's needs/principles, great. >>>>> >>>>> >>>>> This makes sense to me, but I see Eric's fundamental concern with upgrading >>>>> test cases. >>>>> >>>>> >>>>> One other possible idea: we could artificially force coarseness on metadata >>>>> while things are still iterating rapidly by just having a version number >>>>> that rotates every "release". (Even non-open-source-releases could be >>>>> handled by letting anyone, any time they need to rev it, update what the >>>>> current version is and update every test case to reference that version.) If >>>>> the version is nicely factored, it should at least be an obvious diff if >>>>> still huge. >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>> >>>>> >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>
Eric Christopher
2013-Nov-19 18:40 UTC
[LLVMdev] bit code file incompatibility due to debug info changes
On Tue, Nov 19, 2013 at 9:19 AM, Bob Wilson <bob.wilson at apple.com> wrote:> > On Nov 18, 2013, at 11:36 PM, Eric Christopher <echristo at gmail.com> wrote: > >> On Mon, Nov 18, 2013 at 11:02 PM, Bob Wilson <bob.wilson at apple.com> wrote: >>> In the bug report, Rafael commented that the compiler should never crash, even if the debug info metadata is old. I certainly agree, and we should fix that. I don’t think that really solves this problem, at least not by itself. For one thing, if the metadata format is really changing so rapidly, how do we ensure that we have fixed all the problems and that we don’t regress? Is there a plan for testing the behavior with a wide variety of old debug info? >>> >> >> Not a bit. The idea behind the verifier is that if it sees something >> unexpected it can do something about it whether it's emitting a >> diagnostic (though we don't emit diagnostics from the back end), or >> just removing the metadata in memory ala strip (not on disk - that >> really would be unfriendly). > > I understand the idea behind the verifier. I'm asking about testing. If we're relying on the verifier for correctness, then we need to test that on a regular basis. I know we have a good number of tests with valid metadata. How do you propose that we test to make sure the verifier catches all invalid metadata? This seems like a weak point of the proposal to rely on the verifier rather than a version number. >Every time we change it check it in as an additional "don't die" test? Right now there are a lot of asserts, moving those to being failures would probably work. How would you like to version it? Other than at bitcode load time I don't want to do anything with the version number in the debug info. The previous versioning scheme became quickly unmaintainable and I don't want to go back to that either.>> >>> What is the user experience in the case where the debug info is out of date? Not crashing is a pretty low bar. Might we end up emitting DWARF that is incorrect? If we’re just going to drop the old debug info, I would expect a warning diagnostic to report that. >>> >> >> A warning is about the best you can expect. The metadata is basically >> locked with assumptions on fields throughout the code. There's really >> no way to continue reliably. >> >> -eric > > This really is a pretty horrible user experience, even with a warning. > > I'd just like to comment that we're actually running into this problem at Apple. It is not an idle concern. We had a static library built with LTO using an older version of clang. When linking that with a recent version of clang, the compiler crashed. Obviously we need to fix the crash. Even if we don't crash, emitting incorrect debug info or silently removing the debug info is only somewhat better. Stripping the debug info and providing a warning would be a reasonable short-term workaround. I don't think we can get away with doing anything less than that. For the longer term, perhaps we ought to revisit the decision to not auto-upgrade debug info metadata, at least once the format stabilizes. I had not realized the implications of that for LTO at the time it was originally discussed. >I don't think we're really disagreeing with each other here. It's not a good experience and what we've said previously is that you really can't rely upon the metadata being compatible between versions - including and especially LTO. In the future when we're not trying to change the various formats and needing to expand things so that we can cover more debug information then I think finally locking down the format will be a good thing - that time just isn't now. If you have some specific ideas in mind for this I'm very glad to discuss them with you. -eric>> >>> On Nov 18, 2013, at 12:47 PM, Bob Wilson <bob.wilson at apple.com> wrote: >>> >>>> OK. That is fine with me. Saying "WONTFIX" seems to be a bit different than "let's carry on the discussion on the mailing list until we figure out what to do", though. I had intended to use that PR to track the fact that we have an issue here that I think needs to be addressed before the 3.4 release. I don't care where the discussion happens. >>>> >>>> On Nov 18, 2013, at 11:53 AM, Eric Christopher <echristo at gmail.com> wrote: >>>> >>>>> For the same reason that you asked me in two threads: because it's >>>>> splitting the discussion up and making it more complicated. Let's keep >>>>> it to one thread and then open a bug at the end when/if we have >>>>> something that we can take action on. >>>>> >>>>> -eric >>>>> >>>>> On Mon, Nov 18, 2013 at 11:05 AM, Bob Wilson <bob.wilson at apple.com> wrote: >>>>>> Eric, why did you move the PR back to resolved/won't-fix again after I >>>>>> reopened it? We really need to do *something* here. I'm open to >>>>>> discussing any of the possible solutions that have been offered but just >>>>>> letting the compiler crash does not seem acceptable. >>>>>> >>>>>> On Nov 18, 2013, at 11:00 AM, Chandler Carruth <chandlerc at google.com> wrote: >>>>>> >>>>>> >>>>>> On Mon, Nov 18, 2013 at 10:55 AM, David Blaikie <dblaikie at gmail.com> wrote: >>>>>>> >>>>>>> It depends a bit, also, on what kind of guarantees we need to offer. If >>>>>>> the guarantee when reading IR from disk is "will not crash" then there's >>>>>>> nothing for it but to run full debug info verification. >>>>>>> >>>>>>> On the other hand, if we can assume that some specific metadata implies >>>>>>> the correctness of some other metadata, then all we need to do is check a >>>>>>> known debug info version number. If it's the current number, do nothing >>>>>>> (assume the rest of the debug info is up to date and well formed), otherwise >>>>>>> do all the work to find the debug info and dump it (no need to verify it - >>>>>>> we just have to do the work to find it, so we don't go following bad links >>>>>>> later on - that should be as easy as dropping the llvm.dbg.cu named node and >>>>>>> removing all debug intrinsics and the instruction metadata line references). >>>>>>> But this latter scheme isn't robust against arbitrary metadata (that could, >>>>>>> coincidentally, have the right version number and arbitrary metadata that >>>>>>> breaks all our debug info metadata assumptions) >>>>>>> >>>>>>> If the latter is sufficient for everyone's needs/principles, great. >>>>>> >>>>>> >>>>>> This makes sense to me, but I see Eric's fundamental concern with upgrading >>>>>> test cases. >>>>>> >>>>>> >>>>>> One other possible idea: we could artificially force coarseness on metadata >>>>>> while things are still iterating rapidly by just having a version number >>>>>> that rotates every "release". (Even non-open-source-releases could be >>>>>> handled by letting anyone, any time they need to rev it, update what the >>>>>> current version is and update every test case to reference that version.) If >>>>>> the version is nicely factored, it should at least be an obvious diff if >>>>>> still huge. >>>>>> _______________________________________________ >>>>>> LLVM Developers mailing list >>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>> >>>>>> >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>> >
Bob Wilson
2013-Nov-19 18:48 UTC
[LLVMdev] bit code file incompatibility due to debug info changes
I don't have a strong opinion on how we fix it, but I think it's really important to do something and get it done for the 3.4 release. Here's what I would do: - implement Quentin's previous proposal for back-end diagnostics (and add a diagnostic handler to Apple's linker), so that we can warn about out-of-date debug info - add a version number to the debug info metadata - have the bit code reader strip out any debug info that is missing the version number or has a different version than the current value On Nov 19, 2013, at 10:40 AM, Eric Christopher <echristo at gmail.com> wrote:> On Tue, Nov 19, 2013 at 9:19 AM, Bob Wilson <bob.wilson at apple.com> wrote: >> >> On Nov 18, 2013, at 11:36 PM, Eric Christopher <echristo at gmail.com> wrote: >> >>> On Mon, Nov 18, 2013 at 11:02 PM, Bob Wilson <bob.wilson at apple.com> wrote: >>>> In the bug report, Rafael commented that the compiler should never crash, even if the debug info metadata is old. I certainly agree, and we should fix that. I don’t think that really solves this problem, at least not by itself. For one thing, if the metadata format is really changing so rapidly, how do we ensure that we have fixed all the problems and that we don’t regress? Is there a plan for testing the behavior with a wide variety of old debug info? >>>> >>> >>> Not a bit. The idea behind the verifier is that if it sees something >>> unexpected it can do something about it whether it's emitting a >>> diagnostic (though we don't emit diagnostics from the back end), or >>> just removing the metadata in memory ala strip (not on disk - that >>> really would be unfriendly). >> >> I understand the idea behind the verifier. I'm asking about testing. If we're relying on the verifier for correctness, then we need to test that on a regular basis. I know we have a good number of tests with valid metadata. How do you propose that we test to make sure the verifier catches all invalid metadata? This seems like a weak point of the proposal to rely on the verifier rather than a version number. >> > > Every time we change it check it in as an additional "don't die" test? > Right now there are a lot of asserts, moving those to being failures > would probably work. How would you like to version it? Other than at > bitcode load time I don't want to do anything with the version number > in the debug info. The previous versioning scheme became quickly > unmaintainable and I don't want to go back to that either. > >>> >>>> What is the user experience in the case where the debug info is out of date? Not crashing is a pretty low bar. Might we end up emitting DWARF that is incorrect? If we’re just going to drop the old debug info, I would expect a warning diagnostic to report that. >>>> >>> >>> A warning is about the best you can expect. The metadata is basically >>> locked with assumptions on fields throughout the code. There's really >>> no way to continue reliably. >>> >>> -eric >> >> This really is a pretty horrible user experience, even with a warning. >> >> I'd just like to comment that we're actually running into this problem at Apple. It is not an idle concern. We had a static library built with LTO using an older version of clang. When linking that with a recent version of clang, the compiler crashed. Obviously we need to fix the crash. Even if we don't crash, emitting incorrect debug info or silently removing the debug info is only somewhat better. Stripping the debug info and providing a warning would be a reasonable short-term workaround. I don't think we can get away with doing anything less than that. For the longer term, perhaps we ought to revisit the decision to not auto-upgrade debug info metadata, at least once the format stabilizes. I had not realized the implications of that for LTO at the time it was originally discussed. >> > > I don't think we're really disagreeing with each other here. It's not > a good experience and what we've said previously is that you really > can't rely upon the metadata being compatible between versions - > including and especially LTO. In the future when we're not trying to > change the various formats and needing to expand things so that we can > cover more debug information then I think finally locking down the > format will be a good thing - that time just isn't now. If you have > some specific ideas in mind for this I'm very glad to discuss them > with you. > > -eric > > > >>> >>>> On Nov 18, 2013, at 12:47 PM, Bob Wilson <bob.wilson at apple.com> wrote: >>>> >>>>> OK. That is fine with me. Saying "WONTFIX" seems to be a bit different than "let's carry on the discussion on the mailing list until we figure out what to do", though. I had intended to use that PR to track the fact that we have an issue here that I think needs to be addressed before the 3.4 release. I don't care where the discussion happens. >>>>> >>>>> On Nov 18, 2013, at 11:53 AM, Eric Christopher <echristo at gmail.com> wrote: >>>>> >>>>>> For the same reason that you asked me in two threads: because it's >>>>>> splitting the discussion up and making it more complicated. Let's keep >>>>>> it to one thread and then open a bug at the end when/if we have >>>>>> something that we can take action on. >>>>>> >>>>>> -eric >>>>>> >>>>>> On Mon, Nov 18, 2013 at 11:05 AM, Bob Wilson <bob.wilson at apple.com> wrote: >>>>>>> Eric, why did you move the PR back to resolved/won't-fix again after I >>>>>>> reopened it? We really need to do *something* here. I'm open to >>>>>>> discussing any of the possible solutions that have been offered but just >>>>>>> letting the compiler crash does not seem acceptable. >>>>>>> >>>>>>> On Nov 18, 2013, at 11:00 AM, Chandler Carruth <chandlerc at google.com> wrote: >>>>>>> >>>>>>> >>>>>>> On Mon, Nov 18, 2013 at 10:55 AM, David Blaikie <dblaikie at gmail.com> wrote: >>>>>>>> >>>>>>>> It depends a bit, also, on what kind of guarantees we need to offer. If >>>>>>>> the guarantee when reading IR from disk is "will not crash" then there's >>>>>>>> nothing for it but to run full debug info verification. >>>>>>>> >>>>>>>> On the other hand, if we can assume that some specific metadata implies >>>>>>>> the correctness of some other metadata, then all we need to do is check a >>>>>>>> known debug info version number. If it's the current number, do nothing >>>>>>>> (assume the rest of the debug info is up to date and well formed), otherwise >>>>>>>> do all the work to find the debug info and dump it (no need to verify it - >>>>>>>> we just have to do the work to find it, so we don't go following bad links >>>>>>>> later on - that should be as easy as dropping the llvm.dbg.cu named node and >>>>>>>> removing all debug intrinsics and the instruction metadata line references). >>>>>>>> But this latter scheme isn't robust against arbitrary metadata (that could, >>>>>>>> coincidentally, have the right version number and arbitrary metadata that >>>>>>>> breaks all our debug info metadata assumptions) >>>>>>>> >>>>>>>> If the latter is sufficient for everyone's needs/principles, great. >>>>>>> >>>>>>> >>>>>>> This makes sense to me, but I see Eric's fundamental concern with upgrading >>>>>>> test cases. >>>>>>> >>>>>>> >>>>>>> One other possible idea: we could artificially force coarseness on metadata >>>>>>> while things are still iterating rapidly by just having a version number >>>>>>> that rotates every "release". (Even non-open-source-releases could be >>>>>>> handled by letting anyone, any time they need to rev it, update what the >>>>>>> current version is and update every test case to reference that version.) If >>>>>>> the version is nicely factored, it should at least be an obvious diff if >>>>>>> still huge. >>>>>>> _______________________________________________ >>>>>>> LLVM Developers mailing list >>>>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>>>>> >>>>>>> >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >>>>> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >>>> >>
Apparently Analagous Threads
- [LLVMdev] bit code file incompatibility due to debug info changes
- [LLVMdev] bit code file incompatibility due to debug info changes
- [LLVMdev] bit code file incompatibility due to debug info changes
- [LLVMdev] bit code file incompatibility due to debug info changes
- [LLVMdev] bit code file incompatibility due to debug info changes