On Sat, Jul 20, 2013 at 9:15 PM, Chris Lattner <clattner at apple.com> wrote:> Sorry, just getting caught up on an old thread. I haven't been involved in discussions of this. > > On Jul 17, 2013, at 8:53 AM, Bob Wilson <bob.wilson at apple.com> wrote: >> First, let me try to clarify my proposal, in case there was any confusion about that. LLVMContext already has a hook for diagnostics, setInlineAsmDiagnosticHandler() et al. I was suggesting that we rename those interfaces to be more generic, add a simple enumeration of whatever diagnostics can be produced from the backend, and add support in clang for mapping those enumeration values to the corresponding clang diagnostics. This would be a small amount of work and would also be consistent with everything you wrote above about reusing the standard and existing machinery for diagnostics in clang. > > Of all of the proposals discussed, I like this the best: > > 1) This is a really simple extension of what we already have. > > 2) The backend providing a set of enumerations for the classes of diagnostics it produces doesn't tie it to clang, and doesn't make it language specific. Clients should be able to completely ignore the enum if they want the current (unclassified) behavior, and if an unknown enum value comes through, it is easy to handle. > > 3) I don't see how something like the stack size diagnostic can be implemented by clang calling into the backend. First, the MachineFunction (and thus, MachineFrameInfo) is a transient datastructure used by the backend when a function is compiled. There is nothing persistent for clang to query. Second, clang would have to know about all of the LLVM IR functions generated, which is possible, but impractical to track for things like thunks and other implicitly generated entrypoints. > > What is the specific concern with this approach? I don't see how this couples the backend to the frontend or causes layering violation problems. >I've not talked with Chandler about this, but to sketch out the way I'd do it (which is similar): Have the backend vend diagnostics, this can be done either with a set of enums and messages like you mentioned, or just have a message and location struct ala: struct Msg { const char *Message; Location Loc; }; that the consumer of the message can use via a handler. Alternately a handler (and we should have a default handler) can be passed in from the printer of the message (the frontend in the case provided) and it can be called on the error message. Absolutely this should be done via the LLVMContext to deal with the case of parallel function passes. class Handler { void printDiagnostic(const char *Message, Location Loc); }; (Note that I didn't say this was a fleshed out design ;) I think I prefer the latter to the former and we'd just need an "diagnostic callback handler" on the context. Though we would need to keep a set of diagnostics that the backend handles. That said, that it provides diagnostics based on its input language seems to make the most sense. It can use the location metadata if it has it to produce a location, otherwise you get the function, etc. in some sort of nicely degraded quality. I think this scheme could also work as a way of dealing with the "Optimization Diary" sort of use that Hal is envisioning as well. Keeping the separation of concerns around where the front end handles diagnostics on what we'll call "source locations" is pretty important, however, I agree that not every warning can be expressed this way, i.e. the stack size diagnostic. However, leaving the printing up to the front end is the best way to deal with this and a generic diagnostic engine would probably help for things like llc/opt where the backend just deals with its input language - IR. The existing inline asm diagnostics are ... problematic and it would definitely be nice to get a generic interface for them. Though they're actually separated into two separate cases where, I think, we end up with our confusion: a) Front end diagnostics - This is an area that needs some work to be decent, but it involves the front end querying the back end for things like register size, valid immediates, etc and should be implemented by the front end with an expanded set of target queries. We could use this as a way to solidify the backend api for the MS inline asm support as well and use some of that when parsing GNU style inline asm. b) Back end diagnostics - This is the stuff that the front end has no hope of diagnosing. i.e. "ran out of registers", or "can't figure out how to split this up into this kind of vector register". The latter has always been a bit funny and I'm always unhappy with it, but I don't have any better ideas. A unified scheme of communicating "help help I'm being oppressed by the front end" in the backend would be, at the very least, a step forward. Thoughts? -eric
On Jul 21, 2013, at 8:00 PM, Eric Christopher <echristo at gmail.com> wrote:> On Sat, Jul 20, 2013 at 9:15 PM, Chris Lattner <clattner at apple.com> wrote: >> Sorry, just getting caught up on an old thread. I haven't been involved in discussions of this. >> >> On Jul 17, 2013, at 8:53 AM, Bob Wilson <bob.wilson at apple.com> wrote: >>> First, let me try to clarify my proposal, in case there was any confusion about that. LLVMContext already has a hook for diagnostics, setInlineAsmDiagnosticHandler() et al. I was suggesting that we rename those interfaces to be more generic, add a simple enumeration of whatever diagnostics can be produced from the backend, and add support in clang for mapping those enumeration values to the corresponding clang diagnostics. This would be a small amount of work and would also be consistent with everything you wrote above about reusing the standard and existing machinery for diagnostics in clang. >> >> Of all of the proposals discussed, I like this the best: >> >> 1) This is a really simple extension of what we already have. >> >> 2) The backend providing a set of enumerations for the classes of diagnostics it produces doesn't tie it to clang, and doesn't make it language specific. Clients should be able to completely ignore the enum if they want the current (unclassified) behavior, and if an unknown enum value comes through, it is easy to handle. >> >> 3) I don't see how something like the stack size diagnostic can be implemented by clang calling into the backend. First, the MachineFunction (and thus, MachineFrameInfo) is a transient datastructure used by the backend when a function is compiled. There is nothing persistent for clang to query. Second, clang would have to know about all of the LLVM IR functions generated, which is possible, but impractical to track for things like thunks and other implicitly generated entrypoints. >> >> What is the specific concern with this approach? I don't see how this couples the backend to the frontend or causes layering violation problems. >> > > I've not talked with Chandler about this, but to sketch out the way > I'd do it (which is similar): > > Have the backend vend diagnostics, this can be done either with a set > of enums and messages like you mentioned, or just have a message and > location struct ala: > > struct Msg { > const char *Message; > Location Loc; > }; > > that the consumer of the message can use via a handler.When the consumer is clang, it's important that we have diagnostic groups to control the warnings, so the enum is important. We don't want to be comparing the message strings to decide whether a particular warning is an instance of -Wstack-size (for example).> > Alternately a handler (and we should have a default handler) can be > passed in from the printer of the message (the frontend in the case > provided) and it can be called on the error message. Absolutely this > should be done via the LLVMContext to deal with the case of parallel > function passes. > > class Handler { > void printDiagnostic(const char *Message, Location Loc); > }; > > (Note that I didn't say this was a fleshed out design ;) I think I > prefer the latter to the former and we'd just need an "diagnostic > callback handler" on the context. Though we would need to keep a set > of diagnostics that the backend handles. That said, that it provides > diagnostics based on its input language seems to make the most sense. > It can use the location metadata if it has it to produce a location, > otherwise you get the function, etc. in some sort of nicely degraded > quality.This is pretty much the same as what Quentin proposed (with the addition of the enum), isn't it?> > I think this scheme could also work as a way of dealing with the > "Optimization Diary" sort of use that Hal is envisioning as well.Yes, I agree.> > Keeping the separation of concerns around where the front end handles > diagnostics on what we'll call "source locations" is pretty important, > however, I agree that not every warning can be expressed this way, > i.e. the stack size diagnostic. However, leaving the printing up to > the front end is the best way to deal with this and a generic > diagnostic engine would probably help for things like llc/opt where > the backend just deals with its input language - IR. > > The existing inline asm diagnostics are ... problematic and it would > definitely be nice to get a generic interface for them. Though they're > actually separated into two separate cases where, I think, we end up > with our confusion: > > a) Front end diagnostics - This is an area that needs some work to be > decent, but it involves the front end querying the back end for things > like register size, valid immediates, etc and should be implemented by > the front end with an expanded set of target queries. We could use > this as a way to solidify the backend api for the MS inline asm > support as well and use some of that when parsing GNU style inline > asm.Yes, I agree about this, too. As an example, we had a discussion a few months ago about warning about over-aligned stack variables when dynamic stack realignment is disabled, and we agreed to move that warning into the frontend. I keep poking at the frontend team every once in a while to get some help implementing that, but I haven't forgotten it.> > b) Back end diagnostics - This is the stuff that the front end has no > hope of diagnosing. i.e. "ran out of registers", or "can't figure out > how to split this up into this kind of vector register". The latter > has always been a bit funny and I'm always unhappy with it, but I > don't have any better ideas. A unified scheme of communicating "help > help I'm being oppressed by the front end" in the backend would be, at > the very least, a step forward. > > Thoughts?I don't have any better ideas for this. At least if we generalize the existing inline asm diagnostic handler, it will make it less of a special case.
----- Original Message -----> On Sat, Jul 20, 2013 at 9:15 PM, Chris Lattner <clattner at apple.com> > wrote: > > Sorry, just getting caught up on an old thread. I haven't been > > involved in discussions of this. > > > > On Jul 17, 2013, at 8:53 AM, Bob Wilson <bob.wilson at apple.com> > > wrote: > >> First, let me try to clarify my proposal, in case there was any > >> confusion about that. LLVMContext already has a hook for > >> diagnostics, setInlineAsmDiagnosticHandler() et al. I was > >> suggesting that we rename those interfaces to be more generic, > >> add a simple enumeration of whatever diagnostics can be produced > >> from the backend, and add support in clang for mapping those > >> enumeration values to the corresponding clang diagnostics. This > >> would be a small amount of work and would also be consistent with > >> everything you wrote above about reusing the standard and > >> existing machinery for diagnostics in clang. > > > > Of all of the proposals discussed, I like this the best: > > > > 1) This is a really simple extension of what we already have. > > > > 2) The backend providing a set of enumerations for the classes of > > diagnostics it produces doesn't tie it to clang, and doesn't make > > it language specific. Clients should be able to completely ignore > > the enum if they want the current (unclassified) behavior, and if > > an unknown enum value comes through, it is easy to handle. > > > > 3) I don't see how something like the stack size diagnostic can be > > implemented by clang calling into the backend. First, the > > MachineFunction (and thus, MachineFrameInfo) is a transient > > datastructure used by the backend when a function is compiled. > > There is nothing persistent for clang to query. Second, clang > > would have to know about all of the LLVM IR functions generated, > > which is possible, but impractical to track for things like thunks > > and other implicitly generated entrypoints. > > > > What is the specific concern with this approach? I don't see how > > this couples the backend to the frontend or causes layering > > violation problems. > > > > I've not talked with Chandler about this, but to sketch out the way > I'd do it (which is similar): > > Have the backend vend diagnostics, this can be done either with a set > of enums and messages like you mentioned, or just have a message and > location struct ala: > > struct Msg { > const char *Message; > Location Loc; > }; > > that the consumer of the message can use via a handler. > > Alternately a handler (and we should have a default handler) can be > passed in from the printer of the message (the frontend in the case > provided) and it can be called on the error message. Absolutely this > should be done via the LLVMContext to deal with the case of parallel > function passes. > > class Handler { > void printDiagnostic(const char *Message, Location Loc); > }; > > (Note that I didn't say this was a fleshed out design ;) I think I > prefer the latter to the former and we'd just need an "diagnostic > callback handler" on the context. Though we would need to keep a set > of diagnostics that the backend handles. That said, that it provides > diagnostics based on its input language seems to make the most sense. > It can use the location metadata if it has it to produce a location, > otherwise you get the function, etc. in some sort of nicely degraded > quality. > > I think this scheme could also work as a way of dealing with the > "Optimization Diary" sort of use that Hal is envisioning as well.I like "Optimization Diary" :) Obviously, we sometimes loose debug information on variables while optimizing, and so the trick is to make it degrade nicely. Nevertheless, I think that we can often be more expressive than just using a single location for everything. How about something like this: - The message string is text but a single kind of markup is allowed: <debug/>, for example: "We cannot vectorize <debug/> because <debug/> is an unfriendly variable" (where the first will be replaced by text derived from a DIScope and the second from a DIVariable). - The structure is this: struct Msg { const char *Message; Function *F; // If nothing else, we can extract a useful name from here, hopefully. SmallVector<DIDescriptor, 2> DIs; // Should be DIDescriptor* ? }; Then, in the backend, we can look for a DbgValueInst associated with the variable that we want (similar for scopes), and push a DIScope, DIVariable, etc. onto the array, one for every <debug/> in the message string. The frontend can then render these references in whatever way seems most appropriate (which may include things like making them into hyperlinks, doing some kind of source-code highlighting, etc.). Thanks again, Hal> > Keeping the separation of concerns around where the front end handles > diagnostics on what we'll call "source locations" is pretty > important, > however, I agree that not every warning can be expressed this way, > i.e. the stack size diagnostic. However, leaving the printing up to > the front end is the best way to deal with this and a generic > diagnostic engine would probably help for things like llc/opt where > the backend just deals with its input language - IR. > > The existing inline asm diagnostics are ... problematic and it would > definitely be nice to get a generic interface for them. Though > they're > actually separated into two separate cases where, I think, we end up > with our confusion: > > a) Front end diagnostics - This is an area that needs some work to be > decent, but it involves the front end querying the back end for > things > like register size, valid immediates, etc and should be implemented > by > the front end with an expanded set of target queries. We could use > this as a way to solidify the backend api for the MS inline asm > support as well and use some of that when parsing GNU style inline > asm. > > b) Back end diagnostics - This is the stuff that the front end has no > hope of diagnosing. i.e. "ran out of registers", or "can't figure out > how to split this up into this kind of vector register". The latter > has always been a bit funny and I'm always unhappy with it, but I > don't have any better ideas. A unified scheme of communicating "help > help I'm being oppressed by the front end" in the backend would be, > at > the very least, a step forward. > > Thoughts? > > -eric >-- Hal Finkel Assistant Computational Scientist Leadership Computing Facility Argonne National Laboratory
>> Have the backend vend diagnostics, this can be done either with a set >> of enums and messages like you mentioned, or just have a message and >> location struct ala: >> >> struct Msg { >> const char *Message; >> Location Loc; >> }; >> >> that the consumer of the message can use via a handler. > > When the consumer is clang, it's important that we have diagnostic groups to control the warnings, so the enum is important. We don't want to be comparing the message strings to decide whether a particular warning is an instance of -Wstack-size (for example). >I think, in this case, that I'd want the registration function to handle the "give me these sorts of warnings" rather than it be a part of the message. I.e. you'd register for each class of warnings out of the backend maybe?> > This is pretty much the same as what Quentin proposed (with the addition of the enum), isn't it? >Pretty close yeah.>> >> a) Front end diagnostics - This is an area that needs some work to be >> decent, but it involves the front end querying the back end for things >> like register size, valid immediates, etc and should be implemented by >> the front end with an expanded set of target queries. We could use >> this as a way to solidify the backend api for the MS inline asm >> support as well and use some of that when parsing GNU style inline >> asm. > > Yes, I agree about this, too. As an example, we had a discussion a few months ago about warning about over-aligned stack variables when dynamic stack realignment is disabled, and we agreed to move that warning into the frontend. I keep poking at the frontend team every once in a while to get some help implementing that, but I haven't forgotten it. >*nod* This should be pretty easy to implement out of the front end at a first thought - "if we're creating an alloca with an alignment greater than the target and we've turned off dynamic realignment then warn". The trick is getting the backend data for such things etc.>> >> b) Back end diagnostics - This is the stuff that the front end has no >> hope of diagnosing. i.e. "ran out of registers", or "can't figure out >> how to split this up into this kind of vector register". The latter >> has always been a bit funny and I'm always unhappy with it, but I >> don't have any better ideas. A unified scheme of communicating "help >> help I'm being oppressed by the front end" in the backend would be, at >> the very least, a step forward. >> >> Thoughts? > > I don't have any better ideas for this. At least if we generalize the existing inline asm diagnostic handler, it will make it less of a special case.Ah, the thoughts was an "in general", but if you'd had ideas I'd have totally been up for them :) -eric
>> I think this scheme could also work as a way of dealing with the >> "Optimization Diary" sort of use that Hal is envisioning as well. > > I like "Optimization Diary" :) > > Obviously, we sometimes loose debug information on variables while optimizing, and so the trick is to make it degrade nicely. Nevertheless, I think that we can often be more expressive than just using a single location for everything. How about something like this: > > - The message string is text but a single kind of markup is allowed: <debug/>, for example: > "We cannot vectorize <debug/> because <debug/> is an unfriendly variable" > (where the first will be replaced by text derived from a DIScope and the second from a DIVariable). > > - The structure is this: > struct Msg { > const char *Message; > Function *F; // If nothing else, we can extract a useful name from here, hopefully. > SmallVector<DIDescriptor, 2> DIs; // Should be DIDescriptor* ? > }; > > Then, in the backend, we can look for a DbgValueInst associated with the variable that we want (similar for scopes), and push a DIScope, DIVariable, etc. onto the array, one for every <debug/> in the message string. The frontend can then render these references in whatever way seems most appropriate (which may include things like making them into hyperlinks, doing some kind of source-code highlighting, etc.). >Yep, that's what I was going for. As a note you don't need to worry about the Function and DIs as a DebugLoc has a scope and can be traveled back to find a function that it's inside or any other enclosing scope. Though I've just posted another message to the thread with a design for a more informational set of callbacks that could be used by individual passes. I don't have a good design off the top of my head, but... -eric> Thanks again, > Hal > >> >> Keeping the separation of concerns around where the front end handles >> diagnostics on what we'll call "source locations" is pretty >> important, >> however, I agree that not every warning can be expressed this way, >> i.e. the stack size diagnostic. However, leaving the printing up to >> the front end is the best way to deal with this and a generic >> diagnostic engine would probably help for things like llc/opt where >> the backend just deals with its input language - IR. >> >> The existing inline asm diagnostics are ... problematic and it would >> definitely be nice to get a generic interface for them. Though >> they're >> actually separated into two separate cases where, I think, we end up >> with our confusion: >> >> a) Front end diagnostics - This is an area that needs some work to be >> decent, but it involves the front end querying the back end for >> things >> like register size, valid immediates, etc and should be implemented >> by >> the front end with an expanded set of target queries. We could use >> this as a way to solidify the backend api for the MS inline asm >> support as well and use some of that when parsing GNU style inline >> asm. >> >> b) Back end diagnostics - This is the stuff that the front end has no >> hope of diagnosing. i.e. "ran out of registers", or "can't figure out >> how to split this up into this kind of vector register". The latter >> has always been a bit funny and I'm always unhappy with it, but I >> don't have any better ideas. A unified scheme of communicating "help >> help I'm being oppressed by the front end" in the backend would be, >> at >> the very least, a step forward. >> >> Thoughts? >> >> -eric >> > > -- > Hal Finkel > Assistant Computational Scientist > Leadership Computing Facility > Argonne National Laboratory