On May 5, 2010, at 1:22 PM, Nathan Jeffords wrote:> On Wed, May 5, 2010 at 11:15 AM, Chris Lattner <clattner at apple.com> wrote: > On May 4, 2010, at 11:03 AM, Nathan Jeffords wrote: > ... We basically want one MCStreamer callback to correspond to one statement in the .s file. This makes it easier to handle from the compiler standpoint, but is also very important for the llvm-mc assembly parser itself. > > This is an assumption I question. From an evolutionary perspective I agree; Given the existing code base I do see this as a logical transformation. As far as the assembly parser/streamer is concerned it certainly simplifies their implementations. But I also think that this interface could evolve in a direction that simplifies the common case (compiler -> object file) at a small expense to handling assembly language files.The logic to handle this has to go somewhere, putting it in the MCStreamer *implementation* that needs it is the most logical place. We also aim to implement an assembler, it doesn't make sense to duplicate this logic in the compiler and the assembler parser.> > All fragments should be associated with a symbol. For assembler components, a > > unnammed "virtual" symbol can be used when there is no explicit label defined. > > What do you mean by fragment? Can you give me an analogy with what the syntax looks like in a .s file, I'm not sure exactly what you mean here. > > I use the term fragment to refer to the MCFragment class and its derivatives. I understand that to mean any entity representing data in the final linked and loaded form. (something with an address)Ok, MCFragment should definitely be formed behind the MCStreamer implementation. The .s printing implementation of MCStreamer, for example, has no use for it. With the current design, it would be a layering violation to make it earlier.> > > Section assignment should be the responsiblity of the object imlementing the > > MCStreamer interface, with the caller givin the ability to give hints as to > > what section to place the symbol into. > > Section assignment really needs to happen at a higher level. The TargetLoweringObjectFile interfaces are the ones responsible for mapping a global/function -> section. This interface (not mcstreamer) should handle this. > > The important point here is that the COFF MCSection needs to have the right level of semantic information. In fact, MCSection is the place that I'd start for COFF bringup. > > OK, I see that now. The current isolation between TargetLoweringObjectFile -> MCStreamer -> MCObjectWriter has proven somewhat problematic, mostly due to my lack of understanding. I guess MCSectionXXX was meant to provide communication between them. Should the same be true of MCSymbol, and their data counterparts?Yes somewhat. Currently, the COFF implementation of the assembler backend should maintain a DenseMap from MCSymbol* to whatever data you need to associate with a symbol. This is equivalent to embedding per-symbol stuff in the MCSymbol itself. MCSection should be subclassed and you should put COFF specific stuff in MCSectionCOFF.> I had a problem with MCStreamer::EmitCommonSymbol & MCStreamer::EmitLocalCommonSymbol. When I implemented them I assumed this meant to put those symbols into the .bss segment. This required me to get a hold of the TLOF from the streamer. I now realize this is wrong after re-reading the description of the '.comm' directive a few times. I am not sure why an uninitialized global variable was being emitted using this, that seems wrong since global variables in different compilation units with the same name would get merged together at link time. (this is using clang on a C source file)As others have pointed out, this is one of the many horrors of C :) -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100505/98610cd2/attachment.html>
> > > The logic to handle this has to go somewhere, putting it in the MCStreamer > *implementation* that needs it is the most logical place. We also aim to > implement an assembler, it doesn't make sense to duplicate this logic in the > compiler and the assembler parser. > >Assembly language has often been *the* intermediate form for between compilers and object files/executables, but I don't think its the most effective form. That said I have limited experience writing code generators so my opinions do not bear the wisdom of you and other developers of this library on this topic.> All fragments should be associated with a symbol. For assembler >> components, a >> > unnammed "virtual" symbol can be used when there is no explicit label >> defined. >> >> What do you mean by fragment? Can you give me an analogy with what the >> syntax looks like in a .s file, I'm not sure exactly what you mean here. > > >> I use the term fragment to refer to the MCFragment class and > its derivatives. I understand that to mean any entity representing data in > the final linked and loaded form. (something with an address) > > > Ok, MCFragment should definitely be formed behind the MCStreamer > implementation. The .s printing implementation of MCStreamer, for example, > has no use for it. With the current design, it would be a layering > violation to make it earlier. > >I agree with this completely, I quite like that aspect of the design: The streamer putting fragments into sections and allowing the assembler to combine it all resolving fix-ups when it can, letting the writer deal those it can't.> Yes somewhat. Currently, the COFF implementation of the assembler backend > should maintain a DenseMap from MCSymbol* to whatever data you need to > associate with a symbol. This is equivalent to embedding per-symbol stuff > in the MCSymbol itself. MCSection should be subclassed and you should put > COFF specific stuff in MCSectionCOFF. > > I think this is an important detail I was missing. I can already see howthis will help with COMDAT sections. Is there any reason for the difference between symbol and section in this respect? As others have pointed out, this is one of the many horrors of C :)> >Another reason why I am attempting to develop my own language. :) p.s. I posted my coff backend patch to llvm-commit, but that apears to be the wrong place, where should I have posted it? - Nathan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100505/91acc053/attachment.html>
On May 5, 2010, at 5:22 PM, Nathan Jeffords wrote:> > The logic to handle this has to go somewhere, putting it in the MCStreamer *implementation* that needs it is the most logical place. We also aim to implement an assembler, it doesn't make sense to duplicate this logic in the compiler and the assembler parser. > > > Assembly language has often been *the* intermediate form for between compilers and object files/executables, but I don't think its the most effective form. That said I have limited experience writing code generators so my opinions do not bear the wisdom of you and other developers of this library on this topic.I completely agree, but it is a very important and effective form of communication :) One nice fallout of the MCStreamer design is that once the COFF writer is available, we'll have a stand-alone coff assembler mostly "for free". In fact, developing this as a coff assembler (which can be accessed with 'llvm-mc foo.s -o foo.obj -filetype=obj') is easier in a lot of ways than dealing with the compiler!> > Yes somewhat. Currently, the COFF implementation of the assembler backend should maintain a DenseMap from MCSymbol* to whatever data you need to associate with a symbol. This is equivalent to embedding per-symbol stuff in the MCSymbol itself. MCSection should be subclassed and you should put COFF specific stuff in MCSectionCOFF. > >> > > I think this is an important detail I was missing. I can already see how this will help with COMDAT sections. Is there any reason for the difference between symbol and section in this respect?You'd have to ask Daniel about this. I don't recall if this is a short term thing that he'd like to fix or if this is an important design decision.> > As others have pointed out, this is one of the many horrors of C :) > > > Another reason why I am attempting to develop my own language. :) > > p.s. I posted my coff backend patch to llvm-commit, but that apears to be the wrong place, where should I have posted it?llvm-commits is a great place for it! -Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20100505/b3168959/attachment.html>