Aaron Gray
2009-Mar-15 22:52 UTC
[LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine
>I like the idea of a generic MachineCodeWriter, although I prefer the >name 'ObjectFileWriter'...Thats much more descriptive of the functionality.>I think we need to take a hard look at which bits of the >Writer/Emitter infrastructure are needed for what tasks (Object File >Emittion, JIT, etc.) and make sure that our abstractions are flexible >enough...I would suggest being very familuar with the current code the JIT, and MachineCodeEmitter, and X86 and other CodeEmitter code before jumping in :)> As it stands at the moment, the Writer and Emitter classes >could definately be merged (at least from the perspective of object >file generation).I would not do this, their functionality is distinct. The MachineCodeEmitter is specifically used for the JIT, it works fine for now, I think we should leave this alone ! I did a patch that has not been accepted as of yet that deals with the GVStub methods moving them into the JITEmitter class, this made several anonomous namespace classes into llvm namespace and also moving the JITEmitter class main into a header file. This gave it visibility too. NOTE The doxygen API documentation does not show such anonymous namespace classes. I looked into using two MachineCodeEmitter objects in the JITEmitter class to deal with the second dealing with stub generation instread but this got messy. Just parameterizing the X86CodeEmitter and others gives us the base level of flexability and allows us not to have to disturb the existing JIT code too much. As you probably know ObjectFile emittion is not working at all at present, the upper levels have been written out of SVN some time ago.>At the moment, the Writer and Emitter are declared friend, and the >encapsulation is all broken anyhow... I'd like to rethink the whole >model a little...My inclination is to go down that route theoretically then step back to where we are and look at incremental changes that donot disturb the status quo too much, otherwise we will not get our patches through.>In general, I think that a TargetMachine should expose a >'getObjectFileWriter' method, which could be used to obtain an object >file generator. An additional method should be available to allow >users of the TargetMachine to query which types of Object Files the >TargetMachine supports.Okay with that.>llc could then be simply re-written to use these generic functions >instead of the hard-coded MachO and ELF ones.Okay, this give more flexability and usability. On Sun, Mar 15, 2009 at 10:39 PM, Aaron Gray <aaronngray.lists at googlemail.com> wrote:>> Currently, the MachO and ELF Writers and MachineCodeEmitters are >> hard-coded into LLVMTargetMachine and llc. > > I am also interested in working on this area and interested in writting a > COFF file backend. > >> In other words, the 'object file generation' capabilities of the >> Common Code Generator are not generic. > > I was looking at making a parallel class to MachineCodeEmitter, > 'MachineCodeWriter' that can be used generically instead of > MachineCodeEmitter to write to a supplied 'vector<byte>'. This would not > introduce any overhead to the existing runtime code and would allow > inlining > of writting functions in X86CodeEmitter and other emitters. They would > have > to be templated and the MCE member parameterized. > >> LLVMTargetMachine::addPassesToEmitFile explicitly checks whether the >> derived backend TargetMachine implements one of getMachOWriterInfo or >> getELFWriterInfo, and returns a corresponding FileModel enum value. >> >> llc's main function uses the resulting FileModel value to determine >> which of the {AddMachOWriter,AddELFWriter} functions to call. >> >> This is limiting for a number of reasons: >> 1. If a given platform (e.g. x86) may support both MachO and ELF, >> MachO will be selected, as it is checked first. This is bad behaviour, >> it should be up to the user to decide which object format he wants. >> 2. Extension of the object file generation capabilities to include new >> object file formats is difficult, and requires modifications to LLVM >> code (not just a plugin). >> >> I suggest transforming the {getMachOWriterInfo, getELFWriterInfo} >> functions (on TargetMachine) into a single (templated?) >> getObjectFileWriterInfo function. Additionally a addObjectFileWriter >> member should be added to TargetMachine, taking the place of the >> static {AddMachOWriter, AddELFWriter} functions. >> >> As I need this functionality (custom object file generation) for my >> current target, I'd be happy to make the modifications to the LLVM >> core. Before I do so, I'd like to get feedback on my proposed >> solution. >> >> I've added a bug for this issue: >> http://llvm.org/bugs/show_bug.cgi?id=3813 > > Aaron > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >_______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Aaron Gray
2009-Mar-16 03:26 UTC
[LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine
On Sun, Mar 15, 2009 at 10:52 PM, Aaron Gray < aaronngray.lists at googlemail.com> wrote:> I like the idea of a generic MachineCodeWriter, although I prefer the >> name 'ObjectFileWriter'... >> > > Thats much more descriptive of the functionality. >Sorry, I disagree actually the MachineCodeEmitter or the 'MachineCodeWritter' does not do any file handling at all. Do look at the code for the MachineCodeWritter and you will see it only writes to memory and if it reaches the end of the allotted memory I believe higher ordered logic reallocates a larget buffer and starts again from scratch. This could be avoided if they generated fixus for absolute memory references refering within the outputted code. Then a alloc function could be called before outputting say a 4 byte int and could realloc and copy code and when finally written the fixups could be applied. I am also wondering about the efficiency of std::vector whether we could use that for the MachineCodeWriter, or whether we write out own code output stream/buffering ? I still think this is where the crux of the problem lies the upper logic is relatively simple compared to this buy looking at what you say it is important to get it right. Cheers, Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090316/5dd2e3f0/attachment.html>
Aaron Gray
2009-Mar-16 04:27 UTC
[LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine
On Mon, Mar 16, 2009 at 3:26 AM, Aaron Gray <aaronngray.lists at googlemail.com> wrote:> On Sun, Mar 15, 2009 at 10:52 PM, Aaron Gray < > aaronngray.lists at googlemail.com> wrote: > >> I like the idea of a generic MachineCodeWriter, although I prefer the >>> name 'ObjectFileWriter'... >>> >> >> Thats much more descriptive of the functionality. >> > > Sorry, I disagree actually the MachineCodeEmitter or the > 'MachineCodeWritter' does not do any file handling at all. Do look at the > code for the MachineCodeWritter and you will see it only writes to memory > and if it reaches the end of the allotted memory I believe higher ordered > logic reallocates a larget buffer and starts again from scratch. This could > be avoided if they generated fixus for absolute memory references refering > within the outputted code. Then a alloc function could be called before > outputting say a 4 byte int and could realloc and copy code and when finally > written the fixups could be applied. > > I am also wondering about the efficiency of std::vector whether we could > use that for the MachineCodeWriter, or whether we write out own code output > stream/buffering ? > > I still think this is where the crux of the problem lies the upper logic is > relatively simple compared to this buy looking at what you say it is > important to get it right. >'ObjectCodeEmitter' looks like the right description to parallel the MachineCodeEmitter. Its emitting object code to a data stream (which is an object file section) and not direct to a file. I will knock to gether an ObjectCodeEmitter that is call compatible with the MachineCodeEmitter and wtites to a std::vector<byte>, so it could replace the MachineCodeEmitter class generically in usage. This needs alot of thought and to get things right, and provide the right incremental patches to get this accepted. Cheers, Aaron -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20090316/a3560750/attachment.html>
someguy
2009-Mar-16 06:39 UTC
[LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine
> Sorry, I disagree actually the MachineCodeEmitter or the > 'MachineCodeWritter' does not do any file handling at all. Do look at the > code for the MachineCodeWritter and you will see it only writes to memory > and if it reaches the end of the allotted memory I believe higher ordered > logic reallocates a larget buffer and starts again from scratch. This could > be avoided if they generated fixus for absolute memory references refering > within the outputted code. Then a alloc function could be called before > outputting say a 4 byte int and could realloc and copy code and when finally > written the fixups could be applied.IIRC the memory allocation is done in the MachineCodeEmitter, not the higher level (see startFunction and finishFunction). The current implementation has startFunction allocate some (arbitrary) reserve size in the output vector, and if we the emitter runs out of space, finishFunction returns a failure, causing the whole process to occur again. This is icky. It would be far better if the underlying buffer would grow automatically (with an allocation method in the base class, as you suggested), as code is emitted to it.> 'ObjectCodeEmitter' looks like the right description to parallel the > MachineCodeEmitter. Its emitting object code to a data stream (which > is an object file section) and not direct to a file.I can live with that. Before you implement anything, can we try and define the responsibilities of the various classes? We have MachineCodeEmitter, which is responsible for actually emitting bytes into a buffer for a function. Should it have methods for emitting instructions/operands, or should it only work at the byte, dword, etc. level? ObjectCodeEmitter, is responsible for emission of object 'files' into a memory buffer. This includes handling of all object headers, management of sections/segments, symbol and string tables and relocations. The ObjectCodeEmitter should delegate all actual 'data emission' to the MachineCodeEmitter. ObjectCodeEmitter is a MachineFunctionPass. It does 'object wide' setup in doInitialization and finalizes the object in doFinalize(?). Each MachineFunction is emitted through the runOnFunction method, which passes the MachineFunction to the MachineCodeEmitter. The MachineCodeEmitter calls back to the ObjectCodeEmitter in order to look up sections/segments, add globals to an unresolved globals list etc. I'm not too happy about the broken encapsulation here. I'd prefer to find a better way to model this.
Possibly Parallel Threads
- [LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine
- [LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine
- [LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine
- [LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine
- [LLVMdev] MachO and ELF Writers/MachineCodeEmittersarehard-codedinto LLVMTargetMachine