thr3ads.net - llvm dev - [LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine [Mar 2009]

If this information is useful, please help other people find it:
Share via:

Aaron Gray

2009-Mar-15 22:52 UTC

[LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine

>I like the idea of a generic MachineCodeWriter, although I prefer the
>name 'ObjectFileWriter'...
Thats much more descriptive of the functionality.
>I think we need to take a hard look at which bits of the
>Writer/Emitter infrastructure are needed for what tasks (Object File
>Emittion, JIT, etc.) and make sure that our abstractions are flexible
>enough...
I would suggest being very familuar with the current code the JIT, and 
MachineCodeEmitter, and X86 and other CodeEmitter code before jumping in :)
> As it stands at the moment, the Writer and Emitter classes
>could definately be merged (at least from the perspective of object
>file generation).
I would not do this, their functionality is distinct.

The MachineCodeEmitter is specifically used for the JIT, it works fine for 
now, I think we should leave this alone !

I did a patch that has not been accepted as of yet that deals with the 
GVStub methods moving them into the JITEmitter class, this made several 
anonomous namespace classes into llvm namespace and also moving the 
JITEmitter class main into a header file. This gave it visibility too. NOTE 
The doxygen API documentation does not show such anonymous namespace 
classes.

I looked into using two MachineCodeEmitter objects in the JITEmitter class 
to deal with the second dealing with stub generation instread but this got 
messy.

Just parameterizing the X86CodeEmitter and others gives us the base level of 
flexability and allows us not to have to disturb the existing JIT code too 
much.

As you probably know ObjectFile emittion is not working at all at present, 
the upper levels have been written out of SVN some time ago.
>At the moment, the Writer and Emitter are declared friend, and the
>encapsulation is all broken anyhow... I'd like to rethink the whole
>model a little...
My inclination is to go down that route theoretically then step back to 
where we are and look at incremental changes that donot disturb the status 
quo too much, otherwise we will not get our patches through.
>In general, I think that a TargetMachine should expose a
>'getObjectFileWriter' method, which could be used to obtain an
object
>file generator. An additional method should be available to allow
>users of the TargetMachine to query which types of Object Files the
>TargetMachine supports.
Okay with that.
>llc could then be simply re-written to use these generic functions
>instead of the hard-coded MachO and ELF ones.
Okay, this give more flexability and usability.

On Sun, Mar 15, 2009 at 10:39 PM, Aaron Gray
<aaronngray.lists at googlemail.com> wrote:>> Currently, the MachO and ELF Writers and MachineCodeEmitters are
>> hard-coded into LLVMTargetMachine and llc.
>
> I am also interested in working on this area and interested in writting a
> COFF file backend.
>
>> In other words, the 'object file generation' capabilities of
the
>> Common Code Generator are not generic.
>
> I was looking at making a parallel class to MachineCodeEmitter,
> 'MachineCodeWriter' that can be used generically instead of
> MachineCodeEmitter to write to a supplied 'vector<byte>'.
This would not
> introduce any overhead to the existing runtime code and would allow 
> inlining
> of writting functions in X86CodeEmitter and other emitters. They would 
> have
> to be templated and the MCE member parameterized.
>
>> LLVMTargetMachine::addPassesToEmitFile explicitly checks whether the
>> derived backend TargetMachine implements one of getMachOWriterInfo or
>> getELFWriterInfo, and returns a corresponding FileModel enum value.
>>
>> llc's main function uses the resulting FileModel value to determine
>> which of the {AddMachOWriter,AddELFWriter} functions to call.
>>
>> This is limiting for a number of reasons:
>> 1. If a given platform (e.g. x86) may support both MachO and ELF,
>> MachO will be selected, as it is checked first. This is bad behaviour,
>> it should be up to the user to decide which object format he wants.
>> 2. Extension of the object file generation capabilities to include new
>> object file formats is difficult, and requires modifications to LLVM
>> code (not just a plugin).
>>
>> I suggest transforming the {getMachOWriterInfo, getELFWriterInfo}
>> functions (on TargetMachine) into a single (templated?)
>> getObjectFileWriterInfo function. Additionally a addObjectFileWriter
>> member should be added to TargetMachine, taking the place of the
>> static {AddMachOWriter, AddELFWriter} functions.
>>
>> As I need this functionality (custom object file generation) for my
>> current target, I'd be happy to make the modifications to the LLVM
>> core. Before I do so, I'd like to get feedback on my proposed
>> solution.
>>
>> I've added a bug for this issue: 
>> http://llvm.org/bugs/show_bug.cgi?id=3813
>
> Aaron
>
> _______________________________________________
> LLVM Developers mailing list
> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>
_______________________________________________
LLVM Developers mailing list
LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev

Aaron Gray

2009-Mar-16 03:26 UTC

head link

[LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine

On Sun, Mar 15, 2009 at 10:52 PM, Aaron Gray <
aaronngray.lists at googlemail.com> wrote:
>  I like the idea of a generic MachineCodeWriter, although I prefer the
>> name 'ObjectFileWriter'...
>>
>
> Thats much more descriptive of the functionality.
>
Sorry, I disagree actually the MachineCodeEmitter or the
'MachineCodeWritter' does not do any file handling at all. Do look at
the
code for the MachineCodeWritter and you will see it only writes to memory
and if it reaches the end of the allotted memory I believe higher ordered
logic reallocates a larget buffer and starts again from scratch. This could
be avoided if they generated fixus for absolute memory references refering
within the outputted code. Then a alloc function could be called before
outputting say a 4 byte int and could realloc and copy code and when finally
written the fixups could be applied.

I am also wondering about the efficiency of std::vector whether we could use
that for the MachineCodeWriter, or whether we write out own code output
stream/buffering ?

I still think this is where the crux of the problem lies the upper logic is
relatively simple compared to this buy looking at what you say it is
important to get it right.

Cheers,

Aaron
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20090316/5dd2e3f0/attachment.html>

Aaron Gray

2009-Mar-16 04:27 UTC

head link

[LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine

On Mon, Mar 16, 2009 at 3:26 AM, Aaron Gray <aaronngray.lists at
googlemail.com> wrote:
>  On Sun, Mar 15, 2009 at 10:52 PM, Aaron Gray <
> aaronngray.lists at googlemail.com> wrote:
>
>>  I like the idea of a generic MachineCodeWriter, although I prefer the
>>> name 'ObjectFileWriter'...
>>>
>>
>> Thats much more descriptive of the functionality.
>>
>
> Sorry, I disagree actually the MachineCodeEmitter or the
> 'MachineCodeWritter' does not do any file handling at all. Do look
at the
> code for the MachineCodeWritter and you will see it only writes to memory
> and if it reaches the end of the allotted memory I believe higher ordered
> logic reallocates a larget buffer and starts again from scratch. This could
> be avoided if they generated fixus for absolute memory references refering
> within the outputted code. Then a alloc function could be called before
> outputting say a 4 byte int and could realloc and copy code and when
finally
> written the fixups could be applied.
>
> I am also wondering about the efficiency of std::vector whether we could
> use that for the MachineCodeWriter, or whether we write out own code output
> stream/buffering ?
>
> I still think this is where the crux of the problem lies the upper logic is
> relatively simple compared to this buy looking at what you say it is
> important to get it right.
>

'ObjectCodeEmitter' looks like the right description to parallel the
MachineCodeEmitter. Its emitting object code to a data stream (which is an
object file section) and not direct to a file.

I will knock to gether an ObjectCodeEmitter that is call compatible with the
MachineCodeEmitter and wtites to a std::vector<byte>, so it could replace
the MachineCodeEmitter class generically in usage.

This needs alot of thought and to get things right, and provide the right
incremental patches to get this accepted.

Cheers,

Aaron
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20090316/a3560750/attachment.html>

someguy

2009-Mar-16 06:39 UTC

head link

[LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine

> Sorry, I disagree actually the MachineCodeEmitter or the
> 'MachineCodeWritter' does not do any file handling at all. Do look
at the
> code for the MachineCodeWritter and you will see it only writes to memory
> and if it reaches the end of the allotted memory I believe higher ordered
> logic reallocates a larget buffer and starts again from scratch. This could
> be avoided if they generated fixus for absolute memory references refering
> within the outputted code. Then a alloc function could be called before
> outputting say a 4 byte int and could realloc and copy code and when
finally
> written the fixups could be applied.
IIRC the memory allocation is done in the MachineCodeEmitter, not the
higher level (see startFunction and finishFunction). The current
implementation has startFunction allocate some (arbitrary) reserve
size in the output vector, and if we the emitter runs out of space,
finishFunction returns a failure, causing the whole process to occur
again. This is icky.

It would be far better if the underlying buffer would grow
automatically (with an allocation method in the base class, as you
suggested), as code is emitted to it.
> 'ObjectCodeEmitter' looks like the right description to parallel
the
> MachineCodeEmitter. Its emitting object code to a data stream (which
> is an object file section) and not direct to a file.
I can live with that. Before you implement anything, can we try and
define the responsibilities of the various classes?

We have MachineCodeEmitter, which is responsible for actually emitting
bytes into a buffer for a function. Should it have methods for
emitting instructions/operands, or should it only work at the byte,
dword, etc. level?

ObjectCodeEmitter,  is responsible for emission of object 'files' into
a memory buffer. This includes handling of all object headers,
management of sections/segments, symbol and string tables and
relocations. The ObjectCodeEmitter should delegate all actual 'data
emission' to the MachineCodeEmitter.

ObjectCodeEmitter is a MachineFunctionPass. It does 'object wide'
setup in doInitialization and finalizes the object in doFinalize(?).
Each MachineFunction is emitted through the runOnFunction method,
which passes the MachineFunction to the MachineCodeEmitter. The
MachineCodeEmitter calls back to the ObjectCodeEmitter in order to
look up sections/segments, add globals to an unresolved globals list
etc.

I'm not too happy about the broken encapsulation here. I'd prefer to
find a better way to model this.

Apparently Analagous Threads

Search for more seemingly similar threads

llvm dev - Mar 2009 - [LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine

[LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine

[LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine

[LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine

[LLVMdev] MachO and ELF Writers/MachineCodeEmitters arehard-codedinto LLVMTargetMachine

Apparently Analagous Threads