thr3ads.net - llvm dev - [LLVMdev] [lld] Representation of lld::Reference with a fake target [Feb 2015]

If this information is useful, please help other people find it:
Share via:

Rui Ueyama

2015-Feb-07 01:54 UTC

[LLVMdev] [lld] Representation of lld::Reference with a fake target

On Fri, Feb 6, 2015 at 5:42 PM, Michael Spencer <bigcheesegs at gmail.com>
wrote:
> On Fri, Feb 6, 2015 at 5:31 PM, Rui Ueyama <ruiu at google.com>
wrote:
> > There are two questions.
> >
> > Firstly, do you think the on-disk format needs to compatible with a
C++
> > struct so that we can cast that memory buffer to the struct? That may
be
> > super-fast but that also comes with many limitations. It's hard to
> extend,
> > for example. Every time we want to store variable-length objects we
need
> to
> > define string-table-like data structure. And I'm not very sure
that it's
> > fastest -- because mmap'able objects are not very compact on disk,
slow
> disk
> > IO could be a bottleneck, if we compare that with more compact file
> format.
> > I believe Protobufs or Thrust are fast enough or even might be faster.
>
> I'm not sure here. Although I do question if the object files will
> even need to be read from disk in your standard edit/compile/debug
> loop or on a build server. I believe we'll need real data to determine
> this.
>
> >
> > Secondly, do you know why we are dumping post-linked object file to
> Native
> > format? If we want to have a different kind of *object* file format,
we
> > would want to have a tool to convert an object file in an existing
file
> > format (say, ELF) to "native", and teach LLD how read from
the file.
> > Currently we are writing a file in the middle of linking process,
which
> > doesn't make sense to me.
>
> This is an artifact of having the native format before we had any
> readers. I agree that it's weird and not terribly useful to write to
> native format in the middle of the link, although I have found it
> helpful to output yaml. There's no need to be able to read it back in
> and resume though.
>
Even for YAML it doesn't make much sense to write it to a file and read it
back from the file in the middle of the link, do it? I found that being
able to output YAML is useful too, but round-trip is a different thing. In
the middle of the process, we have bunch of additional information that
doesn't exist in input files and doesn't have to be output to the link
result. Ability to serialize that intermediate result is not useful.

Shankar, you added these round-trip tests. Do you have any opinion?

Ideally lld -r would be the tool we use to convert COFF/ELF/MachO
to> the native format.
>
> - Michael Spencer
>
> >
> > On Fri, Feb 6, 2015 at 5:02 PM, Michael Spencer <bigcheesegs at
gmail.com>
> > wrote:
> >>
> >> On Fri, Feb 6, 2015 at 2:54 PM, Rui Ueyama <ruiu at
google.com> wrote:
> >> > Can we remove Native format support? I'd like to get
input from anyone
> >> > who
> >> > wants to keep the current Native format in LLD.
> >>
> >> One of the original goals for LLD was to provide a new object file
> >> format for performance. The reason it is not used currently is
because
> >> we've yet to teach llvm to generate it, and we haven't
done that
> >> because it hasn't been finalized yet. The value it currently
provides
> >> is catching stuff like this, so we can fix it now instead of down
the
> >> road when we actually productize the native format.
> >>
> >> As for the specific implementation of the native format, I'm
open to
> >> an extensible format, but only if the performance cost is low.
> >>
> >> - Michael Spencer
> >>
> >> >
> >> > On Thu, Feb 5, 2015 at 2:03 PM, Shankar Easwaran
> >> > <shankare at codeaurora.org>
> >> > wrote:
> >> >>
> >> >> The only way currently is to create a new reference,
unless we can
> >> >> think
> >> >> of adding some target specific metadata information in
the Atom
> model.
> >> >>
> >> >> This has come up over and over again, we need something
in the Atom
> >> >> model
> >> >> to store information that is target specific.
> >> >>
> >> >> Shankar Easwaran
> >> >>
> >> >>
> >> >> On 2/5/2015 2:22 PM, Simon Atanasyan wrote:
> >> >>>
> >> >>> Hi,
> >> >>>
> >> >>> I need an advice on implementation of a very specific
kind of
> >> >>> relocations
> >> >>> used by MIPS N64 ABI. As usual the main problem is
how to pass
> target
> >> >>> specific
> >> >>> data over Native/YAML conversion barrier.
> >> >>>
> >> >>> In this ABI relocation record r_info field in fact
consists of five
> >> >>> subfields:
> >> >>> * r_sym   - symbol index
> >> >>> * r_ssym  - special symbol
> >> >>> * r_type3 - third relocation type
> >> >>> * r_type2 - second relocation type
> >> >>> * r_type  - first relocation type
> >> >>>
> >> >>> Up to three these relocations applied one by one. The
first
> relocation
> >> >>> uses
> >> >>> an addendum from the relocation record. Each
subsequent relocation
> >> >>> takes
> >> >>> as
> >> >>> its addend the result of the previous operation. Only
the final
> >> >>> operation
> >> >>> actually modifies the location relocated. The first
relocation uses
> as
> >> >>> a reference symbol specified by the r_sym field. The
third
> relocation
> >> >>> assumes NULL symbol.
> >> >>>
> >> >>> The most interesting case is the second relocation.
It uses the
> >> >>> special
> >> >>> symbol value given by the r_ssym field. This field
can contain four
> >> >>> predefined values:
> >> >>> * RSS_UNDEF - zero value
> >> >>> * RSS_GP    - value of gp symbol
> >> >>> * RSS_GP0   - gp0 value taken from the .MIPS.options
or .reginfo
> >> >>> section
> >> >>> * RSS_LOC   - address of location being relocated
> >> >>>
> >> >>> So the problem is how to store these four constants
in the
> >> >>> lld::Reference object.
> >> >>> The RSS_UNDEF is obviously not a problem. To
represent the RSS_GP
> >> >>> value I
> >> >>> can
> >> >>> set an AbsoluteAtom created for the "_gp"
as the reference's target.
> >> >>> But
> >> >>> what
> >> >>> about RSS_GP0 and RSS_LOC? I am considering the
following approaches
> >> >>> but
> >> >>> cannot
> >> >>> select the best one:
> >> >>>
> >> >>> a) Create AbsoluteAtom for each of these cases and
set them as the
> >> >>> reference's target.
> >> >>>     The problem is that these atoms are fake and
should not go to
> the
> >> >>> symbol table.
> >> >>>     One more problem is to select unique names for
these atoms.
> >> >>> b) Use two high bits of lld::Reference::_kindValue
field to encode
> >> >>> RSS_xxx value.
> >> >>>     Then decode these bits in the RelocationHandler
to calculate
> >> >>> result
> >> >>> of relocation.
> >> >>>     In that case the problem is how to represent a
relocation kind
> >> >>> value in YAML format.
> >> >>>     The simple
xxxRelocationStringTable::kindStrings[] array will
> not
> >> >>> satisfy us.
> >> >>> c) Add one more field to the lld::Reference class.
Something like
> the
> >> >>> DefinedAtom::CodeModel
> >> >>>     field.
> >> >>>
> >> >>> Any advices, ideas, and/or objections are much
appreciated.
> >> >>>
> >> >>
> >> >>
> >> >> --
> >> >> Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum,
> >> >> hosted
> >> >> by the Linux Foundation
> >> >>
> >> >
> >> >
> >> > _______________________________________________
> >> > LLVM Developers mailing list
> >> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
> >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >> >
> >
> >
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150206/32073ca2/attachment.html>

Michael Spencer

2015-Feb-07 01:58 UTC

head link

[LLVMdev] [lld] Representation of lld::Reference with a fake target

On Fri, Feb 6, 2015 at 5:54 PM, Rui Ueyama <ruiu at google.com>
wrote:> On Fri, Feb 6, 2015 at 5:42 PM, Michael Spencer <bigcheesegs at
gmail.com>
> wrote:
>>
>> On Fri, Feb 6, 2015 at 5:31 PM, Rui Ueyama <ruiu at google.com>
wrote:
>> > There are two questions.
>> >
>> > Firstly, do you think the on-disk format needs to compatible with
a C++
>> > struct so that we can cast that memory buffer to the struct? That
may be
>> > super-fast but that also comes with many limitations. It's
hard to
>> > extend,
>> > for example. Every time we want to store variable-length objects
we need
>> > to
>> > define string-table-like data structure. And I'm not very sure
that it's
>> > fastest -- because mmap'able objects are not very compact on
disk, slow
>> > disk
>> > IO could be a bottleneck, if we compare that with more compact
file
>> > format.
>> > I believe Protobufs or Thrust are fast enough or even might be
faster.
>>
>> I'm not sure here. Although I do question if the object files will
>> even need to be read from disk in your standard edit/compile/debug
>> loop or on a build server. I believe we'll need real data to
determine
>> this.
>>
>> >
>> > Secondly, do you know why we are dumping post-linked object file
to
>> > Native
>> > format? If we want to have a different kind of *object* file
format, we
>> > would want to have a tool to convert an object file in an existing
file
>> > format (say, ELF) to "native", and teach LLD how read
from the file.
>> > Currently we are writing a file in the middle of linking process,
which
>> > doesn't make sense to me.
>>
>> This is an artifact of having the native format before we had any
>> readers. I agree that it's weird and not terribly useful to write
to
>> native format in the middle of the link, although I have found it
>> helpful to output yaml. There's no need to be able to read it back
in
>> and resume though.
>
>
> Even for YAML it doesn't make much sense to write it to a file and read
it
> back from the file in the middle of the link, do it? I found that being
able
> to output YAML is useful too, but round-trip is a different thing. In the
> middle of the process, we have bunch of additional information that
doesn't
> exist in input files and doesn't have to be output to the link result.
> Ability to serialize that intermediate result is not useful.
Completely agree here. We should round-trip the input instead.

- Michael Spencer
>
> Shankar, you added these round-trip tests. Do you have any opinion?
>
>> Ideally lld -r would be the tool we use to convert COFF/ELF/MachO to
>> the native format.
>>
>> - Michael Spencer
>>
>> >
>> > On Fri, Feb 6, 2015 at 5:02 PM, Michael Spencer <bigcheesegs at
gmail.com>
>> > wrote:
>> >>
>> >> On Fri, Feb 6, 2015 at 2:54 PM, Rui Ueyama <ruiu at
google.com> wrote:
>> >> > Can we remove Native format support? I'd like to get
input from
>> >> > anyone
>> >> > who
>> >> > wants to keep the current Native format in LLD.
>> >>
>> >> One of the original goals for LLD was to provide a new object
file
>> >> format for performance. The reason it is not used currently is
because
>> >> we've yet to teach llvm to generate it, and we haven't
done that
>> >> because it hasn't been finalized yet. The value it
currently provides
>> >> is catching stuff like this, so we can fix it now instead of
down the
>> >> road when we actually productize the native format.
>> >>
>> >> As for the specific implementation of the native format,
I'm open to
>> >> an extensible format, but only if the performance cost is low.
>> >>
>> >> - Michael Spencer
>> >>
>> >> >
>> >> > On Thu, Feb 5, 2015 at 2:03 PM, Shankar Easwaran
>> >> > <shankare at codeaurora.org>
>> >> > wrote:
>> >> >>
>> >> >> The only way currently is to create a new reference,
unless we can
>> >> >> think
>> >> >> of adding some target specific metadata information
in the Atom
>> >> >> model.
>> >> >>
>> >> >> This has come up over and over again, we need
something in the Atom
>> >> >> model
>> >> >> to store information that is target specific.
>> >> >>
>> >> >> Shankar Easwaran
>> >> >>
>> >> >>
>> >> >> On 2/5/2015 2:22 PM, Simon Atanasyan wrote:
>> >> >>>
>> >> >>> Hi,
>> >> >>>
>> >> >>> I need an advice on implementation of a very
specific kind of
>> >> >>> relocations
>> >> >>> used by MIPS N64 ABI. As usual the main problem
is how to pass
>> >> >>> target
>> >> >>> specific
>> >> >>> data over Native/YAML conversion barrier.
>> >> >>>
>> >> >>> In this ABI relocation record r_info field in
fact consists of five
>> >> >>> subfields:
>> >> >>> * r_sym   - symbol index
>> >> >>> * r_ssym  - special symbol
>> >> >>> * r_type3 - third relocation type
>> >> >>> * r_type2 - second relocation type
>> >> >>> * r_type  - first relocation type
>> >> >>>
>> >> >>> Up to three these relocations applied one by one.
The first
>> >> >>> relocation
>> >> >>> uses
>> >> >>> an addendum from the relocation record. Each
subsequent relocation
>> >> >>> takes
>> >> >>> as
>> >> >>> its addend the result of the previous operation.
Only the final
>> >> >>> operation
>> >> >>> actually modifies the location relocated. The
first relocation uses
>> >> >>> as
>> >> >>> a reference symbol specified by the r_sym field.
The third
>> >> >>> relocation
>> >> >>> assumes NULL symbol.
>> >> >>>
>> >> >>> The most interesting case is the second
relocation. It uses the
>> >> >>> special
>> >> >>> symbol value given by the r_ssym field. This
field can contain four
>> >> >>> predefined values:
>> >> >>> * RSS_UNDEF - zero value
>> >> >>> * RSS_GP    - value of gp symbol
>> >> >>> * RSS_GP0   - gp0 value taken from the
.MIPS.options or .reginfo
>> >> >>> section
>> >> >>> * RSS_LOC   - address of location being relocated
>> >> >>>
>> >> >>> So the problem is how to store these four
constants in the
>> >> >>> lld::Reference object.
>> >> >>> The RSS_UNDEF is obviously not a problem. To
represent the RSS_GP
>> >> >>> value I
>> >> >>> can
>> >> >>> set an AbsoluteAtom created for the
"_gp" as the reference's
>> >> >>> target.
>> >> >>> But
>> >> >>> what
>> >> >>> about RSS_GP0 and RSS_LOC? I am considering the
following
>> >> >>> approaches
>> >> >>> but
>> >> >>> cannot
>> >> >>> select the best one:
>> >> >>>
>> >> >>> a) Create AbsoluteAtom for each of these cases
and set them as the
>> >> >>> reference's target.
>> >> >>>     The problem is that these atoms are fake and
should not go to
>> >> >>> the
>> >> >>> symbol table.
>> >> >>>     One more problem is to select unique names
for these atoms.
>> >> >>> b) Use two high bits of
lld::Reference::_kindValue field to encode
>> >> >>> RSS_xxx value.
>> >> >>>     Then decode these bits in the
RelocationHandler to calculate
>> >> >>> result
>> >> >>> of relocation.
>> >> >>>     In that case the problem is how to represent
a relocation kind
>> >> >>> value in YAML format.
>> >> >>>     The simple
xxxRelocationStringTable::kindStrings[] array will
>> >> >>> not
>> >> >>> satisfy us.
>> >> >>> c) Add one more field to the lld::Reference
class. Something like
>> >> >>> the
>> >> >>> DefinedAtom::CodeModel
>> >> >>>     field.
>> >> >>>
>> >> >>> Any advices, ideas, and/or objections are much
appreciated.
>> >> >>>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum,
>> >> >> hosted
>> >> >> by the Linux Foundation
>> >> >>
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > LLVM Developers mailing list
>> >> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >> >
>> >
>> >
>
>

Rui Ueyama

2015-Feb-07 02:05 UTC

head link

[LLVMdev] [lld] Representation of lld::Reference with a fake target

On Fri, Feb 6, 2015 at 5:58 PM, Michael Spencer <bigcheesegs at gmail.com>
wrote:
> On Fri, Feb 6, 2015 at 5:54 PM, Rui Ueyama <ruiu at google.com>
wrote:
> > On Fri, Feb 6, 2015 at 5:42 PM, Michael Spencer <bigcheesegs at
gmail.com>
> > wrote:
> >>
> >> On Fri, Feb 6, 2015 at 5:31 PM, Rui Ueyama <ruiu at
google.com> wrote:
> >> > There are two questions.
> >> >
> >> > Firstly, do you think the on-disk format needs to compatible
with a
> C++
> >> > struct so that we can cast that memory buffer to the struct?
That may
> be
> >> > super-fast but that also comes with many limitations.
It's hard to
> >> > extend,
> >> > for example. Every time we want to store variable-length
objects we
> need
> >> > to
> >> > define string-table-like data structure. And I'm not very
sure that
> it's
> >> > fastest -- because mmap'able objects are not very compact
on disk,
> slow
> >> > disk
> >> > IO could be a bottleneck, if we compare that with more
compact file
> >> > format.
> >> > I believe Protobufs or Thrust are fast enough or even might
be faster.
> >>
> >> I'm not sure here. Although I do question if the object files
will
> >> even need to be read from disk in your standard edit/compile/debug
> >> loop or on a build server. I believe we'll need real data to
determine
> >> this.
> >>
> >> >
> >> > Secondly, do you know why we are dumping post-linked object
file to
> >> > Native
> >> > format? If we want to have a different kind of *object* file
format,
> we
> >> > would want to have a tool to convert an object file in an
existing
> file
> >> > format (say, ELF) to "native", and teach LLD how
read from the file.
> >> > Currently we are writing a file in the middle of linking
process,
> which
> >> > doesn't make sense to me.
> >>
> >> This is an artifact of having the native format before we had any
> >> readers. I agree that it's weird and not terribly useful to
write to
> >> native format in the middle of the link, although I have found it
> >> helpful to output yaml. There's no need to be able to read it
back in
> >> and resume though.
> >
> >
> > Even for YAML it doesn't make much sense to write it to a file and
read
> it
> > back from the file in the middle of the link, do it? I found that
being
> able
> > to output YAML is useful too, but round-trip is a different thing. In
the
> > middle of the process, we have bunch of additional information that
> doesn't
> > exist in input files and doesn't have to be output to the link
result.
> > Ability to serialize that intermediate result is not useful.
>
> Completely agree here. We should round-trip the input instead.
>
Let me remove the round-trip passes. I'll send a patch for review, so
let's
discuss there.

>
> - Michael Spencer
>
> >
> > Shankar, you added these round-trip tests. Do you have any opinion?
> >
> >> Ideally lld -r would be the tool we use to convert COFF/ELF/MachO
to
> >> the native format.
> >>
> >> - Michael Spencer
> >>
> >> >
> >> > On Fri, Feb 6, 2015 at 5:02 PM, Michael Spencer <
> bigcheesegs at gmail.com>
> >> > wrote:
> >> >>
> >> >> On Fri, Feb 6, 2015 at 2:54 PM, Rui Ueyama <ruiu at
google.com> wrote:
> >> >> > Can we remove Native format support? I'd like to
get input from
> >> >> > anyone
> >> >> > who
> >> >> > wants to keep the current Native format in LLD.
> >> >>
> >> >> One of the original goals for LLD was to provide a new
object file
> >> >> format for performance. The reason it is not used
currently is
> because
> >> >> we've yet to teach llvm to generate it, and we
haven't done that
> >> >> because it hasn't been finalized yet. The value it
currently provides
> >> >> is catching stuff like this, so we can fix it now instead
of down the
> >> >> road when we actually productize the native format.
> >> >>
> >> >> As for the specific implementation of the native format,
I'm open to
> >> >> an extensible format, but only if the performance cost is
low.
> >> >>
> >> >> - Michael Spencer
> >> >>
> >> >> >
> >> >> > On Thu, Feb 5, 2015 at 2:03 PM, Shankar Easwaran
> >> >> > <shankare at codeaurora.org>
> >> >> > wrote:
> >> >> >>
> >> >> >> The only way currently is to create a new
reference, unless we can
> >> >> >> think
> >> >> >> of adding some target specific metadata
information in the Atom
> >> >> >> model.
> >> >> >>
> >> >> >> This has come up over and over again, we need
something in the
> Atom
> >> >> >> model
> >> >> >> to store information that is target specific.
> >> >> >>
> >> >> >> Shankar Easwaran
> >> >> >>
> >> >> >>
> >> >> >> On 2/5/2015 2:22 PM, Simon Atanasyan wrote:
> >> >> >>>
> >> >> >>> Hi,
> >> >> >>>
> >> >> >>> I need an advice on implementation of a very
specific kind of
> >> >> >>> relocations
> >> >> >>> used by MIPS N64 ABI. As usual the main
problem is how to pass
> >> >> >>> target
> >> >> >>> specific
> >> >> >>> data over Native/YAML conversion barrier.
> >> >> >>>
> >> >> >>> In this ABI relocation record r_info field
in fact consists of
> five
> >> >> >>> subfields:
> >> >> >>> * r_sym   - symbol index
> >> >> >>> * r_ssym  - special symbol
> >> >> >>> * r_type3 - third relocation type
> >> >> >>> * r_type2 - second relocation type
> >> >> >>> * r_type  - first relocation type
> >> >> >>>
> >> >> >>> Up to three these relocations applied one by
one. The first
> >> >> >>> relocation
> >> >> >>> uses
> >> >> >>> an addendum from the relocation record. Each
subsequent
> relocation
> >> >> >>> takes
> >> >> >>> as
> >> >> >>> its addend the result of the previous
operation. Only the final
> >> >> >>> operation
> >> >> >>> actually modifies the location relocated.
The first relocation
> uses
> >> >> >>> as
> >> >> >>> a reference symbol specified by the r_sym
field. The third
> >> >> >>> relocation
> >> >> >>> assumes NULL symbol.
> >> >> >>>
> >> >> >>> The most interesting case is the second
relocation. It uses the
> >> >> >>> special
> >> >> >>> symbol value given by the r_ssym field. This
field can contain
> four
> >> >> >>> predefined values:
> >> >> >>> * RSS_UNDEF - zero value
> >> >> >>> * RSS_GP    - value of gp symbol
> >> >> >>> * RSS_GP0   - gp0 value taken from the
.MIPS.options or .reginfo
> >> >> >>> section
> >> >> >>> * RSS_LOC   - address of location being
relocated
> >> >> >>>
> >> >> >>> So the problem is how to store these four
constants in the
> >> >> >>> lld::Reference object.
> >> >> >>> The RSS_UNDEF is obviously not a problem. To
represent the RSS_GP
> >> >> >>> value I
> >> >> >>> can
> >> >> >>> set an AbsoluteAtom created for the
"_gp" as the reference's
> >> >> >>> target.
> >> >> >>> But
> >> >> >>> what
> >> >> >>> about RSS_GP0 and RSS_LOC? I am considering
the following
> >> >> >>> approaches
> >> >> >>> but
> >> >> >>> cannot
> >> >> >>> select the best one:
> >> >> >>>
> >> >> >>> a) Create AbsoluteAtom for each of these
cases and set them as
> the
> >> >> >>> reference's target.
> >> >> >>>     The problem is that these atoms are fake
and should not go to
> >> >> >>> the
> >> >> >>> symbol table.
> >> >> >>>     One more problem is to select unique
names for these atoms.
> >> >> >>> b) Use two high bits of
lld::Reference::_kindValue field to
> encode
> >> >> >>> RSS_xxx value.
> >> >> >>>     Then decode these bits in the
RelocationHandler to calculate
> >> >> >>> result
> >> >> >>> of relocation.
> >> >> >>>     In that case the problem is how to
represent a relocation
> kind
> >> >> >>> value in YAML format.
> >> >> >>>     The simple
xxxRelocationStringTable::kindStrings[] array will
> >> >> >>> not
> >> >> >>> satisfy us.
> >> >> >>> c) Add one more field to the lld::Reference
class. Something like
> >> >> >>> the
> >> >> >>> DefinedAtom::CodeModel
> >> >> >>>     field.
> >> >> >>>
> >> >> >>> Any advices, ideas, and/or objections are
much appreciated.
> >> >> >>>
> >> >> >>
> >> >> >>
> >> >> >> --
> >> >> >> Qualcomm Innovation Center, Inc. is a member of
Code Aurora Forum,
> >> >> >> hosted
> >> >> >> by the Linux Foundation
> >> >> >>
> >> >> >
> >> >> >
> >> >> > _______________________________________________
> >> >> > LLVM Developers mailing list
> >> >> > LLVMdev at cs.uiuc.edu        
http://llvm.cs.uiuc.edu
> >> >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
> >> >> >
> >> >
> >> >
> >
> >
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150206/e4aaf515/attachment.html>

Shankar Easwaram

2015-Feb-07 02:41 UTC

head link

[LLVMdev] [lld] Representation of lld::Reference with a fake target

Doing it for every input file is not useful as some of the input files are not
represent able in YAML form. Examples are shared libraries.

The reason I made the yaml pass be called before the writer was the intermediate
result was more complete since all atoms have been resolved at that point and
the state of all atoms are much sane.

It was also easy to use the pass manager. the code was very small to achieve
what we are trying to do that all the information to the writer is passed
through references or atom properties.

Shankar Easwaran 


> On Feb 6, 2015, at 19:54, Rui Ueyama <ruiu at google.com> wrote:
> 
>> On Fri, Feb 6, 2015 at 5:42 PM, Michael Spencer <bigcheesegs at
gmail.com> wrote:
>> On Fri, Feb 6, 2015 at 5:31 PM, Rui Ueyama <ruiu at google.com>
wrote:
>> > There are two questions.
>> >
>> > Firstly, do you think the on-disk format needs to compatible with
a C++
>> > struct so that we can cast that memory buffer to the struct? That
may be
>> > super-fast but that also comes with many limitations. It's
hard to extend,
>> > for example. Every time we want to store variable-length objects
we need to
>> > define string-table-like data structure. And I'm not very sure
that it's
>> > fastest -- because mmap'able objects are not very compact on
disk, slow disk
>> > IO could be a bottleneck, if we compare that with more compact
file format.
>> > I believe Protobufs or Thrust are fast enough or even might be
faster.
>> 
>> I'm not sure here. Although I do question if the object files will
>> even need to be read from disk in your standard edit/compile/debug
>> loop or on a build server. I believe we'll need real data to
determine
>> this.
>> 
>> >
>> > Secondly, do you know why we are dumping post-linked object file
to Native
>> > format? If we want to have a different kind of *object* file
format, we
>> > would want to have a tool to convert an object file in an existing
file
>> > format (say, ELF) to "native", and teach LLD how read
from the file.
>> > Currently we are writing a file in the middle of linking process,
which
>> > doesn't make sense to me.
>> 
>> This is an artifact of having the native format before we had any
>> readers. I agree that it's weird and not terribly useful to write
to
>> native format in the middle of the link, although I have found it
>> helpful to output yaml. There's no need to be able to read it back
in
>> and resume though.
> 
> Even for YAML it doesn't make much sense to write it to a file and read
it back from the file in the middle of the link, do it? I found that being able
to output YAML is useful too, but round-trip is a different thing. In the middle
of the process, we have bunch of additional information that doesn't exist
in input files and doesn't have to be output to the link result. Ability to
serialize that intermediate result is not useful.
> 
> Shankar, you added these round-trip tests. Do you have any opinion?
> 
>> Ideally lld -r would be the tool we use to convert COFF/ELF/MachO to
>> the native format.
>> 
>> - Michael Spencer
>> 
>> >
>> > On Fri, Feb 6, 2015 at 5:02 PM, Michael Spencer <bigcheesegs at
gmail.com>
>> > wrote:
>> >>
>> >> On Fri, Feb 6, 2015 at 2:54 PM, Rui Ueyama <ruiu at
google.com> wrote:
>> >> > Can we remove Native format support? I'd like to get
input from anyone
>> >> > who
>> >> > wants to keep the current Native format in LLD.
>> >>
>> >> One of the original goals for LLD was to provide a new object
file
>> >> format for performance. The reason it is not used currently is
because
>> >> we've yet to teach llvm to generate it, and we haven't
done that
>> >> because it hasn't been finalized yet. The value it
currently provides
>> >> is catching stuff like this, so we can fix it now instead of
down the
>> >> road when we actually productize the native format.
>> >>
>> >> As for the specific implementation of the native format,
I'm open to
>> >> an extensible format, but only if the performance cost is low.
>> >>
>> >> - Michael Spencer
>> >>
>> >> >
>> >> > On Thu, Feb 5, 2015 at 2:03 PM, Shankar Easwaran
>> >> > <shankare at codeaurora.org>
>> >> > wrote:
>> >> >>
>> >> >> The only way currently is to create a new reference,
unless we can
>> >> >> think
>> >> >> of adding some target specific metadata information
in the Atom model.
>> >> >>
>> >> >> This has come up over and over again, we need
something in the Atom
>> >> >> model
>> >> >> to store information that is target specific.
>> >> >>
>> >> >> Shankar Easwaran
>> >> >>
>> >> >>
>> >> >> On 2/5/2015 2:22 PM, Simon Atanasyan wrote:
>> >> >>>
>> >> >>> Hi,
>> >> >>>
>> >> >>> I need an advice on implementation of a very
specific kind of
>> >> >>> relocations
>> >> >>> used by MIPS N64 ABI. As usual the main problem
is how to pass target
>> >> >>> specific
>> >> >>> data over Native/YAML conversion barrier.
>> >> >>>
>> >> >>> In this ABI relocation record r_info field in
fact consists of five
>> >> >>> subfields:
>> >> >>> * r_sym   - symbol index
>> >> >>> * r_ssym  - special symbol
>> >> >>> * r_type3 - third relocation type
>> >> >>> * r_type2 - second relocation type
>> >> >>> * r_type  - first relocation type
>> >> >>>
>> >> >>> Up to three these relocations applied one by one.
The first relocation
>> >> >>> uses
>> >> >>> an addendum from the relocation record. Each
subsequent relocation
>> >> >>> takes
>> >> >>> as
>> >> >>> its addend the result of the previous operation.
Only the final
>> >> >>> operation
>> >> >>> actually modifies the location relocated. The
first relocation uses as
>> >> >>> a reference symbol specified by the r_sym field.
The third relocation
>> >> >>> assumes NULL symbol.
>> >> >>>
>> >> >>> The most interesting case is the second
relocation. It uses the
>> >> >>> special
>> >> >>> symbol value given by the r_ssym field. This
field can contain four
>> >> >>> predefined values:
>> >> >>> * RSS_UNDEF - zero value
>> >> >>> * RSS_GP    - value of gp symbol
>> >> >>> * RSS_GP0   - gp0 value taken from the
.MIPS.options or .reginfo
>> >> >>> section
>> >> >>> * RSS_LOC   - address of location being relocated
>> >> >>>
>> >> >>> So the problem is how to store these four
constants in the
>> >> >>> lld::Reference object.
>> >> >>> The RSS_UNDEF is obviously not a problem. To
represent the RSS_GP
>> >> >>> value I
>> >> >>> can
>> >> >>> set an AbsoluteAtom created for the
"_gp" as the reference's target.
>> >> >>> But
>> >> >>> what
>> >> >>> about RSS_GP0 and RSS_LOC? I am considering the
following approaches
>> >> >>> but
>> >> >>> cannot
>> >> >>> select the best one:
>> >> >>>
>> >> >>> a) Create AbsoluteAtom for each of these cases
and set them as the
>> >> >>> reference's target.
>> >> >>>     The problem is that these atoms are fake and
should not go to the
>> >> >>> symbol table.
>> >> >>>     One more problem is to select unique names
for these atoms.
>> >> >>> b) Use two high bits of
lld::Reference::_kindValue field to encode
>> >> >>> RSS_xxx value.
>> >> >>>     Then decode these bits in the
RelocationHandler to calculate
>> >> >>> result
>> >> >>> of relocation.
>> >> >>>     In that case the problem is how to represent
a relocation kind
>> >> >>> value in YAML format.
>> >> >>>     The simple
xxxRelocationStringTable::kindStrings[] array will not
>> >> >>> satisfy us.
>> >> >>> c) Add one more field to the lld::Reference
class. Something like the
>> >> >>> DefinedAtom::CodeModel
>> >> >>>     field.
>> >> >>>
>> >> >>> Any advices, ideas, and/or objections are much
appreciated.
>> >> >>>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum,
>> >> >> hosted
>> >> >> by the Linux Foundation
>> >> >>
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > LLVM Developers mailing list
>> >> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >> >
>> >
>> >
> -------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150206/bcb53fc5/attachment.html>

Rui Ueyama

2015-Feb-07 02:48 UTC

head link

[LLVMdev] [lld] Representation of lld::Reference with a fake target

I think no one is opposing the idea of reading and writing YAML.

The problem here is that why we need to force all developers to write code
to serialize intermediate data in the middle of link, which no one except
the round-trip passes needs.

On Fri, Feb 6, 2015 at 6:41 PM, Shankar Easwaram <shankarke at gmail.com>
wrote:
> Doing it for every input file is not useful as some of the input files are
> not represent able in YAML form. Examples are shared libraries.
>
> The reason I made the yaml pass be called before the writer was the
> intermediate result was more complete since all atoms have been resolved at
> that point and the state of all atoms are much sane.
>
> It was also easy to use the pass manager. the code was very small to
> achieve what we are trying to do that all the information to the writer is
> passed through references or atom properties.
>
> Shankar Easwaran
>
>
>
> On Feb 6, 2015, at 19:54, Rui Ueyama <ruiu at google.com> wrote:
>
> On Fri, Feb 6, 2015 at 5:42 PM, Michael Spencer <bigcheesegs at
gmail.com>
> wrote:
>
>> On Fri, Feb 6, 2015 at 5:31 PM, Rui Ueyama <ruiu at google.com>
wrote:
>> > There are two questions.
>> >
>> > Firstly, do you think the on-disk format needs to compatible with
a C++
>> > struct so that we can cast that memory buffer to the struct? That
may be
>> > super-fast but that also comes with many limitations. It's
hard to
>> extend,
>> > for example. Every time we want to store variable-length objects
we
>> need to
>> > define string-table-like data structure. And I'm not very sure
that it's
>> > fastest -- because mmap'able objects are not very compact on
disk, slow
>> disk
>> > IO could be a bottleneck, if we compare that with more compact
file
>> format.
>> > I believe Protobufs or Thrust are fast enough or even might be
faster.
>>
>> I'm not sure here. Although I do question if the object files will
>> even need to be read from disk in your standard edit/compile/debug
>> loop or on a build server. I believe we'll need real data to
determine
>> this.
>>
>> >
>> > Secondly, do you know why we are dumping post-linked object file
to
>> Native
>> > format? If we want to have a different kind of *object* file
format, we
>> > would want to have a tool to convert an object file in an existing
file
>> > format (say, ELF) to "native", and teach LLD how read
from the file.
>> > Currently we are writing a file in the middle of linking process,
which
>> > doesn't make sense to me.
>>
>> This is an artifact of having the native format before we had any
>> readers. I agree that it's weird and not terribly useful to write
to
>> native format in the middle of the link, although I have found it
>> helpful to output yaml. There's no need to be able to read it back
in
>> and resume though.
>>
>
> Even for YAML it doesn't make much sense to write it to a file and read
it
> back from the file in the middle of the link, do it? I found that being
> able to output YAML is useful too, but round-trip is a different thing. In
> the middle of the process, we have bunch of additional information that
> doesn't exist in input files and doesn't have to be output to the
link
> result. Ability to serialize that intermediate result is not useful.
>
> Shankar, you added these round-trip tests. Do you have any opinion?
>
> Ideally lld -r would be the tool we use to convert COFF/ELF/MachO to
>> the native format.
>>
>> - Michael Spencer
>>
>> >
>> > On Fri, Feb 6, 2015 at 5:02 PM, Michael Spencer <bigcheesegs at
gmail.com>
>> > wrote:
>> >>
>> >> On Fri, Feb 6, 2015 at 2:54 PM, Rui Ueyama <ruiu at
google.com> wrote:
>> >> > Can we remove Native format support? I'd like to get
input from
>> anyone
>> >> > who
>> >> > wants to keep the current Native format in LLD.
>> >>
>> >> One of the original goals for LLD was to provide a new object
file
>> >> format for performance. The reason it is not used currently is
because
>> >> we've yet to teach llvm to generate it, and we haven't
done that
>> >> because it hasn't been finalized yet. The value it
currently provides
>> >> is catching stuff like this, so we can fix it now instead of
down the
>> >> road when we actually productize the native format.
>> >>
>> >> As for the specific implementation of the native format,
I'm open to
>> >> an extensible format, but only if the performance cost is low.
>> >>
>> >> - Michael Spencer
>> >>
>> >> >
>> >> > On Thu, Feb 5, 2015 at 2:03 PM, Shankar Easwaran
>> >> > <shankare at codeaurora.org>
>> >> > wrote:
>> >> >>
>> >> >> The only way currently is to create a new reference,
unless we can
>> >> >> think
>> >> >> of adding some target specific metadata information
in the Atom
>> model.
>> >> >>
>> >> >> This has come up over and over again, we need
something in the Atom
>> >> >> model
>> >> >> to store information that is target specific.
>> >> >>
>> >> >> Shankar Easwaran
>> >> >>
>> >> >>
>> >> >> On 2/5/2015 2:22 PM, Simon Atanasyan wrote:
>> >> >>>
>> >> >>> Hi,
>> >> >>>
>> >> >>> I need an advice on implementation of a very
specific kind of
>> >> >>> relocations
>> >> >>> used by MIPS N64 ABI. As usual the main problem
is how to pass
>> target
>> >> >>> specific
>> >> >>> data over Native/YAML conversion barrier.
>> >> >>>
>> >> >>> In this ABI relocation record r_info field in
fact consists of five
>> >> >>> subfields:
>> >> >>> * r_sym   - symbol index
>> >> >>> * r_ssym  - special symbol
>> >> >>> * r_type3 - third relocation type
>> >> >>> * r_type2 - second relocation type
>> >> >>> * r_type  - first relocation type
>> >> >>>
>> >> >>> Up to three these relocations applied one by one.
The first
>> relocation
>> >> >>> uses
>> >> >>> an addendum from the relocation record. Each
subsequent relocation
>> >> >>> takes
>> >> >>> as
>> >> >>> its addend the result of the previous operation.
Only the final
>> >> >>> operation
>> >> >>> actually modifies the location relocated. The
first relocation
>> uses as
>> >> >>> a reference symbol specified by the r_sym field.
The third
>> relocation
>> >> >>> assumes NULL symbol.
>> >> >>>
>> >> >>> The most interesting case is the second
relocation. It uses the
>> >> >>> special
>> >> >>> symbol value given by the r_ssym field. This
field can contain four
>> >> >>> predefined values:
>> >> >>> * RSS_UNDEF - zero value
>> >> >>> * RSS_GP    - value of gp symbol
>> >> >>> * RSS_GP0   - gp0 value taken from the
.MIPS.options or .reginfo
>> >> >>> section
>> >> >>> * RSS_LOC   - address of location being relocated
>> >> >>>
>> >> >>> So the problem is how to store these four
constants in the
>> >> >>> lld::Reference object.
>> >> >>> The RSS_UNDEF is obviously not a problem. To
represent the RSS_GP
>> >> >>> value I
>> >> >>> can
>> >> >>> set an AbsoluteAtom created for the
"_gp" as the reference's
>> target.
>> >> >>> But
>> >> >>> what
>> >> >>> about RSS_GP0 and RSS_LOC? I am considering the
following
>> approaches
>> >> >>> but
>> >> >>> cannot
>> >> >>> select the best one:
>> >> >>>
>> >> >>> a) Create AbsoluteAtom for each of these cases
and set them as the
>> >> >>> reference's target.
>> >> >>>     The problem is that these atoms are fake and
should not go to
>> the
>> >> >>> symbol table.
>> >> >>>     One more problem is to select unique names
for these atoms.
>> >> >>> b) Use two high bits of
lld::Reference::_kindValue field to encode
>> >> >>> RSS_xxx value.
>> >> >>>     Then decode these bits in the
RelocationHandler to calculate
>> >> >>> result
>> >> >>> of relocation.
>> >> >>>     In that case the problem is how to represent
a relocation kind
>> >> >>> value in YAML format.
>> >> >>>     The simple
xxxRelocationStringTable::kindStrings[] array will
>> not
>> >> >>> satisfy us.
>> >> >>> c) Add one more field to the lld::Reference
class. Something like
>> the
>> >> >>> DefinedAtom::CodeModel
>> >> >>>     field.
>> >> >>>
>> >> >>> Any advices, ideas, and/or objections are much
appreciated.
>> >> >>>
>> >> >>
>> >> >>
>> >> >> --
>> >> >> Qualcomm Innovation Center, Inc. is a member of Code
Aurora Forum,
>> >> >> hosted
>> >> >> by the Linux Foundation
>> >> >>
>> >> >
>> >> >
>> >> > _______________________________________________
>> >> > LLVM Developers mailing list
>> >> > LLVMdev at cs.uiuc.edu         http://llvm.cs.uiuc.edu
>> >> > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
>> >> >
>> >
>> >
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20150206/784a70a1/attachment.html>

llvm dev - Feb 2015 - [LLVMdev] [lld] Representation of lld::Reference with a fake target

[LLVMdev] [lld] Representation of lld::Reference with a fake target

[LLVMdev] [lld] Representation of lld::Reference with a fake target

[LLVMdev] [lld] Representation of lld::Reference with a fake target

[LLVMdev] [lld] Representation of lld::Reference with a fake target

[LLVMdev] [lld] Representation of lld::Reference with a fake target