On Oct 31, 2012, at 7:46 PM, Shankar Easwaran wrote:
> Hi Nick,
>>> The range of flags would be integers ranging from LOW_PROC ..
HIGH_PROC.
>>>
>>> The Generic flags would be within the range less than LOW_PROC and
greater HIGH_PROC. Any value within the range LOW_PROC .. HIGH_PROC is
os/platform specific.
>>>
>>> What I was thinking was there could be a uint32_t flags() in the
definedAtom which returns the flags, and platforms can act accordingly on the
meaning of the flags in their pieces of code.
>>>
>>> What do you think ?
>> You still have not given an example of what information is missing in
the current Atom model that is driving the need for this.
>>
>> It sounds like your flags() returns a value - not a set of bits. Which
means it can only be used for one thing. What if you need two or more kinds of
information/attributes not in the Atom model? I don't see why LOW_PROC,
HIGH_PROC is needed. If we decide there are new kinds of information/attributes
that are general we would just define new methods on Atom, rather than define a
value to be returned by flags().
> There are two usecases that I can think of now :-
>
> 1) flags :- These are used to determine what the Atom contains in addition
to the content, could be that the Atom has
> a) follow on reference
> b) atom is part of a group, where other atoms are part of
>
> The flags could be used to determine if there is a follow up atom or if the
atom is part of a group.
> Both of them would be useful than iterating through the reference list and
iterating it and figuring out if there is a follow on reference / atom being
part of a group.
I see layout constraints (follow-on) and grouping as a natural use for
References. Seems like your concern is just performance. I would wait and see
if the searching of References for special kinds is actually a bottleneck in
practice, then we can talk about was to improve the performance.
>
> 2) Atom specific content types
>
> This is where the LOW_PROC, and HIGH_PROC comes in, there are content types
which are architecture specific.
>
> Currently there are many types defined within contentType which are
operating system specific. As more environments start using lld, I feel that
many architectures would want to add.
I've already added all the content types that darwin needs. I think it is
fine for you to add any that ELF needs.
>
> Example for GNU support would include
>
> a) checksum
> b) hash
> c) gnu prelink library list
These are actually generic attributes that could be made into real attributes
(methods) of DefinedAtom.
I've been thinking about adding something like a checksum for you in
coalescing by-content (for instance coalescing duplication c-strings or other
constants). For that, having a checksum would speed up comparisons.
I'm not sure if you mean a hash of the content or hash of the name. On the
name side, I've had thoughts of reworking Atom::name() to return some new
abstract type like SymbolName, instead of StringRef. The idea is that
SymbolName maintains a hash for the string so equality checks are fast. It can
also be used to help reduce the size of the new "native" format for
object files of C++ code. With C++ (especially with namespaces) generates huge
symbol names. A more compact format would be to factor out all the common
substrings and use a dictionary coder. Thus in the native object file, each
symbol name is some data stream of chars and dictionary indices.
I'm not sure what "gnu prelink library list" has to do with
individual Atoms.
-Nick
> I believe both of them would be solvable, by using a 64bit unsigned
integer, where in the lower half is used by content type and the upper is used
by flags. I dont think we would need more than 32 flags anytime soon. But
atleast there is a possibility of adding more flags.
>
> I think flags should be supported only by lld Core.
>
> Thanks
>
> Shankar Easwaran