Shankar and I discussed input file handling, and we came up with a design that may greatly simplify the input file handling, while giving more flexibility to developer to support complicated options, such as --{start,end}-group, -z rescan or -z rescan-now. It'd worth pursuing, so here's the idea: 1. We wouldn't probably want to let Resolver to handle the input graph directly, for we don't want to implement the details of input graph control node behavior for --{start,end}-group, -z rescan{,-now}, or whatever you define as a new linker option. Implementing it to Resolver would make it messy. 2. We would instead add a new method nextFile() to LinkingContext, which returns a new file from which Resolver should try to resolve undefined symbols. Resolver wouldn't handle the input graph at all. Instead, Resolver calls nextFile() repeatedly to link until nextFile() returns a null value. 3. Resolver will notify Linking Context when (A) all undefined symbols are resolved (success), or (B) it detects it's in an infinite loop in which it cannot resolve any symbols anymore (failure). Linking Context will do whatever it thinks appropriate for the event notification. So, with this mechanism, one can implement --{start,end}-group this way: Assume command line *--start-group a b c --end-group d.* nextFile() returns *a, b, c, a, b, c, a, ...* until it gets notified that Resolver cannot resolve symbol any longer. If notified, next call of nextFile() will return *d.* This is how "search the files in the group repeatedly until all possible references are resolved" is implemented. "-z rescan", which is an option to let the linker to reprocess all files up to the command line option position, can be implemented this way: Assume command line *a b -z rescan c*. nextFile() returns *a b a b c* in this order. That makes Resolver to reprocess file *a* and *b* twice. The logic for --whole-archive can also be implemented: Linking Context should usually return a null value for nextFile() if it gets notified all symbols are resolved in order to terminate Resolver. However, if --whole-archive option is given, it should continue to return new files to let Resolver to include all files appeared in the command line. Overall, this seems to be a clean API that is powerful enough to implement complex semantics. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130920/b13499ab/attachment.html>
Thanks Ruiu, for putting this together. I like this proposal much, as it simplifies the resolver a lot and leaves it to the LinkingContext on how symbol resolution should happen. The Resolver will have a state enum ResolverState : uint8_t { unChanged = 0x0, // no change newWeakAtoms = 0x1, // Found weak atoms newUndefinedAtoms = 0x4, // Added new undefined atoms newDefinedAtoms = 0x8, // Added new defined atoms newAbsoluteAtoms = 0x10, // Added new absolute atoms newSharedLibraryAtoms = 0x20, // Added new shared library atoms }; nextFile() can be renamed to nextFiles(), that would return a range of input files, makes it much easier to handle when there are no groups. Thanks Shankar Easwaran On 9/20/2013 4:30 PM, Rui Ueyama wrote:> Shankar and I discussed input file handling, and we came up with a design > that may greatly simplify the input file handling, while giving more > flexibility to developer to support complicated options, such as > --{start,end}-group, -z rescan or -z rescan-now. It'd worth pursuing, so > here's the idea: > > 1. We wouldn't probably want to let Resolver to handle the input graph > directly, for we don't want to implement the details of input graph control > node behavior for --{start,end}-group, -z rescan{,-now}, or whatever you > define as a new linker option. Implementing it to Resolver would make it > messy. > > 2. We would instead add a new method nextFile() to LinkingContext, which > returns a new file from which Resolver should try to resolve undefined > symbols. Resolver wouldn't handle the input graph at all. Instead, Resolver > calls nextFile() repeatedly to link until nextFile() returns a null value. > > 3. Resolver will notify Linking Context when (A) all undefined symbols are > resolved (success), or (B) it detects it's in an infinite loop in which it > cannot resolve any symbols anymore (failure). Linking Context will do > whatever it thinks appropriate for the event notification. > > So, with this mechanism, one can implement --{start,end}-group this way: > Assume command line *--start-group a b c --end-group d.* nextFile() returns > *a, b, c, a, b, c, a, ...* until it gets notified that Resolver cannot > resolve symbol any longer. If notified, next call of nextFile() will return > *d.* This is how "search the files in the group repeatedly until all > possible references are resolved" is implemented. > > "-z rescan", which is an option to let the linker to reprocess all files up > to the command line option position, can be implemented this way: Assume > command line *a b -z rescan c*. nextFile() returns *a b a b c* in this > order. That makes Resolver to reprocess file *a* and *b* twice. > > The logic for --whole-archive can also be implemented: Linking Context > should usually return a null value for nextFile() if it gets notified all > symbols are resolved in order to terminate Resolver. However, if > --whole-archive option is given, it should continue to return new files to > let Resolver to include all files appeared in the command line. > > Overall, this seems to be a clean API that is powerful enough to implement > complex semantics. > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130920/9813e9fa/attachment.html>
Rui, I like this in general, but have a few questions. On Sep 20, 2013, at 2:30 PM, Rui Ueyama <ruiu at google.com> wrote:> 2. We would instead add a new method nextFile() to LinkingContext, which returns a new file from which Resolver should try to resolve undefined symbols. Resolver wouldn't handle the input graph at all. Instead, Resolver calls nextFile() repeatedly to link until nextFile() returns a null value. > > 3. Resolver will notify Linking Context when (A) all undefined symbols are resolved (success), or (B) it detects it's in an infinite loop in which it cannot resolve any symbols anymore (failure). Linking Context will do whatever it thinks appropriate for the event notification.How does the Resolver detect an infinite loop? As in the example below, it is supposed to keep getting a,b,c,a,b,c… At some point, no more undefines are being fulfilled by a,b,c,a,b,c…, but how does the resolver know that the files are repeating, to know to tell the InputGraph to move on? -Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130920/70dd4818/attachment.html>
Hi Nick, On 9/20/2013 5:29 PM, Nick Kledzik wrote:> Rui, > > I like this in general, but have a few questions. > > On Sep 20, 2013, at 2:30 PM, Rui Ueyama <ruiu at google.com> wrote: > >> 2. We would instead add a new method nextFile() to LinkingContext, which returns a new file from which Resolver should try to resolve undefined symbols. Resolver wouldn't handle the input graph at all. Instead, Resolver calls nextFile() repeatedly to link until nextFile() returns a null value. >> >> 3. Resolver will notify Linking Context when (A) all undefined symbols are resolved (success), or (B) it detects it's in an infinite loop in which it cannot resolve any symbols anymore (failure). Linking Context will do whatever it thinks appropriate for the event notification. > How does the Resolver detect an infinite loop? As in the example below, it is supposed to keep getting a,b,c,a,b,c… At some point, no more undefines are being fulfilled by a,b,c,a,b,c…, but how does the resolver know that the files are repeating, to know to tell the InputGraph to move on?nextFile could pass the current resolver state at the time when its called, the linkingcontext can return the next file to be processed as below :- nextFile(currentResolverState) :- a) Returns the next file if the current node is not a group node b) Returns the next file in the group node, if the current node is a group node and the resolver state states undefined /weak / shared library atoms were added. c) Returns the start file in the group node, if the resolver state states undefined/weak/shared library atoms were added d) If the state is unchanged, no symbols were added exit the group and move to the next node. Thanks Shankar Easwaran -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation
On Sep 20, 2013, at 3:37 PM, Rui Ueyama <ruiu at google.com> wrote:> On Fri, Sep 20, 2013 at 3:29 PM, Nick Kledzik <kledzik at apple.com> wrote: > Rui, > > I like this in general, but have a few questions. > > On Sep 20, 2013, at 2:30 PM, Rui Ueyama <ruiu at google.com> wrote: > >> 2. We would instead add a new method nextFile() to LinkingContext, which returns a new file from which Resolver should try to resolve undefined symbols. Resolver wouldn't handle the input graph at all. Instead, Resolver calls nextFile() repeatedly to link until nextFile() returns a null value. >> >> 3. Resolver will notify Linking Context when (A) all undefined symbols are resolved (success), or (B) it detects it's in an infinite loop in which it cannot resolve any symbols anymore (failure). Linking Context will do whatever it thinks appropriate for the event notification. > > How does the Resolver detect an infinite loop? As in the example below, it is supposed to keep getting a,b,c,a,b,c… At some point, no more undefines are being fulfilled by a,b,c,a,b,c…, but how does the resolver know that the files are repeating, to know to tell the InputGraph to move on? > > nextFile() should returns the same file objects for the same file, so Resolver can notice if it haven't resolved any symbol since last time it saw the same file.That is not how groups work. The resolver has to keep checking each file in the group, in order, until there is one full pass of the group and none in the group provided any more content. On Sep 20, 2013, at 3:42 PM, Shankar Easwaran <shankare at codeaurora.org> wrote:> nextFile could pass the current resolver state at the time when its called, the linkingcontext can return the next file to be processed as below :- > > nextFile(currentResolverState) :- > > a) Returns the next file if the current node is not a group node > b) Returns the next file in the group node, if the current node is a group node and the resolver state states undefined /weak / shared library atoms were added. > c) Returns the start file in the group node, if the resolver state states undefined/weak/shared library atoms were added > d) If the state is unchanged, no symbols were added exit the group and move to the next node.What causes the Resolver state to change? I understand the state of “there are undefines remaining”, but the “something was added” is a transient state. Each last file, changes it. The InputGraph knows how to exit a group, but does not know if it should. The Resolver knows which files added content, but does not know which files are in what groups. How about this for InputGraph: void fileAddedContent(File *); // asserts that File* param is the same thing nextFile() last returned File *nextFile(); The Resolver, also tells the InputGraph which files it used. When in a group, the InputGraph keeps a flag for the group of whether it has contributed anything. It starts as false for the first entry in the group. If fileAddedContent() is called, the flag is updated. When nextFile() is called and the previous entry was the last in the group, if the flag says nothing in that pass contributed, the InputGraph exits the group and moves on to the next file. -Nick -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130920/d97f34b9/attachment.html>
On Fri, Sep 20, 2013 at 3:59 PM, Nick Kledzik <kledzik at apple.com> wrote:> > On Sep 20, 2013, at 3:37 PM, Rui Ueyama <ruiu at google.com> wrote: > > On Fri, Sep 20, 2013 at 3:29 PM, Nick Kledzik <kledzik at apple.com> wrote: > >> Rui, >> >> I like this in general, but have a few questions. >> >> On Sep 20, 2013, at 2:30 PM, Rui Ueyama <ruiu at google.com> wrote: >> >> 2. We would instead add a new method nextFile() to LinkingContext, which >> returns a new file from which Resolver should try to resolve undefined >> symbols. Resolver wouldn't handle the input graph at all. Instead, Resolver >> calls nextFile() repeatedly to link until nextFile() returns a null value. >> >> 3. Resolver will notify Linking Context when (A) all undefined symbols >> are resolved (success), or (B) it detects it's in an infinite loop in which >> it cannot resolve any symbols anymore (failure). Linking Context will do >> whatever it thinks appropriate for the event notification. >> >> How does the Resolver detect an infinite loop? As in the example below, >> it is supposed to keep getting a,b,c,a,b,c… At some point, no more >> undefines are being fulfilled by a,b,c,a,b,c…, but how does the resolver >> know that the files are repeating, to know to tell the InputGraph to move >> on? >> > > nextFile() should returns the same file objects for the same file, so > Resolver can notice if it haven't resolved any symbol since last time it > saw the same file. > > That is not how groups work. The resolver has to keep checking each file > in the group, in order, until there is one full pass of the group and none > in the group provided any more content. > > > On Sep 20, 2013, at 3:42 PM, Shankar Easwaran <shankare at codeaurora.org> > wrote: > > nextFile could pass the current resolver state at the time when its > called, the linkingcontext can return the next file to be processed as > below :- > > nextFile(currentResolverState) :- > > a) Returns the next file if the current node is not a group node > b) Returns the next file in the group node, if the current node is a group > node and the resolver state states undefined /weak / shared library atoms > were added. > c) Returns the start file in the group node, if the resolver state states > undefined/weak/shared library atoms were added > d) If the state is unchanged, no symbols were added exit the group and > move to the next node. > > > What causes the Resolver state to change? I understand the state of > “there are undefines remaining”, but the “something was added” is a > transient state. Each last file, changes it. > > The InputGraph knows how to exit a group, but does not know if it should. > The Resolver knows which files added content, but does not know which > files are in what groups. > > How about this for InputGraph: > > void fileAddedContent(File *); // asserts that File* param is the same > thing nextFile() last returned > File *nextFile(); > > The Resolver, also tells the InputGraph which files it used. When in a > group, the InputGraph keeps a flag for the group of whether it has > contributed anything. It starts as false for the first entry in the group. > If fileAddedContent() is called, the flag is updated. When nextFile() is > called and the previous entry was the last in the group, if the flag says > nothing in that pass contributed, the InputGraph exits the group and moves > on to the next file. >I don't want to make Resolver to have a reference to input graph. The point of this proposal is to separate input graph handling from Resolver and instead making Linker Context to do that task. Resolver shouldn't have knowledge on how it should iterate over input files, focusing only on core linking. Linking Context should implement the policy and "feed" input files to Resolver. Resolver should blindly consume an input file given by Linking Context, and report its status to Linking Context. How about this: add integer parameters to nextFile() to report the current number of undefined and defined atoms. With that information, Linking Context is able to decide what file should be fed to Resolver next. For example, it can detect from the atom counts whether Resolver need to exit --{start,end}-group loop or not. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130920/bd91cf7a/attachment.html>
Hi Nick, On 9/20/2013 5:59 PM, Nick Kledzik wrote:> On Sep 20, 2013, at 3:42 PM, Shankar Easwaran > <shankare at codeaurora.org> wrote: >> nextFile could pass the current resolver state at the time when its called, the linkingcontext can return the next file to be processed as below :- >> >> nextFile(currentResolverState) :- >> >> a) Returns the next file if the current node is not a group node >> b) Returns the next file in the group node, if the current node is a group node and the resolver state states undefined /weak / shared library atoms were added. >> c) Returns the start file in the group node, if the resolver state states undefined/weak/shared library atoms were added >> d) If the state is unchanged, no symbols were added exit the group and move to the next node. > What causes the Resolver state to change? I understand the state of “there are undefines remaining”, but the “something was added” is a transient state. Each last file, changes it.Sorry for the long mail. This should explain things better. Here is a example with a state diagram on how the above proposal works. The main idea is to keep a running state of the resolver and the capturing the resolver state of each input file in the group by the linkingcontext. lld -flavor gnu main.o thread.o --start-group libc.a libpthread.a --end-group function.o main.o has atoms ------------------------ main (defined) printf(undefined) fn(undefined) thread.o has atoms ----------------------------- pthread_create (undefined) libc.a(printf.o) has atoms ------------------------------------ printf(defined) libc.a(exit.o) has atoms ---------------------------------- exit(defined) libpthread.a has atoms --------------------------------- pthread_create(defined) exit(undefined) function.o has atoms ------------------------------- fn(defined) State diagram with time information Resolver resolverState Context(nextFile) -------------- ------------------ ---------------- resolverState = initialState nextFile(resolverState) initialState ELFContextState=processingFileNode, return a.o resolverState = nochange process(a.o) state = definedatoms/undefinedatoms (reason: main/printf) nextFile(resolverState) definedAtoms/undefinedAtoms ELFContextState=processingFileNode, return b.o resolverState = nochange process(b.o) state = undefinedatoms(reason: pthread_create) nextFile(resolverState) undefinedAtoms ELFContextState=processingGroupNode, return libc.a resolverState=nochange process(libc.a) process(printf.o) state = definedatom (reason: printf) nextFile(resolverState) definedAtoms ELFContextState=processingGroupNode, state[libc.a]=definedAtoms, return libpthread.a resolverState=nochange process(libpthread.a) process(pthread.o) state = definedatom/undefinedatoms (reason: pthread_create/exit) nextFile(resolverState) definedAtoms/undefinedatoms ELFContextState=processingGroupNode, state[libpthread.a]=definedAtoms|undefinedAtoms, return libc.a (returns the first file in the group) *LinkingContext would exit the GroupNode only if the state of each file in the group is unchanged, or has only definedAtoms.* ///LinkingContext here, finds that libc.a has definedAtoms, whereas libpthread.a has undefinedAtoms, so traverses the group back./* *resolverState=nochange process(libc.a) process(exit.o) state = definedatom (reason: exit) nextfile(resolverState) definedAtoms ELFContextState=processingGroupNode, state[libc.a] = definedAtoms, return libpthread.a resolverState=nochange process(libc.a) state = nochange nextfile(resolverState) nochange ELFContextState=processingGroupNode, state[libpthread.a] = nochange, / LinkingContext //finds that libc.a state has "definedAtoms", and libpthread.a has "nochange", so exits the group./* *resolverState=nochange process(function.o) state = definedatom (reason: fn) Exit. Thanks Shankar Easwaran -- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by the Linux Foundation -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20130920/2f205c8c/attachment.html>