Mani, Suresh via llvm-dev
2019-Jun-21 08:08 UTC
[llvm-dev] Expected behavior of lld during LTO for global symbols (Attr Internal/Common)
Thanks for the info Teresa, Regards M Suresh From: Teresa Johnson <tejohnson at google.com> Sent: Thursday, June 20, 2019 7:15 PM To: Mani, Suresh <Suresh.Mani at amd.com> Cc: Rui Ueyama <ruiu at google.com>; llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] Expected behavior of lld during LTO for global symbols (Attr Internal/Common) [CAUTION: External Email] I haven't had a chance to look, but as mentioned, the linker resolution for the symbol is exported, which explains the LTO side behavior. Someone from the linker will probably need to see what changed in the symbol info they are giving LTO is changing after that patch. If you want you can debug lld's BitcodeCompiler::add to see what info is different in the Resols array for that symbol that gets passed to LTO. Or what else is different in the Sym used to generate the resolution. Both of those are examined in LTO::addModuleToGlobalRes when we note that the symbol is external. Teresa On Thu, Jun 20, 2019 at 2:36 AM Mani, Suresh <Suresh.Mani at amd.com<mailto:Suresh.Mani at amd.com>> wrote: Hi Teresa, Can you please let me know if there is any update on this issue. Thanks M Suresh From: Teresa Johnson <tejohnson at google.com<mailto:tejohnson at google.com>> Sent: Tuesday, June 11, 2019 7:23 PM To: Rui Ueyama <ruiu at google.com<mailto:ruiu at google.com>> Cc: Mani, Suresh <Suresh.Mani at amd.com<mailto:Suresh.Mani at amd.com>>; llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> Subject: Re: [llvm-dev] Expected behavior of lld during LTO for global symbols (Attr Internal/Common) [CAUTION: External Email] LTO can, but it is linker driven. I confirmed that when it is a common symbol the resolution indicates that the symbol is exported, and when I add an initializer so that it is a def we no longer think it is exported and are able to internalize. So this seems to be due to a change in what the linker is telling LTO. I would have to dig in the debugger to confirm, but perhaps lld is now indicating that it might be used by a regular obj? I.e. in BitcodeCompiler::add. Teresa On Tue, Jun 11, 2019 at 5:48 AM Rui Ueyama <ruiu at google.com<mailto:ruiu at google.com>> wrote: Looks like this is indeed related to r360841. In C, there are distinctions between declarations, definitions and tentative definitions. Global variables declared with "extern" are declarations. Global variables that don't have "extern" and have initializers are definitions. If global variables have neither "extern" nor initializers, they are called tentative definitions. Common symbols represent tentative definitions. Tentative definition get special treatment in the linker. Usually if you define the same symbol in two object files, a linker report an error. However, common symbols are allowed to duplicate. Two or more common symbols are merged and then placed to the .bss section, so that they will be zero-initialized at runtime. So, a global variable defined as `struct Node* head` is actually a common symbol. I'm not sure why LTO cannot internalize common symbols though. Teresa, is this expected? On Mon, Jun 10, 2019 at 11:06 PM Teresa Johnson <tejohnson at google.com<mailto:tejohnson at google.com>> wrote: My guess is that it is due to lld change r360841 on that date (Introduce CommonSymbol). +Rui for comments. On Mon, Jun 10, 2019 at 4:45 AM Mani, Suresh via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: Hi , I have an issue during LTO phase of llvm compiler which is as follows, File t3.c --------- #include <stdio.h> #include <stdlib.h> // A linked list node struct Node { int data; struct Node* next; struct Node* prev; }; struct Node* head; /* Given a reference (pointer to pointer) to the head of a list and an int, inserts a new node on the front of the list. */ void push(struct Node** head_ref, int new_data) { struct Node* new_node = (struct Node*)malloc(sizeof(struct Node)); new_node->data = new_data; new_node->next = (*head_ref); new_node->prev = NULL; if ((*head_ref) != NULL) (*head_ref)->prev = new_node; (*head_ref) = new_node; } // This function prints contents of linked list starting from the given node void printList(struct Node* node) { struct Node* last; printf("\nTraversal in forward direction \n"); while (node != NULL) { printf(" %d ", node->data); last = node; node = node->next; } printf("\nTraversal in reverse direction \n"); while (last != NULL) { printf(" %d ", last->data); last = last->prev; } } /* Driver program to test above functions*/ int main() { head = NULL; push(&head, 7); push(&head, 1); push(&head, 4); printList(head); return 0; } Compiler invocation: -------------------- clang -flto -fuse-ld=lld -O3 t3.c -o a.out Expected behavior during LTO: ------------------------------ The compiler optimization during LTO needs to figure out that variable "head" is not referred by any precompiled object or library. Until May-16-2019 variable "head" had internal attribute as follows, @head = internal global %struct.Node* null, align 8 And the compiler was rightly able to recognize that "head" is not referred by any external precompiled object or library. But after May-16-2019 the attribute of head was changed as follows, @head = common dso_local global %struct.Node* null, align 8 Not sure if this is correct behavior? If this is a correct behavior then can you please let me know how could the compiler figure out that variable "head" is not referred by any external precompiled object or library? Thanks M Suresh _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -- Teresa Johnson | Software Engineer | tejohnson at google.com<mailto:tejohnson at google.com> | -- Teresa Johnson | Software Engineer | tejohnson at google.com<mailto:tejohnson at google.com> | -- Teresa Johnson | Software Engineer | tejohnson at google.com<mailto:tejohnson at google.com> | -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190621/b265b93c/attachment-0001.html>
Rui Ueyama via llvm-dev
2019-Jun-21 11:39 UTC
[llvm-dev] Expected behavior of lld during LTO for global symbols (Attr Internal/Common)
Let me investigate. On Fri, Jun 21, 2019 at 5:08 PM Mani, Suresh <Suresh.Mani at amd.com> wrote:> Thanks for the info Teresa, > > > > Regards > > M Suresh > > > > *From:* Teresa Johnson <tejohnson at google.com> > *Sent:* Thursday, June 20, 2019 7:15 PM > *To:* Mani, Suresh <Suresh.Mani at amd.com> > *Cc:* Rui Ueyama <ruiu at google.com>; llvm-dev <llvm-dev at lists.llvm.org> > *Subject:* Re: [llvm-dev] Expected behavior of lld during LTO for global > symbols (Attr Internal/Common) > > > > [CAUTION: External Email] > > I haven't had a chance to look, but as mentioned, the linker resolution > for the symbol is exported, which explains the LTO side behavior. Someone > from the linker will probably need to see what changed in the symbol info > they are giving LTO is changing after that patch. If you want you can debug > lld's BitcodeCompiler::add to see what info is different in the Resols > array for that symbol that gets passed to LTO. Or what else is different in > the Sym used to generate the resolution. Both of those are examined in > LTO::addModuleToGlobalRes when we note that the symbol is external. > > > > Teresa > > > > On Thu, Jun 20, 2019 at 2:36 AM Mani, Suresh <Suresh.Mani at amd.com> wrote: > > Hi Teresa, > > > > Can you please let me know if there is any update on this issue. > > > > Thanks > > M Suresh > > > > *From:* Teresa Johnson <tejohnson at google.com> > *Sent:* Tuesday, June 11, 2019 7:23 PM > *To:* Rui Ueyama <ruiu at google.com> > *Cc:* Mani, Suresh <Suresh.Mani at amd.com>; llvm-dev < > llvm-dev at lists.llvm.org> > *Subject:* Re: [llvm-dev] Expected behavior of lld during LTO for global > symbols (Attr Internal/Common) > > > > [CAUTION: External Email] > > LTO can, but it is linker driven. I confirmed that when it is a common > symbol the resolution indicates that the symbol is exported, and when I add > an initializer so that it is a def we no longer think it is exported and > are able to internalize. So this seems to be due to a change in what the > linker is telling LTO. I would have to dig in the debugger to confirm, but > perhaps lld is now indicating that it might be used by a regular obj? I.e. > in BitcodeCompiler::add. > > > > Teresa > > > > On Tue, Jun 11, 2019 at 5:48 AM Rui Ueyama <ruiu at google.com> wrote: > > Looks like this is indeed related to r360841. > > > > In C, there are distinctions between declarations, definitions and > tentative definitions. Global variables declared with "extern" are > declarations. Global variables that don't have "extern" and have > initializers are definitions. If global variables have neither "extern" nor > initializers, they are called tentative definitions. > > > > Common symbols represent tentative definitions. > > > > Tentative definition get special treatment in the linker. Usually if you > define the same symbol in two object files, a linker report an error. > However, common symbols are allowed to duplicate. Two or more common > symbols are merged and then placed to the .bss section, so that they will > be zero-initialized at runtime. > > > > So, a global variable defined as `struct Node* head` is actually a common > symbol. > > > > I'm not sure why LTO cannot internalize common symbols though. Teresa, is > this expected? > > > > On Mon, Jun 10, 2019 at 11:06 PM Teresa Johnson <tejohnson at google.com> > wrote: > > My guess is that it is due to lld change r360841 on that date (Introduce > CommonSymbol). +Rui for comments. > > > > On Mon, Jun 10, 2019 at 4:45 AM Mani, Suresh via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > > > > > Hi , > > > > I have an issue during LTO phase of llvm compiler which is as follows, > > > > > > File t3.c > > --------- > > > > > > #include <stdio.h> > > #include <stdlib.h> > > > > // A linked list node > > struct Node { > > int data; > > struct Node* next; > > struct Node* prev; > > }; > > > > *struct Node* head;* > > > > /* Given a reference (pointer to pointer) to the head of a list > > and an int, inserts a new node on the front of the list. */ > > void push(struct Node** head_ref, int new_data) > > { > > struct Node* new_node = (struct Node*)malloc(sizeof(struct Node)); > > > > new_node->data = new_data; > > > > new_node->next = (*head_ref); > > new_node->prev = NULL; > > > > if ((*head_ref) != NULL) > > (*head_ref)->prev = new_node; > > > > (*head_ref) = new_node; > > } > > > > > > // This function prints contents of linked list starting from the given > node > > void printList(struct Node* node) > > { > > struct Node* last; > > printf("\nTraversal in forward direction \n"); > > while (node != NULL) { > > printf(" %d ", node->data); > > last = node; > > node = node->next; > > } > > > > printf("\nTraversal in reverse direction \n"); > > while (last != NULL) { > > printf(" %d ", last->data); > > last = last->prev; > > } > > } > > > > > > /* Driver program to test above functions*/ > > int main() > > { > > > > head = NULL; > > push(&head, 7); > > push(&head, 1); > > push(&head, 4); > > > > printList(head); > > > > return 0; > > } > > > > > > > > > > Compiler invocation: > > -------------------- > > > > clang -flto -fuse-ld=lld -O3 t3.c -o a.out > > > > > > Expected behavior during LTO: > > ------------------------------ > > > > The compiler optimization during LTO needs to figure out that variable > "head" is not referred by any precompiled object or library. > > Until May-16-2019 variable "head" had internal attribute as follows, > > > > @head = internal global %struct.Node* null, align 8 > > > > And the compiler was rightly able to recognize that "head" is not referred > by any external precompiled object or library. > > > > But after May-16-2019 the attribute of head was changed as follows, > > > > @head = common dso_local global %struct.Node* null, align 8 > > > > > > Not sure if this is correct behavior? > > > > If this is a correct behavior then can you please let me know how could > the compiler figure out that variable "head" is not referred by any > external precompiled object or library? > > > > > > Thanks > > M Suresh > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > > -- > > Teresa Johnson | > > Software Engineer | > > tejohnson at google.com | > > > > > -- > > Teresa Johnson | > > Software Engineer | > > tejohnson at google.com | > > > > > > > -- > > Teresa Johnson | > > Software Engineer | > > tejohnson at google.com | > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190621/1c98e12e/attachment.html>
Rui Ueyama via llvm-dev
2019-Jun-24 08:08 UTC
[llvm-dev] Expected behavior of lld during LTO for global symbols (Attr Internal/Common)
The direct cause of this issue is that, previously lld converted common symbols to defined symbols before passing input files to LTO, and after r360841 they are passed as common symbols to LTO. Making lld to work as before is easy, as we can convert common symbols to defined symbols as before. Here is a patch to do that, and I confirmed that that restores the original behavior for the reported issue. The question is why LTO cannot internalize common symbols under some conditions. Looks like if there's no file other than bitcode files, LTO can internalize them, but if there's other DSO file, LTO can't, even if the DSOs don't contain any symbols. But I don't fully understand what is going on. I'll try to investigate tomorrow. diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp index 008a6cd7954..d9deddbf357 100644 --- a/lld/ELF/Driver.cpp +++ b/lld/ELF/Driver.cpp @@ -1789,6 +1789,11 @@ template <class ELFT> void LinkerDriver::link(opt::InputArgList &Args) { if (!Config->Relocatable) Symtab->scanVersionScript(); + // Replace common symbols with regular symbols, so that common + // symbols in input object files appear as regular symbols in .bss + // in the output. + replaceCommonSymbols(); + // Do link-time optimization if given files are LLVM bitcode files. // This compiles bitcode files into real object files. // @@ -1798,6 +1803,11 @@ template <class ELFT> void LinkerDriver::link(opt::InputArgList &Args) { if (errorCount()) return; + // LTO may have introduced new common symbols, so convert them + // to regular defined symbols. + if (!BitcodeFiles.empty()) + replaceCommonSymbols(); // If -thinlto-index-only is given, we should create only "index // files" and not object files. Index file creation is already done // in addCombinedLTOObject, so we are done if that's the case. @@ -1879,7 +1889,6 @@ template <class ELFT> void LinkerDriver::link(opt::InputArgList &Args) { if (!Config->Relocatable) InputSections.push_back(createCommentSection()); - // Replace common symbols with regular symbols. - replaceCommonSymbols(); // Do size optimizations: garbage collection, merging of SHF_MERGE sections On Fri, Jun 21, 2019 at 8:39 PM Rui Ueyama <ruiu at google.com> wrote:> Let me investigate. > > On Fri, Jun 21, 2019 at 5:08 PM Mani, Suresh <Suresh.Mani at amd.com> wrote: > >> Thanks for the info Teresa, >> >> >> >> Regards >> >> M Suresh >> >> >> >> *From:* Teresa Johnson <tejohnson at google.com> >> *Sent:* Thursday, June 20, 2019 7:15 PM >> *To:* Mani, Suresh <Suresh.Mani at amd.com> >> *Cc:* Rui Ueyama <ruiu at google.com>; llvm-dev <llvm-dev at lists.llvm.org> >> *Subject:* Re: [llvm-dev] Expected behavior of lld during LTO for global >> symbols (Attr Internal/Common) >> >> >> >> [CAUTION: External Email] >> >> I haven't had a chance to look, but as mentioned, the linker resolution >> for the symbol is exported, which explains the LTO side behavior. Someone >> from the linker will probably need to see what changed in the symbol info >> they are giving LTO is changing after that patch. If you want you can debug >> lld's BitcodeCompiler::add to see what info is different in the Resols >> array for that symbol that gets passed to LTO. Or what else is different in >> the Sym used to generate the resolution. Both of those are examined in >> LTO::addModuleToGlobalRes when we note that the symbol is external. >> >> >> >> Teresa >> >> >> >> On Thu, Jun 20, 2019 at 2:36 AM Mani, Suresh <Suresh.Mani at amd.com> wrote: >> >> Hi Teresa, >> >> >> >> Can you please let me know if there is any update on this issue. >> >> >> >> Thanks >> >> M Suresh >> >> >> >> *From:* Teresa Johnson <tejohnson at google.com> >> *Sent:* Tuesday, June 11, 2019 7:23 PM >> *To:* Rui Ueyama <ruiu at google.com> >> *Cc:* Mani, Suresh <Suresh.Mani at amd.com>; llvm-dev < >> llvm-dev at lists.llvm.org> >> *Subject:* Re: [llvm-dev] Expected behavior of lld during LTO for global >> symbols (Attr Internal/Common) >> >> >> >> [CAUTION: External Email] >> >> LTO can, but it is linker driven. I confirmed that when it is a common >> symbol the resolution indicates that the symbol is exported, and when I add >> an initializer so that it is a def we no longer think it is exported and >> are able to internalize. So this seems to be due to a change in what the >> linker is telling LTO. I would have to dig in the debugger to confirm, but >> perhaps lld is now indicating that it might be used by a regular obj? I.e. >> in BitcodeCompiler::add. >> >> >> >> Teresa >> >> >> >> On Tue, Jun 11, 2019 at 5:48 AM Rui Ueyama <ruiu at google.com> wrote: >> >> Looks like this is indeed related to r360841. >> >> >> >> In C, there are distinctions between declarations, definitions and >> tentative definitions. Global variables declared with "extern" are >> declarations. Global variables that don't have "extern" and have >> initializers are definitions. If global variables have neither "extern" nor >> initializers, they are called tentative definitions. >> >> >> >> Common symbols represent tentative definitions. >> >> >> >> Tentative definition get special treatment in the linker. Usually if you >> define the same symbol in two object files, a linker report an error. >> However, common symbols are allowed to duplicate. Two or more common >> symbols are merged and then placed to the .bss section, so that they will >> be zero-initialized at runtime. >> >> >> >> So, a global variable defined as `struct Node* head` is actually a common >> symbol. >> >> >> >> I'm not sure why LTO cannot internalize common symbols though. Teresa, is >> this expected? >> >> >> >> On Mon, Jun 10, 2019 at 11:06 PM Teresa Johnson <tejohnson at google.com> >> wrote: >> >> My guess is that it is due to lld change r360841 on that date (Introduce >> CommonSymbol). +Rui for comments. >> >> >> >> On Mon, Jun 10, 2019 at 4:45 AM Mani, Suresh via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> >> >> >> >> Hi , >> >> >> >> I have an issue during LTO phase of llvm compiler which is as follows, >> >> >> >> >> >> File t3.c >> >> --------- >> >> >> >> >> >> #include <stdio.h> >> >> #include <stdlib.h> >> >> >> >> // A linked list node >> >> struct Node { >> >> int data; >> >> struct Node* next; >> >> struct Node* prev; >> >> }; >> >> >> >> *struct Node* head;* >> >> >> >> /* Given a reference (pointer to pointer) to the head of a list >> >> and an int, inserts a new node on the front of the list. */ >> >> void push(struct Node** head_ref, int new_data) >> >> { >> >> struct Node* new_node = (struct Node*)malloc(sizeof(struct Node)); >> >> >> >> new_node->data = new_data; >> >> >> >> new_node->next = (*head_ref); >> >> new_node->prev = NULL; >> >> >> >> if ((*head_ref) != NULL) >> >> (*head_ref)->prev = new_node; >> >> >> >> (*head_ref) = new_node; >> >> } >> >> >> >> >> >> // This function prints contents of linked list starting from the given >> node >> >> void printList(struct Node* node) >> >> { >> >> struct Node* last; >> >> printf("\nTraversal in forward direction \n"); >> >> while (node != NULL) { >> >> printf(" %d ", node->data); >> >> last = node; >> >> node = node->next; >> >> } >> >> >> >> printf("\nTraversal in reverse direction \n"); >> >> while (last != NULL) { >> >> printf(" %d ", last->data); >> >> last = last->prev; >> >> } >> >> } >> >> >> >> >> >> /* Driver program to test above functions*/ >> >> int main() >> >> { >> >> >> >> head = NULL; >> >> push(&head, 7); >> >> push(&head, 1); >> >> push(&head, 4); >> >> >> >> printList(head); >> >> >> >> return 0; >> >> } >> >> >> >> >> >> >> >> >> >> Compiler invocation: >> >> -------------------- >> >> >> >> clang -flto -fuse-ld=lld -O3 t3.c -o a.out >> >> >> >> >> >> Expected behavior during LTO: >> >> ------------------------------ >> >> >> >> The compiler optimization during LTO needs to figure out that variable >> "head" is not referred by any precompiled object or library. >> >> Until May-16-2019 variable "head" had internal attribute as follows, >> >> >> >> @head = internal global %struct.Node* null, align 8 >> >> >> >> And the compiler was rightly able to recognize that "head" is not >> referred by any external precompiled object or library. >> >> >> >> But after May-16-2019 the attribute of head was changed as follows, >> >> >> >> @head = common dso_local global %struct.Node* null, align 8 >> >> >> >> >> >> Not sure if this is correct behavior? >> >> >> >> If this is a correct behavior then can you please let me know how could >> the compiler figure out that variable "head" is not referred by any >> external precompiled object or library? >> >> >> >> >> >> Thanks >> >> M Suresh >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >> >> >> -- >> >> Teresa Johnson | >> >> Software Engineer | >> >> tejohnson at google.com | >> >> >> >> >> -- >> >> Teresa Johnson | >> >> Software Engineer | >> >> tejohnson at google.com | >> >> >> >> >> >> >> -- >> >> Teresa Johnson | >> >> Software Engineer | >> >> tejohnson at google.com | >> >> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190624/497e874c/attachment.html>