Teresa Johnson via llvm-dev
2016-Apr-06 19:52 UTC
[llvm-dev] LTO renaming of constants with inline assembly
On Wed, Apr 6, 2016 at 11:16 AM, Peter Collingbourne <peter at pcc.me.uk> wrote:> On Wed, Apr 6, 2016 at 10:49 AM, Teresa Johnson <tejohnson at google.com> > wrote: > >> >> >> On Wed, Apr 6, 2016 at 10:46 AM, Peter Collingbourne <peter at pcc.me.uk> >> wrote: >> >>> I suspect that the right way to do promotion/renaming of this sort is to >>> rename at the MC layer just before writing the symbol table to the object >>> file. >>> >> >> I think that is too late - how would the symbols be distinguished in the >> LTO case below after the IR is linked but before we renamed the duplicate? >> > > Sorry, wasn't fully awake. I think we could do something along the lines > of the symbol renaming idea, but with directives that limit their scope to > inline asm blocks. > > Specifically, we could teach the frontend to produce a mapping from symbol > names to globalvalues, for any internal names with the used attribute, and > attach that mapping to inline asm blocks. > > For example, if myvar were renamed to myvar.6, the IR would look like this: > > @myvar.6 = global i8 [...] > > [...] > > call asm("movzbl myvar(%rip), ...", ..., "myvar")(..., i8* @myvar.6) > > The backend would produce assembly that would look like this: > > .rename myvar, myvar.6 > movzbl myvar(%%rip), ... > .norename myvar > > The .rename and .norename directives would delimit the scope of the > renaming. >That's an interesting idea, thanks. Are .rename and .norename standard directives? I did some web searches but couldn't find anything concrete on them (I did find a .rename in some IBM Power documentation, but it seemed to apply to string constants. I think for ThinLTO purposes I will limit importing to/from modules with inline assembly for now to avoid the issue. Teresa> Peter > >> >> Teresa >> >> >>> Peter >>> >>> On Wed, Apr 6, 2016 at 10:37 AM, Teresa Johnson via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> >>>> I encountered an issue with ThinLTO handling of inline assembly, where >>>> the inline assembly referenced a constant that was a local variable. The >>>> local var was renamed because it was promoted in ThinLTO mode, but the >>>> inline assembly copy was not renamed and we ended up with an undef at link >>>> time. >>>> >>>> It looks like this is a general problem with inline assembly and LTO. >>>> Wondering if it is a known issue. E.g. if I link in LTO mode two files that >>>> have inline assembly referencing local constants with the same name, the >>>> LTO linking will rename the second. However, the renaming doesn't propagate >>>> to inline assembly, resulting in the wrong output. >>>> >>>> For example, let's say we have two modules with inline assembly that >>>> writes a local constant var named "myvar" into the memory pointed to by its >>>> parameter "v", and a simple main that calls each function: >>>> >>>> $ cat inlineasm1.c >>>> static const unsigned char __attribute__((used)) __attribute__ >>>> ((aligned (1))) myvar = 1; >>>> >>>> void foo(unsigned long int *v) { >>>> __asm__ volatile("movzbl myvar(%%rip), %%eax\n\t" >>>> "movq %%rax, %0\n\t" >>>> : "=*m" (*v) >>>> : >>>> : "%eax" >>>> ); >>>> } >>>> >>>> $ cat inlineasm2.c >>>> static const unsigned char __attribute__((used)) __attribute__ >>>> ((aligned (1))) myvar = 2; >>>> >>>> void bar(unsigned long int *v) { >>>> __asm__ volatile("movzbl myvar(%%rip), %%eax\n\t" >>>> "movq %%rax, %0\n\t" >>>> : "=*m" (*v) >>>> : >>>> : "%eax" >>>> ); >>>> } >>>> >>>> $ cat inlineasm.c >>>> #include <stdio.h> >>>> extern void foo(unsigned long int *v); >>>> extern void bar(unsigned long int *v); >>>> int main() { >>>> unsigned long int f,b; >>>> foo(&f); >>>> bar(&b); >>>> printf("%lu %lu\n", f, b); >>>> } >>>> >>>> >>>> If compiled at -O2 (no LTO) this correctly prints out "1 2". >>>> >>>> However, when linked with LTO, the second copy of local "myvar" is >>>> renamed to "myvar.6". But the inline assembly which is still hidden within >>>> a call that hasn't been lowered, still references "myvar" in that second >>>> linked copy in bar(). The output is thus incorrect: "1 1" (or "2 2" if the >>>> bar() copy was linked first). >>>> >>>> Is this a known issue? Any ideas on how we could handle this? >>>> >>>> Thanks, >>>> Teresa >>>> >>>> -- >>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>> 408-460-2413 >>>> >>>> _______________________________________________ >>>> LLVM Developers mailing list >>>> llvm-dev at lists.llvm.org >>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>> >>> >>> -- >>> -- >>> Peter >>> >> >> >> >> -- >> Teresa Johnson | Software Engineer | tejohnson at google.com | >> 408-460-2413 >> > > > > -- > -- > Peter >-- Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160406/ca175d83/attachment.html>
Peter Collingbourne via llvm-dev
2016-Apr-06 20:10 UTC
[llvm-dev] LTO renaming of constants with inline assembly
On Wed, Apr 6, 2016 at 12:52 PM, Teresa Johnson <tejohnson at google.com> wrote:> > > On Wed, Apr 6, 2016 at 11:16 AM, Peter Collingbourne <peter at pcc.me.uk> > wrote: > >> On Wed, Apr 6, 2016 at 10:49 AM, Teresa Johnson <tejohnson at google.com> >> wrote: >> >>> >>> >>> On Wed, Apr 6, 2016 at 10:46 AM, Peter Collingbourne <peter at pcc.me.uk> >>> wrote: >>> >>>> I suspect that the right way to do promotion/renaming of this sort is >>>> to rename at the MC layer just before writing the symbol table to the >>>> object file. >>>> >>> >>> I think that is too late - how would the symbols be distinguished in the >>> LTO case below after the IR is linked but before we renamed the duplicate? >>> >> >> Sorry, wasn't fully awake. I think we could do something along the lines >> of the symbol renaming idea, but with directives that limit their scope to >> inline asm blocks. >> >> Specifically, we could teach the frontend to produce a mapping from >> symbol names to globalvalues, for any internal names with the used >> attribute, and attach that mapping to inline asm blocks. >> >> For example, if myvar were renamed to myvar.6, the IR would look like >> this: >> >> @myvar.6 = global i8 [...] >> >> [...] >> >> call asm("movzbl myvar(%rip), ...", ..., "myvar")(..., i8* @myvar.6) >> >> The backend would produce assembly that would look like this: >> >> .rename myvar, myvar.6 >> movzbl myvar(%%rip), ... >> .norename myvar >> >> The .rename and .norename directives would delimit the scope of the >> renaming. >> > > That's an interesting idea, thanks. Are .rename and .norename standard > directives? I did some web searches but couldn't find anything concrete on > them (I did find a .rename in some IBM Power documentation, but it seemed > to apply to string constants. >These would be new directives that we'd need to implement. Perhaps if they're already being used elsewhere we can come up with a sufficiently unique name. I think for ThinLTO purposes I will limit importing to/from modules with> inline assembly for now to avoid the issue. >Sounds reasonable. Peter> > Teresa > > >> Peter >> >>> >>> Teresa >>> >>> >>>> Peter >>>> >>>> On Wed, Apr 6, 2016 at 10:37 AM, Teresa Johnson via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> >>>>> I encountered an issue with ThinLTO handling of inline assembly, where >>>>> the inline assembly referenced a constant that was a local variable. The >>>>> local var was renamed because it was promoted in ThinLTO mode, but the >>>>> inline assembly copy was not renamed and we ended up with an undef at link >>>>> time. >>>>> >>>>> It looks like this is a general problem with inline assembly and LTO. >>>>> Wondering if it is a known issue. E.g. if I link in LTO mode two files that >>>>> have inline assembly referencing local constants with the same name, the >>>>> LTO linking will rename the second. However, the renaming doesn't propagate >>>>> to inline assembly, resulting in the wrong output. >>>>> >>>>> For example, let's say we have two modules with inline assembly that >>>>> writes a local constant var named "myvar" into the memory pointed to by its >>>>> parameter "v", and a simple main that calls each function: >>>>> >>>>> $ cat inlineasm1.c >>>>> static const unsigned char __attribute__((used)) __attribute__ >>>>> ((aligned (1))) myvar = 1; >>>>> >>>>> void foo(unsigned long int *v) { >>>>> __asm__ volatile("movzbl myvar(%%rip), %%eax\n\t" >>>>> "movq %%rax, %0\n\t" >>>>> : "=*m" (*v) >>>>> : >>>>> : "%eax" >>>>> ); >>>>> } >>>>> >>>>> $ cat inlineasm2.c >>>>> static const unsigned char __attribute__((used)) __attribute__ >>>>> ((aligned (1))) myvar = 2; >>>>> >>>>> void bar(unsigned long int *v) { >>>>> __asm__ volatile("movzbl myvar(%%rip), %%eax\n\t" >>>>> "movq %%rax, %0\n\t" >>>>> : "=*m" (*v) >>>>> : >>>>> : "%eax" >>>>> ); >>>>> } >>>>> >>>>> $ cat inlineasm.c >>>>> #include <stdio.h> >>>>> extern void foo(unsigned long int *v); >>>>> extern void bar(unsigned long int *v); >>>>> int main() { >>>>> unsigned long int f,b; >>>>> foo(&f); >>>>> bar(&b); >>>>> printf("%lu %lu\n", f, b); >>>>> } >>>>> >>>>> >>>>> If compiled at -O2 (no LTO) this correctly prints out "1 2". >>>>> >>>>> However, when linked with LTO, the second copy of local "myvar" is >>>>> renamed to "myvar.6". But the inline assembly which is still hidden within >>>>> a call that hasn't been lowered, still references "myvar" in that second >>>>> linked copy in bar(). The output is thus incorrect: "1 1" (or "2 2" if the >>>>> bar() copy was linked first). >>>>> >>>>> Is this a known issue? Any ideas on how we could handle this? >>>>> >>>>> Thanks, >>>>> Teresa >>>>> >>>>> -- >>>>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>>>> 408-460-2413 >>>>> >>>>> _______________________________________________ >>>>> LLVM Developers mailing list >>>>> llvm-dev at lists.llvm.org >>>>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>>> >>>>> >>>> >>>> >>>> -- >>>> -- >>>> Peter >>>> >>> >>> >>> >>> -- >>> Teresa Johnson | Software Engineer | tejohnson at google.com | >>> 408-460-2413 >>> >> >> >> >> -- >> -- >> Peter >> > > > > -- > Teresa Johnson | Software Engineer | tejohnson at google.com | > 408-460-2413 >-- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160406/aa91254f/attachment.html>
Sergei Larin via llvm-dev
2016-Apr-13 15:02 UTC
[llvm-dev] LTO renaming of constants with inline assembly
I still wonder if this would be an issue in _standard_ (not thin) LTO? This test seems to be OK on my (slightly modified) standard LTO flow, but I do wonder for a more general case. Sergei --- Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation From: llvm-dev [mailto:llvm-dev-bounces at lists.llvm.org] On Behalf Of Peter Collingbourne via llvm-dev Sent: Wednesday, April 06, 2016 3:11 PM To: Teresa Johnson <tejohnson at google.com> Cc: llvm-dev <llvm-dev at lists.llvm.org> Subject: Re: [llvm-dev] LTO renaming of constants with inline assembly On Wed, Apr 6, 2016 at 12:52 PM, Teresa Johnson <tejohnson at google.com <mailto:tejohnson at google.com> > wrote: On Wed, Apr 6, 2016 at 11:16 AM, Peter Collingbourne <peter at pcc.me.uk <mailto:peter at pcc.me.uk> > wrote: On Wed, Apr 6, 2016 at 10:49 AM, Teresa Johnson <tejohnson at google.com <mailto:tejohnson at google.com> > wrote: On Wed, Apr 6, 2016 at 10:46 AM, Peter Collingbourne <peter at pcc.me.uk <mailto:peter at pcc.me.uk> > wrote: I suspect that the right way to do promotion/renaming of this sort is to rename at the MC layer just before writing the symbol table to the object file. I think that is too late - how would the symbols be distinguished in the LTO case below after the IR is linked but before we renamed the duplicate? Sorry, wasn't fully awake. I think we could do something along the lines of the symbol renaming idea, but with directives that limit their scope to inline asm blocks. Specifically, we could teach the frontend to produce a mapping from symbol names to globalvalues, for any internal names with the used attribute, and attach that mapping to inline asm blocks. For example, if myvar were renamed to myvar.6, the IR would look like this: @myvar.6 = global i8 [...] [...] call asm("movzbl myvar(%rip), ...", ..., "myvar")(..., i8* @myvar.6) The backend would produce assembly that would look like this: .rename myvar, myvar.6 movzbl myvar(%%rip), ... .norename myvar The .rename and .norename directives would delimit the scope of the renaming. That's an interesting idea, thanks. Are .rename and .norename standard directives? I did some web searches but couldn't find anything concrete on them (I did find a .rename in some IBM Power documentation, but it seemed to apply to string constants. These would be new directives that we'd need to implement. Perhaps if they're already being used elsewhere we can come up with a sufficiently unique name. I think for ThinLTO purposes I will limit importing to/from modules with inline assembly for now to avoid the issue. Sounds reasonable. Peter Teresa Peter Teresa Peter On Wed, Apr 6, 2016 at 10:37 AM, Teresa Johnson via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> > wrote: I encountered an issue with ThinLTO handling of inline assembly, where the inline assembly referenced a constant that was a local variable. The local var was renamed because it was promoted in ThinLTO mode, but the inline assembly copy was not renamed and we ended up with an undef at link time. It looks like this is a general problem with inline assembly and LTO. Wondering if it is a known issue. E.g. if I link in LTO mode two files that have inline assembly referencing local constants with the same name, the LTO linking will rename the second. However, the renaming doesn't propagate to inline assembly, resulting in the wrong output. For example, let's say we have two modules with inline assembly that writes a local constant var named "myvar" into the memory pointed to by its parameter "v", and a simple main that calls each function: $ cat inlineasm1.c static const unsigned char __attribute__((used)) __attribute__ ((aligned (1))) myvar = 1; void foo(unsigned long int *v) { __asm__ volatile("movzbl myvar(%%rip), %%eax\n\t" "movq %%rax, %0\n\t" : "=*m" (*v) : : "%eax" ); } $ cat inlineasm2.c static const unsigned char __attribute__((used)) __attribute__ ((aligned (1))) myvar = 2; void bar(unsigned long int *v) { __asm__ volatile("movzbl myvar(%%rip), %%eax\n\t" "movq %%rax, %0\n\t" : "=*m" (*v) : : "%eax" ); } $ cat inlineasm.c #include <stdio.h> extern void foo(unsigned long int *v); extern void bar(unsigned long int *v); int main() { unsigned long int f,b; foo(&f); bar(&b); printf("%lu %lu\n", f, b); } If compiled at -O2 (no LTO) this correctly prints out "1 2". However, when linked with LTO, the second copy of local "myvar" is renamed to "myvar.6". But the inline assembly which is still hidden within a call that hasn't been lowered, still references "myvar" in that second linked copy in bar(). The output is thus incorrect: "1 1" (or "2 2" if the bar() copy was linked first). Is this a known issue? Any ideas on how we could handle this? Thanks, Teresa -- Teresa Johnson | Software Engineer | tejohnson at google.com <mailto:tejohnson at google.com> | 408-460-2413 <tel:408-460-2413> _______________________________________________ LLVM Developers mailing list llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev -- -- Peter -- Teresa Johnson | Software Engineer | tejohnson at google.com <mailto:tejohnson at google.com> | 408-460-2413 <tel:408-460-2413> -- -- Peter -- Teresa Johnson | Software Engineer | tejohnson at google.com <mailto:tejohnson at google.com> | 408-460-2413 <tel:408-460-2413> -- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160413/2f7d6fdf/attachment.html>