Hi all, While working on alias support for the LLVM-ML project, I ran into a feature implemented back in 2010: default-null weak externals in COFF, a GNU extension. https://reviews.llvm.org/rG17990d56907b I'd like to disable this feature when targeting MSVC compatibility. Does anyone have more context on this, and why it'd be a terrible idea? For context: This seems to be designed to let LLVM implement a GNU extension in COFF libraries. However, it leads to very different behavior than we see for cl.exe (and ml.exe) on Windows; for already-defined aliasees, it injects an alternate placeholder ".weak.<alias>.default.<uniquifier>" symbol which resolves back to the current location. I admit, I'm not quite sure how this helps. If anyone can explain the purpose, I'd really appreciate it! In Windows PE/COFF files, aliases typically just resolve to their target symbol. For an example, see https://reviews.llvm.org/D87403#inline-811289. Thanks, - Eric -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200923/0e0c108e/attachment.html>
On Wed, Sep 23, 2020 at 1:45 PM Eric Astor <epastor at google.com> wrote:> Hi all, > > While working on alias support for the LLVM-ML project, I ran into a > feature implemented back in 2010: default-null weak externals in COFF, a > GNU extension. > https://reviews.llvm.org/rG17990d56907b > I'd like to disable this feature when targeting MSVC compatibility. Does > anyone have more context on this, and why it'd be a terrible idea? > > For context: This seems to be designed to let LLVM implement a GNU > extension in COFF libraries. However, it leads to very different behavior > than we see for cl.exe (and ml.exe) on Windows; for already-defined > aliasees, it injects an alternate placeholder > ".weak.<alias>.default.<uniquifier>" symbol which resolves back to the > current location. I admit, I'm not quite sure how this helps. If anyone can > explain the purpose, I'd really appreciate it! > > In Windows PE/COFF files, aliases typically just resolve to their target > symbol. For an example, see https://reviews.llvm.org/D87403#inline-811289. > > Thanks, > - Eric >Sadly I don't recall why I made that change, probably should have included that in the commit message. I do know that at the time I was focused on link.exe support, not mingw. I just looked through my commit history from that time and it doesn't really explain much. I believe that as long as the "compare weak symbol against null" does the right thing any changes you make here are fine. - Michael Spencer -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200923/ac9c444f/attachment.html>
Thanks, Michael! However... comparing weak against null isn't actually supported by MSVC, to the best of my knowledge. If we want to maintain that in the PE/COFF world, I think we might have to accept some weird behavior. I was wondering how much people will scream if I disable this feature in MSVC-compatible targets. Best, - Eric On Wed, Sep 23, 2020 at 8:53 PM Michael Spencer <bigcheesegs at gmail.com> wrote:> > On Wed, Sep 23, 2020 at 1:45 PM Eric Astor <epastor at google.com> wrote: > >> Hi all, >> >> While working on alias support for the LLVM-ML project, I ran into a >> feature implemented back in 2010: default-null weak externals in COFF, a >> GNU extension. >> https://reviews.llvm.org/rG17990d56907b >> I'd like to disable this feature when targeting MSVC compatibility. Does >> anyone have more context on this, and why it'd be a terrible idea? >> >> For context: This seems to be designed to let LLVM implement a GNU >> extension in COFF libraries. However, it leads to very different behavior >> than we see for cl.exe (and ml.exe) on Windows; for already-defined >> aliasees, it injects an alternate placeholder >> ".weak.<alias>.default.<uniquifier>" symbol which resolves back to the >> current location. I admit, I'm not quite sure how this helps. If anyone can >> explain the purpose, I'd really appreciate it! >> >> In Windows PE/COFF files, aliases typically just resolve to their target >> symbol. For an example, see https://reviews.llvm.org/D87403#inline-811289 >> . >> >> Thanks, >> - Eric >> > > > Sadly I don't recall why I made that change, probably should have included > that in the commit message. I do know that at the time I was focused on > link.exe support, not mingw. I just looked through my commit history from > that time and it doesn't really explain much. > > I believe that as long as the "compare weak symbol against null" does the > right thing any changes you make here are fine. > > - Michael Spencer >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200923/24c7e6da/attachment.html>
Hi, On Wed, 23 Sep 2020, Eric Astor via llvm-dev wrote:> While working on alias support for the LLVM-ML project, I ran into a feature > implemented back in 2010: default-null weak externals in COFF, a GNU > extension. > https://reviews.llvm.org/rG17990d56907b > I'd like to disable this feature when targeting MSVC compatibility. Does > anyone have more context on this, and why it'd be a terrible idea? > > For context: This seems to be designed to let LLVM implement a GNU extension > in COFF libraries. However, it leads to very different behavior than we see > for cl.exe (and ml.exe) on Windows; for already-defined aliasees, it injects > an alternate placeholder ".weak.<alias>.default.<uniquifier>" symbol which > resolves back to the current location. I admit, I'm not quite sure how this > helps. If anyone can explain the purpose, I'd really appreciate it!So, for the GNU extension, from the user point of view, there's two potential usecases. A translation unit can reference a function declaration with __attribute__((weak)), with no implementation in the translation unit. This then then either evaluates to NULL or an actual implementation, if there existed another, non-weak definition in another object file at link time. Secondly, multiple translation units may have function definitions that are marked with the weak attribute. You can have this in 0-N object files, and 0-1 object files containing a non-weak definition. If there's no non-weak definition, one of the weak definitions ends up picked, but if there is one, the non-weak one ends up used. As all this is consumed via GNU style attributes (in MinGW environments), it shouldn't really matter in an MSVC context. I recently worked on this to get the final details on this hooked up for COFF, so I'd be happy to have a look at any work touching this feature.> In Windows PE/COFF files, aliases typically just resolve to their target > symbol. For an example, see https://reviews.llvm.org/D87403#inline-811289.For the cases where there already exists a symbol with a name that is unique in itself, just adding an alias directly to the target symbol sounds sensible in itself, but for cases when it isn't set up as an alias, but where the implementation itself is marked weak, the uniquifying symbol name is needed, to allow multiple objects to provide the same thing. Consider these two examples in GAS assembly form: .globl uniquename uniquename: ret .globl func func: ret .weak aliasname aliasname = func This produces the following symbols, shown with llvm-objdump -t: [ 6](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 uniquename [ 7](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000001 func [ 8](sec 0)(fl 0x00)(ty 0)(scl 69) (nx 1) 0x00000000 aliasname AUX indx 10 srch 3 [pointing at .weak.aliasname.default.uniquename] [10](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000001 .weak.aliasname.default.uniquename So here .weak.aliasname.default.uniquename is identical to func, and as func itself is non-weak, aliasname could just as well have pointed directly at func instead. But for this case, the extra dance is necessary: .globl uniquename uniquename: ret .weak func .globl func func: ret Producing: [ 6](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 uniquename [ 7](sec 0)(fl 0x00)(ty 0)(scl 69) (nx 1) 0x00000000 func AUX indx 9 srch 3 [ 9](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000001 .weak.func.default.uniquename Initially, the non-weak symbols were just named ".weak.func.default", but this caused clashes if multiple object files defined the same one. I tried fixing this in https://reviews.llvm.org/D71711 by making the non-weak symbols that the weak ones point at static, but MSVC tools error out if you have a weak symbol pointing at a non-external symbol (as "weak" in COFF actually is "weak external"). Therefore I reverted that attempt and I later made https://reviews.llvm.org/D75989 that tries to make unique names for these symbols, to avoid clashes. // Martin
Thanks, Martin! My biggest question is around the behavior for alias-to-alias linkage. Using Microsoft tools (ml64.exe), if you define an external symbol t2, alias t4 to t2, and alias t7 to t4, you get exactly what you asked for: [ 8](sec 1)(fl 0x00)(ty 20)(scl 2) (nx 0) 0x00000001 t2 [ 9](sec 0)(fl 0x00)(ty 0)(scl 69) (nx 1) 0x00000001 t4 AUX indx 8 srch 3 [11](sec 0)(fl 0x00)(ty 0)(scl 69) (nx 1) 0x00000001 t7 AUX indx 9 srch 3 Using LLVM, we instead get a second weak default-null reference pointing directly to t2, rather than to t4: [ 3](sec 1)(fl 0x00)(ty 20)(scl 2) (nx 0) 0x00000001 t2 ... [ 7](sec 0)(fl 0x00)(ty 0)(scl 69) (nx 1) 0x00000000 t4 AUX indx 9 srch 3 [ 9](sec 1)(fl 0x00)(ty 0)(scl 69) (nx 0) 0x00000001 .weak.t4.default.t1 ... [17](sec 0)(fl 0x00)(ty 0)(scl 69) (nx 1) 0x00000000 t7 AUX indx 19 srch 3 [19](sec 1)(fl 0x00)(ty 0)(scl 69) (nx 0) 0x00000001 .weak.t7.default.t1 Due to our creation of ".weak" intermediates duplicating the current resolution of the aliasee, I think this can result in a different resolution for t7 than would happen in the Microsoft tools case? (Say, in a context where t4 has a strong definition.) Maybe we should eliminate the ".weak" intermediates if the reference's target is already an external symbol? They seem unnecessary for that case. Thanks, - Eric On Thu, Sep 24, 2020 at 3:49 AM Martin Storsjö <martin at martin.st> wrote:> Hi, > > On Wed, 23 Sep 2020, Eric Astor via llvm-dev wrote: > > > While working on alias support for the LLVM-ML project, I ran into a > feature > > implemented back in 2010: default-null weak externals in COFF, a GNU > > extension. > > https://reviews.llvm.org/rG17990d56907b > > I'd like to disable this feature when targeting MSVC compatibility. Does > > anyone have more context on this, and why it'd be a terrible idea? > > > > For context: This seems to be designed to let LLVM implement a GNU > extension > > in COFF libraries. However, it leads to very different behavior than we > see > > for cl.exe (and ml.exe) on Windows; for already-defined aliasees, it > injects > > an alternate placeholder ".weak.<alias>.default.<uniquifier>" symbol > which > > resolves back to the current location. I admit, I'm not quite sure how > this > > helps. If anyone can explain the purpose, I'd really appreciate it! > > So, for the GNU extension, from the user point of view, there's two > potential usecases. > > A translation unit can reference a function declaration with > __attribute__((weak)), with no implementation in the translation unit. > This then then either evaluates to NULL or an actual implementation, if > there existed another, non-weak definition in another object file at > link time. > > Secondly, multiple translation units may have function definitions that > are marked with the weak attribute. You can have this in 0-N object files, > and 0-1 object files containing a non-weak definition. If there's no > non-weak definition, one of the weak definitions ends up picked, but if > there is one, the non-weak one ends up used. > > As all this is consumed via GNU style attributes (in MinGW environments), > it shouldn't really matter in an MSVC context. > > I recently worked on this to get the final details on this hooked up for > COFF, so I'd be happy to have a look at any work touching this feature. > > > In Windows PE/COFF files, aliases typically just resolve to their target > > symbol. For an example, see > https://reviews.llvm.org/D87403#inline-811289. > > For the cases where there already exists a symbol with a name that is > unique in itself, just adding an alias directly to the target symbol > sounds sensible in itself, but for cases when it isn't set up as an alias, > but where the implementation itself is marked weak, the uniquifying symbol > name is needed, to allow multiple objects to provide the same thing. > > Consider these two examples in GAS assembly form: > > .globl uniquename > uniquename: > ret > > .globl func > func: > ret > > .weak aliasname > aliasname = func > > This produces the following symbols, shown with llvm-objdump -t: > > [ 6](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 uniquename > [ 7](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000001 func > [ 8](sec 0)(fl 0x00)(ty 0)(scl 69) (nx 1) 0x00000000 aliasname > AUX indx 10 srch 3 [pointing at .weak.aliasname.default.uniquename] > [10](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000001 > .weak.aliasname.default.uniquename > > So here .weak.aliasname.default.uniquename is identical to func, and as > func itself is non-weak, aliasname could just as well have pointed > directly at func instead. > > > But for this case, the extra dance is necessary: > > .globl uniquename > uniquename: > ret > > .weak func > .globl func > func: > ret > > Producing: > [ 6](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000000 uniquename > [ 7](sec 0)(fl 0x00)(ty 0)(scl 69) (nx 1) 0x00000000 func > AUX indx 9 srch 3 > [ 9](sec 1)(fl 0x00)(ty 0)(scl 2) (nx 0) 0x00000001 > .weak.func.default.uniquename > > > > Initially, the non-weak symbols were just named ".weak.func.default", but > this caused clashes if multiple object files defined the same one. I tried > fixing this in https://reviews.llvm.org/D71711 by making the non-weak > symbols that the weak ones point at static, but MSVC tools error out if > you have a weak symbol pointing at a non-external symbol (as "weak" in > COFF actually is "weak external"). Therefore I reverted that attempt and I > later made https://reviews.llvm.org/D75989 that tries to make unique > names > for these symbols, to avoid clashes. > > // Martin >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20200925/35ba74ec/attachment.html>