Hi folks
I noticed something interesting when debugging a program that uses llvm
for JIT compilation.
Running `ltrace` surfaced a number of `getenv("bar")` calls coming
from
llvm. It turns out these are not "real" `getenv` calls, but are an
optimization "do nothing" escape hatch which have been in
`llvm/include/llvm/LinkAllPasses.h` [for over 15years](1) - apparently
as a way to prevent the compiler eliminating symbol references to
optimization pass initialization functions. I took a look at the code
and couldn't really work out what issue is being solved as the commit
messages from 2005 have something to be desired ;)
I removed the whole function body from my local tree and `ninja check`
was happy in release mode (amd64-linux-gcc-10.2). Given its age, and the
fact that it's been through several iterations, I guess I've stumbled
upon a Chesterton Fence and would appreciate some input on whether this
is still needed. I see the original commit was Windows only, and was
then updated to use `getenv` as a way to support this behaviour
cross-platform.
It's more weird than pernicious given that nothing is done with the
result, but to me it feels dirty and confusing to query the process
environment in this way. As such, I wonder 3 things:
    1. Is this still needed? I don't know enough about the original
       toolchains affected to know if the problem still exists, but my
       limited testing shows that it doesn't seem to affect Linux
       builds.
    2. If 1: Is there a better way e.g. define our own function that
       can't be eliminated instead of `getenv` or use features of newer
       language standards and toolchains introduced since 2005 that might
       make the original problem go away on its own (I don't know what
       these might be).
    3. If 1 and not 2: could we make it more obvious that this comes
       from LLVM for those in my situation e.g.
       `getenv("LLVM_IGNORE_THIS_GETENV")` or similar instead of the
       unhelpful "bar" variable?
If it's no longer needed in any case, I can post a removal patch.
Any input is appreciated.
All the Best
Luke
[1]
https://github.com/llvm/llvm-project/commit/00d5508496c1e#diff-7206f3725623127339dd17671577a6888ee3402d2e667ae9dd1457ea3600f4e7R3
-- 
Codeplay Software Ltd.
Company registered in England and Wales, number: 04567874
Registered office: Regent House, 316 Beulah Hill, London, SE19 3HF
Reid/Hans - you two happen to have any ideas what this device might've been introduced to address on Windows 15 years ago? (possible that it was a real issue back then, or even a misunderstanding that might've been common/you might be aware of?) On Mon, Nov 2, 2020 at 11:00 AM Luke Drummond via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hi folks > > I noticed something interesting when debugging a program that uses llvm > for JIT compilation. > > Running `ltrace` surfaced a number of `getenv("bar")` calls coming from > llvm. It turns out these are not "real" `getenv` calls, but are an > optimization "do nothing" escape hatch which have been in > `llvm/include/llvm/LinkAllPasses.h` [for over 15years](1) - apparently > as a way to prevent the compiler eliminating symbol references to > optimization pass initialization functions. I took a look at the code > and couldn't really work out what issue is being solved as the commit > messages from 2005 have something to be desired ;) > > I removed the whole function body from my local tree and `ninja check` > was happy in release mode (amd64-linux-gcc-10.2). Given its age, and the > fact that it's been through several iterations, I guess I've stumbled > upon a Chesterton Fence and would appreciate some input on whether this > is still needed. I see the original commit was Windows only, and was > then updated to use `getenv` as a way to support this behaviour > cross-platform. > > It's more weird than pernicious given that nothing is done with the > result, but to me it feels dirty and confusing to query the process > environment in this way. As such, I wonder 3 things: > > 1. Is this still needed? I don't know enough about the original > toolchains affected to know if the problem still exists, but my > limited testing shows that it doesn't seem to affect Linux > builds. > 2. If 1: Is there a better way e.g. define our own function that > can't be eliminated instead of `getenv` or use features of newer > language standards and toolchains introduced since 2005 that might > make the original problem go away on its own (I don't know what > these might be). > 3. If 1 and not 2: could we make it more obvious that this comes > from LLVM for those in my situation e.g. > `getenv("LLVM_IGNORE_THIS_GETENV")` or similar instead of the > unhelpful "bar" variable? > > If it's no longer needed in any case, I can post a removal patch. > > Any input is appreciated. > > All the Best > > Luke > > [1] > https://github.com/llvm/llvm-project/commit/00d5508496c1e#diff-7206f3725623127339dd17671577a6888ee3402d2e667ae9dd1457ea3600f4e7R3 > > -- > Codeplay Software Ltd. > Company registered in England and Wales, number: 04567874 > Registered office: Regent House, 316 Beulah Hill, London, SE19 3HF > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201102/05fc8f63/attachment.html>
I am pretty sure this has nothing to do with Windows, but with static linking. When building an executable (opt, clang) we need to ensure that all the symbols are available in the artifact to ensure that a loaded plugin uses them. Otherwise the linker may discard object files from .a libraries that are not used by the executable itself, which only uses a subset of the functionality. In particular, one wants to ensure that all passes are available in the opt executable, even though no default pass pipeline does not reference a pass but can be added using the cl::opt mechanism. Michael Am Mo., 2. Nov. 2020 um 13:00 Uhr schrieb Luke Drummond via llvm-dev <llvm-dev at lists.llvm.org>:> > Hi folks > > I noticed something interesting when debugging a program that uses llvm > for JIT compilation. > > Running `ltrace` surfaced a number of `getenv("bar")` calls coming from > llvm. It turns out these are not "real" `getenv` calls, but are an > optimization "do nothing" escape hatch which have been in > `llvm/include/llvm/LinkAllPasses.h` [for over 15years](1) - apparently > as a way to prevent the compiler eliminating symbol references to > optimization pass initialization functions. I took a look at the code > and couldn't really work out what issue is being solved as the commit > messages from 2005 have something to be desired ;) > > I removed the whole function body from my local tree and `ninja check` > was happy in release mode (amd64-linux-gcc-10.2). Given its age, and the > fact that it's been through several iterations, I guess I've stumbled > upon a Chesterton Fence and would appreciate some input on whether this > is still needed. I see the original commit was Windows only, and was > then updated to use `getenv` as a way to support this behaviour > cross-platform. > > It's more weird than pernicious given that nothing is done with the > result, but to me it feels dirty and confusing to query the process > environment in this way. As such, I wonder 3 things: > > 1. Is this still needed? I don't know enough about the original > toolchains affected to know if the problem still exists, but my > limited testing shows that it doesn't seem to affect Linux > builds. > 2. If 1: Is there a better way e.g. define our own function that > can't be eliminated instead of `getenv` or use features of newer > language standards and toolchains introduced since 2005 that might > make the original problem go away on its own (I don't know what > these might be). > 3. If 1 and not 2: could we make it more obvious that this comes > from LLVM for those in my situation e.g. > `getenv("LLVM_IGNORE_THIS_GETENV")` or similar instead of the > unhelpful "bar" variable? > > If it's no longer needed in any case, I can post a removal patch. > > Any input is appreciated. > > All the Best > > Luke > > [1] https://github.com/llvm/llvm-project/commit/00d5508496c1e#diff-7206f3725623127339dd17671577a6888ee3402d2e667ae9dd1457ea3600f4e7R3 > > -- > Codeplay Software Ltd. > Company registered in England and Wales, number: 04567874 > Registered office: Regent House, 316 Beulah Hill, London, SE19 3HF > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Last time I removed the getenv calls, as an Aprils fool joke btw [1], I ended up with a compiler that was unable to do a stage 2 build, IIRC. It might depend on the kind of build you do, e.g., static vs dynamic libraries. ~ Johannes [1] https://lists.llvm.org/pipermail/llvm-dev/2019-April/131466.html On 11/2/20 12:49 PM, Luke Drummond via llvm-dev wrote:> Hi folks > > I noticed something interesting when debugging a program that uses llvm > for JIT compilation. > > Running `ltrace` surfaced a number of `getenv("bar")` calls coming from > llvm. It turns out these are not "real" `getenv` calls, but are an > optimization "do nothing" escape hatch which have been in > `llvm/include/llvm/LinkAllPasses.h` [for over 15years](1) - apparently > as a way to prevent the compiler eliminating symbol references to > optimization pass initialization functions. I took a look at the code > and couldn't really work out what issue is being solved as the commit > messages from 2005 have something to be desired ;) > > I removed the whole function body from my local tree and `ninja check` > was happy in release mode (amd64-linux-gcc-10.2). Given its age, and the > fact that it's been through several iterations, I guess I've stumbled > upon a Chesterton Fence and would appreciate some input on whether this > is still needed. I see the original commit was Windows only, and was > then updated to use `getenv` as a way to support this behaviour > cross-platform. > > It's more weird than pernicious given that nothing is done with the > result, but to me it feels dirty and confusing to query the process > environment in this way. As such, I wonder 3 things: > > 1. Is this still needed? I don't know enough about the original > toolchains affected to know if the problem still exists, but my > limited testing shows that it doesn't seem to affect Linux > builds. > 2. If 1: Is there a better way e.g. define our own function that > can't be eliminated instead of `getenv` or use features of newer > language standards and toolchains introduced since 2005 that might > make the original problem go away on its own (I don't know what > these might be). > 3. If 1 and not 2: could we make it more obvious that this comes > from LLVM for those in my situation e.g. > `getenv("LLVM_IGNORE_THIS_GETENV")` or similar instead of the > unhelpful "bar" variable? > > If it's no longer needed in any case, I can post a removal patch. > > Any input is appreciated. > > All the Best > > Luke > > [1] https://github.com/llvm/llvm-project/commit/00d5508496c1e#diff-7206f3725623127339dd17671577a6888ee3402d2e667ae9dd1457ea3600f4e7R3 >
Oh, right, this stuff. I guess the non-windows solution might've been a volatile read, for instance? So maybe not so much that the general machinery isn't needed, but that perhaps MSVC does something interesting with a volatile read or whatever other solution might've been used. Hmm, not sure why the whole file was added only when MSVC support was added - if it is a "static library object file selection" issue. Wouldn't that have turned up on other platforms before that moment? On Mon, Nov 2, 2020 at 11:54 AM Michael Kruse via llvm-dev < llvm-dev at lists.llvm.org> wrote:> I am pretty sure this has nothing to do with Windows, but with static > linking. > > When building an executable (opt, clang) we need to ensure that all > the symbols are available in the artifact to ensure that a loaded > plugin uses them. Otherwise the linker may discard object files from > .a libraries that are not used by the executable itself, which only > uses a subset of the functionality. In particular, one wants to ensure > that all passes are available in the opt executable, even though no > default pass pipeline does not reference a pass but can be added using > the cl::opt mechanism. > > Michael > > > Am Mo., 2. Nov. 2020 um 13:00 Uhr schrieb Luke Drummond via llvm-dev > <llvm-dev at lists.llvm.org>: > > > > Hi folks > > > > I noticed something interesting when debugging a program that uses llvm > > for JIT compilation. > > > > Running `ltrace` surfaced a number of `getenv("bar")` calls coming from > > llvm. It turns out these are not "real" `getenv` calls, but are an > > optimization "do nothing" escape hatch which have been in > > `llvm/include/llvm/LinkAllPasses.h` [for over 15years](1) - apparently > > as a way to prevent the compiler eliminating symbol references to > > optimization pass initialization functions. I took a look at the code > > and couldn't really work out what issue is being solved as the commit > > messages from 2005 have something to be desired ;) > > > > I removed the whole function body from my local tree and `ninja check` > > was happy in release mode (amd64-linux-gcc-10.2). Given its age, and the > > fact that it's been through several iterations, I guess I've stumbled > > upon a Chesterton Fence and would appreciate some input on whether this > > is still needed. I see the original commit was Windows only, and was > > then updated to use `getenv` as a way to support this behaviour > > cross-platform. > > > > It's more weird than pernicious given that nothing is done with the > > result, but to me it feels dirty and confusing to query the process > > environment in this way. As such, I wonder 3 things: > > > > 1. Is this still needed? I don't know enough about the original > > toolchains affected to know if the problem still exists, but my > > limited testing shows that it doesn't seem to affect Linux > > builds. > > 2. If 1: Is there a better way e.g. define our own function that > > can't be eliminated instead of `getenv` or use features of newer > > language standards and toolchains introduced since 2005 that might > > make the original problem go away on its own (I don't know what > > these might be). > > 3. If 1 and not 2: could we make it more obvious that this comes > > from LLVM for those in my situation e.g. > > `getenv("LLVM_IGNORE_THIS_GETENV")` or similar instead of the > > unhelpful "bar" variable? > > > > If it's no longer needed in any case, I can post a removal patch. > > > > Any input is appreciated. > > > > All the Best > > > > Luke > > > > [1] > https://github.com/llvm/llvm-project/commit/00d5508496c1e#diff-7206f3725623127339dd17671577a6888ee3402d2e667ae9dd1457ea3600f4e7R3 > > > > -- > > Codeplay Software Ltd. > > Company registered in England and Wales, number: 04567874 > > Registered office: Regent House, 316 Beulah Hill, London, SE19 3HF > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20201102/aba2de35/attachment.html>
On Mon Nov 2, 2020 at 7:56 PM GMT, Johannes Doerfert wrote:> Last time I removed the getenv calls, as an Aprils fool joke btw [1], > I ended up with a compiler that was unable to do a stage 2 build, IIRC. > > [1] https://lists.llvm.org/pipermail/llvm-dev/2019-April/131466.html >Wonderful! When I asked about this internally, a colleague said they were going to create a platform where `getenv` can return `-1`. -- Codeplay Software Ltd. Company registered in England and Wales, number: 04567874 Registered office: Regent House, 316 Beulah Hill, London, SE19 3HF
On Mon Nov 2, 2020 at 7:53 PM GMT, Michael Kruse wrote:> I am pretty sure this has nothing to do with Windows, but with static > linking. > > When building an executable (opt, clang) we need to ensure that all > the symbols are available in the artifact to ensure that a loaded > plugin uses them. Otherwise the linker may discard object files from > .a libraries that are not used by the executable itself, which only > uses a subset of the functionality. In particular, one wants to ensure > that all passes are available in the opt executable, even though no > default pass pipeline does not reference a pass but can be added using > the cl::opt mechanism.Right. Like a sort of static `-rdynamic` hack. I'm starting to suspect it's not going to be straightforward to remove this cleanly. Thanks for the insight.