Hayden Livingston via llvm-dev
2019-Jul-01 08:47 UTC
[llvm-dev] Generating completely position agnostic code
It is wholly self-contained. It's code that has no references to anything beyond a set of pointers passed in as arguments to the function. This piece of code doesn't do any OS work at all. It is purely calling function pointers, doing math and allocating memory. On Mon, Jul 1, 2019 at 12:57 AM Jorg Brown <jorg.brown at gmail.com> wrote:> > Qs for you: > > The code that is being loaded from disk... is it wholly self-contained, or is your executable potentially made up of several pieces that each need to be loaded from disk? > > What does it mean to use the STL but not have global variables? std::cout is a global variable, so you can't even do Hello World without globals. > > = = > > Architectures such as 68K and PowerPC and RISC-V have a dedicated register for accessing global variables, rather than the PC-relative globals used in other architectures. This makes them inherently more amenable to what you describe, since you can put the "array of function pointers" into global space, as part of setting up global space in general, and then load the code from disk, and go. There is no relocation needed since all access to globals is done via the global register, not relative to wherever the program was loaded. Of course, access to something like libc might normally need post-loading relocation, but if you do what you're talking about and use an "array of function pointers" to get to libc, no relocation would be needed. > > For what it's worth, the original 68K-based Macintosh used a scheme quite similar to this. The big difference for the Mac was that to get to the OS (the equivalent of libc), it didn't use an array of function pointers, per se; it used a certain range of illegal instructions, which generated exceptions when used, and the (highly optimized) exception handlers would recover from the exception by dispatching to an OS routine determined by the specific bits in the illegal instruction. > > On Sun, Jun 30, 2019 at 9:07 PM Hayden Livingston via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> I'm on a mission to generate code that can be loaded from disk without >> any modifications. This means no relocations can occur. >> >> Trying to see if this can be done for C++ code that uses STL but has >> no global variables, and a single function, but of course Clang will >> generate more functions for STL code. >> >> I want to provide an array of function pointers so that for all >> interactions STL needs to do with LIBC that I'm able to just provide >> it via indirect calls. >> >> Has anyone had success with such a thing in LLVM? >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Peter Smith via llvm-dev
2019-Jul-01 09:23 UTC
[llvm-dev] Generating completely position agnostic code
I'm not sure if you are wanting to modify LLVM to achieve your goal, or just use the functionality that already exists. If you are willing to make changes there are a couple of options in the ARM backend -fropi and -frwpi that are close, but unfortunately don't support C++. My understanding is that there are constant data such as vtables containing pointers that you would need quite a bit of work to turn into something that wouldn't require some kind of relocation. The initial RFC has an explanation https://lists.llvm.org/pipermail/llvm-dev/2015-December/093022.html there is a mention of a -fallow-unsupported option to allow c++ use, but I expect that this would only work for a subset of C++. I don't think that this is the same problem that you are trying to solve here though. I'm guessing that you are providing a fixed address libc external to the position independent code that you interface with via a table of pointers? I have seen that being done, one way of doing it is to provide the linker with the address of the the libc functions via absolute symbols, the table of function pointers uses something like the linker --wrap symbol to do the indirection. Peter On Mon, 1 Jul 2019 at 09:47, Hayden Livingston via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > It is wholly self-contained. It's code that has no references to > anything beyond a set of pointers passed in as arguments to the > function. This piece of code doesn't do any OS work at all. It is > purely calling function pointers, doing math and allocating memory. > > On Mon, Jul 1, 2019 at 12:57 AM Jorg Brown <jorg.brown at gmail.com> wrote: > > > > Qs for you: > > > > The code that is being loaded from disk... is it wholly self-contained, or is your executable potentially made up of several pieces that each need to be loaded from disk? > > > > What does it mean to use the STL but not have global variables? std::cout is a global variable, so you can't even do Hello World without globals. > > > > = = > > > > Architectures such as 68K and PowerPC and RISC-V have a dedicated register for accessing global variables, rather than the PC-relative globals used in other architectures. This makes them inherently more amenable to what you describe, since you can put the "array of function pointers" into global space, as part of setting up global space in general, and then load the code from disk, and go. There is no relocation needed since all access to globals is done via the global register, not relative to wherever the program was loaded. Of course, access to something like libc might normally need post-loading relocation, but if you do what you're talking about and use an "array of function pointers" to get to libc, no relocation would be needed. > > > > For what it's worth, the original 68K-based Macintosh used a scheme quite similar to this. The big difference for the Mac was that to get to the OS (the equivalent of libc), it didn't use an array of function pointers, per se; it used a certain range of illegal instructions, which generated exceptions when used, and the (highly optimized) exception handlers would recover from the exception by dispatching to an OS routine determined by the specific bits in the illegal instruction. > > > > On Sun, Jun 30, 2019 at 9:07 PM Hayden Livingston via llvm-dev <llvm-dev at lists.llvm.org> wrote: > >> > >> I'm on a mission to generate code that can be loaded from disk without > >> any modifications. This means no relocations can occur. > >> > >> Trying to see if this can be done for C++ code that uses STL but has > >> no global variables, and a single function, but of course Clang will > >> generate more functions for STL code. > >> > >> I want to provide an array of function pointers so that for all > >> interactions STL needs to do with LIBC that I'm able to just provide > >> it via indirect calls. > >> > >> Has anyone had success with such a thing in LLVM? > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Jorg Brown via llvm-dev
2019-Jul-01 18:07 UTC
[llvm-dev] Generating completely position agnostic code
What architecture do you need this for? Does the code in question ever use more than one thread? On Mon, Jul 1, 2019 at 1:47 AM Hayden Livingston <halivingston at gmail.com> wrote:> It is wholly self-contained. It's code that has no references to > anything beyond a set of pointers passed in as arguments to the > function. This piece of code doesn't do any OS work at all. It is > purely calling function pointers, doing math and allocating memory. > > On Mon, Jul 1, 2019 at 12:57 AM Jorg Brown <jorg.brown at gmail.com> wrote: > > > > Qs for you: > > > > The code that is being loaded from disk... is it wholly self-contained, > or is your executable potentially made up of several pieces that each need > to be loaded from disk? > > > > What does it mean to use the STL but not have global variables? > std::cout is a global variable, so you can't even do Hello World without > globals. > > > > = = > > > > Architectures such as 68K and PowerPC and RISC-V have a dedicated > register for accessing global variables, rather than the PC-relative > globals used in other architectures. This makes them inherently more > amenable to what you describe, since you can put the "array of function > pointers" into global space, as part of setting up global space in general, > and then load the code from disk, and go. There is no relocation needed > since all access to globals is done via the global register, not relative > to wherever the program was loaded. Of course, access to something like > libc might normally need post-loading relocation, but if you do what you're > talking about and use an "array of function pointers" to get to libc, no > relocation would be needed. > > > > For what it's worth, the original 68K-based Macintosh used a scheme > quite similar to this. The big difference for the Mac was that to get to > the OS (the equivalent of libc), it didn't use an array of function > pointers, per se; it used a certain range of illegal instructions, which > generated exceptions when used, and the (highly optimized) exception > handlers would recover from the exception by dispatching to an OS routine > determined by the specific bits in the illegal instruction. > > > > On Sun, Jun 30, 2019 at 9:07 PM Hayden Livingston via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > >> > >> I'm on a mission to generate code that can be loaded from disk without > >> any modifications. This means no relocations can occur. > >> > >> Trying to see if this can be done for C++ code that uses STL but has > >> no global variables, and a single function, but of course Clang will > >> generate more functions for STL code. > >> > >> I want to provide an array of function pointers so that for all > >> interactions STL needs to do with LIBC that I'm able to just provide > >> it via indirect calls. > >> > >> Has anyone had success with such a thing in LLVM? > >> _______________________________________________ > >> LLVM Developers mailing list > >> llvm-dev at lists.llvm.org > >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190701/933cd235/attachment.html>
Hayden Livingston via llvm-dev
2019-Jul-07 22:13 UTC
[llvm-dev] Generating completely position agnostic code
x86-64. On Mon, Jul 1, 2019 at 11:07 AM Jorg Brown <jorg.brown at gmail.com> wrote:> > What architecture do you need this for? > > Does the code in question ever use more than one thread? > > On Mon, Jul 1, 2019 at 1:47 AM Hayden Livingston <halivingston at gmail.com> wrote: >> >> It is wholly self-contained. It's code that has no references to >> anything beyond a set of pointers passed in as arguments to the >> function. This piece of code doesn't do any OS work at all. It is >> purely calling function pointers, doing math and allocating memory. >> >> On Mon, Jul 1, 2019 at 12:57 AM Jorg Brown <jorg.brown at gmail.com> wrote: >> > >> > Qs for you: >> > >> > The code that is being loaded from disk... is it wholly self-contained, or is your executable potentially made up of several pieces that each need to be loaded from disk? >> > >> > What does it mean to use the STL but not have global variables? std::cout is a global variable, so you can't even do Hello World without globals. >> > >> > = = >> > >> > Architectures such as 68K and PowerPC and RISC-V have a dedicated register for accessing global variables, rather than the PC-relative globals used in other architectures. This makes them inherently more amenable to what you describe, since you can put the "array of function pointers" into global space, as part of setting up global space in general, and then load the code from disk, and go. There is no relocation needed since all access to globals is done via the global register, not relative to wherever the program was loaded. Of course, access to something like libc might normally need post-loading relocation, but if you do what you're talking about and use an "array of function pointers" to get to libc, no relocation would be needed. >> > >> > For what it's worth, the original 68K-based Macintosh used a scheme quite similar to this. The big difference for the Mac was that to get to the OS (the equivalent of libc), it didn't use an array of function pointers, per se; it used a certain range of illegal instructions, which generated exceptions when used, and the (highly optimized) exception handlers would recover from the exception by dispatching to an OS routine determined by the specific bits in the illegal instruction. >> > >> > On Sun, Jun 30, 2019 at 9:07 PM Hayden Livingston via llvm-dev <llvm-dev at lists.llvm.org> wrote: >> >> >> >> I'm on a mission to generate code that can be loaded from disk without >> >> any modifications. This means no relocations can occur. >> >> >> >> Trying to see if this can be done for C++ code that uses STL but has >> >> no global variables, and a single function, but of course Clang will >> >> generate more functions for STL code. >> >> >> >> I want to provide an array of function pointers so that for all >> >> interactions STL needs to do with LIBC that I'm able to just provide >> >> it via indirect calls. >> >> >> >> Has anyone had success with such a thing in LLVM? >> >> _______________________________________________ >> >> LLVM Developers mailing list >> >> llvm-dev at lists.llvm.org >> >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
Hayden Livingston via llvm-dev
2019-Jul-07 22:15 UTC
[llvm-dev] Generating completely position agnostic code
I'd really like to not modify LLVM. What I want is conceptually using LLVM to generate machine code with no data references outside a single stack frame, and everything else is using pointers. This is merely a snippet of assembly code that will be invoked by other full C++ programs. On Mon, Jul 1, 2019 at 2:23 AM Peter Smith <peter.smith at linaro.org> wrote:> > I'm not sure if you are wanting to modify LLVM to achieve your goal, > or just use the functionality that already exists. If you are willing > to make changes there are a couple of options in the ARM backend > -fropi and -frwpi that are close, but unfortunately don't support C++. > My understanding is that there are constant data such as vtables > containing pointers that you would need quite a bit of work to turn > into something that wouldn't require some kind of relocation. The > initial RFC has an explanation > https://lists.llvm.org/pipermail/llvm-dev/2015-December/093022.html > there is a mention of a -fallow-unsupported option to allow c++ use, > but I expect that this would only work for a subset of C++. > > I don't think that this is the same problem that you are trying to > solve here though. I'm guessing that you are providing a fixed address > libc external to the position independent code that you interface with > via a table of pointers? I have seen that being done, one way of doing > it is to provide the linker with the address of the the libc functions > via absolute symbols, the table of function pointers uses something > like the linker --wrap symbol to do the indirection. > > Peter > > > On Mon, 1 Jul 2019 at 09:47, Hayden Livingston via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > > > It is wholly self-contained. It's code that has no references to > > anything beyond a set of pointers passed in as arguments to the > > function. This piece of code doesn't do any OS work at all. It is > > purely calling function pointers, doing math and allocating memory. > > > > On Mon, Jul 1, 2019 at 12:57 AM Jorg Brown <jorg.brown at gmail.com> wrote: > > > > > > Qs for you: > > > > > > The code that is being loaded from disk... is it wholly self-contained, or is your executable potentially made up of several pieces that each need to be loaded from disk? > > > > > > What does it mean to use the STL but not have global variables? std::cout is a global variable, so you can't even do Hello World without globals. > > > > > > = = > > > > > > Architectures such as 68K and PowerPC and RISC-V have a dedicated register for accessing global variables, rather than the PC-relative globals used in other architectures. This makes them inherently more amenable to what you describe, since you can put the "array of function pointers" into global space, as part of setting up global space in general, and then load the code from disk, and go. There is no relocation needed since all access to globals is done via the global register, not relative to wherever the program was loaded. Of course, access to something like libc might normally need post-loading relocation, but if you do what you're talking about and use an "array of function pointers" to get to libc, no relocation would be needed. > > > > > > For what it's worth, the original 68K-based Macintosh used a scheme quite similar to this. The big difference for the Mac was that to get to the OS (the equivalent of libc), it didn't use an array of function pointers, per se; it used a certain range of illegal instructions, which generated exceptions when used, and the (highly optimized) exception handlers would recover from the exception by dispatching to an OS routine determined by the specific bits in the illegal instruction. > > > > > > On Sun, Jun 30, 2019 at 9:07 PM Hayden Livingston via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > >> > > >> I'm on a mission to generate code that can be loaded from disk without > > >> any modifications. This means no relocations can occur. > > >> > > >> Trying to see if this can be done for C++ code that uses STL but has > > >> no global variables, and a single function, but of course Clang will > > >> generate more functions for STL code. > > >> > > >> I want to provide an array of function pointers so that for all > > >> interactions STL needs to do with LIBC that I'm able to just provide > > >> it via indirect calls. > > >> > > >> Has anyone had success with such a thing in LLVM? > > >> _______________________________________________ > > >> LLVM Developers mailing list > > >> llvm-dev at lists.llvm.org > > >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > > LLVM Developers mailing list > > llvm-dev at lists.llvm.org > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev