On Wed, Apr 18, 2018 at 4:49 PM, <dmitry.mikulin at sony.com> wrote:> Hi Teresa, > > Thanks for the info! > This example is my attempt to reduce FreeBSD kernel to something more > manageable :) > > I will take a look at why globals are not being imported in this case. > What’s the best tool to look into ThinLTO objects and their summaries? Most > dumping tools don’t seem to like ThinLTO bitcode files… >Sadly there isn't a really great way to dump the summaries. =( There was a patch awhile back by a GSOC student to dump in YAML format, but there was resistance from some who preferred dumping to llvm assembly via llvm-dis and support reading in the summary from llvm assembly. It's been on my list of things to do, hasn't yet risen high enough in priority to work on that. For now, you have to use llvm-bcanalyzer -dump and look at the raw format. Teresa> Hopefully Peter can chime in regarding CFI related issues. > > Thanks. > Dmitry. > > > On Apr 17, 2018, at 9:37 AM, Teresa Johnson <tejohnson at google.com> wrote: > > Hi Dmitry, > > Sorry for the late reply. For CFI specific code generation, pcc is a > better person to answer. But on the issue of global variables being > optimized, that hasn't happened yet. That would be great if you wanted to > pick that up! > > In your original email example, it seems like the file static i=53 could > be constant propagated since there are no other defs, and the code in > get_fptr simplified during the compile step, but I assume this is part of a > more complex example where it is not possible to do this? Also note that > with r327254 we started importing global variables. Do you know why we > don't import in your case? I wonder if it has to do with it being CFI > inserted code? > > Teresa > > On Tue, Apr 17, 2018 at 9:17 AM <dmitry.mikulin at sony.com> wrote: > >> I watched Teresa’s talk on ThinLTO from last year’s CppCon, and it >> sounded like adding global variable information to the summaries was in the >> works, or at least in planning. Can someone (Teresa?) please share the >> current status? If it’s part of future plans, are there any specific >> proposals that can be picked up and worked on? >> >> Thanks! >> >> >> > On Apr 9, 2018, at 6:51 PM, via llvm-dev <llvm-dev at lists.llvm.org> >> wrote: >> > >> > Hi, >> > >> > I’m working on setting up ThinLTO+CFI for a C application which uses a >> lot of function pointers. While functionally it appears stable, it’s >> performance is significantly degraded, to the tune of double digit >> percentage points compared to regular LTO+CFI. >> > >> > Looking into possible causes I see that under ThinLTO+CFI iCall type >> checks almost always generate jump table entries for indirect calls, which >> creates another level of indirection for every such call. On top of that it >> breaks the link order layout because real function names point to jump >> table entries. It appears that I’m hitting a limitation in ThinLTO on how >> much information it can propagate across modules, particularly information >> about constants. In the example below, the fact that “i” is effectively a >> constant, is lost under ThinLTO, and the inlined copy of b.c:get_fptr() in >> a.c does not eliminate the conditional, which, for CFI purposes requires to >> generate a type check/jump table. >> > >> > I was wondering if there was a way to mitigate this limitation. >> > >> > a.c >> > ============================>> > typedef int (*fptr_t) (void); >> > fptr_t get_fptr(); >> > int main(int argc, char *argv[]) >> > { >> > fptr_t fp = get_fptr(); >> > return fp(); >> > } >> > >> > >> > b.c >> > ============================>> > typedef int (*fptr_t) (void); >> > int foo(void) { return 11; } >> > int bar(void) { return 22; } >> > >> > static fptr_t fptr = bar; >> > static int i = 53; >> > >> > fptr_t get_fptr(void) >> > { >> > if (i >= 0) >> > fptr = foo; >> > else >> > fptr = bar; >> > >> > return fptr; >> > } >> > >> > _______________________________________________ >> > LLVM Developers mailing list >> > llvm-dev at lists.llvm.org >> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> > > -- > Teresa Johnson | Software Engineer | tejohnson at google.com | > 408-460-2413 > > >-- Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180418/edb8959d/attachment.html>
Teresa, Peter, Thanks for your help! I need to re-run my experiments as the compiler I used did not have the latest changes like r327254. The fact that the decision about routing calls through jump table entries is made early may be problematic. In my experiments with FreeBSD kernel, ThinLTO produced thousands jump table entries compared to only dozens with full LTO. As for re-ordering jump table entries, I don’t think it’s going to work as they are placed in the same section. Including *.cfi names into a link order file will take care of re-ordering real functions routed through jump table entries, but in our case we need to force some functions to be on the same page. So not having jump table entries for the functions that don't really need them would be ideal. Thanks. Dmitry. On Apr 18, 2018, at 6:11 PM, Teresa Johnson <tejohnson at google.com<mailto:tejohnson at google.com>> wrote: On Wed, Apr 18, 2018 at 4:49 PM, <dmitry.mikulin at sony.com<mailto:dmitry.mikulin at sony.com>> wrote: Hi Teresa, Thanks for the info! This example is my attempt to reduce FreeBSD kernel to something more manageable :) I will take a look at why globals are not being imported in this case. What’s the best tool to look into ThinLTO objects and their summaries? Most dumping tools don’t seem to like ThinLTO bitcode files… Sadly there isn't a really great way to dump the summaries. =( There was a patch awhile back by a GSOC student to dump in YAML format, but there was resistance from some who preferred dumping to llvm assembly via llvm-dis and support reading in the summary from llvm assembly. It's been on my list of things to do, hasn't yet risen high enough in priority to work on that. For now, you have to use llvm-bcanalyzer -dump and look at the raw format. Teresa Hopefully Peter can chime in regarding CFI related issues. Thanks. Dmitry. On Apr 17, 2018, at 9:37 AM, Teresa Johnson <tejohnson at google.com<mailto:tejohnson at google.com>> wrote: Hi Dmitry, Sorry for the late reply. For CFI specific code generation, pcc is a better person to answer. But on the issue of global variables being optimized, that hasn't happened yet. That would be great if you wanted to pick that up! In your original email example, it seems like the file static i=53 could be constant propagated since there are no other defs, and the code in get_fptr simplified during the compile step, but I assume this is part of a more complex example where it is not possible to do this? Also note that with r327254 we started importing global variables. Do you know why we don't import in your case? I wonder if it has to do with it being CFI inserted code? Teresa On Tue, Apr 17, 2018 at 9:17 AM <dmitry.mikulin at sony.com<mailto:dmitry.mikulin at sony.com>> wrote: I watched Teresa’s talk on ThinLTO from last year’s CppCon, and it sounded like adding global variable information to the summaries was in the works, or at least in planning. Can someone (Teresa?) please share the current status? If it’s part of future plans, are there any specific proposals that can be picked up and worked on? Thanks!> On Apr 9, 2018, at 6:51 PM, via llvm-dev <llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>> wrote: > > Hi, > > I’m working on setting up ThinLTO+CFI for a C application which uses a lot of function pointers. While functionally it appears stable, it’s performance is significantly degraded, to the tune of double digit percentage points compared to regular LTO+CFI. > > Looking into possible causes I see that under ThinLTO+CFI iCall type checks almost always generate jump table entries for indirect calls, which creates another level of indirection for every such call. On top of that it breaks the link order layout because real function names point to jump table entries. It appears that I’m hitting a limitation in ThinLTO on how much information it can propagate across modules, particularly information about constants. In the example below, the fact that “i” is effectively a constant, is lost under ThinLTO, and the inlined copy of b.c:get_fptr() in a.c does not eliminate the conditional, which, for CFI purposes requires to generate a type check/jump table. > > I was wondering if there was a way to mitigate this limitation. > > a.c > ============================> typedef int (*fptr_t) (void); > fptr_t get_fptr(); > int main(int argc, char *argv[]) > { > fptr_t fp = get_fptr(); > return fp(); > } > > > b.c > ============================> typedef int (*fptr_t) (void); > int foo(void) { return 11; } > int bar(void) { return 22; } > > static fptr_t fptr = bar; > static int i = 53; > > fptr_t get_fptr(void) > { > if (i >= 0) > fptr = foo; > else > fptr = bar; > > return fptr; > } > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Teresa Johnson | Software Engineer | tejohnson at google.com<mailto:tejohnson at google.com> | 408-460-2413 -- Teresa Johnson | Software Engineer | tejohnson at google.com<mailto:tejohnson at google.com> | 408-460-2413 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180420/d7264c5b/attachment-0001.html>
Regarding the orderfile, yes, I was thinking more about ordering the real functions. In that case it sounds like your best option may be to implement the optimization pass to make direct calls go directly to the real function.>From a performance perspective I don't think it would make much differenceif there are unused jump table entries. Peter On Thu, Apr 19, 2018 at 6:09 PM, via llvm-dev <llvm-dev at lists.llvm.org> wrote:> Teresa, Peter, > > Thanks for your help! > I need to re-run my experiments as the compiler I used did not have the > latest changes like r327254. > The fact that the decision about routing calls through jump table entries > is made early may be problematic. In my experiments with FreeBSD kernel, > ThinLTO produced thousands jump table entries compared to only dozens with > full LTO. As for re-ordering jump table entries, I don’t think it’s going > to work as they are placed in the same section. Including *.cfi names into > a link order file will take care of re-ordering real functions routed > through jump table entries, but in our case we need to force some functions > to be on the same page. So not having jump table entries for the functions > that don't really need them would be ideal. > > Thanks. > Dmitry. > > > On Apr 18, 2018, at 6:11 PM, Teresa Johnson <tejohnson at google.com> wrote: > > > > On Wed, Apr 18, 2018 at 4:49 PM, <dmitry.mikulin at sony.com> wrote: > >> Hi Teresa, >> >> Thanks for the info! >> This example is my attempt to reduce FreeBSD kernel to something more >> manageable :) >> >> I will take a look at why globals are not being imported in this case. >> What’s the best tool to look into ThinLTO objects and their summaries? Most >> dumping tools don’t seem to like ThinLTO bitcode files… >> > > Sadly there isn't a really great way to dump the summaries. =( There was a > patch awhile back by a GSOC student to dump in YAML format, but there was > resistance from some who preferred dumping to llvm assembly via llvm-dis > and support reading in the summary from llvm assembly. It's been on my list > of things to do, hasn't yet risen high enough in priority to work on that. > For now, you have to use llvm-bcanalyzer -dump and look at the raw format. > > Teresa > > >> Hopefully Peter can chime in regarding CFI related issues. >> >> Thanks. >> Dmitry. >> >> >> On Apr 17, 2018, at 9:37 AM, Teresa Johnson <tejohnson at google.com> wrote: >> >> Hi Dmitry, >> >> Sorry for the late reply. For CFI specific code generation, pcc is a >> better person to answer. But on the issue of global variables being >> optimized, that hasn't happened yet. That would be great if you wanted to >> pick that up! >> >> In your original email example, it seems like the file static i=53 could >> be constant propagated since there are no other defs, and the code in >> get_fptr simplified during the compile step, but I assume this is part of a >> more complex example where it is not possible to do this? Also note that >> with r327254 we started importing global variables. Do you know why we >> don't import in your case? I wonder if it has to do with it being CFI >> inserted code? >> >> Teresa >> >> On Tue, Apr 17, 2018 at 9:17 AM <dmitry.mikulin at sony.com> wrote: >> >>> I watched Teresa’s talk on ThinLTO from last year’s CppCon, and it >>> sounded like adding global variable information to the summaries was in the >>> works, or at least in planning. Can someone (Teresa?) please share the >>> current status? If it’s part of future plans, are there any specific >>> proposals that can be picked up and worked on? >>> >>> Thanks! >>> >>> >>> > On Apr 9, 2018, at 6:51 PM, via llvm-dev <llvm-dev at lists.llvm.org> >>> wrote: >>> > >>> > Hi, >>> > >>> > I’m working on setting up ThinLTO+CFI for a C application which uses a >>> lot of function pointers. While functionally it appears stable, it’s >>> performance is significantly degraded, to the tune of double digit >>> percentage points compared to regular LTO+CFI. >>> > >>> > Looking into possible causes I see that under ThinLTO+CFI iCall type >>> checks almost always generate jump table entries for indirect calls, which >>> creates another level of indirection for every such call. On top of that it >>> breaks the link order layout because real function names point to jump >>> table entries. It appears that I’m hitting a limitation in ThinLTO on how >>> much information it can propagate across modules, particularly information >>> about constants. In the example below, the fact that “i” is effectively a >>> constant, is lost under ThinLTO, and the inlined copy of b.c:get_fptr() in >>> a.c does not eliminate the conditional, which, for CFI purposes requires to >>> generate a type check/jump table. >>> > >>> > I was wondering if there was a way to mitigate this limitation. >>> > >>> > a.c >>> > ============================>>> > typedef int (*fptr_t) (void); >>> > fptr_t get_fptr(); >>> > int main(int argc, char *argv[]) >>> > { >>> > fptr_t fp = get_fptr(); >>> > return fp(); >>> > } >>> > >>> > >>> > b.c >>> > ============================>>> > typedef int (*fptr_t) (void); >>> > int foo(void) { return 11; } >>> > int bar(void) { return 22; } >>> > >>> > static fptr_t fptr = bar; >>> > static int i = 53; >>> > >>> > fptr_t get_fptr(void) >>> > { >>> > if (i >= 0) >>> > fptr = foo; >>> > else >>> > fptr = bar; >>> > >>> > return fptr; >>> > } >>> > >>> > _______________________________________________ >>> > LLVM Developers mailing list >>> > llvm-dev at lists.llvm.org >>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >> >> -- >> Teresa Johnson | Software Engineer | tejohnson at google.com | >> 408-460-2413 >> >> >> > > > -- > Teresa Johnson | Software Engineer | tejohnson at google.com | > 408-460-2413 > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > >-- -- Peter -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180419/e2ae5031/attachment.html>
On Wed, Apr 18, 2018 at 6:11 PM, Teresa Johnson <tejohnson at google.com> wrote:> > > On Wed, Apr 18, 2018 at 4:49 PM, <dmitry.mikulin at sony.com> wrote: > >> Hi Teresa, >> >> Thanks for the info! >> This example is my attempt to reduce FreeBSD kernel to something more >> manageable :) >> >> I will take a look at why globals are not being imported in this case. >> What’s the best tool to look into ThinLTO objects and their summaries? Most >> dumping tools don’t seem to like ThinLTO bitcode files… >> > > Sadly there isn't a really great way to dump the summaries. =( There was a > patch awhile back by a GSOC student to dump in YAML format, but there was > resistance from some who preferred dumping to llvm assembly via llvm-dis > and support reading in the summary from llvm assembly. It's been on my list > of things to do, hasn't yet risen high enough in priority to work on that. > For now, you have to use llvm-bcanalyzer -dump and look at the raw format. >FYI I decided to take a stab at an LLVM assembly format today. I have hacked some code to do the printing side, and am playing with formats. I'll send out an RFC on the format hopefully early next week. Teresa> Teresa > > >> Hopefully Peter can chime in regarding CFI related issues. >> >> Thanks. >> Dmitry. >> >> >> On Apr 17, 2018, at 9:37 AM, Teresa Johnson <tejohnson at google.com> wrote: >> >> Hi Dmitry, >> >> Sorry for the late reply. For CFI specific code generation, pcc is a >> better person to answer. But on the issue of global variables being >> optimized, that hasn't happened yet. That would be great if you wanted to >> pick that up! >> >> In your original email example, it seems like the file static i=53 could >> be constant propagated since there are no other defs, and the code in >> get_fptr simplified during the compile step, but I assume this is part of a >> more complex example where it is not possible to do this? Also note that >> with r327254 we started importing global variables. Do you know why we >> don't import in your case? I wonder if it has to do with it being CFI >> inserted code? >> >> Teresa >> >> On Tue, Apr 17, 2018 at 9:17 AM <dmitry.mikulin at sony.com> wrote: >> >>> I watched Teresa’s talk on ThinLTO from last year’s CppCon, and it >>> sounded like adding global variable information to the summaries was in the >>> works, or at least in planning. Can someone (Teresa?) please share the >>> current status? If it’s part of future plans, are there any specific >>> proposals that can be picked up and worked on? >>> >>> Thanks! >>> >>> >>> > On Apr 9, 2018, at 6:51 PM, via llvm-dev <llvm-dev at lists.llvm.org> >>> wrote: >>> > >>> > Hi, >>> > >>> > I’m working on setting up ThinLTO+CFI for a C application which uses a >>> lot of function pointers. While functionally it appears stable, it’s >>> performance is significantly degraded, to the tune of double digit >>> percentage points compared to regular LTO+CFI. >>> > >>> > Looking into possible causes I see that under ThinLTO+CFI iCall type >>> checks almost always generate jump table entries for indirect calls, which >>> creates another level of indirection for every such call. On top of that it >>> breaks the link order layout because real function names point to jump >>> table entries. It appears that I’m hitting a limitation in ThinLTO on how >>> much information it can propagate across modules, particularly information >>> about constants. In the example below, the fact that “i” is effectively a >>> constant, is lost under ThinLTO, and the inlined copy of b.c:get_fptr() in >>> a.c does not eliminate the conditional, which, for CFI purposes requires to >>> generate a type check/jump table. >>> > >>> > I was wondering if there was a way to mitigate this limitation. >>> > >>> > a.c >>> > ============================>>> > typedef int (*fptr_t) (void); >>> > fptr_t get_fptr(); >>> > int main(int argc, char *argv[]) >>> > { >>> > fptr_t fp = get_fptr(); >>> > return fp(); >>> > } >>> > >>> > >>> > b.c >>> > ============================>>> > typedef int (*fptr_t) (void); >>> > int foo(void) { return 11; } >>> > int bar(void) { return 22; } >>> > >>> > static fptr_t fptr = bar; >>> > static int i = 53; >>> > >>> > fptr_t get_fptr(void) >>> > { >>> > if (i >= 0) >>> > fptr = foo; >>> > else >>> > fptr = bar; >>> > >>> > return fptr; >>> > } >>> > >>> > _______________________________________________ >>> > LLVM Developers mailing list >>> > llvm-dev at lists.llvm.org >>> > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >> >> -- >> Teresa Johnson | Software Engineer | tejohnson at google.com | >> 408-460-2413 >> >> >> > > > -- > Teresa Johnson | Software Engineer | tejohnson at google.com | > 408-460-2413 >-- Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20180420/a20ecb9a/attachment.html>