JF Bastien via llvm-dev
2019-May-13 16:55 UTC
[llvm-dev] Interprocedural DSE for -ftrivial-auto-var-init
> On May 10, 2019, at 8:59 PM, Vitaly Buka via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Sorry for delay, I was busy with other stuff. > CTMark results. > > dse is the current DSE. > dsem is my experimental module level DSE. > dsem runs after dse, so it's additionally deleted stores. > > -O3 > dse - Number of stores deleted 3033 > dsem - Number of deleted writes 3148 > > -O3 -ftrivial-auto-var-init=pattern > dse - Number of stores deleted 5618 > dsem - Number of deleted writes 3840 > > -O3 -flto > dse - Number of stores deleted 3985 > dsem - Number of deleted writes 3838 > > -O3 -flto -ftrivial-auto-var-init=pattern > dse - Number of stores deleted 6461 > dsem - Number of deleted writes 4215 > > -Os > dse - Number of stores deleted 1443 > dsem - Number of deleted writes 1517 > > -Os -ftrivial-auto-var-init=pattern > dse - Number of stores deleted 3951 > dsem - Number of deleted writes 2259 > > -Oz > dse - Number of stores deleted 1072 > dsem - Number of deleted writes 574 > > -Oz -ftrivial-auto-var-init=pattern > dse - Number of stores deleted 3420 > dsem - Number of deleted writes 1637This looks great! Do you have a patch ready to go?> From: Amara Emerson <aemerson at apple.com <mailto:aemerson at apple.com>> > Date: Tue, Apr 16, 2019 at 12:10 PM > To: Vitaly Buka > Cc: Alexander Potapenko, llvm-dev, Peter Collingbourne > > Can you post numbers for how many stores get eliminated from CTMark? > >> On Apr 16, 2019, at 11:45 AM, Vitaly Buka <vitalybuka at google.com <mailto:vitalybuka at google.com>> wrote: >> >> I tried -Os and effect of new approach significantly increases. >> I run regular DSE and immediately myDSE. With -Os myDSE removes more than 50% of DSE number. >> Which is expected as -Os inlines less and regular DSE can't remove over function call. >> >> On Tue, Apr 16, 2019 at 7:11 AM Alexander Potapenko <glider at google.com <mailto:glider at google.com>> wrote: >> On Mon, Apr 15, 2019 at 11:02 PM Amara Emerson via llvm-dev >> <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> > >> > >> > > On Apr 15, 2019, at 1:51 PM, Vitaly Buka via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> > > >> > > Hi JF, >> > > >> > > I've heard that you are interested DSE improvements and maybe we need to be in sync. >> > > So far I experimented with following DSE improvements: >> > > >> > > * Cross-block DSE, it eliminates additional 7% stores comparing to existing DSE. But it's not visible on benchmarks. >> > I take it you couldn’t see any runtime impact? If there’s code size improvements that could also be useful, CTMark in the llvm test suite is a useful subset of benchmarks to check this on (as a baseline use -Os to compare code size). >> > >> > Thanks, >> > Amara >> > > >> > > * Cross-block + Interprocedural analysis to annotate each function argument with: >> > > - can read before write >> > > - will always write >> > > This annotations gets me 20% stores deleted additional to the current DSE. >> I believe we can only benefit from removing extra stores. >> Hot functions in existing benchmarks are probably optimized good >> enough already, but speeding up the long tail is also important. >> Also, at least the repro in >> https://bugs.llvm.org/show_bug.cgi?id=40527 <https://bugs.llvm.org/show_bug.cgi?id=40527> has been extracted from a >> real kernel benchmark (hackbench), where this extra store costed us >> 0.45% >> >> > > This is on LLVM codebase with -ftrivial-auto-var-init=patter. >> > > >> > > As-is it's less than I expected, so I would like to find good benchmark to decide if we should work to make production code from my experiment. >> > > >> > > So now I am also planing to try to extend that to whole program analysis. >> > > I will cleanup my code and upload this during this weak, if anyone wants to try. >> > > >> > > Vitaly. >> > > _______________________________________________ >> > > LLVM Developers mailing list >> > > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >> > >> > _______________________________________________ >> > LLVM Developers mailing list >> > llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >> >> >> >> -- >> Alexander Potapenko >> Software Engineer >> >> Google Germany GmbH >> Erika-Mann-Straße, 33 >> 80636 München >> >> Geschäftsführer: Paul Manicle, Halimah DeLaine Prado >> Registergericht und -nummer: Hamburg, HRB 86891 >> Sitz der Gesellschaft: Hamburg > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190513/660cf992/attachment-0001.html>
Vitaly Buka via llvm-dev
2019-May-13 17:23 UTC
[llvm-dev] Interprocedural DSE for -ftrivial-auto-var-init
I have dirty prof-of-concept patch. I am going to rewrite pieces of it during the May starting now. Today it's a new pass which does cross-block DSE, module DSE, and global DSE. So far the module DSE is the most useful and probably easy integrate to existing DSE. *From: *JF Bastien <jfbastien at apple.com> *Date: *Mon, May 13, 2019 at 9:55 AM *To: *Vitaly Buka *Cc: *Amara Emerson, llvm-dev, Peter Collingbourne> > On May 10, 2019, at 8:59 PM, Vitaly Buka via llvm-dev < > llvm-dev at lists.llvm.org> wrote: > > Sorry for delay, I was busy with other stuff. > CTMark results. > > dse is the current DSE. > dsem is my experimental module level DSE. > dsem runs after dse, so it's additionally deleted stores. > > -O3 > dse - Number of stores deleted 3033 > dsem - Number of deleted writes 3148 > > -O3 -ftrivial-auto-var-init=pattern > dse - Number of stores deleted 5618 > dsem - Number of deleted writes 3840 > > -O3 -flto > dse - Number of stores deleted 3985 > dsem - Number of deleted writes 3838 > > -O3 -flto -ftrivial-auto-var-init=pattern > dse - Number of stores deleted 6461 > dsem - Number of deleted writes 4215 > > -Os > dse - Number of stores deleted 1443 > dsem - Number of deleted writes 1517 > > -Os -ftrivial-auto-var-init=pattern > dse - Number of stores deleted 3951 > dsem - Number of deleted writes 2259 > > -Oz > dse - Number of stores deleted 1072 > dsem - Number of deleted writes 574 > > -Oz -ftrivial-auto-var-init=pattern > dse - Number of stores deleted 3420 > dsem - Number of deleted writes 1637 > > > This looks great! Do you have a patch ready to go? > > > *From: *Amara Emerson <aemerson at apple.com> > *Date: *Tue, Apr 16, 2019 at 12:10 PM > *To: *Vitaly Buka > *Cc: *Alexander Potapenko, llvm-dev, Peter Collingbourne > > Can you post numbers for how many stores get eliminated from CTMark? >> >> On Apr 16, 2019, at 11:45 AM, Vitaly Buka <vitalybuka at google.com> wrote: >> >> I tried -Os and effect of new approach significantly increases. >> I run regular DSE and immediately myDSE. With -Os myDSE removes more than >> 50% of DSE number. >> Which is expected as -Os inlines less and regular DSE can't remove over >> function call. >> >> On Tue, Apr 16, 2019 at 7:11 AM Alexander Potapenko <glider at google.com> >> wrote: >> >>> On Mon, Apr 15, 2019 at 11:02 PM Amara Emerson via llvm-dev >>> <llvm-dev at lists.llvm.org> wrote: >>> > >>> > >>> > > On Apr 15, 2019, at 1:51 PM, Vitaly Buka via llvm-dev < >>> llvm-dev at lists.llvm.org> wrote: >>> > > >>> > > Hi JF, >>> > > >>> > > I've heard that you are interested DSE improvements and maybe we >>> need to be in sync. >>> > > So far I experimented with following DSE improvements: >>> > > >>> > > * Cross-block DSE, it eliminates additional 7% stores comparing to >>> existing DSE. But it's not visible on benchmarks. >>> > I take it you couldn’t see any runtime impact? If there’s code size >>> improvements that could also be useful, CTMark in the llvm test suite is a >>> useful subset of benchmarks to check this on (as a baseline use -Os to >>> compare code size). >>> > >>> > Thanks, >>> > Amara >>> > > >>> > > * Cross-block + Interprocedural analysis to annotate each function >>> argument with: >>> > > - can read before write >>> > > - will always write >>> > > This annotations gets me 20% stores deleted additional to the >>> current DSE. >>> I believe we can only benefit from removing extra stores. >>> Hot functions in existing benchmarks are probably optimized good >>> enough already, but speeding up the long tail is also important. >>> Also, at least the repro in >>> https://bugs.llvm.org/show_bug.cgi?id=40527 has been extracted from a >>> real kernel benchmark (hackbench), where this extra store costed us >>> 0.45% >>> >>> > > This is on LLVM codebase with -ftrivial-auto-var-init=patter. >>> > > >>> > > As-is it's less than I expected, so I would like to find good >>> benchmark to decide if we should work to make production code from my >>> experiment. >>> > > >>> > > So now I am also planing to try to extend that to whole program >>> analysis. >>> > > I will cleanup my code and upload this during this weak, if anyone >>> wants to try. >>> > > >>> > > Vitaly. >>> > > _______________________________________________ >>> > > LLVM Developers mailing list >>> > > llvm-dev at lists.llvm.org >>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> > >>> > _______________________________________________ >>> > LLVM Developers mailing list >>> > llvm-dev at lists.llvm.org >>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>> >>> >>> >>> -- >>> Alexander Potapenko >>> Software Engineer >>> >>> Google Germany GmbH >>> Erika-Mann-Straße, 33 >>> 80636 München >>> >>> Geschäftsführer: Paul Manicle, Halimah DeLaine Prado >>> Registergericht und -nummer: Hamburg, HRB 86891 >>> Sitz der Gesellschaft: Hamburg >>> >> >> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190513/ee1bdf68/attachment.html>
Vitaly Buka via llvm-dev
2019-May-14 02:32 UTC
[llvm-dev] Interprocedural DSE for -ftrivial-auto-var-init
https://reviews.llvm.org/D61879 *From: *Vitaly Buka <vitalybuka at google.com> *Date: *Mon, May 13, 2019 at 10:23 AM *To: *JF Bastien *Cc: *Amara Emerson, llvm-dev, Peter Collingbourne I have dirty prof-of-concept patch. I am going to rewrite pieces of it> during the May starting now. > Today it's a new pass which does cross-block DSE, module DSE, and global > DSE. > So far the module DSE is the most useful and probably easy integrate to > existing DSE. > > *From: *JF Bastien <jfbastien at apple.com> > *Date: *Mon, May 13, 2019 at 9:55 AM > *To: *Vitaly Buka > *Cc: *Amara Emerson, llvm-dev, Peter Collingbourne > > >> >> On May 10, 2019, at 8:59 PM, Vitaly Buka via llvm-dev < >> llvm-dev at lists.llvm.org> wrote: >> >> Sorry for delay, I was busy with other stuff. >> CTMark results. >> >> dse is the current DSE. >> dsem is my experimental module level DSE. >> dsem runs after dse, so it's additionally deleted stores. >> >> -O3 >> dse - Number of stores deleted 3033 >> dsem - Number of deleted writes 3148 >> >> -O3 -ftrivial-auto-var-init=pattern >> dse - Number of stores deleted 5618 >> dsem - Number of deleted writes 3840 >> >> -O3 -flto >> dse - Number of stores deleted 3985 >> dsem - Number of deleted writes 3838 >> >> -O3 -flto -ftrivial-auto-var-init=pattern >> dse - Number of stores deleted 6461 >> dsem - Number of deleted writes 4215 >> >> -Os >> dse - Number of stores deleted 1443 >> dsem - Number of deleted writes 1517 >> >> -Os -ftrivial-auto-var-init=pattern >> dse - Number of stores deleted 3951 >> dsem - Number of deleted writes 2259 >> >> -Oz >> dse - Number of stores deleted 1072 >> dsem - Number of deleted writes 574 >> >> -Oz -ftrivial-auto-var-init=pattern >> dse - Number of stores deleted 3420 >> dsem - Number of deleted writes 1637 >> >> >> This looks great! Do you have a patch ready to go? >> >> >> *From: *Amara Emerson <aemerson at apple.com> >> *Date: *Tue, Apr 16, 2019 at 12:10 PM >> *To: *Vitaly Buka >> *Cc: *Alexander Potapenko, llvm-dev, Peter Collingbourne >> >> Can you post numbers for how many stores get eliminated from CTMark? >>> >>> On Apr 16, 2019, at 11:45 AM, Vitaly Buka <vitalybuka at google.com> wrote: >>> >>> I tried -Os and effect of new approach significantly increases. >>> I run regular DSE and immediately myDSE. With -Os myDSE removes more >>> than 50% of DSE number. >>> Which is expected as -Os inlines less and regular DSE can't remove over >>> function call. >>> >>> On Tue, Apr 16, 2019 at 7:11 AM Alexander Potapenko <glider at google.com> >>> wrote: >>> >>>> On Mon, Apr 15, 2019 at 11:02 PM Amara Emerson via llvm-dev >>>> <llvm-dev at lists.llvm.org> wrote: >>>> > >>>> > >>>> > > On Apr 15, 2019, at 1:51 PM, Vitaly Buka via llvm-dev < >>>> llvm-dev at lists.llvm.org> wrote: >>>> > > >>>> > > Hi JF, >>>> > > >>>> > > I've heard that you are interested DSE improvements and maybe we >>>> need to be in sync. >>>> > > So far I experimented with following DSE improvements: >>>> > > >>>> > > * Cross-block DSE, it eliminates additional 7% stores comparing to >>>> existing DSE. But it's not visible on benchmarks. >>>> > I take it you couldn’t see any runtime impact? If there’s code size >>>> improvements that could also be useful, CTMark in the llvm test suite is a >>>> useful subset of benchmarks to check this on (as a baseline use -Os to >>>> compare code size). >>>> > >>>> > Thanks, >>>> > Amara >>>> > > >>>> > > * Cross-block + Interprocedural analysis to annotate each function >>>> argument with: >>>> > > - can read before write >>>> > > - will always write >>>> > > This annotations gets me 20% stores deleted additional to the >>>> current DSE. >>>> I believe we can only benefit from removing extra stores. >>>> Hot functions in existing benchmarks are probably optimized good >>>> enough already, but speeding up the long tail is also important. >>>> Also, at least the repro in >>>> https://bugs.llvm.org/show_bug.cgi?id=40527 has been extracted from a >>>> real kernel benchmark (hackbench), where this extra store costed us >>>> 0.45% >>>> >>>> > > This is on LLVM codebase with -ftrivial-auto-var-init=patter. >>>> > > >>>> > > As-is it's less than I expected, so I would like to find good >>>> benchmark to decide if we should work to make production code from my >>>> experiment. >>>> > > >>>> > > So now I am also planing to try to extend that to whole program >>>> analysis. >>>> > > I will cleanup my code and upload this during this weak, if anyone >>>> wants to try. >>>> > > >>>> > > Vitaly. >>>> > > _______________________________________________ >>>> > > LLVM Developers mailing list >>>> > > llvm-dev at lists.llvm.org >>>> > > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> > >>>> > _______________________________________________ >>>> > LLVM Developers mailing list >>>> > llvm-dev at lists.llvm.org >>>> > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >>>> >>>> >>>> >>>> -- >>>> Alexander Potapenko >>>> Software Engineer >>>> >>>> Google Germany GmbH >>>> Erika-Mann-Straße, 33 >>>> 80636 München >>>> >>>> Geschäftsführer: Paul Manicle, Halimah DeLaine Prado >>>> Registergericht und -nummer: Hamburg, HRB 86891 >>>> Sitz der Gesellschaft: Hamburg >>>> >>> >>> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org >> https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >> >> >>-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190513/68ce2768/attachment.html>