Marcello Maggioni via llvm-dev
2015-Sep-15 22:19 UTC
[llvm-dev] RFC: speedups with instruction side-data (ADCE, perhaps others?)
Here the main problem seem to be that we don’t have a fast way to check for “is this in a certain set”. In my opinion here the problem is that we don’t have a good (read fast) way for checking that a certain condition applies to an llvm Value. I believe this bit is trying to address that. Maybe there is room for improvement in LLVM set-like data structures? Maybe there is a way to use a BitVector instead and generate from the Value/Instruction a unique linear ID that can be used with that? Marcello> On 15 Sep 2015, at 14:49, Pete Cooper via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > >> On Sep 15, 2015, at 2:16 PM, Owen Anderson <resistor at mac.com <mailto:resistor at mac.com>> wrote: >> >> >>> On Sep 14, 2015, at 5:02 PM, Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>> >>>> >>>> On Sep 14, 2015, at 2:58 PM, Pete Cooper via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>> >>>> >>>>> On Sep 14, 2015, at 2:49 PM, Matt Arsenault via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>>> >>>>> On 09/14/2015 02:47 PM, escha via llvm-dev wrote: >>>>>> I would assume that it’s just considered to be garbage. I feel like any sort of per-pass side data like this should come with absolute minimal contracts, to avoid introducing any more inter-pass complexity. >>>>> I would think this would need to be a verifier error if it were ever non-0 >>>> +1 >>>> >>>> Otherwise every pass which ever needs this bit would have to first zero it out just to be safe, adding an extra walk over the whole functions. >>>> >>>> Of course otherwise the pass modifying it will have to zero it, which could also be a walk over the whole function. So either way you have lots iterate over, which is why i’m weary of this approach tbh. >>> >>> Every pass which ever uses this internally would have to set it to zero when it is done, adding an extra walk over the whole functions as you noticed. This goes against “you don’t pay for what you don’t use”, so definitively -1 for this. Better to cleanup before use. >>> I agree that the approach does not scale/generalize well, and we should try to find an alternative if possible. Now *if* it is the only way to improve performance significantly, we might have to weight the tradeoff. >> >> Does anyone have any concrete alternative suggestions to achieve the speedup demonstrated here? > Honestly, not an alternative to speedup ADCE. This appears to be the fastest way to improve that particular pass. > > My worry here comes from prior experience. I have done this exactly change years ago on a non-LLVM compiler. Adding one bit to each instruction was great, we used it in a few passes, then we found we needed a second bit, then a third. That quickly became unmanageable, and if we add this bit then we put ourselves on a repeat of that. > > I think we should weigh up what this change gets us in overall compile time. A 30% speedup in ADCE is great, and if a few other passes would benefit from this and get similar speedups, then even better. However, if thats only worth say 1% of overall compile time, then i’d argue its not worth it. > > Pete >> >> —Owen > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150915/f62be8ff/attachment.html>
escha via llvm-dev
2015-Sep-15 22:43 UTC
[llvm-dev] RFC: speedups with instruction side-data (ADCE, perhaps others?)
The problem then is that your “unique linear ID” becomes side-data you have to maintain with each instruction, effectively an IR version of a SlotIndex, right? —escha> On Sep 15, 2015, at 3:19 PM, Marcello Maggioni via llvm-dev <llvm-dev at lists.llvm.org> wrote: > > Here the main problem seem to be that we don’t have a fast way to check for “is this in a certain set”. > > In my opinion here the problem is that we don’t have a good (read fast) way for checking that a certain condition applies to an llvm Value. > I believe this bit is trying to address that. > > Maybe there is room for improvement in LLVM set-like data structures? > Maybe there is a way to use a BitVector instead and generate from the Value/Instruction a unique linear ID that can be used with that? > > Marcello >> On 15 Sep 2015, at 14:49, Pete Cooper via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> >>> On Sep 15, 2015, at 2:16 PM, Owen Anderson <resistor at mac.com <mailto:resistor at mac.com>> wrote: >>> >>> >>>> On Sep 14, 2015, at 5:02 PM, Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>> >>>>> >>>>> On Sep 14, 2015, at 2:58 PM, Pete Cooper via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>>> >>>>> >>>>>> On Sep 14, 2015, at 2:49 PM, Matt Arsenault via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>>>> >>>>>> On 09/14/2015 02:47 PM, escha via llvm-dev wrote: >>>>>>> I would assume that it’s just considered to be garbage. I feel like any sort of per-pass side data like this should come with absolute minimal contracts, to avoid introducing any more inter-pass complexity. >>>>>> I would think this would need to be a verifier error if it were ever non-0 >>>>> +1 >>>>> >>>>> Otherwise every pass which ever needs this bit would have to first zero it out just to be safe, adding an extra walk over the whole functions. >>>>> >>>>> Of course otherwise the pass modifying it will have to zero it, which could also be a walk over the whole function. So either way you have lots iterate over, which is why i’m weary of this approach tbh. >>>> >>>> Every pass which ever uses this internally would have to set it to zero when it is done, adding an extra walk over the whole functions as you noticed. This goes against “you don’t pay for what you don’t use”, so definitively -1 for this. Better to cleanup before use. >>>> I agree that the approach does not scale/generalize well, and we should try to find an alternative if possible. Now *if* it is the only way to improve performance significantly, we might have to weight the tradeoff. >>> >>> Does anyone have any concrete alternative suggestions to achieve the speedup demonstrated here? >> Honestly, not an alternative to speedup ADCE. This appears to be the fastest way to improve that particular pass. >> >> My worry here comes from prior experience. I have done this exactly change years ago on a non-LLVM compiler. Adding one bit to each instruction was great, we used it in a few passes, then we found we needed a second bit, then a third. That quickly became unmanageable, and if we add this bit then we put ourselves on a repeat of that. >> >> I think we should weigh up what this change gets us in overall compile time. A 30% speedup in ADCE is great, and if a few other passes would benefit from this and get similar speedups, then even better. However, if thats only worth say 1% of overall compile time, then i’d argue its not worth it. >> >> Pete >>> >>> —Owen >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150915/83c135da/attachment.html>
Daniel Berlin via llvm-dev
2015-Sep-15 22:50 UTC
[llvm-dev] RFC: speedups with instruction side-data (ADCE, perhaps others?)
Chandler had a scheme to reorg basic blocks so instructions were in a vector rather than a list for memory reasons anyway. For code removal optimizations, this gives you a unique id (BB, position in the vector) for free. On Tue, Sep 15, 2015 at 3:43 PM, escha via llvm-dev <llvm-dev at lists.llvm.org> wrote:> The problem then is that your “unique linear ID” becomes side-data you have > to maintain with each instruction, effectively an IR version of a SlotIndex, > right? > > —escha > > On Sep 15, 2015, at 3:19 PM, Marcello Maggioni via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > Here the main problem seem to be that we don’t have a fast way to check for > “is this in a certain set”. > > In my opinion here the problem is that we don’t have a good (read fast) way > for checking that a certain condition applies to an llvm Value. > I believe this bit is trying to address that. > > Maybe there is room for improvement in LLVM set-like data structures? > Maybe there is a way to use a BitVector instead and generate from the > Value/Instruction a unique linear ID that can be used with that? > > Marcello > > On 15 Sep 2015, at 14:49, Pete Cooper via llvm-dev <llvm-dev at lists.llvm.org> > wrote: > > > On Sep 15, 2015, at 2:16 PM, Owen Anderson <resistor at mac.com> wrote: > > > On Sep 14, 2015, at 5:02 PM, Mehdi Amini via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > > On Sep 14, 2015, at 2:58 PM, Pete Cooper via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > > On Sep 14, 2015, at 2:49 PM, Matt Arsenault via llvm-dev > <llvm-dev at lists.llvm.org> wrote: > > On 09/14/2015 02:47 PM, escha via llvm-dev wrote: > > I would assume that it’s just considered to be garbage. I feel like any sort > of per-pass side data like this should come with absolute minimal contracts, > to avoid introducing any more inter-pass complexity. > > I would think this would need to be a verifier error if it were ever non-0 > > +1 > > Otherwise every pass which ever needs this bit would have to first zero it > out just to be safe, adding an extra walk over the whole functions. > > Of course otherwise the pass modifying it will have to zero it, which could > also be a walk over the whole function. So either way you have lots iterate > over, which is why i’m weary of this approach tbh. > > > Every pass which ever uses this internally would have to set it to zero when > it is done, adding an extra walk over the whole functions as you noticed. > This goes against “you don’t pay for what you don’t use”, so definitively -1 > for this. Better to cleanup before use. > I agree that the approach does not scale/generalize well, and we should try > to find an alternative if possible. Now *if* it is the only way to improve > performance significantly, we might have to weight the tradeoff. > > > Does anyone have any concrete alternative suggestions to achieve the speedup > demonstrated here? > > Honestly, not an alternative to speedup ADCE. This appears to be the > fastest way to improve that particular pass. > > My worry here comes from prior experience. I have done this exactly change > years ago on a non-LLVM compiler. Adding one bit to each instruction was > great, we used it in a few passes, then we found we needed a second bit, > then a third. That quickly became unmanageable, and if we add this bit then > we put ourselves on a repeat of that. > > I think we should weigh up what this change gets us in overall compile time. > A 30% speedup in ADCE is great, and if a few other passes would benefit from > this and get similar speedups, then even better. However, if thats only > worth say 1% of overall compile time, then i’d argue its not worth it. > > Pete > > > —Owen > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev > > > > _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >
Marcello Maggioni via llvm-dev
2015-Sep-15 22:53 UTC
[llvm-dev] RFC: speedups with instruction side-data (ADCE, perhaps others?)
I didn’t mention any idea about how to implement it. I was just mentioning the concept in the case there was a fast way of computing such a thing. But basically that is what your concept does. Creating a BitVector , but because we don’t have a good way of discerning a different Value from another to index a private bitvector for the pass (with the exclusion of its pointer) we spread the bitvector over the Values themselves. I was wondering if there was a good way to actually index a separate BitVector effectively instead. But probably more expert people on the implementation internals of LLVM Value can answer that. Marcello> On 15 Sep 2015, at 15:43, escha <escha at apple.com> wrote: > > The problem then is that your “unique linear ID” becomes side-data you have to maintain with each instruction, effectively an IR version of a SlotIndex, right? > > —escha > >> On Sep 15, 2015, at 3:19 PM, Marcello Maggioni via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >> >> Here the main problem seem to be that we don’t have a fast way to check for “is this in a certain set”. >> >> In my opinion here the problem is that we don’t have a good (read fast) way for checking that a certain condition applies to an llvm Value. >> I believe this bit is trying to address that. >> >> Maybe there is room for improvement in LLVM set-like data structures? >> Maybe there is a way to use a BitVector instead and generate from the Value/Instruction a unique linear ID that can be used with that? >> >> Marcello >>> On 15 Sep 2015, at 14:49, Pete Cooper via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>> >>> >>>> On Sep 15, 2015, at 2:16 PM, Owen Anderson <resistor at mac.com <mailto:resistor at mac.com>> wrote: >>>> >>>> >>>>> On Sep 14, 2015, at 5:02 PM, Mehdi Amini via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>>> >>>>>> >>>>>> On Sep 14, 2015, at 2:58 PM, Pete Cooper via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>>>> >>>>>> >>>>>>> On Sep 14, 2015, at 2:49 PM, Matt Arsenault via llvm-dev <llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org>> wrote: >>>>>>> >>>>>>> On 09/14/2015 02:47 PM, escha via llvm-dev wrote: >>>>>>>> I would assume that it’s just considered to be garbage. I feel like any sort of per-pass side data like this should come with absolute minimal contracts, to avoid introducing any more inter-pass complexity. >>>>>>> I would think this would need to be a verifier error if it were ever non-0 >>>>>> +1 >>>>>> >>>>>> Otherwise every pass which ever needs this bit would have to first zero it out just to be safe, adding an extra walk over the whole functions. >>>>>> >>>>>> Of course otherwise the pass modifying it will have to zero it, which could also be a walk over the whole function. So either way you have lots iterate over, which is why i’m weary of this approach tbh. >>>>> >>>>> Every pass which ever uses this internally would have to set it to zero when it is done, adding an extra walk over the whole functions as you noticed. This goes against “you don’t pay for what you don’t use”, so definitively -1 for this. Better to cleanup before use. >>>>> I agree that the approach does not scale/generalize well, and we should try to find an alternative if possible. Now *if* it is the only way to improve performance significantly, we might have to weight the tradeoff. >>>> >>>> Does anyone have any concrete alternative suggestions to achieve the speedup demonstrated here? >>> Honestly, not an alternative to speedup ADCE. This appears to be the fastest way to improve that particular pass. >>> >>> My worry here comes from prior experience. I have done this exactly change years ago on a non-LLVM compiler. Adding one bit to each instruction was great, we used it in a few passes, then we found we needed a second bit, then a third. That quickly became unmanageable, and if we add this bit then we put ourselves on a repeat of that. >>> >>> I think we should weigh up what this change gets us in overall compile time. A 30% speedup in ADCE is great, and if a few other passes would benefit from this and get similar speedups, then even better. However, if thats only worth say 1% of overall compile time, then i’d argue its not worth it. >>> >>> Pete >>>> >>>> —Owen >>> >>> _______________________________________________ >>> LLVM Developers mailing list >>> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >>> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev <http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev> >> >> _______________________________________________ >> LLVM Developers mailing list >> llvm-dev at lists.llvm.org <mailto:llvm-dev at lists.llvm.org> >> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150915/b43986f9/attachment.html>