Bardia Mahjour via llvm-dev
2019-May-15 16:54 UTC
[llvm-dev] Delinearization validity checks in DependenceAnalysis
Hi David, Thank you very much for your response. I also get correct results for my example (for a 64-bit target) if the upper bounds are changed to unsigned. The reason is simply because clang zero-extends `m` for address calculations but sign-extends it for the loop upper bound. This prevents SCEV from canceling out the 'm' term from the difference expression that looks like `(-3 + (sext i32 %m to i64) + (-1 * (zext i32 %m to i64))<nsw>)`. While we cannot reduce expressions of this form in general, it does pose a sever limitation for the vast majority of cases where the loop upper bound is non-negative. Regarding your example, I'm not sure I fully understand the concern and I would appreciate it if you could clarify that for me. My understanding is that if the intended shape of 'a' is in fact 'a[n][m]' then the example, as provided, has an out-of-bound access to start with. To avoid this out-bound access one would need to change the upper bound of the j-loop to be 'm-2'. Interestingly, if we do that, the current validity checks start to pass and we get [0 2] as the dependence vector. Here's a slightly modified version of your example that I tried: ``` typedef unsigned long long TT; void foo(TT n, TT m, int *a) { for (TT i = 0; i < n; ++i) for (TT j = 0; j < m-2; ++j) { a[i*m + j] = a[i*m + j+2]; } } ``` If the concern is that the actual intended shape of the array may have been something other than a[n][m], and that we are indexing it such that the accesses are in-bound with respect to the memory of the whole array but not with respect to individual dimensions, then I'm not sure we can reason about *any* delinearization statically (except for the limited cases where the bounds are compile-time known). Am I misunderstanding the issue? Bardia Mahjour Compiler Optimizations Hydra Squad Lead IBM Toronto Software Lab bmahjour at ca.ibm.com (905) 413-2336 From: David Green <David.Green at arm.com> To: "llvm-dev at lists.llvm.org" <llvm-dev at lists.llvm.org>, Bardia Mahjour <bmahjour at ca.ibm.com> Cc: nd <nd at arm.com> Date: 2019/05/14 02:50 PM Subject: [EXTERNAL] Re: [llvm-dev] Delinearization validity checks in DependenceAnalysis Hello Interestingly, the example you provide works for me so long as either it's a 32bit target, or the array bounds (n and m) are changed to unsigned. For a bit of history, DA used to have a different delinearisation method based on geps, but it was known to be invalid in same cases and was eventually removed. There was no delinearisation for a while until these checks were fixed, enough to be correct, but not as powerful as they could be. I believe Pollys delinearisation is much more powerful, and can version loops with runtime conditions. IIRC, the checks are there for cases like this: void foo(unsigned n, unsigned m, int *a) { for (int i = 0; i < n; ++i) for (int j = 0; j < m; ++j) { a[i*m + j] = a[i*m + j+2]; } } The "j-2" can underflow into the previous i iterations space. So the distance vector isn't [0 2] (and isn't even [= <]). Unfortunately this is perfectly valid in C (I think for the int a[][m] case too). See this comment for why they were needed and perhaps a better way to fix it: https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_llvm_llvm-2Dproject_commit_d143c65de3c884d09197da279d2f04f094efaf15-23diff-2D57473b376037dd3637516348b9b02556L3274&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=46eKxI_sFjjeBzn7X-OLXSEUwHN-HVCD16TF9OuyIqc&s=adPvJDhPtFMlaTWihZmvWjXqFUFHDnzcV84oaDGlryM&e Any improvements to the delinearisation code would be most welcome. Dave From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Bardia Mahjour via llvm-dev <llvm-dev at lists.llvm.org> Sent: 13 May 2019 14:48 To: llvm-dev at lists.llvm.org Subject: [llvm-dev] Delinearization validity checks in DependenceAnalysis Hi all, I have been looking at the `DependenceAnalysis` pass in `llvm/include/llvm/Analysis/DependenceAnalysis.h`. In order for this analysis to produce accurate dependence vectors for multi-dimensional arrays in nested loops, it needs to "delinearize" array element accesses to recover the subscripts in each dimension of the array. I believe that the current implementation of delinearization is based on this paper: https://urldefense.proofpoint.com/v2/url?u=http-3A__web.cse.ohio-2Dstate.edu_-7Epouchet.2_doc_ics-2Darticle.15a.pdf&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=46eKxI_sFjjeBzn7X-OLXSEUwHN-HVCD16TF9OuyIqc&s=nraVp9R56UlYi0We27kmGSjZGnM296r0HFpbRR62Fzs&e. This paper describes how to delinearize the subscripts, and as a last step it requires certain conditions to be met in order to validate that the delinearized indexes are correct. The `tryDelinearize` function in `DependenceAnalysis.cpp` appears to be checking for cases where these conditions can be validated *at compile time*: ``` // Statically check that the array bounds are in-range. The first subscript we // don't have a size for and it cannot overflow into another subscript, so is // always safe. The others need to be 0 <= subscript[i] < bound, for both src // and dst. // FIXME: It may be better to record these sizes and add them as constraints // to the dependency checks. for (int i = 1; i < size; ++i) { if (!isKnownNonNegative(SrcSubscripts[i], SrcPtr)) return false; if (!isKnownLessThan(SrcSubscripts[i], Sizes[i - 1])) return false; if (!isKnownNonNegative(DstSubscripts[i], DstPtr)) return false; if (!isKnownLessThan(DstSubscripts[i], Sizes[i - 1])) return false; } ``` The problem is that in a lot of cases these conditions cannot be proven statically, even though the delinearized indexes are in fact correct. For example consider this simple loop: ``` void foo(int n, int m, int a[][m]) { for (int i = 0; i < n-1; ++i) for (int j = 2; j < m; ++j) { a[i][j] = a[i+1][j-2]; } } clang test.c -c -O3 -S -Xclang -disable-llvm-passes -emit-llvm opt -mem2reg -instcombine -indvars -loop-simplify -loop-rotate -inline -simplifycfg test.ll -S -o test.simp.ll opt test.simp.ll -analyze -da ``` will produce: ``` da analyze - none! da analyze - anti [* *|<]! da analyze - output [* *]! ``` If the validity checks were not present, the result would be much more accurate dependence vector with accurate dependence distances: ``` da analyze - none! da analyze - consistent anti [1 -2]! da analyze - none! ``` In my experience the static validity checks tend to fail in many common cases (involving loops with symbolic bounds). Sometimes this happens because SCEV is not able to simplify the expressions due to presence of type casts and sign/zero extensions, but other times the conditions are just not computable at compile-time. So far I haven't been able to come up with an example where the validity checks in the current implementation legitimately catch a case of invalid delinearization. I've also disabled these checks and run some tests without finding any failures (other than improvements in the dependence analysis LIT tests). I also had a quick look at Polly which benefits from delinearization. From what I saw, a much more optimistic approach is taken where the validity checks seem to be avoided. My questions are: 1. Does anyone have more information about why these validity checks are necessary in the current implementation, perhaps with some examples showing an incorrect delinearization that's possible without those checks? 2. Are there any concerns with putting these validity checks under an option so that we can more easily disable them and experiment? This could also help us improve LIT tests, since some of them have been pessimised to compensate for DA's inability to delinearize, and could fail to catch regressions as a result of bad changes to the data dependence analysis. Looking forward to your help on this. Thank you, Bardia Mahjour Compiler Optimizations Hydra Squad Lead IBM Toronto Software Lab bmahjour at ca.ibm.com (905) 413-2336 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190515/bba4fa17/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190515/bba4fa17/attachment.gif>
Doerfert, Johannes via llvm-dev
2019-May-15 19:21 UTC
[llvm-dev] Delinearization validity checks in DependenceAnalysis
On 05/15, Bardia Mahjour via llvm-dev wrote:> Regarding your example, I'm not sure I fully understand the concern and I > would appreciate it if you could clarify that for me. My understanding is > that if the intended shape of 'a' is in fact 'a[n][m]' then the example, as > provided, has an out-of-bound access to start with. To avoid this out-bound > access one would need to change the upper bound of the j-loop to be 'm-2'. > Interestingly, if we do that, the current validity checks start to pass and > we get [0 2] as the dependence vector. Here's a slightly modified version > of your example that I tried: > > ``` > typedef unsigned long long TT; > void foo(TT n, TT m, int *a) { > for (TT i = 0; i < n; ++i) > for (TT j = 0; j < m-2; ++j) { > a[i*m + j] = a[i*m + j+2]; > } > } > ``` > > If the concern is that the actual intended shape of the array may have been > something other than a[n][m], and that we are indexing it such that the > accesses are in-bound with respect to the memory of the whole array but not > with respect to individual dimensions, then I'm not sure we can reason > about *any* delinearization statically (except for the limited cases where > the bounds are compile-time known).In the example above, it works statically, doesn't it? Generally, if loop bounds and access functions are "in-sync", it should work statically. It does not if you do use different/multiple unknown values (parameters) or confuse Scalar Evolution (different types).> Am I misunderstanding the issue? > > Bardia Mahjour > Compiler Optimizations > Hydra Squad Lead > IBM Toronto Software Lab > bmahjour at ca.ibm.com (905) 413-2336 > > > > > > From: David Green <David.Green at arm.com> > To: "llvm-dev at lists.llvm.org" <llvm-dev at lists.llvm.org>, Bardia > Mahjour <bmahjour at ca.ibm.com> > Cc: nd <nd at arm.com> > Date: 2019/05/14 02:50 PM > Subject: [EXTERNAL] Re: [llvm-dev] Delinearization validity checks in > DependenceAnalysis > > > > Hello > > Interestingly, the example you provide works for me so long as either it's > a 32bit target, or the array bounds (n and m) are changed to unsigned. > > For a bit of history, DA used to have a different delinearisation method > based on geps, but it was known to be invalid in same cases and was > eventually removed. There was no delinearisation for a while until these > checks were fixed, enough to be correct, but not as powerful as they could > be. I believe Pollys delinearisation is much more powerful, and can version > loops with runtime conditions. > > IIRC, the checks are there for cases like this: > void foo(unsigned n, unsigned m, int *a) { > for (int i = 0; i < n; ++i) > for (int j = 0; j < m; ++j) { > a[i*m + j] = a[i*m + j+2]; > } > } > > The "j-2" can underflow into the previous i iterations space. So the > distance vector isn't [0 2] (and isn't even [= <]). Unfortunately this is > perfectly valid in C (I think for the int a[][m] case too). > > See this comment for why they were needed and perhaps a better way to fix > it: > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_llvm_llvm-2Dproject_commit_d143c65de3c884d09197da279d2f04f094efaf15-23diff-2D57473b376037dd3637516348b9b02556L3274&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=46eKxI_sFjjeBzn7X-OLXSEUwHN-HVCD16TF9OuyIqc&s=adPvJDhPtFMlaTWihZmvWjXqFUFHDnzcV84oaDGlryM&e> > > Any improvements to the delinearisation code would be most welcome. > Dave > > > > > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Bardia > Mahjour via llvm-dev <llvm-dev at lists.llvm.org> > Sent: 13 May 2019 14:48 > To: llvm-dev at lists.llvm.org > Subject: [llvm-dev] Delinearization validity checks in DependenceAnalysis > > Hi all, > > I have been looking at the `DependenceAnalysis` pass in > `llvm/include/llvm/Analysis/DependenceAnalysis.h`. > In order for this analysis to produce accurate dependence vectors for > multi-dimensional arrays in nested loops, it needs to "delinearize" array > element accesses to recover the subscripts in each dimension of the array. > I believe that the current implementation of delinearization is based on > this paper: > https://urldefense.proofpoint.com/v2/url?u=http-3A__web.cse.ohio-2Dstate.edu_-7Epouchet.2_doc_ics-2Darticle.15a.pdf&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=46eKxI_sFjjeBzn7X-OLXSEUwHN-HVCD16TF9OuyIqc&s=nraVp9R56UlYi0We27kmGSjZGnM296r0HFpbRR62Fzs&e> . > > This paper describes how to delinearize the subscripts, and as a last step > it requires certain conditions to be met in order to validate that the > delinearized indexes are correct. The `tryDelinearize` function in > `DependenceAnalysis.cpp` appears to be checking for cases where these > conditions can be validated *at compile time*: > > ``` > // Statically check that the array bounds are in-range. The first subscript > we > // don't have a size for and it cannot overflow into another subscript, so > is > // always safe. The others need to be 0 <= subscript[i] < bound, for both > src > // and dst. > // FIXME: It may be better to record these sizes and add them as > constraints > // to the dependency checks. > for (int i = 1; i < size; ++i) { > if (!isKnownNonNegative(SrcSubscripts[i], SrcPtr)) > return false; > > if (!isKnownLessThan(SrcSubscripts[i], Sizes[i - 1])) > return false; > > if (!isKnownNonNegative(DstSubscripts[i], DstPtr)) > return false; > > if (!isKnownLessThan(DstSubscripts[i], Sizes[i - 1])) > return false; > } > ``` > > The problem is that in a lot of cases these conditions cannot be proven > statically, even though the delinearized indexes are in fact correct. For > example consider this simple loop: > > ``` > void foo(int n, int m, int a[][m]) { > for (int i = 0; i < n-1; ++i) > for (int j = 2; j < m; ++j) { > a[i][j] = a[i+1][j-2]; > } > } > > clang test.c -c -O3 -S -Xclang -disable-llvm-passes -emit-llvm > opt -mem2reg -instcombine -indvars -loop-simplify -loop-rotate -inline > -simplifycfg test.ll -S -o test.simp.ll > opt test.simp.ll -analyze -da > ``` > > will produce: > > ``` > da analyze - none! > da analyze - anti [* *|<]! > da analyze - output [* *]! > ``` > > If the validity checks were not present, the result would be much more > accurate dependence vector with accurate dependence distances: > > ``` > da analyze - none! > da analyze - consistent anti [1 -2]! > da analyze - none! > ``` > > In my experience the static validity checks tend to fail in many common > cases (involving loops with symbolic bounds). Sometimes this happens > because SCEV is not able to simplify the expressions due to presence of > type casts and sign/zero extensions, but other times the conditions are > just not computable at compile-time. > > So far I haven't been able to come up with an example where the validity > checks in the current implementation legitimately catch a case of invalid > delinearization. I've also disabled these checks and run some tests without > finding any failures (other than improvements in the dependence analysis > LIT tests). > > I also had a quick look at Polly which benefits from delinearization. From > what I saw, a much more optimistic approach is taken where the validity > checks seem to be avoided. > > My questions are: > 1. Does anyone have more information about why these validity checks are > necessary in the current implementation, perhaps with some examples showing > an incorrect delinearization that's possible without those checks? > 2. Are there any concerns with putting these validity checks under an > option so that we can more easily disable them and experiment? This could > also help us improve LIT tests, since some of them have been pessimised to > compensate for DA's inability to delinearize, and could fail to catch > regressions as a result of bad changes to the data dependence analysis. > > Looking forward to your help on this. > > Thank you, > > Bardia Mahjour > Compiler Optimizations > Hydra Squad Lead > IBM Toronto Software Lab > bmahjour at ca.ibm.com (905) 413-2336 > > > > >> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Johannes Doerfert Researcher Argonne National Laboratory Lemont, IL 60439, USA jdoerfert at anl.gov -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 228 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190515/8180536e/attachment.sig>
Bardia Mahjour via llvm-dev
2019-May-16 19:04 UTC
[llvm-dev] Delinearization validity checks in DependenceAnalysis
Hi Johannes,> In the example above, it works statically, doesn't it?Yes, that example works statically, and the delinearization appears correct to me. What I'm looking for though, is an example showing that delinearization produces invalid subscripts that are rightfully caught by the validity checks.> if loop bounds and access functions are "in-sync", it should workstatically. It does not if you do use different/multiple unknown values (parameters) or confuse Scalar Evolution (different types). In practice, many (if not most) loops will have some sort of type-conversion on 64-bit targets which prevent SCEV simplifications that are necessary to prove the validity conditions at compile-time. That's a big problem for loop optimizations. Bardia Mahjour Compiler Optimizations Hydra Squad Lead IBM Toronto Software Lab bmahjour at ca.ibm.com (905) 413-2336 From: "Doerfert, Johannes" <jdoerfert at anl.gov> To: Bardia Mahjour <bmahjour at ca.ibm.com> Cc: David Green <David.Green at arm.com>, "llvm-dev at lists.llvm.org" <llvm-dev at lists.llvm.org>, nd <nd at arm.com> Date: 2019/05/15 03:21 PM Subject: [EXTERNAL] Re: [llvm-dev] Delinearization validity checks in DependenceAnalysis On 05/15, Bardia Mahjour via llvm-dev wrote:> Regarding your example, I'm not sure I fully understand the concern and I > would appreciate it if you could clarify that for me. My understanding is > that if the intended shape of 'a' is in fact 'a[n][m]' then the example,as> provided, has an out-of-bound access to start with. To avoid thisout-bound> access one would need to change the upper bound of the j-loop to be'm-2'.> Interestingly, if we do that, the current validity checks start to passand> we get [0 2] as the dependence vector. Here's a slightly modified version > of your example that I tried: > > ``` > typedef unsigned long long TT; > void foo(TT n, TT m, int *a) { > for (TT i = 0; i < n; ++i) > for (TT j = 0; j < m-2; ++j) { > a[i*m + j] = a[i*m + j+2]; > } > } > ``` > > If the concern is that the actual intended shape of the array may havebeen> something other than a[n][m], and that we are indexing it such that the > accesses are in-bound with respect to the memory of the whole array butnot> with respect to individual dimensions, then I'm not sure we can reason > about *any* delinearization statically (except for the limited caseswhere> the bounds are compile-time known).In the example above, it works statically, doesn't it? Generally, if loop bounds and access functions are "in-sync", it should work statically. It does not if you do use different/multiple unknown values (parameters) or confuse Scalar Evolution (different types).> Am I misunderstanding the issue? > > Bardia Mahjour > Compiler Optimizations > Hydra Squad Lead > IBM Toronto Software Lab > bmahjour at ca.ibm.com (905) 413-2336 > > > > > > From: David Green <David.Green at arm.com> > To: "llvm-dev at lists.llvm.org" <llvm-dev at lists.llvm.org>, Bardia > Mahjour <bmahjour at ca.ibm.com> > Cc: nd <nd at arm.com> > Date: 2019/05/14 02:50 PM > Subject: [EXTERNAL] Re: [llvm-dev] Delinearization validitychecks in> DependenceAnalysis > > > > Hello > > Interestingly, the example you provide works for me so long as eitherit's> a 32bit target, or the array bounds (n and m) are changed to unsigned. > > For a bit of history, DA used to have a different delinearisation method > based on geps, but it was known to be invalid in same cases and was > eventually removed. There was no delinearisation for a while until these > checks were fixed, enough to be correct, but not as powerful as theycould> be. I believe Pollys delinearisation is much more powerful, and canversion> loops with runtime conditions. > > IIRC, the checks are there for cases like this: > void foo(unsigned n, unsigned m, int *a) { > for (int i = 0; i < n; ++i) > for (int j = 0; j < m; ++j) { > a[i*m + j] = a[i*m + j+2]; > } > } > > The "j-2" can underflow into the previous i iterations space. So the > distance vector isn't [0 2] (and isn't even [= <]). Unfortunately this is > perfectly valid in C (I think for the int a[][m] case too). > > See this comment for why they were needed and perhaps a better way to fix > it: >https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_llvm_llvm-2Dproject_commit_d143c65de3c884d09197da279d2f04f094efaf15-23diff-2D57473b376037dd3637516348b9b02556L3274&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=46eKxI_sFjjeBzn7X-OLXSEUwHN-HVCD16TF9OuyIqc&s=adPvJDhPtFMlaTWihZmvWjXqFUFHDnzcV84oaDGlryM&e> > > Any improvements to the delinearisation code would be most welcome. > Dave > > > > > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Bardia > Mahjour via llvm-dev <llvm-dev at lists.llvm.org> > Sent: 13 May 2019 14:48 > To: llvm-dev at lists.llvm.org > Subject: [llvm-dev] Delinearization validity checks in DependenceAnalysis > > Hi all, > > I have been looking at the `DependenceAnalysis` pass in > `llvm/include/llvm/Analysis/DependenceAnalysis.h`. > In order for this analysis to produce accurate dependence vectors for > multi-dimensional arrays in nested loops, it needs to "delinearize" array > element accesses to recover the subscripts in each dimension of thearray.> I believe that the current implementation of delinearization is based on > this paper: >https://urldefense.proofpoint.com/v2/url?u=http-3A__web.cse.ohio-2Dstate.edu_-7Epouchet.2_doc_ics-2Darticle.15a.pdf&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=46eKxI_sFjjeBzn7X-OLXSEUwHN-HVCD16TF9OuyIqc&s=nraVp9R56UlYi0We27kmGSjZGnM296r0HFpbRR62Fzs&e> . > > This paper describes how to delinearize the subscripts, and as a laststep> it requires certain conditions to be met in order to validate that the > delinearized indexes are correct. The `tryDelinearize` function in > `DependenceAnalysis.cpp` appears to be checking for cases where these > conditions can be validated *at compile time*: > > ``` > // Statically check that the array bounds are in-range. The firstsubscript> we > // don't have a size for and it cannot overflow into another subscript,so> is > // always safe. The others need to be 0 <= subscript[i] < bound, for both > src > // and dst. > // FIXME: It may be better to record these sizes and add them as > constraints > // to the dependency checks. > for (int i = 1; i < size; ++i) { > if (!isKnownNonNegative(SrcSubscripts[i], SrcPtr)) > return false; > > if (!isKnownLessThan(SrcSubscripts[i], Sizes[i - 1])) > return false; > > if (!isKnownNonNegative(DstSubscripts[i], DstPtr)) > return false; > > if (!isKnownLessThan(DstSubscripts[i], Sizes[i - 1])) > return false; > } > ``` > > The problem is that in a lot of cases these conditions cannot be proven > statically, even though the delinearized indexes are in fact correct. For > example consider this simple loop: > > ``` > void foo(int n, int m, int a[][m]) { > for (int i = 0; i < n-1; ++i) > for (int j = 2; j < m; ++j) { > a[i][j] = a[i+1][j-2]; > } > } > > clang test.c -c -O3 -S -Xclang -disable-llvm-passes -emit-llvm > opt -mem2reg -instcombine -indvars -loop-simplify -loop-rotate -inline > -simplifycfg test.ll -S -o test.simp.ll > opt test.simp.ll -analyze -da > ``` > > will produce: > > ``` > da analyze - none! > da analyze - anti [* *|<]! > da analyze - output [* *]! > ``` > > If the validity checks were not present, the result would be much more > accurate dependence vector with accurate dependence distances: > > ``` > da analyze - none! > da analyze - consistent anti [1 -2]! > da analyze - none! > ``` > > In my experience the static validity checks tend to fail in many common > cases (involving loops with symbolic bounds). Sometimes this happens > because SCEV is not able to simplify the expressions due to presence of > type casts and sign/zero extensions, but other times the conditions are > just not computable at compile-time. > > So far I haven't been able to come up with an example where the validity > checks in the current implementation legitimately catch a case of invalid > delinearization. I've also disabled these checks and run some testswithout> finding any failures (other than improvements in the dependence analysis > LIT tests). > > I also had a quick look at Polly which benefits from delinearization.From> what I saw, a much more optimistic approach is taken where the validity > checks seem to be avoided. > > My questions are: > 1. Does anyone have more information about why these validity checks are > necessary in the current implementation, perhaps with some examplesshowing> an incorrect delinearization that's possible without those checks? > 2. Are there any concerns with putting these validity checks under an > option so that we can more easily disable them and experiment? This could > also help us improve LIT tests, since some of them have been pessimisedto> compensate for DA's inability to delinearize, and could fail to catch > regressions as a result of bad changes to the data dependence analysis. > > Looking forward to your help on this. > > Thank you, > > Bardia Mahjour > Compiler Optimizations > Hydra Squad Lead > IBM Toronto Software Lab > bmahjour at ca.ibm.com (905) 413-2336 > > > > >> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Johannes Doerfert Researcher Argonne National Laboratory Lemont, IL 60439, USA jdoerfert at anl.gov [attachment "signature.asc" deleted by Bardia Mahjour/Toronto/IBM] -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190516/d2d8d609/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: graycol.gif Type: image/gif Size: 105 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190516/d2d8d609/attachment.gif>
David Green via llvm-dev
2019-May-16 19:50 UTC
[llvm-dev] Delinearization validity checks in DependenceAnalysis
Hello Under the proviso that it's been a while since I looked into any of these things... On 05/15, Bardia Mahjour via llvm-dev wrote:> I also get correct results for my example (for a 64-bit target) if the upper > bounds are changed to unsigned. The reason is simply because clang zero-extends > `m` for address calculations but sign-extends it for the loop upper bound. This > prevents SCEV from canceling out the 'm' term from the difference expression > that looks like `(-3 + (sext i32 %m to i64) + (-1 * (zext i32 %m to > i64))<nsw>)`. While we cannot reduce expressions of this form in general, it > does pose a sever limitation for the vast majority of cases where the loop > upper bound is non-negative. >There are already some function like isKnownPredicate that attempt to deal with some of this is DA. Can we extend the other for these cases? Perhaps with some extra information that we have that SCEV would in general not know, or do something extra that it would not compute. Or ideally can we just improve SCEV to get this right?> Regarding your example, I'm not sure I fully understand the concern and I > would appreciate it if you could clarify that for me. My understanding is > that if the intended shape of 'a' is in fact 'a[n][m]' then the example, as > provided, has an out-of-bound access to start with. To avoid this out-bound > access one would need to change the upper bound of the j-loop to be 'm-2'. > Interestingly, if we do that, the current validity checks start to pass and > we get [0 2] as the dependence vector. Here's a slightly modified version > of your example that I tried: > > ``` > typedef unsigned long long TT; > void foo(TT n, TT m, int *a) { > for (TT i = 0; i < n; ++i) > for (TT j = 0; j < m-2; ++j) { > a[i*m + j] = a[i*m + j+2]; > } > } > ```Sure, but is wouldn't be invalid from C to use the "for (TT j = 0; j < m; ++j)" bound, so long as the size of the object passed through to "a" was large enough that it didn't overflow. We don't have enough information to prove this won't happen, so unfortunately we need to be a conservative. So this loop: void test(int * A, signed N, int M) { for(int i = 1; i < N; i+=1) { for(int j = 0; j < M; j+=1) { A[i*M + j] = 2; A[i*M + j - 1] = 4; } } } Needs to not get the dependency vector [0 >], otherwise unroll and jam might think it was safe to reorder (unroll and jam) and miscompile the code.> > If the concern is that the actual intended shape of the array may have been > something other than a[n][m], and that we are indexing it such that the > accesses are in-bound with respect to the memory of the whole array but not > with respect to individual dimensions, then I'm not sure we can reason > about *any* delinearization statically (except for the limited cases where > the bounds are compile-time known). > > Am I misunderstanding the issue? > > Bardia Mahjour > Compiler Optimizations > Hydra Squad Lead > IBM Toronto Software Lab > bmahjour at ca.ibm.com (905) 413-2336 > > > > > > From: David Green <David.Green at arm.com> > To: "llvm-dev at lists.llvm.org" <llvm-dev at lists.llvm.org>, Bardia > Mahjour <bmahjour at ca.ibm.com> > Cc: nd <nd at arm.com> > Date: 2019/05/14 02:50 PM > Subject: [EXTERNAL] Re: [llvm-dev] Delinearization validity checks in > DependenceAnalysis > > > > Hello > > Interestingly, the example you provide works for me so long as either it's > a 32bit target, or the array bounds (n and m) are changed to unsigned. > > For a bit of history, DA used to have a different delinearisation method > based on geps, but it was known to be invalid in same cases and was > eventually removed. There was no delinearisation for a while until these > checks were fixed, enough to be correct, but not as powerful as they could > be. I believe Pollys delinearisation is much more powerful, and can version > loops with runtime conditions. > > IIRC, the checks are there for cases like this: > void foo(unsigned n, unsigned m, int *a) { > for (int i = 0; i < n; ++i) > for (int j = 0; j < m; ++j) { > a[i*m + j] = a[i*m + j+2]; > } > } > > The "j-2" can underflow into the previous i iterations space. So the > distance vector isn't [0 2] (and isn't even [= <]). Unfortunately this is > perfectly valid in C (I think for the int a[][m] case too). > > See this comment for why they were needed and perhaps a better way to fix > it: > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_llvm_llvm-2Dproject_commit_d143c65de3c884d09197da279d2f04f094efaf15-23diff-2D57473b376037dd3637516348b9b02556L3274&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=46eKxI_sFjjeBzn7X-OLXSEUwHN-HVCD16TF9OuyIqc&s=adPvJDhPtFMlaTWihZmvWjXqFUFHDnzcV84oaDGlryM&e> > > Any improvements to the delinearisation code would be most welcome. > Dave > > > > > From: llvm-dev <llvm-dev-bounces at lists.llvm.org> on behalf of Bardia > Mahjour via llvm-dev <llvm-dev at lists.llvm.org> > Sent: 13 May 2019 14:48 > To: llvm-dev at lists.llvm.org > Subject: [llvm-dev] Delinearization validity checks in DependenceAnalysis > > Hi all, > > I have been looking at the `DependenceAnalysis` pass in > `llvm/include/llvm/Analysis/DependenceAnalysis.h`. > In order for this analysis to produce accurate dependence vectors for > multi-dimensional arrays in nested loops, it needs to "delinearize" array > element accesses to recover the subscripts in each dimension of the array. > I believe that the current implementation of delinearization is based on > this paper: > https://urldefense.proofpoint.com/v2/url?u=http-3A__web.cse.ohio-2Dstate.edu_-7Epouchet.2_doc_ics-2Darticle.15a.pdf&d=DwIFAw&c=jf_iaSHvJObTbx-siA1ZOg&r=aihobyOnVzXW7OPSK1-NiSYQkq7oP3ZSUVc4BemvrVo&m=46eKxI_sFjjeBzn7X-OLXSEUwHN-HVCD16TF9OuyIqc&s=nraVp9R56UlYi0We27kmGSjZGnM296r0HFpbRR62Fzs&e> . > > This paper describes how to delinearize the subscripts, and as a last step > it requires certain conditions to be met in order to validate that the > delinearized indexes are correct. The `tryDelinearize` function in > `DependenceAnalysis.cpp` appears to be checking for cases where these > conditions can be validated *at compile time*: > > ``` > // Statically check that the array bounds are in-range. The first subscript > we > // don't have a size for and it cannot overflow into another subscript, so > is > // always safe. The others need to be 0 <= subscript[i] < bound, for both > src > // and dst. > // FIXME: It may be better to record these sizes and add them as > constraints > // to the dependency checks. > for (int i = 1; i < size; ++i) { > if (!isKnownNonNegative(SrcSubscripts[i], SrcPtr)) > return false; > > if (!isKnownLessThan(SrcSubscripts[i], Sizes[i - 1])) > return false; > > if (!isKnownNonNegative(DstSubscripts[i], DstPtr)) > return false; > > if (!isKnownLessThan(DstSubscripts[i], Sizes[i - 1])) > return false; > } > ``` > > The problem is that in a lot of cases these conditions cannot be proven > statically, even though the delinearized indexes are in fact correct. For > example consider this simple loop: > > ``` > void foo(int n, int m, int a[][m]) { > for (int i = 0; i < n-1; ++i) > for (int j = 2; j < m; ++j) { > a[i][j] = a[i+1][j-2]; > } > } > > clang test.c -c -O3 -S -Xclang -disable-llvm-passes -emit-llvm > opt -mem2reg -instcombine -indvars -loop-simplify -loop-rotate -inline > -simplifycfg test.ll -S -o test.simp.ll > opt test.simp.ll -analyze -da > ``` > > will produce: > > ``` > da analyze - none! > da analyze - anti [* *|<]! > da analyze - output [* *]! > ``` > > If the validity checks were not present, the result would be much more > accurate dependence vector with accurate dependence distances: > > ``` > da analyze - none! > da analyze - consistent anti [1 -2]! > da analyze - none! > ``` > > In my experience the static validity checks tend to fail in many common > cases (involving loops with symbolic bounds). Sometimes this happens > because SCEV is not able to simplify the expressions due to presence of > type casts and sign/zero extensions, but other times the conditions are > just not computable at compile-time. > > So far I haven't been able to come up with an example where the validity > checks in the current implementation legitimately catch a case of invalid > delinearization. I've also disabled these checks and run some tests without > finding any failures (other than improvements in the dependence analysis > LIT tests). > > I also had a quick look at Polly which benefits from delinearization. From > what I saw, a much more optimistic approach is taken where the validity > checks seem to be avoided. > > My questions are: > 1. Does anyone have more information about why these validity checks are > necessary in the current implementation, perhaps with some examples showing > an incorrect delinearization that's possible without those checks? > 2. Are there any concerns with putting these validity checks under an > option so that we can more easily disable them and experiment? This could > also help us improve LIT tests, since some of them have been pessimised to > compensate for DA's inability to delinearize, and could fail to catch > regressions as a result of bad changes to the data dependence analysis. > > Looking forward to your help on this. > > Thank you, > > Bardia Mahjour > Compiler Optimizations > Hydra Squad Lead > IBM Toronto Software Lab > bmahjour at ca.ibm.com (905) 413-2336 > > > > >> _______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev-- Johannes Doerfert Researcher Argonne National Laboratory Lemont, IL 60439, USA jdoerfert at anl.gov -------------- next part -------------- A non-text attachment was scrubbed... Name: da3.c Type: text/x-csrc Size: 525 bytes Desc: da3.c URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20190516/b2b50a68/attachment.c>
Reasonably Related Threads
- Delinearization validity checks in DependenceAnalysis
- Delinearization validity checks in DependenceAnalysis
- Delinearization validity checks in DependenceAnalysis
- Delinearization validity checks in DependenceAnalysis
- Loop Opt WG Meeting Minutes for July 17, 2019