Preston Briggs
2012-Mar-29 04:34 UTC
[LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch
Hi Sanjoy & Hal, Looking at LoopDependenceAnalysis::analyzeZIV ... I don't understand much about SCEV, but the code seems inadequate. Consider these three examples: int a = ...; // we can't be sure of the relationship between a and b int b = ...; // perhaps they are parameters, or the result of I/O for (int i = 0; i < n; i++) { v[a][i] = ...; v[a + 1][i] = ...; } for (int i = 0; i < n; i++) { v[a][i] = ...; v[a][i] = ...; } for (int i = 0; i < n; i++) { v[a][i] = ...; v[b][i] = ...; } In the first loop, we can be confident that the two stores are to different locations and there is no dependence. In the second loop, we can be sure that the two references are to the same location and that there's a consistent loop-independent output dependence between the two stores. In the third loop, we can't be sure what's going on and must therefore assume an inconsistent loop-independent output dependence. The ZIVanalysis code doesn't do the right thing. I believe, but am not certain, that it will get the 1st two loops right, but will return Independent for the 3rd case. Better code might look something like: SCEV *delta = a - b if (delta is zero) { consistant &= true return Dependent } else if (delta is a known constant) return Independent } else { // delta is some symbolic term consistent = false return Dependent } By the way, when I assert all these things about dependence analysis, I'm not always correct. If you disagree, please let me know and we'll discuss it. Thanks, Preston -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20120328/a4bc2d0d/attachment.html>
Sanjoy Das
2012-Apr-01 20:54 UTC
[LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch
Hi Preston, Thanks for the feedback!> In LoopDependenceAnalysis::AnalyzePair, what's going to happen if we > have something like this > > for (i = 0; i < n; i++) > for (j = 0; j < n; j++) > A[i][j]++; > > versus > > for (i = 0; i < n; i++) > for (j = 0; j < n; j++) > A[j][i]++;I think this can be fixed by ordering the subscripts on the loop nesting. This should be doable, since the add recurrences remember the loop whose induction variable they're representing. Secondly, I think we can specialize on dealing only with linear add recurrences and constants (symbolic or actual constants) at this point. Since LLVM represents indices like for (i = 0; i < n; i++) for (j = 0; j < n; j++) a[i + j]++ as quite nicely as add recurrences of add recurrences we should be able to get quite far along with this approach.> And what's going to happen here? > > for (i = 0; i < n; i++) > A[i+1][i+2] = A[i][i];I left out handling coupled subscripts at this point because ignoring them only results in more conservative safer inferences about dependence. More sophisticated analysis like this can be added incrementally, I think.> And here? > > for (i = 0; i < n; i++) > for (j = 0; j < n; j++) > A[i] = A[i] + B[i][j];I'm not too clear about this. Assuming A and B point to two different locations, there seems to be only one loop independent dependence -- from A[i] to A[i]. You are right about the ZIV issue, it needs to be fixed. I think the following would be a good start at modelling dependencies between two Values: class Dependence { public: struct Level { enum { LT = 1, EQ = 2, GT = 4, ALL = 7 } direction; bool scalar; const SCEV *distance; // NULL implies no distance available const Loop *loop; }; enum Kind { Flow, Anti, Output, Input }; Kind getKind() const { return kind; } const Value *getSource() const { return source; } const Value *getDestination() const { return destination; } typedef SmallVector<const Level *, 4>::const_iterator const_iterator; const_iterator begin() const { return levels.begin(); } const_iterator end() const { return levels.end(); } private: Value *source, *destination; Kind kind; SmallVector<const Level *, 4> levels; friend class LoopDependenceAnalysis; }; More methods, like `isConsistent` and `isConfused` can be added easily to this interface, since the raw data required to compute them is available. Given two Values, then, we can first break them up into their respective indices. We can then order them based on their loop nesting. To actually compute the dependence, we can traverse down the subscript lists of the two Values and keep checking individual subscript pairs. I'm not sure on how to deal with cases such as these: for (i = 0; i < n; i++) { for (j = 0; j < n; j++) { t = a[i][j]; a[i + 1][j + 1] = t; } for (j = 0; j < n; j++) { t = a[i][j]; a[i + 1][j + 1] = t; } } My gut feeling is that we should stop checking individual subscripts when the loop nesting have diverged. But don't know how to concretely represent such dependencies. I'm assuming in cases such as these: for (i = 0; i < n; i++) for (j = 0; j < n; j++) a[i][0] = a[j][i] we can have a dependence edge [S U | 0]. Let me know what you think! -- Sanjoy Das. http://playingwithpointers.com
Preston Briggs
2012-Apr-03 05:25 UTC
[LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch
Hi Sanjoy, I wondered:>> In LoopDependenceAnalysis::AnalyzePair, what's going to happen if we >> have something like this >> >> for (i = 0; i < n; i++) >> for (j = 0; j < n; j++) >> A[i][j]++; >> >> versus >> >> for (i = 0; i < n; i++) >> for (j = 0; j < n; j++) >> A[j][i]++;> I think this can be fixed by ordering the subscripts on the loop > nesting. This should be doable, since the add recurrences remember > the loop whose induction variable they're representing.Yes, we need to record the results of the individual subscript tests in a vector indexed by the loop-nesting depth. The order of the subscripts is irrelevant.> Secondly, I think we can specialize on dealing only with linear add > recurrences and constants (symbolic or actual constants) at this > point. Since LLVM represents indices like > > for (i = 0; i < n; i++) > for (j = 0; j < n; j++) > a[i + j]++ > > as quite nicely as add recurrences of add recurrences we should be > able to get quite far along with this approach.Yes, this will be fine. Indeed, I don't think we'll ever be able to do more.>> And what's going to happen here? >> >> for (i = 0; i < n; i++) >> A[i+1][i+2] = A[i][i]; > > I left out handling coupled subscripts at this point because ignoring > them only results in more conservative safer inferences about > dependence. More sophisticated analysis like this can be added > incrementally, I think.Yes, though my point was that you are currently recording two different results, when there's actually only one loop level.>> And here? >> >> for (i = 0; i < n; i++) >> for (j = 0; j < n; j++) >> A[i] = A[i] + B[i][j]; > > I'm not too clear about this. Assuming A and B point to two different > locations, there seems to be only one loop independent dependence -- > from A[i] to A[i].Again, I'm trying to point out that the number and order of subscripts has nothing to do with the loop levels. What we want to find here is that there are 3 dependences: 1) a loop-carried anti dependence from the load of A to the store, with a direction vector of [= S|<] 2) a loop-carried flow dependence from the store to the load of A, with direction vector [= S|0] 3) a loop-carried output dependence from the store to the store, with direction vector [= S|0] (see Section 8.2.3 in Allen & Kennedy)> I think the following would be a good start at modelling dependencies > between two Values: > > class Dependence { > public: > struct Level { > enum { LT = 1, EQ = 2, GT = 4, ALL = 7 } direction; > bool scalar; > const SCEV *distance; // NULL implies no distance available > const Loop *loop; > };What is the loop value used for? Seems redundant, in that we can get to the statements given the source and destination pointers, and from there to the loop nest. The relative position in the loop nest is given by the index of the levels vector.> enum Kind { Flow, Anti, Output, Input }; > > Kind getKind() const { > return kind; > } > > const Value *getSource() const { > return source; > } > > const Value *getDestination() const { > return destination; > } > > typedef SmallVector<const Level *, 4>::const_iterator const_iterator; > const_iterator begin() const { return levels.begin(); } > const_iterator end() const { return levels.end(); } > > private: > Value *source, *destination; > Kind kind; > SmallVector<const Level *, 4> levels;I'd malloc an ordinary vector of the appropriate length, since we know the length at allocation time.> friend class LoopDependenceAnalysis; > }; > > More methods, like `isConsistent` and `isConfused` can be added easily > to this interface, since the raw data required to compute them is > available.You'll also need flags for loop-independent, consistent, and confused. I thought earlier that the latter two could be computed, but I was wrong. Can't compute them when the source or destination isn't in the loop. When a dependence is completely confused, we needn't represent the direction vector (what you call levels), so we can perhaps save a bit of storage in that common case.> Given two Values, then, we can first break them up into their > respective indices. We can then order them based on their loop > nesting. To actually compute the dependence, we can traverse down the > subscript lists of the two Values and keep checking individual > subscript pairs.Sort of. I would break them up into two lists if subscripts. Don't sort these lists! Instead, find the common nesting depth for the 2 statements and allocate distance vectors of the appropriate length. Then, as we get results from testing individual subscript pairs, we record them in the direction vector at the appropriate level. Initialize each level in the direction vector with ALL. Set the Scalar flag for each level. Imagine we're starting simple, with no attempt to take advantage of coupled subscripts. Examine each pair of subscripts in sequence. If a pair is ZIV, we test hoping to prove independence. If we can't, we move on to the next subscript. If a pair is MIV, we ignore it and move on to the next subscript. If a pair is SIV, we check to see if it's something we can handle (strongSIV, weakzeroSIV, ...). If not, skip to next subscript. If we can handle it, compute direction (if StrongSIV, compute distance). Find the appropriate level based on the index variable, turn off the Scalar flag, and intersect the new direction with whatever is already stored there. If the result is NONE, then there's no dependence and we're done. If we have a distance, set it (if a distance already exists, compare to see if they are the same. If not, then no dependence exists, and we're done). Probably don't bother to record symbolic distances. Move on to the next subscript. That's the general idea. I'll spend some time over the next week or so trying to get it written up in more detail on my website: https://sites.google.com/site/parallelizationforllvm/dependence-test Does this make sense. If not, please get back to me.> I'm not sure on how to deal with cases such as these: > > for (i = 0; i < n; i++) { > for (j = 0; j < n; j++) { > t = a[i][j]; > a[i + 1][j + 1] = t; > } > for (j = 0; j < n; j++) { > t = a[i][j]; > a[i + 1][j + 1] = t; > } > } > > My gut feeling is that we should stop checking individual subscripts > when the loop nesting have diverged. But don't know how to concretely > represent such dependencies.We can compute dependence for imperfectly nested loops like this; indeed, we must if we want to try and parallelize the outer loop. But there's also the possibility of fusing the two inner loops. If we want to do that (and we often do), then we need to look for dependences between them. Generally, dependences between 2 statements only have to do with the common loops. So dependences between these two inner loops should pay attention to subscripts that mention "i". So we'd say that values flow from store in the first loop to load in the second loop, with a distance vector [1|<] However, in this special case, where we have two loops next to each other with identical loop bounds, we'd like to fuse the loops, if it's legal. Often very useful as a way to save loads and cache misses. Now someother part of the compiler would notice the potential for fusing (the adjacent loops with identical headers), but it would rely on the dependence analyzer to make sure fusion was legal, i.e., that no dependences would be violated. To do so, we need to pretend that the fusion has already taken place, and check subscripts for both the i and j loops. Does this make sense? I'd arrange for it by providing an extra entry point with an extra parameter specifying how many loop levels to check.> I'm assuming in cases such as these: > > for (i = 0; i < n; i++) > for (j = 0; j < n; j++) > a[i][0] = a[j][i] > > we can have a dependence edge [S U | 0].Scalar directions occur when a loop level isn't mentioned in any subscript. For example, for (i = 0; i < n; i++) for (j = 0; j < n; j++) a[i + 1] = a[i] + 1; Here we see two dependences 1) a loop-carried flow dependence from the store to A and the load, with dv of [1 S|0] 2) a loop-carried output dependence from the store to the store, with dv of [0 S|0] I don't think there's an anti dependence in this case. Your example is more complicated. Let's look at it in detail for (i = 0; i < n; i++) for (j = 0; j < n; j++) a[i][0] = a[j][i] Comparing the store against itself, we find a loop-carried output dependence with dv of [0 S|0]. Do you see how it happens? We have two stores with subscripts [i][0] and [i][0]. The j loop isn't mentioned, so its level is S. The i loop is mentioned in the first subscript, so that level is 0 (or =). And since we're talking about the same instruction, it can't be a loop-independent dependence. Comparing the load against the store, we see [j][i] and [i][0]. The subscripts are coupled, with an MIV and an SIV. Looking at the SIV, we see that there can only be a dependence when i = 0. Propagating that fact into the other subscript pair, we find that the dependence can only exist when j = i = 0. So there can't be a loop-carried dependence, so no flow dependence. On the other hand, we can have an inconsistent loop-independent anti dependence from the load to the store, with dv [0 0|<] Preston
Maybe Matching Threads
- [LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch
- [LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch
- [LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch
- [LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch
- [LLVMdev] SIV tests in LoopDependence Analysis, Sanjoy's patch