Chris Lattner via llvm-dev
2019-Aug-01 06:23 UTC
[llvm-dev] [RFC] A new multidimensional array indexing intrinsic
On Jul 29, 2019, at 1:30 PM, Michael Kruse <llvmdev at meinersbur.de> wrote:>> Have you been following what is happening on the MLIR side of the world? It directly supports multi-dimensional arrays with both value and buffer semantics (e.g. load, store, and DMA operations). It is specifically focused on solving these sorts of problems, and is the current proposed direction for the flang compiler IR (as well as a bunch of HPC applications that haven’t gone public yet). Even if it isn’t directly relevant to this work, it is worth checking out to see if some of the ideas can be borrowed. > > I have been following MLIR and even pointed the flang team towards it > since this kind of access is more important in Fortran than in C/C++.Oh cool.> IMHO the issue is representation since LLVM-IR does not have the > primitives that MLIR has, specifically there is no Memref type that -- > in contrast to LLVM-IR pointer types are multi-dimensional, and -- in > contrast to LLVM-IR array types can have dependent and dynamic shape. > > Adding a MemRef type this would go quite deep into LLVM-IR > fundamentals. Do you think it would be worth it?That’s the challenge, and no, I don’t think that will make sense in LLVM IR. Similarly, I’m very concerned about the other high level abstraction concepts that are being proposed for LLVM IR - this is muddling the concerns and making the system more complicated and less predictable, and it isn’t clear that the end result of this work will ever be “truly great”. LLVM doesn’t have the same abstractions as MLIR in multiple ways: it doesn’t have the other things that you need to do higher level loop transformations (e.g. affine abstractions or a layout independent memory buffer). Without those it is unclear whether we’ll get much benefit in practice from (e.g.) a multidimensional index. You can’t even really do SoA to AoS transformations without truly heroic work (e.g. like my phd thesis :). The places I’ve seen higher level transformations be successful in the LLVM context are in places like Poly (and MLIR) which have higher level abstractions, or in higher level IRs that lower to LLVM (e.g. XLA, SIL, and many many other domain specific examples). This is why I ask whether its makes sense to add this to LLVM IR: If you want HPC style loop transformations, I don’t think that LLVM IR itself will ever be great, even with this. This might make some narrow set of cases slightly better, but this is far from a solution, and isn’t contiguous with getting to a real solution. That said, this is just my opinion - I have no data to back this up. Adding ‘experiemental’ intrinsics is cheap and easy, so if you think this direction has promise, I’d recommend starting with that, building out the optimizers that you expect to work and measure them in practice. -Chris
Michael Kruse via llvm-dev
2019-Aug-02 15:57 UTC
[llvm-dev] [RFC] A new multidimensional array indexing intrinsic
Am Do., 1. Aug. 2019 um 01:23 Uhr schrieb Chris Lattner <clattner at nondot.org>:> > IMHO the issue is representation since LLVM-IR does not have the > > primitives that MLIR has, specifically there is no Memref type that -- > > in contrast to LLVM-IR pointer types are multi-dimensional, and -- in > > contrast to LLVM-IR array types can have dependent and dynamic shape. > > > > Adding a MemRef type this would go quite deep into LLVM-IR > > fundamentals. Do you think it would be worth it? > > That’s the challenge, and no, I don’t think that will make sense in LLVM IR. Similarly, I’m very concerned about the other high level abstraction concepts that are being proposed for LLVM IR - this is muddling the concerns and making the system more complicated and less predictable, and it isn’t clear that the end result of this work will ever be “truly great”. > > LLVM doesn’t have the same abstractions as MLIR in multiple ways: it doesn’t have the other things that you need to do higher level loop transformations (e.g. affine abstractions or a layout independent memory buffer). Without those it is unclear whether we’ll get much benefit in practice from (e.g.) a multidimensional index. You can’t even really do SoA to AoS transformations without truly heroic work (e.g. like my phd thesis :). The places I’ve seen higher level transformations be successful in the LLVM context are in places like Poly (and MLIR) which have higher level abstractions, or in higher level IRs that lower to LLVM (e.g. XLA, SIL, and many many other domain specific examples). > > This is why I ask whether its makes sense to add this to LLVM IR: If you want HPC style loop transformations, I don’t think that LLVM IR itself will ever be great, even with this. This might make some narrow set of cases slightly better, but this is far from a solution, and isn’t contiguous with getting to a real solution.I agree that memory layout transformations are heroic to do in LLVM-IR. This already follows from the C/C++ linking model where the compiler cannot see/modify all potential accesses to a data structure. However, I think loop transformations are something we can do reasonably (as Polly shows), if necessary guarded by runtime-condition to e.g. rule out aliasing, out-of-bounds accesses and integer overflow. The required lowering of multi-dimensional accesses on dynamically-sized arrays to one-dimensional ones for GEP loses crucial information. At the moment we have ScalarEvolution::delinearize which tries to recover this information, but unfortunately is not very robust. Carrying this information from the front-end into the IR is a much more reliable way, especially if the source language already has the semantics.> That said, this is just my opinion - I have no data to back this up. Adding ‘experiemental’ intrinsics is cheap and easy, so if you think this direction has promise, I’d recommend starting with that, building out the optimizers that you expect to work and measure them in practice.The other feedback in this thread was mostly against using an intrinsic. Would you prefer starting with an intrinsic or would the other suggested approaches be fine as well? Michael
Chris Lattner via llvm-dev
2019-Aug-02 18:47 UTC
[llvm-dev] [RFC] A new multidimensional array indexing intrinsic
On Aug 2, 2019, at 8:57 AM, Michael Kruse <llvmdev at meinersbur.de> wrote:>> This is why I ask whether its makes sense to add this to LLVM IR: If you want HPC style loop transformations, I don’t think that LLVM IR itself will ever be great, even with this. This might make some narrow set of cases slightly better, but this is far from a solution, and isn’t contiguous with getting to a real solution. > > I agree that memory layout transformations are heroic to do in > LLVM-IR. This already follows from the C/C++ linking model where the > compiler cannot see/modify all potential accesses to a data structure.Right.> However, I think loop transformations are something we can do > reasonably (as Polly shows), if necessary guarded by runtime-condition > to e.g. rule out aliasing, out-of-bounds accesses and integer > overflow. The required lowering of multi-dimensional accesses on > dynamically-sized arrays to one-dimensional ones for GEP loses crucial > information. At the moment we have ScalarEvolution::delinearize which > tries to recover this information, but unfortunately is not very > robust. Carrying this information from the front-end into the IR is a > much more reliable way, especially if the source language already has > the semantics.Are you interested in the C family, or another language (e.g. fortran) that has multidimensional arrays as a first class concept? I can see how this is useful for the later case, but in the former case you’d just be changing where you do the pattern matching, right?>> That said, this is just my opinion - I have no data to back this up. Adding ‘experiemental’ intrinsics is cheap and easy, so if you think this direction has promise, I’d recommend starting with that, building out the optimizers that you expect to work and measure them in practice. > > The other feedback in this thread was mostly against using an > intrinsic. Would you prefer starting with an intrinsic or would the > other suggested approaches be fine as well?Which other approach are you referring to? I’d pretty strong prefer *not* to add an instruction for this. It is generally better to start things out as experimental intrinsics, get experience with them, then if they make sense to promote them to instructions. Was there another alternative? -Chris