Hal Finkel
2012-Oct-02 05:37 UTC
[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)
On Mon, 01 Oct 2012 21:26:54 -0700 Chris Lattner <clattner at apple.com> wrote:> > On Oct 1, 2012, at 6:16 PM, greened at obbligato.org wrote: > > > Sanjoy Das <sanjoy at playingwithpointers.com> writes: > > > >> In short, I propose a intrinsic based approach which hinges on the > >> concept of a "parallel map". The immediate effect of using > >> intrinsics is that we no longer have to worry about missing > >> metadata. Moreover, we are still free to lower the intrinsics in > >> a variety of ways -- including vectorizing them or lowering them > >> to calls to an actual openmp backend. > > > > I'll re-ask here since this is in its own thread. > > > > Why can't we just make ordinary function calls to runtime routines? > > I agree. I can't imagine any practical way that a metadata-based > approach could be preserved by optimizers.Regarding the metadata approach, it depends on what you mean by preserved. The trick is to make sure that transformations that don't understand the metadata can't cause miscompiles. The specific scheme that I proposed used a combination of procedurization and cross-referencing metadata such that invalidated parallel metadata can be detected and the entire enclosing parallel region can be dropped. The proposal from Intel, which more-heavily uses intrinsics, has other advantages, but will require more modifications to existing passes to realize its potential optimization benefits. -Hal> > -Chris > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev-- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory
Chris Lattner
2012-Oct-02 05:56 UTC
[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)
On Oct 1, 2012, at 10:37 PM, Hal Finkel <hfinkel at anl.gov> wrote:> On Mon, 01 Oct 2012 21:26:54 -0700 > Chris Lattner <clattner at apple.com> wrote: > >> >> On Oct 1, 2012, at 6:16 PM, greened at obbligato.org wrote: >> >>> Sanjoy Das <sanjoy at playingwithpointers.com> writes: >>> >>>> In short, I propose a intrinsic based approach which hinges on the >>>> concept of a "parallel map". The immediate effect of using >>>> intrinsics is that we no longer have to worry about missing >>>> metadata. Moreover, we are still free to lower the intrinsics in >>>> a variety of ways -- including vectorizing them or lowering them >>>> to calls to an actual openmp backend. >>> >>> I'll re-ask here since this is in its own thread. >>> >>> Why can't we just make ordinary function calls to runtime routines? >> >> I agree. I can't imagine any practical way that a metadata-based >> approach could be preserved by optimizers. > > Regarding the metadata approach, it depends on what you mean by > preserved. The trick is to make sure that transformations that don't > understand the metadata can't cause miscompiles. The specific scheme > that I proposed used a combination of procedurization and > cross-referencing metadata such that invalidated parallel metadata can > be detected and the entire enclosing parallel region can be dropped. > > The proposal from Intel, which more-heavily uses intrinsics, has other > advantages, but will require more modifications to existing passes to > realize its potential optimization benefits.My comment was mostly in response to the Intel proposal, which effectively translates OpenMP pragmas directly into llvm intrinsics + metadata. I can't imagine a way to make this work *correctly* without massive changes to the optimizer. -Chris
Hal Finkel
2012-Oct-03 02:29 UTC
[LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)
On Mon, 01 Oct 2012 22:56:50 -0700 Chris Lattner <clattner at apple.com> wrote:> > On Oct 1, 2012, at 10:37 PM, Hal Finkel <hfinkel at anl.gov> wrote: > > > On Mon, 01 Oct 2012 21:26:54 -0700 > > Chris Lattner <clattner at apple.com> wrote: > > > >> > >> On Oct 1, 2012, at 6:16 PM, greened at obbligato.org wrote: > >> > >>> Sanjoy Das <sanjoy at playingwithpointers.com> writes: > >>> > >>>> In short, I propose a intrinsic based approach which hinges on > >>>> the concept of a "parallel map". The immediate effect of using > >>>> intrinsics is that we no longer have to worry about missing > >>>> metadata. Moreover, we are still free to lower the intrinsics in > >>>> a variety of ways -- including vectorizing them or lowering them > >>>> to calls to an actual openmp backend. > >>> > >>> I'll re-ask here since this is in its own thread. > >>> > >>> Why can't we just make ordinary function calls to runtime > >>> routines? > >> > >> I agree. I can't imagine any practical way that a metadata-based > >> approach could be preserved by optimizers. > > > > Regarding the metadata approach, it depends on what you mean by > > preserved. The trick is to make sure that transformations that don't > > understand the metadata can't cause miscompiles. The specific scheme > > that I proposed used a combination of procedurization and > > cross-referencing metadata such that invalidated parallel metadata > > can be detected and the entire enclosing parallel region can be > > dropped. > > > > The proposal from Intel, which more-heavily uses intrinsics, has > > other advantages, but will require more modifications to existing > > passes to realize its potential optimization benefits. > > My comment was mostly in response to the Intel proposal, which > effectively translates OpenMP pragmas directly into llvm intrinsics + > metadata. I can't imagine a way to make this work *correctly* > without massive changes to the optimizer.Also, I should mention that Sanjoy's recommendation, which is to move the parallelization state into an analysis pass, might make sense here. If not all intermediate passes preserve the analysis, then the state will be lost, and no parallelization will occur. In the context of OpenMP, where parallelization is essentially optional, I think this should be fine. In any case, if we mark the intrinsics has having unknown side effects then they'll serve as barriers for code motion. I *think* that this would also inhibit loop restructuring (or could be made to do so) so loop annotations could be kept properly associated with the intended code, but this would need to be checked. -Hal> > -Chris-- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory
Reasonably Related Threads
- [LLVMdev] [cfe-dev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)
- [LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)
- [LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)
- [LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)
- [LLVMdev] [RFC] Parallelization metadata and intrinsics in LLVM (for OpenMP, etc.)