thr3ads.net - llvm dev - [llvm-dev] [RFC] Injecting new element atomic memory intrinsics into MemIntrinsic class hierarchy [Aug 2017]

If this information is useful, please help other people find it:
Share via:

Daniel Neilson via llvm-dev

2017-Aug-17 15:32 UTC

[llvm-dev] [RFC] Injecting new element atomic memory intrinsics into MemIntrinsic class hierarchy

Hi all,
 We somewhat recently created/updated element-wise unordered-atomic versions of
the memcpy, memmove, and memset memory intrinsics:
Memcpy: https://reviews.llvm.org/rL305558
Memmove: https://reviews.llvm.org/rL307796
Memset: https://reviews.llvm.org/rL307854

 These intrinsics are semantically similar to the regular versions. The main
difference is that the underlying operation is performed entirely with unordered
atomic loads & stores of a given size.

 Our ultimate purpose is to enable exploitation of the memory intrinsic idioms
and optimizations in languages like Java that make extensive use of
unordered-atomic loads & stores. To this end, we will eventually have to add
these new intrinsics to the instruction class hierarchy, and teach passes how to
deal with them; talking about how to do this is the purpose of this RFC.

 We have started adding canary tests to some passes, and will continue to do so
in preparation for adding the element atomic intrinsics to the instruction class
hierarchy. These canary tests are simply placeholders that show that the pass in
question currently does nothing with the new element atomic memory instrinics,
and could/should generally start failing once the unordered-atomic memory
intrinsics are added to the instruction class hierarchy — telling us which
passes need to be updated. For example: https://reviews.llvm.org/rL308247

 For adding the new intrinsics to the instruction class hierarchy, the plan will
be to add them one at a time — add the intrinsic to the hierarchy, and flush out
all problems before adding the next. We’re thinking that it would be best to
start with memset, then memcpy, then memmove; memset is kind of it’s own thing,
and there are some passes that change memcpy/memmove into a memset so it would
be beneficial to have that in place before memcpy/memmove, and there are also
passes that change memmove into memcpy but I haven’t found any that go the other
way (I can’t imagine why you’d want to) so it would be good to have memcpy
squared away before doing memmove.

 There are a few options that we have thought of for adding the new memory
intrinsics to the instruction class hierarchy. We have a preference for the
first option, but would appreciate thoughts/opinions/discussion on the best way
to insert the element atomic intrinsics into the hierarchy. For background, the
relevant portion of the current hierarchy looks like:

MemIntrinsic
* MemSetInst
* MemTransferInst
 ** MemCpyInst
 ** MemMoveInst

Option 1)
-------------
 Do not add any new classes to the hierarchy. Instead, teach each class to
recognize the unordered-atomic variant as belonging to the class (ex: element
unordered-atomic memcpy is recognized as a MemCpyInst, MemTransferInst, and a
MemIntrinsic), and add a query method like ‘isUnorderedAtomic’ that will tell
you whether the underlying intrinsic is one of the element atomic variants.
Alternatively, the method could be a more generic ‘getOrdering()’ that would
return an AtomicOrdering; though a memory intrinsic that is required to be
implemented with ordered-atomic instructions strikes me as unlikely to ever be
desired.

 There is precedent for encoding the intrinsic property as a query method both
within the current memory intrinsic hierarchy itself (the isVolatile() method)
and the load/store instruction classes.

 The way forward with this option would be to:
  (i) add the method to MemInstrinsic (returning false); then
  (ii) update all passes, one at a time, to bail-out if isUnorderedAtomic() is
true; then
  (iii) add an intrinsic to the class hierarchy; then
  (iv) update passes to exploit isUnorderedAtomic() one at a time
  (v) repeat (iii) & (iv) for each unordered-atomic intrinsic

Pros:
 Significant reduction in copy/paste code over other options; passes that use
visitors would require some refactoring or some copy/paste to mirror behaviour
if the unordered-atomic intrinsics are separate classes.
 Lends itself to a clear & simple incremental path-forward; each pass can be
updated independently of the others which will make reviews more achievable due
to not requiring people with working knowledge much beyond the single pass.
 Easy to leverage new work on memory intrinsic optimization for the
unordered-atomic variants; even near-zero cost for passes that won’t be affected
by unordered-atomicity.
Cons:
 One more attribute to check when implementing a pass that handles a memory
intrinsic; could lead to some unexpected bugs where we change an
unordered-atomic intrinsic into a regular intrinsic.

 The con here is pretty concerning. We risk making it easy for a developer to
forget about the element-atomic variants when introducing an optimization that
converts a memmove/memcpy/memset. Such an optimization could, conceivably,
result in an unordered-atomic intrinsic being erroneously converted to drop its
atomicity; such a bug would only exhibit as a race condition in the resulting
program, and could be an absolute beast to debug. The only way to prevent such
bugs would be very careful code reviews.

Option 2)
-------------
 Add separate classes to the hierarchy for each intrinsic. There are a few
different ways that this could be organized (note: UA* = unordered-atomic
version of intrinsic, to save typing):

a) Completely disjoint hierarchy
UAMemIntrinsic
* UAMemSetInst
* UAMemTransferInst
 ** UAMemCpyInst
 ** UAMemMoveInst
MemIntrinsic
* MemSetInst
* MemTransferInst
 ** MemCpyInst
 ** MemMoveInst

 This organization has the benefit of not interacting/interfering with the
regular memory intrinsics in any way. That’s also a big downside; most/many
passes are going to be able to function on unordered-atomic memory intrinsics
the same as they do for the regular memory intrinsics. In some cases this will
just mean adding another few isa<> checks at strategic positions, but in
many cases this will mean code clones and/or clever refactoring (ex: for passes
that incorporate visitors).
 Being completely disjoint also makes this option super easy to add in
incrementally; no pass will know what to do with an unordered-atomic intrinsic
until explicitly taught. Of course that’s also a downside as it means that
optimizing the unordered-atomic memory intrinsics always has to be explicitly
added — nothing comes for free from the regular memory intrinsic work.

b) Disjoint sub-hierarchy of MemIntrinsic
MemIntrinsic
* MemSetInst
* UAMemSetInst
* MemTransferInst
 ** MemCpyInst
 ** MemMoveInst
* UAMemTransferInst
 ** UAMemCpyInst
 ** UAMemMoveInst

 Similar to the (2a) option in a lot of ways, but with some added complications
for even the regular memory intrinsics. The hierarchy immediately below
MemIntrinsic becomes much wider, so it is no longer sufficient to check
isa<MemTransferInst> to narrow down memcpy/memmove vs memset. Also,
similar to (2a), there’s the potential for a lot of code clones in passes that
incorporate visitors.

 The way forward with this option would be similar to 2a with the exception that
we’d have to go through the whole codebase to make sure that "if (!
isa<MemTransferInst>) then Inst is memset” and equivalent checks resolve
correctly.

c) Child classes of the regular intrinsics
MemIntrinsic
* MemSetInst
 ** UAMemSetInst
* MemTransferInst
 ** MemCpyInst
  *** UAMemCpyInst
 ** MemMoveInst
   *** UAMemMoveInst

 There’s not a tonne of difference between this option and option (1), above.
All of the existing visitors would recognize unordered-atomic and regular memory
intrinsics, so there wouldn’t need to be any code clones to support the new
intrinsics.
 However, passes that care about the atomicity of the intrinsic would have to
add isa<> checks to ensure that they’re not acting on the unordered-atomic
memory intrinsic when they don’t want to be. It would also be trickier than
option (1) to check, at the level of MemIntrinsic/MemTransferInst, whether the
intrinsic we have is unordered-atomic; we’d have to explicitly check all of the
isa<>’s rather than querying the isUnorderedAtomic() property. We could
add the isUnorderedAtomic() query method to this option, but then it’s not much
different from option (1).

 The way forward for this option would be very similar to that for option 1.

d) Direct parent classes of the regular intrinsics
MemIntrinsic
* UAMemSetInst
 ** MemSetInst
* MemTransferInst
 ** UAMemCpyInst
  *** MemCpyInst
 ** UAMemMoveInst
  *** MemMoveInst

  This is an idea that came up in our discussions. A benefit is that it leaves
the existing regular memory intrinsics as leaf classes in the hierarchy; so, if
we currently check for, say, isa<MemCpyInst> then we don’t have to change
that code because it can’t be unordered-atomic and have that test be true. The
benefit is that it would make it harder for a pass to accidentally convert an
unordered-atomic intrinsic into a non-atomic intrinsic. It does complicate code
that checks whether an intrinsic is a memset by checking for not
isa<MemTransferInst>, and other code like that.

e) Multiple inheritance

 This is a variant on options 2b to 2d. To make it possible to easily use
isa<> to query for whether or not a memory intrinsic is unordered-atomic
or not we could add an interface-like class that all unordered-atomic memory
intrinsic classes would multiply inherit from. I’m not sure if there is
precedence for this sort of thing in the instruction class hierarchy, though; I
haven’t reviewed the entire hierarchy, but all of the parts that I have looked
at appear to be single inheritance.

==
Note about the align parameter attribute.

 When creating the unordered-atomic memory intrinsics we decided to use the
align attribute on pointer parameters, rather than supplying a single alignment
parameter. This adheres to the way that LLVM wants to move going forward, but is
different from the way that the memcpy, memmove, and memset intrinsics are
currently implemented. This could, potentially, be a complication when
integrating the unordered-atomic memory intrinsics into the class hierarchy. 
Ideally, it would be good to resurrect the 2.5 yr old work to remove the
alignment parameter from memcpy, memmove, and memset:
commit 8b170f7f290843dc3849eaa75b6f74a87a7a2de6
Author: Pete Cooper <peter_cooper at apple.com<mailto:peter_cooper at
apple.com>>
Date:   Wed Nov 18 22:17:24 2015 +0000

That was reverted due to bot failures, but the history on exactly what those
failures are appears, to me, to be lost:

commit 6d024c616a2a4959f8dfe5c64d27f89b394cf042
Author: Pete Cooper <peter_cooper at apple.com<mailto:peter_cooper at
apple.com>>
Date:   Thu Nov 19 05:56:52 2015 +0000

 Baring that, there is still a way forward for integrating the unordered-atomic
memory intrinsics into the class hierarchy.  The getAlignment() method of
MemIntrinsic would have to return the minimum of the pointer-arg alignment
attributes for unordered-atomic intrinsics, and the setAlignment() method would
have to set both parameter attributes to the given value.

==
 Thoughts? Opinions?

Thanks,
 Daniel

---
Daniel Neilson, Ph.D.
Azul Systems
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170817/396037f1/attachment.html>

Philip Reames via llvm-dev

2017-Aug-17 17:56 UTC

head link

[llvm-dev] [RFC] Injecting new element atomic memory intrinsics into MemIntrinsic class hierarchy

I'll comment that Daniel and I have discussed this pretty extensively 
offline.  I am now convinced that option 1 is our best choice out of a 
set of not entirely great choices.  If no one has any comments, that's 
the approach we'll end up pursuing, but we wanted to give the community 
a chance to chime in here.  Maybe someone can come up with a better API 
design than we've managed. :)

Philip


On 08/17/2017 08:32 AM, Daniel Neilson via llvm-dev
wrote:>
> Hi all,
>  We somewhat recently created/updated element-wise unordered-atomic 
> versions of the memcpy, memmove, and memset memory intrinsics:
> Memcpy: https://reviews.llvm.org/rL305558
> Memmove: https://reviews.llvm.org/rL307796
> Memset: https://reviews.llvm.org/rL307854
>
>  These intrinsics are semantically similar to the regular versions. 
> The main difference is that the underlying operation is performed 
> entirely with unordered atomic loads & stores of a given size.
>
>  Our ultimate purpose is to enable exploitation of the memory 
> intrinsic idioms and optimizations in languages like Java that make 
> extensive use of unordered-atomic loads & stores. To this end, we will 
> eventually have to add these new intrinsics to the instruction class 
> hierarchy, and teach passes how to deal with them; talking about how 
> to do this is the purpose of this RFC.
>
>  We have started adding canary tests to some passes, and will continue 
> to do so in preparation for adding the element atomic intrinsics to 
> the instruction class hierarchy. These canary tests are simply 
> placeholders that show that the pass in question currently does 
> nothing with the new element atomic memory instrinics, and 
> could/should generally start failing once the unordered-atomic memory 
> intrinsics are added to the instruction class hierarchy — telling us 
> which passes need to be updated. For example: 
> https://reviews.llvm.org/rL308247
>
>  For adding the new intrinsics to the instruction class hierarchy, the 
> plan will be to add them one at a time — add the intrinsic to the 
> hierarchy, and flush out all problems before adding the next. We’re 
> thinking that it would be best to start with memset, then memcpy, then 
> memmove; memset is kind of it’s own thing, and there are some passes 
> that change memcpy/memmove into a memset so it would be beneficial to 
> have that in place before memcpy/memmove, and there are also passes 
> that change memmove into memcpy but I haven’t found any that go the 
> other way (I can’t imagine why you’d want to) so it would be good to 
> have memcpy squared away before doing memmove.
>
>  There are a few options that we have thought of for adding the new 
> memory intrinsics to the instruction class hierarchy. We have a 
> preference for the first option, but would appreciate 
> thoughts/opinions/discussion on the best way to insert the element 
> atomic intrinsics into the hierarchy. For background, the relevant 
> portion of the current hierarchy looks like:
>
> MemIntrinsic
> * MemSetInst
> * MemTransferInst
>  ** MemCpyInst
>  ** MemMoveInst
>
> Option 1)
> -------------
>  Do not add any new classes to the hierarchy. Instead, teach each 
> class to recognize the unordered-atomic variant as belonging to the 
> class (ex: element unordered-atomic memcpy is recognized as a 
> MemCpyInst, MemTransferInst, and a MemIntrinsic), and add a query 
> method like ‘isUnorderedAtomic’ that will tell you whether the 
> underlying intrinsic is one of the element atomic variants. 
> Alternatively, the method could be a more generic ‘getOrdering()’ that 
> would return an AtomicOrdering; though a memory intrinsic that is 
> required to be implemented with ordered-atomic instructions strikes me 
> as unlikely to ever be desired.
>
>  There is precedent for encoding the intrinsic property as a query 
> method both within the current memory intrinsic hierarchy itself (the 
> isVolatile() method) and the load/store instruction classes.
>
>  The way forward with this option would be to:
>   (i) add the method to MemInstrinsic (returning false); then
>   (ii) update all passes, one at a time, to bail-out if 
> isUnorderedAtomic() is true; then
>   (iii) add an intrinsic to the class hierarchy; then
>   (iv) update passes to exploit isUnorderedAtomic() one at a time
>   (v) repeat (iii) & (iv) for each unordered-atomic intrinsic
>
> Pros:
>  Significant reduction in copy/paste code over other options; passes 
> that use visitors would require some refactoring or some copy/paste to 
> mirror behaviour if the unordered-atomic intrinsics are separate classes.
>  Lends itself to a clear & simple incremental path-forward; each pass 
> can be updated independently of the others which will make reviews 
> more achievable due to not requiring people with working knowledge 
> much beyond the single pass.
>  Easy to leverage new work on memory intrinsic optimization for the 
> unordered-atomic variants; even near-zero cost for passes that won’t 
> be affected by unordered-atomicity.
> Cons:
>  One more attribute to check when implementing a pass that handles a 
> memory intrinsic; could lead to some unexpected bugs where we change 
> an unordered-atomic intrinsic into a regular intrinsic.
>
>  The con here is pretty concerning. We risk making it easy for a 
> developer to forget about the element-atomic variants when introducing 
> an optimization that converts a memmove/memcpy/memset. Such an 
> optimization could, conceivably, result in an unordered-atomic 
> intrinsic being erroneously converted to drop its atomicity; such a 
> bug would only exhibit as a race condition in the resulting program, 
> and could be an absolute beast to debug. The only way to prevent such 
> bugs would be very careful code reviews.
>
> Option 2)
> -------------
>  Add separate classes to the hierarchy for each intrinsic. There are a 
> few different ways that this could be organized (note: UA* = 
> unordered-atomic version of intrinsic, to save typing):
>
> a) Completely disjoint hierarchy
> UAMemIntrinsic
> * UAMemSetInst
> * UAMemTransferInst
>  ** UAMemCpyInst
>  ** UAMemMoveInst
> MemIntrinsic
> * MemSetInst
> * MemTransferInst
>  ** MemCpyInst
>  ** MemMoveInst
>
>  This organization has the benefit of not interacting/interfering with 
> the regular memory intrinsics in any way. That’s also a big downside; 
> most/many passes are going to be able to function on unordered-atomic 
> memory intrinsics the same as they do for the regular memory 
> intrinsics. In some cases this will just mean adding another few
isa<>
> checks at strategic positions, but in many cases this will mean code 
> clones and/or clever refactoring (ex: for passes that incorporate 
> visitors).
>  Being completely disjoint also makes this option super easy to add in 
> incrementally; no pass will know what to do with an unordered-atomic 
> intrinsic until explicitly taught. Of course that’s also a downside as 
> it means that optimizing the unordered-atomic memory intrinsics always 
> has to be explicitly added — nothing comes for free from the regular 
> memory intrinsic work.
>
> b) Disjoint sub-hierarchy of MemIntrinsic
> MemIntrinsic
> * MemSetInst
> * UAMemSetInst
> * MemTransferInst
>  ** MemCpyInst
>  ** MemMoveInst
> * UAMemTransferInst
>  ** UAMemCpyInst
>  ** UAMemMoveInst
>
>  Similar to the (2a) option in a lot of ways, but with some added 
> complications for even the regular memory intrinsics. The hierarchy 
> immediately below MemIntrinsic becomes much wider, so it is no longer 
> sufficient to check isa<MemTransferInst> to narrow down
memcpy/memmove
> vs memset. Also, similar to (2a), there’s the potential for a lot of 
> code clones in passes that incorporate visitors.
>
>  The way forward with this option would be similar to 2a with the 
> exception that we’d have to go through the whole codebase to make sure 
> that "if (! isa<MemTransferInst>) then Inst is memset” and
equivalent
> checks resolve correctly.
>
> c) Child classes of the regular intrinsics
> MemIntrinsic
> * MemSetInst
>  ** UAMemSetInst
> * MemTransferInst
>  ** MemCpyInst
>   *** UAMemCpyInst
>  ** MemMoveInst
>    *** UAMemMoveInst
>
>  There’s not a tonne of difference between this option and option (1), 
> above. All of the existing visitors would recognize unordered-atomic 
> and regular memory intrinsics, so there wouldn’t need to be any code 
> clones to support the new intrinsics.
>  However, passes that care about the atomicity of the intrinsic would 
> have to add isa<> checks to ensure that they’re not acting on the 
> unordered-atomic memory intrinsic when they don’t want to be. It would 
> also be trickier than option (1) to check, at the level of 
> MemIntrinsic/MemTransferInst, whether the intrinsic we have is 
> unordered-atomic; we’d have to explicitly check all of the isa<>’s 
> rather than querying the isUnorderedAtomic() property. We could add 
> the isUnorderedAtomic() query method to this option, but then it’s not 
> much different from option (1).
>
>  The way forward for this option would be very similar to that for 
> option 1.
>
> d) Direct parent classes of the regular intrinsics
> MemIntrinsic
> * UAMemSetInst
>  ** MemSetInst
> * MemTransferInst
>  ** UAMemCpyInst
>   *** MemCpyInst
>  ** UAMemMoveInst
>   *** MemMoveInst
>
>   This is an idea that came up in our discussions. A benefit is that 
> it leaves the existing regular memory intrinsics as leaf classes in 
> the hierarchy; so, if we currently check for, say, isa<MemCpyInst> 
> then we don’t have to change that code because it can’t be 
> unordered-atomic and have that test be true. The benefit is that it 
> would make it harder for a pass to accidentally convert an 
> unordered-atomic intrinsic into a non-atomic intrinsic. It does 
> complicate code that checks whether an intrinsic is a memset by 
> checking for not isa<MemTransferInst>, and other code like that.
>
> e) Multiple inheritance
>
>  This is a variant on options 2b to 2d. To make it possible to easily 
> use isa<> to query for whether or not a memory intrinsic is 
> unordered-atomic or not we could add an interface-like class that all 
> unordered-atomic memory intrinsic classes would multiply inherit from. 
> I’m not sure if there is precedence for this sort of thing in the 
> instruction class hierarchy, though; I haven’t reviewed the entire 
> hierarchy, but all of the parts that I have looked at appear to be 
> single inheritance.
>
> ==>
> *Note about the align parameter attribute.*
>
>  When creating the unordered-atomic memory intrinsics we decided to 
> use the align attribute on pointer parameters, rather than supplying a 
> single alignment parameter. This adheres to the way that LLVM wants to 
> move going forward, but is different from the way that the memcpy, 
> memmove, and memset intrinsics are currently implemented. This could, 
> potentially, be a complication when integrating the unordered-atomic 
> memory intrinsics into the class hierarchy.  Ideally, it would be good 
> to resurrect the 2.5 yr old work to remove the alignment parameter 
> from memcpy, memmove, and memset:
> commit 8b170f7f290843dc3849eaa75b6f74a87a7a2de6
> Author: Pete Cooper <peter_cooper at apple.com 
> <mailto:peter_cooper at apple.com>>
> Date:   Wed Nov 18 22:17:24 2015 +0000
>
> That was reverted due to bot failures, but the history on exactly what 
> those failures are appears, to me, to be lost:
>
> commit 6d024c616a2a4959f8dfe5c64d27f89b394cf042
> Author: Pete Cooper <peter_cooper at apple.com 
> <mailto:peter_cooper at apple.com>>
> Date:   Thu Nov 19 05:56:52 2015 +0000
>
>  Baring that, there is still a way forward for integrating the 
> unordered-atomic memory intrinsics into the class hierarchy.  The 
> getAlignment() method of MemIntrinsic would have to return the minimum 
> of the pointer-arg alignment attributes for unordered-atomic 
> intrinsics, and the setAlignment() method would have to set both 
> parameter attributes to the given value.
>
> ==>
>  Thoughts? Opinions?
>
> Thanks,
>  Daniel
>
> ---
> Daniel Neilson, Ph.D.
> Azul Systems
>
>
> _______________________________________________
> LLVM Developers mailing list
> llvm-dev at lists.llvm.org
> http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170817/304e6a02/attachment.html>

Sanjoy Das via llvm-dev

2017-Aug-19 20:13 UTC

head link

[llvm-dev] [RFC] Injecting new element atomic memory intrinsics into MemIntrinsic class hierarchy

Hi Daniel,

On Thu, Aug 17, 2017 at 8:32 AM, Daniel Neilson via llvm-dev
<llvm-dev at lists.llvm.org> wrote:> Cons:
>  One more attribute to check when implementing a pass that handles a memory
> intrinsic; could lead to some unexpected bugs where we change an
> unordered-atomic intrinsic into a regular intrinsic.
>
>  The con here is pretty concerning. We risk making it easy for a developer
> to forget about the element-atomic variants when introducing an
optimization
> that converts a memmove/memcpy/memset. Such an optimization could,
> conceivably, result in an unordered-atomic intrinsic being erroneously
> converted to drop its atomicity; such a bug would only exhibit as a race
> condition in the resulting program, and could be an absolute beast to
debug.
> The only way to prevent such bugs would be very careful code reviews.
The con is compounded by the fact that since clang will not (?)
generate atomic memX instructions, it will be possible to validate a
change to LLVM across an arbitrarily large corpus of C and C++
programs and not notice this sort of bug.  In theory this isn't a
problem (since code reviews can catch bugs like this), but IMO there
is non-trivial value in avoiding a situation that allows bug in the
Java frontends that don't manifest in clang.
> Option 2)
> -------------
>  Add separate classes to the hierarchy for each intrinsic. There are a few
> different ways that this could be organized (note: UA* = unordered-atomic
> version of intrinsic, to save typing):
>
> a) Completely disjoint hierarchy
> UAMemIntrinsic
> * UAMemSetInst
> * UAMemTransferInst
>  ** UAMemCpyInst
>  ** UAMemMoveInst
> MemIntrinsic
> * MemSetInst
> * MemTransferInst
>  ** MemCpyInst
>  ** MemMoveInst
>
> e) Multiple inheritance
>
>  This is a variant on options 2b to 2d. To make it possible to easily use
> isa<> to query for whether or not a memory intrinsic is
unordered-atomic or
> not we could add an interface-like class that all unordered-atomic memory
> intrinsic classes would multiply inherit from. I’m not sure if there is
> precedence for this sort of thing in the instruction class hierarchy,
> though; I haven’t reviewed the entire hierarchy, but all of the parts that
I
> have looked at appear to be single inheritance.
There is precedent for something like this:  OverflowingBinaryOperator
will return true for both isa<Instruction> and isa<ConstantExpr>.  I
also think this combined with (a) is probably the cleanest path
forward:

AtomicMemIntrinsic
* AtomicMemSetInst
* AtomicMemTransferInst
 ** AtomicMemCpyInst
 ** AtomicMemMoveInst
PlainMemIntrinsic
* PlainMemSetInst
* PlainMemTransferInst
 ** PlainMemCpyInst
 ** PlainMemMoveInst

MemIntrinsic : PlainMemIntrinsic, AtomicMemIntrinsic
MemSetInst : PlainMemSetInst, AtomicMemSetInst
MemTransferInst : PlainMemTransferInst, AtomicMemTransferInst
MemCpyInst : PlainMemCpyInst, AtomicMemCpyInst
MemMoveInst : PlainMemMoveInst, AtomicMemMoveInst

with the existing uses of MemIntrinsic changed to PlainMemIntrinsic etc.

What do you think?

-- Sanjoy

Daniel Neilson via llvm-dev

2017-Aug-21 14:57 UTC

head link

[llvm-dev] [RFC] Injecting new element atomic memory intrinsics into MemIntrinsic class hierarchy

Hi Sanjoy,
 Response/thoughts below...


On Aug 19, 2017, at 3:13 PM, Sanjoy Das <sanjoy at
playingwithpointers.com<mailto:sanjoy at playingwithpointers.com>>
wrote:

Hi Daniel,

On Thu, Aug 17, 2017 at 8:32 AM, Daniel Neilson via llvm-dev
<llvm-dev at lists.llvm.org<mailto:llvm-dev at lists.llvm.org>>
wrote:
Cons:
One more attribute to check when implementing a pass that handles a memory
intrinsic; could lead to some unexpected bugs where we change an
unordered-atomic intrinsic into a regular intrinsic.

The con here is pretty concerning. We risk making it easy for a developer
to forget about the element-atomic variants when introducing an optimization
that converts a memmove/memcpy/memset. Such an optimization could,
conceivably, result in an unordered-atomic intrinsic being erroneously
converted to drop its atomicity; such a bug would only exhibit as a race
condition in the resulting program, and could be an absolute beast to debug.
The only way to prevent such bugs would be very careful code reviews.

The con is compounded by the fact that since clang will not (?)
generate atomic memX instructions, it will be possible to validate a
change to LLVM across an arbitrarily large corpus of C and C++
programs and not notice this sort of bug.  In theory this isn't a
problem (since code reviews can catch bugs like this), but IMO there
is non-trivial value in avoiding a situation that allows bug in the
Java frontends that don't manifest in clang.

Option 2)
-------------
Add separate classes to the hierarchy for each intrinsic. There are a few
different ways that this could be organized (note: UA* = unordered-atomic
version of intrinsic, to save typing):

a) Completely disjoint hierarchy
UAMemIntrinsic
* UAMemSetInst
* UAMemTransferInst
** UAMemCpyInst
** UAMemMoveInst
MemIntrinsic
* MemSetInst
* MemTransferInst
** MemCpyInst
** MemMoveInst

e) Multiple inheritance

This is a variant on options 2b to 2d. To make it possible to easily use
isa<> to query for whether or not a memory intrinsic is unordered-atomic
or
not we could add an interface-like class that all unordered-atomic memory
intrinsic classes would multiply inherit from. I’m not sure if there is
precedence for this sort of thing in the instruction class hierarchy,
though; I haven’t reviewed the entire hierarchy, but all of the parts that I
have looked at appear to be single inheritance.

There is precedent for something like this:  OverflowingBinaryOperator
will return true for both isa<Instruction> and isa<ConstantExpr>.  I
also think this combined with (a) is probably the cleanest path
forward:

AtomicMemIntrinsic
* AtomicMemSetInst
* AtomicMemTransferInst
** AtomicMemCpyInst
** AtomicMemMoveInst
PlainMemIntrinsic
* PlainMemSetInst
* PlainMemTransferInst
** PlainMemCpyInst
** PlainMemMoveInst

MemIntrinsic : PlainMemIntrinsic, AtomicMemIntrinsic
MemSetInst : PlainMemSetInst, AtomicMemSetInst
MemTransferInst : PlainMemTransferInst, AtomicMemTransferInst
MemCpyInst : PlainMemCpyInst, AtomicMemCpyInst
MemMoveInst : PlainMemMoveInst, AtomicMemMoveInst

with the existing uses of MemIntrinsic changed to PlainMemIntrinsic etc.

What do you think?

It’s an interesting idea. I kind of like it — it’s an interesting mix of options
1, 2a, & 2e. An advantage over option 1 is that this also exposes an easy
way for a pass to not even have to think about the atomic memory intrinsics —
just key off of the PlainMemIntrinsic classes instead of the MemIntrinsic
classes.

We still have the same con as option 1, though I could virtually eliminate that
con by starting the work by submitting an NFC that renames the current
MemIntrinsic & derived class to PlainMemIntrinsic, etc. That would make the
path forward something like:
1) NFC to rename MemIntrinsic classes to PlainMemIntrinsic
2) Patch to add AtomicMemIntrinsic classes & MemIntrinsic classes derived
from Plain* & Atomic*
3) Individual patches to update passes to handle Atomic + Plain as desired.

 The problem that I see with that, though, is any third party stuff that uses
the MemIntrinsic classes as they are today will break between steps (1) and (2),
and then they will start doing the wrong things with the unordered-atomic memory
intrinsics if they’re presented with one — though, that last bit is no different
from option 1.

 Thinking out loud… If we go the multiple inheritance route, then what about
something like:
MemIntrinsic
* MemSetInst
* MemTransferInst
** MemCpyInst
** MemMoveInst
AtomicMemIntrinsic
* AtomicMemSetInst
* AtomicMemTransferInst
** AtomicMemCpyInst
** AtomicMemMoveInst

AnyMemIntrinsic : MemIntrinsic, AtomicMemIntrinsic
AnyMemSetInst : MemSetInst, AtomicMemSetInst
AnyMemTransferInst : MemTranferInst, AtomicMemTransferInst
… etc.

 That would leave the meanings of the current MemIntrinsic classes completely
unchanged; which should be good for 3rd parties. Updating a pass to work with
unordered-atomic memory intrinsics would just mean having it use the AnyMem*
classes instead of the Mem* classes, and then adding some isa<> checks to
do the right thing with the atomic variants. We don’t get all of the memory
intrinsic optimizing passes for free on the atomic variants, but we get the ones
we want at a reasonably low cost, and there’s no risk to the behaviour of the
existing non-atomic memory intrinsics...

-Daniel

---
Daniel Neilson, Ph.D.
Azul Systems

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.llvm.org/pipermail/llvm-dev/attachments/20170821/1f09c413/attachment.html>

Apparently Analagous Threads

Search for more maybe matching threads

llvm dev - Aug 2017 - [RFC] Injecting new element atomic memory intrinsics into MemIntrinsic class hierarchy

[llvm-dev] [RFC] Injecting new element atomic memory intrinsics into MemIntrinsic class hierarchy

[llvm-dev] [RFC] Injecting new element atomic memory intrinsics into MemIntrinsic class hierarchy

[llvm-dev] [RFC] Injecting new element atomic memory intrinsics into MemIntrinsic class hierarchy

[llvm-dev] [RFC] Injecting new element atomic memory intrinsics into MemIntrinsic class hierarchy

Apparently Analagous Threads