thr3ads.net - llvm dev - [llvm-dev] [RFC] Clean up the way we store optional Function data [Oct 2015]

If this information is useful, please help other people find it:
Share via:

Vedant Kumar via llvm-dev

2015-Oct-10 02:20 UTC

[llvm-dev] [RFC] Clean up the way we store optional Function data

Function's have three kinds of optional data: prefix data, prologue data,
and
personalities. We don't have a consistent way of storing this data, IMO.
This
RFC discusses a new way of managing optional data that makes llvm::Function
cleaner, more consistent, and a little smaller.


What do we do currently?
=======================
Prefix and prologue data are attached to Functions via DenseMaps in
LLVMContextImpl:

    typedef DenseMap<const Function *, ReturnInst *> FunctionDataMapTy;
    FunctionDataMapTy PrefixDataMap;
    FunctionDataMapTy PrologueDataMap;

To attach prefix data to a Function, we create an orphan ReturnInst:

    RI = ReturnInst::Create(F->getContext(), PrefixData);
    PrefixDataMap[F] = RI;

Personalities are stored as ``optional`` Function operands. We actually always
allocate the space for this ``optional`` operand: there's a FIXME in the
destructor for Function about this.


What's wrong with this?
======================
1. Functions without personalities pay the space-cost for them.

2. We handle personalities differently from prefix and prologue data.

3. Suppose N functions share the same prefix or prologue data. We create N
   different ReturnInsts to point to the same thing. I think that's
wasteful.

4. These hidden ReturnInsts effectively make a Function an indirect user of
   its optional data. This relationship should be made explicit to avoid bugs.


Proposal: Introduce a new Constant, llvm::IndirectUser
=====================================================
This is my take on Duncan's proposed solution to PR24755 [1].

It involves creating a new type of Constant. A sketch:

    struct llvm::IndirectUser : llvm::Constant {
        Constant *Data;     // Operand 0
        unsigned  RefCount; // Metadata (not an operand)

        /// Get or create an IndirectUser for C.
        static IndirectUser *GetOrCreateIndirectUser(Constant *C);
        
        /// Drop a reference to C's IndirectUser.
        static void ReleaseIndirectUser(Constant *C);
    };

Here's how to attach prefix data to a Function:

    IU = IndirectUser::GetOrCreateIndirectUser(PrefixData);
    PrefixDataMap[F] = IU;

Prologue data and personality functions work the same way (though we'll need
to
add a PersonalityFnMap).

If two functions share prefix/prologue data, or personalities, they get the
same IndirectUser (with a bumped refcount).

This addresses all the points in the previous section.


Implementation details
=====================
When an IndirectUser is constructed, it adds itself to its Constant's users
list. When the IndirectUser's refcount hits 0, we can invoke
destroyConstant().
The only tricky part is handling ReplaceAllUsesWith.

There are two RAUW cases. Case 1 is easy:

    <IndirectUser 1> = { Data = A, RefCount = X }
    RAUW(A, B)

Result:

    <IndirectUser 1> = { Data = B, RefCount = X }

Case 2 is a bit more involved:

    <IndirectUser 1> = { Data = A, RefCount = X }
    <IndirectUser 2> = { Data = B, RefCount = Y }
    RAUW(A, B)

Result:

    <IndirectUser 2> = { Data = B, RefCount = X + Y }
    <IndirectUser 1> = { Data = <IndirectUser 2>, RefCount = 0 }

I've left out a sub-case where <IndirectUser 2> is already part of a
daisy
chain. There are strict asserts in place to keep the daisy-chaining sane.

I have this mostly working. There's just one outstanding issue in clang. The
PersonalityHasOnlyCXXUses() routine needs to look up all the Function objects
which refer to a particular personality function.  I plan on supporting this by
defining a ``Function::getPersonalityFnUsers(Fn)`` routine.

I'm hoping to have patches out for review by Monday. I thought I'd send
this
out now to get feedback on my approach -- plus, I didn't want to send stuff
in
out of the blue.

Let me know what you think!

vedant

[1] https://llvm.org/bugs/show_bug.cgi?id=24755

Sanjoy Das via llvm-dev

2015-Oct-10 06:07 UTC

head link

[llvm-dev] [RFC] Clean up the way we store optional Function data

I don't know how prologue and prefix data is used -- is it correct to
say that you're basically trying to give `llvm::Function` s some
"optional" operands, and that you know during construction of an
`llvm::Function` how many optional operands the `llvm::Function` will
need[1]?  If so, you might want to consider using the "descriptor"
functionality that was added recently[2].  It allows classes deriving
from `llvm::User` to allocate an arbitrary number of bookkeeping bytes
before the normal `Use &` array.  Instructions with operand bundles
use these bytes to remember which where the individual operand bundles
start and end, and an `llvm::Function` could easily use the extra
storage to remember if / where optional operands like the personality
function are in the `Use &` array.  And by making the operands part of
the regular `Use &` array, functionality like RAUW that use Def-Use
chains should Just Work(TM).

Keep in mind, I'm biased here, so take with a grain of salt! :)

[1]: I see there are methods like `setPrologueData`, but they're only
   used from files that I'd expect to be creating `llvm::Function`
   instances from scratch.

[2]: See `DescBytes` in `User::User`

-- Sanjoy


Vedant Kumar wrote:
 > Function's have three kinds of optional data: prefix data, prologue
data, and
 > personalities. We don't have a consistent way of storing this data,
IMO. This
 > RFC discusses a new way of managing optional data that makes
llvm::Function
 > cleaner, more consistent, and a little smaller.
 >
 >
 > What do we do currently?
 > ======================= >
 > Prefix and prologue data are attached to Functions via DenseMaps in
 > LLVMContextImpl:
 >
 >      typedef DenseMap<const Function *, ReturnInst *> 
FunctionDataMapTy;
 >      FunctionDataMapTy PrefixDataMap;
 >      FunctionDataMapTy PrologueDataMap;
 >
 > To attach prefix data to a Function, we create an orphan ReturnInst:
 >
 >      RI = ReturnInst::Create(F->getContext(), PrefixData);
 >      PrefixDataMap[F] = RI;
 >
 > Personalities are stored as ``optional`` Function operands. We actually
always
 > allocate the space for this ``optional`` operand: there's a FIXME in
the
 > destructor for Function about this.
 >
 >
 > What's wrong with this?
 > ====================== >
 > 1. Functions without personalities pay the space-cost for them.
 >
 > 2. We handle personalities differently from prefix and prologue data.
 >
 > 3. Suppose N functions share the same prefix or prologue data. We create N
 >     different ReturnInsts to point to the same thing. I think that's
wasteful.
 >
 > 4. These hidden ReturnInsts effectively make a Function an indirect user
of
 >     its optional data. This relationship should be made explicit to avoid
bugs.
 >
 >
 > Proposal: Introduce a new Constant, llvm::IndirectUser
 > ===================================================== >
 > This is my take on Duncan's proposed solution to PR24755 [1].
 >
 > It involves creating a new type of Constant. A sketch:
 >
 >      struct llvm::IndirectUser : llvm::Constant {
 >          Constant *Data;     // Operand 0
 >          unsigned  RefCount; // Metadata (not an operand)
 >
 >          /// Get or create an IndirectUser for C.
 >          static IndirectUser *GetOrCreateIndirectUser(Constant *C);
 >
 >          /// Drop a reference to C's IndirectUser.
 >          static void ReleaseIndirectUser(Constant *C);
 >      };
 >
 > Here's how to attach prefix data to a Function:
 >
 >      IU = IndirectUser::GetOrCreateIndirectUser(PrefixData);
 >      PrefixDataMap[F] = IU;
 >
 > Prologue data and personality functions work the same way (though
we'll need to
 > add a PersonalityFnMap).
 >
 > If two functions share prefix/prologue data, or personalities, they get
the
 > same IndirectUser (with a bumped refcount).
 >
 > This addresses all the points in the previous section.
 >
 >
 > Implementation details
 > ===================== >
 > When an IndirectUser is constructed, it adds itself to its Constant's
users
 > list. When the IndirectUser's refcount hits 0, we can invoke
destroyConstant().
 > The only tricky part is handling ReplaceAllUsesWith.
 >
 > There are two RAUW cases. Case 1 is easy:
 >
 >      <IndirectUser 1>  = { Data = A, RefCount = X }
 >      RAUW(A, B)
 >
 > Result:
 >
 >      <IndirectUser 1>  = { Data = B, RefCount = X }
 >
 > Case 2 is a bit more involved:
 >
 >      <IndirectUser 1>  = { Data = A, RefCount = X }
 >      <IndirectUser 2>  = { Data = B, RefCount = Y }
 >      RAUW(A, B)
 >
 > Result:
 >
 >      <IndirectUser 2>  = { Data = B, RefCount = X + Y }
 >      <IndirectUser 1>  = { Data =<IndirectUser 2>, RefCount =
0 }
 >
 > I've left out a sub-case where<IndirectUser 2>  is already part
of a daisy
 > chain. There are strict asserts in place to keep the daisy-chaining sane.
 >
 > I have this mostly working. There's just one outstanding issue in
clang. The
 > PersonalityHasOnlyCXXUses() routine needs to look up all the Function
objects
 > which refer to a particular personality function.  I plan on supporting
this by
 > defining a ``Function::getPersonalityFnUsers(Fn)`` routine.
 >
 > I'm hoping to have patches out for review by Monday. I thought I'd
send this
 > out now to get feedback on my approach -- plus, I didn't want to send
stuff in
 > out of the blue.
 >
 > Let me know what you think!
 >
 > vedant
 >
 > [1] https://llvm.org/bugs/show_bug.cgi?id=24755

Vedant Kumar via llvm-dev

2015-Oct-12 16:01 UTC

head link

[llvm-dev] [RFC] Clean up the way we store optional Function data

Hi Sanjoy,

> I don't know how prologue and prefix data is used -- is it correct to
> say that you're basically trying to give `llvm::Function` s some
> "optional" operands, and that you know during construction of an
> `llvm::Function` how many optional operands the `llvm::Function` will
> need[1]?
Yep. Though not operands exactly, since they wouldn't be in Function's
use list.

>  If so, you might want to consider using the "descriptor"
> functionality that was added recently[2].  It allows classes deriving
> from `llvm::User` to allocate an arbitrary number of bookkeeping bytes
> before the normal `Use &` array.  Instructions with operand bundles
> use these bytes to remember which where the individual operand bundles
> start and end, and an `llvm::Function` could easily use the extra
> storage to remember if / where optional operands like the personality
> function are in the `Use &` array.  And by making the operands part of
> the regular `Use &` array, functionality like RAUW that use Def-Use
> chains should Just Work(TM).
That's a neat idea. To summarize: make Function have 3 optional operands.
(For context -- Function currently has 1 optional operand, and my proposal is to
move to 0.)

Could someone else chime in on what they'd like to see?

vedant

Maybe Matching Threads

Search for more reasonably related threads

llvm dev - Oct 2015 - [RFC] Clean up the way we store optional Function data

[llvm-dev] [RFC] Clean up the way we store optional Function data

[llvm-dev] [RFC] Clean up the way we store optional Function data

[llvm-dev] [RFC] Clean up the way we store optional Function data

Maybe Matching Threads