Right now the MemSDNode keeps a volatile bit in the SubclassData to mark volatile memory operations. We have some changes we'd like to push back that adds a NonTemporal flag to MemSDNode to mark instructions where movnt (on x86) and other goodness can happen (we'll also add the TableGen patterns to properly select movnt). In our tree we simply added another flag to the MemSDNode constructor and embedded it in SubclassData: MemSDNode(unsigned Opc, DebugLoc dl, SDVTList VTs, MVT MemoryVT, const Value *srcValue, int SVOff, unsigned alignment, bool isvolatile, bool NonTemporal); This is ugly for a variety of reasons and also doesn't scale as we want to add more of this kind of information. So what if we replace Volatile/NonTemporal with a single bitvector? There's not a lot of room in SubclassData (it also holds alignment information) so maybe this should be another member on MemSDNode. That will increase the size of MemSDNode, so we need to consider that. The constructor could look something like this: MemSDNode(unsigned Opc, DebugLoc dl, SDVTList VTs, MVT MemoryVT, const Value *srcValue, int SVOff, unsigned alignment, unsigned flags); and called like this: new (N) LoadSDNode(..., isVolatile|isNonTemporal); Thoughts? -Dave
On Jul 31, 2009, at 11:26 AM, David Greene wrote:> Right now the MemSDNode keeps a volatile bit in the SubclassData to > mark > volatile memory operations. > > We have some changes we'd like to push back that adds a NonTemporal > flag > to MemSDNode to mark instructions where movnt (on x86) and other > goodness > can happen (we'll also add the TableGen patterns to properly select > movnt). > > In our tree we simply added another flag to the MemSDNode constructor > and embedded it in SubclassData: > > MemSDNode(unsigned Opc, DebugLoc dl, SDVTList VTs, MVT MemoryVT, > const Value *srcValue, int SVOff, > unsigned alignment, bool isvolatile, bool NonTemporal); > > This is ugly for a variety of reasons and also doesn't scale as we > want to add more of this kind of information. > > So what if we replace Volatile/NonTemporal with a single bitvector? > There's not a lot of room in SubclassData (it also holds alignment > information) so maybe this should be another member on MemSDNode. > That will increase the size of MemSDNode, so we need to consider > that.LoadSDNode, which inherits from MemSDNode is the largest SDNode. With the current SDNode allocation strategy, making it bigger will increase the allocation needed for all nodes.> > The constructor could look something like this: > > MemSDNode(unsigned Opc, DebugLoc dl, SDVTList VTs, MVT MemoryVT, > const Value *srcValue, int SVOff, > unsigned alignment, unsigned flags); > > and called like this: > > new (N) LoadSDNode(..., isVolatile|isNonTemporal); > > Thoughts?This sounds reasonable. I'd suggest using names like MemRefFlags rather than isVolatileisNonTemporal, but that's just a detail :-). My biggest concern is how this is encoded in the SDNode. It'd be good to avoid making MemSDNode bigger, but there are a variety of ways that exiting bits can be made available. NodeType doesn't need all 16 of its bits, for example, and OperandsNeedDelete could be merged with OperandList with a PointerIntPair if needed. How are you representing movnt in LLVM IR, with an intrinsic? Dan
I also want a way to add target specific flag to a SDNode (which should be transferred to MachineInstr). For example, on x86 lots of opcodes have *lock* variants. Right now, these are separate instructions. I'd prefer to make it into a target specific flag that can be toggled by some sort of post-isel action routine. One way to handle this might be to expand the use of SubclassData. There are 15 bits to play with and only the bottom 6-7 are in use (if I am reading it right). We can also reduce the width of following field NodeId (do we need 32-bit for it?) and widen SubclassData. Also, NumOperands and NumValues can be changed to take up fewer bits. I don't think we need 16-bits for each. So NodeType, OperandsNeedDelete, SubclassData, NodeId, NumOperands, and NumValues together are using 96 bits (assuming int is 32-bit and short is 16-bit). I think there is opportunity here to re-pack them and find quite a few bits for target specific data. Evan On Jul 31, 2009, at 11:26 AM, David Greene wrote:> Right now the MemSDNode keeps a volatile bit in the SubclassData to > mark > volatile memory operations. > > We have some changes we'd like to push back that adds a NonTemporal > flag > to MemSDNode to mark instructions where movnt (on x86) and other > goodness > can happen (we'll also add the TableGen patterns to properly select > movnt). > > In our tree we simply added another flag to the MemSDNode constructor > and embedded it in SubclassData: > > MemSDNode(unsigned Opc, DebugLoc dl, SDVTList VTs, MVT MemoryVT, > const Value *srcValue, int SVOff, > unsigned alignment, bool isvolatile, bool NonTemporal); > > This is ugly for a variety of reasons and also doesn't scale as we > want to add more of this kind of information. > > So what if we replace Volatile/NonTemporal with a single bitvector? > There's not a lot of room in SubclassData (it also holds alignment > information) so maybe this should be another member on MemSDNode. > That will increase the size of MemSDNode, so we need to consider > that. > > The constructor could look something like this: > > MemSDNode(unsigned Opc, DebugLoc dl, SDVTList VTs, MVT MemoryVT, > const Value *srcValue, int SVOff, > unsigned alignment, unsigned flags); > > and called like this: > > new (N) LoadSDNode(..., isVolatile|isNonTemporal); > > Thoughts? > > -Dave > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Saturday 01 August 2009 15:12, Dan Gohman wrote:> LoadSDNode, which inherits from MemSDNode is the largest > SDNode. With the current SDNode allocation strategy, making it > bigger will increase the allocation needed for all nodes.Ok.> > new (N) LoadSDNode(..., isVolatile|isNonTemporal); > > > > Thoughts? > > This sounds reasonable. I'd suggest using names like MemRefFlags rather > than isVolatileisNonTemporal, but that's just a detail :-).There's a pipe there. :)> My biggest concern is how this is encoded in the SDNode. It'd be > good to avoid making MemSDNode bigger, but there are a variety of ways > that exiting bits can be made available. NodeType doesn't need all 16 > of > its bits, for example, and OperandsNeedDelete could be merged with > OperandList with a PointerIntPair if needed.*shudder* Undefined behavior? No thanks. Right now, the lower five bits of SubclassData are used to encode various things for memory SDNodes. One of those is the volatile bit. This leaves 10 bits for alignment information, meaning we can represent alignments up to 2^1024, which seems like a bit much. :) What do you think about carving four more bits out of the alignment field so we have five flag/semantic bits and can represent alignments up to 2^64? That seems like plenty of alignment headroom.> How are you representing movnt in LLVM IR, with an intrinsic?Yes, it's an intrinsic inserted just before the load or store of interest. I'm not particularly happy with this solution because I'm not sure all transformation passes will preserve the ordering of intrinsic-followed-by-operation. Is there a better way to do this? -Dave
On Sunday 02 August 2009 20:57, Evan Cheng wrote:> I also want a way to add target specific flag to a SDNode (which > should be transferred to MachineInstr). For example, on x86 lots of > opcodes have *lock* variants. Right now, these are separate > instructions. I'd prefer to make it into a target specific flag that > can be toggled by some sort of post-isel action routine.That's a good idea.> One way to handle this might be to expand the use of SubclassData. > There are 15 bits to play with and only the bottom 6-7 are in use (if > I am reading it right). We can also reduce the width of following > field NodeId (do we need 32-bit for it?) and widen SubclassData.Yep. See my response to Dan. I don't know how many target-specific flags you need, but we should be able to carve out four more bits from SubclassData without changing anything else.> Also, NumOperands and NumValues can be changed to take up fewer bits. > I don't think we need 16-bits for each.Probably true.> So NodeType, OperandsNeedDelete, SubclassData, NodeId, NumOperands, > and NumValues together are using 96 bits (assuming int is 32-bit and > short is 16-bit). I think there is opportunity here to re-pack them > and find quite a few bits for target specific data.I agree. If you and Dan agree, I'll take a crack at this. -Dave