On Saturday 01 August 2009 15:12, Dan Gohman wrote:> LoadSDNode, which inherits from MemSDNode is the largest > SDNode. With the current SDNode allocation strategy, making it > bigger will increase the allocation needed for all nodes.Ok.> > new (N) LoadSDNode(..., isVolatile|isNonTemporal); > > > > Thoughts? > > This sounds reasonable. I'd suggest using names like MemRefFlags rather > than isVolatileisNonTemporal, but that's just a detail :-).There's a pipe there. :)> My biggest concern is how this is encoded in the SDNode. It'd be > good to avoid making MemSDNode bigger, but there are a variety of ways > that exiting bits can be made available. NodeType doesn't need all 16 > of > its bits, for example, and OperandsNeedDelete could be merged with > OperandList with a PointerIntPair if needed.*shudder* Undefined behavior? No thanks. Right now, the lower five bits of SubclassData are used to encode various things for memory SDNodes. One of those is the volatile bit. This leaves 10 bits for alignment information, meaning we can represent alignments up to 2^1024, which seems like a bit much. :) What do you think about carving four more bits out of the alignment field so we have five flag/semantic bits and can represent alignments up to 2^64? That seems like plenty of alignment headroom.> How are you representing movnt in LLVM IR, with an intrinsic?Yes, it's an intrinsic inserted just before the load or store of interest. I'm not particularly happy with this solution because I'm not sure all transformation passes will preserve the ordering of intrinsic-followed-by-operation. Is there a better way to do this? -Dave
On Aug 3, 2009, at 9:30 AM, David Greene wrote:> On Saturday 01 August 2009 15:12, Dan Gohman wrote: > >> LoadSDNode, which inherits from MemSDNode is the largest >> SDNode. With the current SDNode allocation strategy, making it >> bigger will increase the allocation needed for all nodes. > > Ok. > > >>> new (N) LoadSDNode(..., isVolatile|isNonTemporal); >>> >>> Thoughts? >> >> This sounds reasonable. I'd suggest using names like MemRefFlags >> rather >> than isVolatileisNonTemporal, but that's just a detail :-). > > There's a pipe there. :)Oops. I blame my font ;-).> >> My biggest concern is how this is encoded in the SDNode. It'd be >> good to avoid making MemSDNode bigger, but there are a variety of >> ways >> that exiting bits can be made available. NodeType doesn't need all >> 16 >> of >> its bits, for example, and OperandsNeedDelete could be merged with >> OperandList with a PointerIntPair if needed. > > *shudder* Undefined behavior? No thanks.What undefined behavior? Is it PointerIntPair that's making you shudder? That's implementation-defined behavior. Very different :-). And it's checked by asserts.> > Right now, the lower five bits of SubclassData are used to encode > various > things for memory SDNodes. One of those is the volatile bit. This > leaves > 10 bits for alignment information, meaning we can represent > alignments up to > 2^1024, which seems like a bit much. :) > > What do you think about carving four more bits out of the alignment > field > so we have five flag/semantic bits and can represent alignments up > to 2^64? > That seems like plenty of alignment headroom.Yes, that's fine too. If someone really wanted an alignment greater than 1<<64, they would probably have other issues that would dwarf this one.> >> How are you representing movnt in LLVM IR, with an intrinsic? > > Yes, it's an intrinsic inserted just before the load or store of > interest. > I'm not particularly happy with this solution because I'm not sure all > transformation passes will preserve the ordering of > intrinsic-followed-by-operation. Is there a better way to do this?Crazy idea: how about an intrinsic just after the load or store of interest? It would be a sort of opposite of prefetch, saying "i'm not going to be using the memory at this address for a while". It could use IntrWriteArgMem like the regular prefetch does, which is usually sufficient to keep it from wandering off, though it is a little stronger than is strictly necessary. The other alternative is something like llvm.x86.sse.movnt.ps and friends. Dan
On Monday 03 August 2009 12:38, Dan Gohman wrote:> > *shudder* Undefined behavior? No thanks. > > What undefined behavior? Is it PointerIntPair that's making you > shudder? > That's implementation-defined behavior. Very different :-). And it's > checked > by asserts.Well, it's still non-portable either way.> > What do you think about carving four more bits out of the alignment > > field > > so we have five flag/semantic bits and can represent alignments up > > to 2^64? > > That seems like plenty of alignment headroom.> Yes, that's fine too. If someone really wanted an alignment greater > than 1<<64, they would probably have other issues that would dwarf > this one.But see Evan's note. I can see a need for more bits if we take target-specific needs into account. I'll work on a solution.> >> How are you representing movnt in LLVM IR, with an intrinsic? > > > > Yes, it's an intrinsic inserted just before the load or store of > > interest. > > I'm not particularly happy with this solution because I'm not sure all > > transformation passes will preserve the ordering of > > intrinsic-followed-by-operation. Is there a better way to do this? > > Crazy idea: how about an intrinsic just after the load or store of > interest? > It would be a sort of opposite of prefetch, saying "i'm not going to be > using the memory at this address for a while". It could use > IntrWriteArgMem like the regular prefetch does, which is usually > sufficient to keep it from wandering off, though it is a little > stronger than > is strictly necessary.Does before vs. after matter for IntrWriteArgMem? What does that do anyway?> The other alternative is something like llvm.x86.sse.movnt.ps and > friends.I don't want to go that route because this kind of information is useful on multiple targets. Better to have one way to do it. We ain't coding Perl here. :) -Dave