It looks like I need to be able to intercept GEP lowering (in SelectionDAGLowering::visitGetElementPtr) and insert something else other than the shifts and adds. The basic problem is that CellSPU loads and stores on 16-byte boundaries. Consequently, the SPU backend has to do the load or store differently than most normal architectures that have byte-addressable operations. Unfortunately, detecting whether an add is really an add or whether it was generated by a GEP lowering is ambiguous. Hence, the need to custom lower GEP. From reading the code, hijacking SelectionDAGLowering::visitGetElementPtr() appears to be the only way to pull this off. Is there a better way? If not, how receptive would the community be to: a) Creating a GEP DAG node b) Sending the new GEP node through legalize/custom/promote switch I'm sure that this will spark a raging debate, with a few comments about how to refactor the whole legalize/custom/promote switch, etc. But all I really want to do it customize GEP processing for Cell. -scooter
On Mon, Aug 27, 2007 at 07:26:55PM -0700, Scott Michel wrote:> It looks like I need to be able to intercept GEP lowering (in > SelectionDAGLowering::visitGetElementPtr) and insert something else > other than the shifts and adds. The basic problem is that CellSPU > loads and stores on 16-byte boundaries. Consequently, the SPU backend > has to do the load or store differently than most normal > architectures that have byte-addressable operations.In TOT, load and store instructions have an alignment attribute which is useful for addressing similar needs on other architectures. For example, this attribute is used on x86, which also has a bunch of instructions which require 16-byte alignment. x86 uses it quite late, after legalize, and I don't know if that's appropriate for the CellSPU target, but wherever you're doing the lowering, could you use the load and store alignment attribute? The alignment attribute can be set by LLVM IR producers (front-ends), however instcombine also automatically sets alignments on load an store instructions by looking through GEPs and casts and examining underlying storage. There's room for improvement, but it gets common cases. Dan -- Dan Gohman, Cray Inc.
On Aug 28, 2007, at 7:02 AM, Dan Gohman wrote:> On Mon, Aug 27, 2007 at 07:26:55PM -0700, Scott Michel wrote: >> It looks like I need to be able to intercept GEP lowering (in >> SelectionDAGLowering::visitGetElementPtr) and insert something else >> other than the shifts and adds. The basic problem is that CellSPU >> loads and stores on 16-byte boundaries. Consequently, the SPU backend >> has to do the load or store differently than most normal >> architectures that have byte-addressable operations. > > In TOT, load and store instructions have an alignment attribute > which is > useful for addressing similar needs on other architectures. For > example, > this attribute is used on x86, which also has a bunch of instructions > which require 16-byte alignment. x86 uses it quite late, after > legalize, > and I don't know if that's appropriate for the CellSPU target, but > wherever you're doing the lowering, could you use the load and store > alignment attribute?I'm aware of this attribute, but it doesn't help. The underlying problem is that CellSPU does not know how to natively perform byte- level addressing. For example, here's an indexed stack instruction to load register $3: ldq $3, 4($sp) In reality, the "4($sp)" doesn't mean what you think it means in the PPC and x86 worlds: that's 4 x 16 -- load quadword (ldq) appends four zero bits to the right of the offset. To get at the 4th byte requires loading from 0($sp) and some vector shuffling. (Dan: Think about older Cray hardware... you'll immediately understand!) I could try custom lowering loads and stores as an interim step and detect if one of the operands is really a frameindex (or global variable or external variable or ... <insert exhaustive list of edge cases here>) added to some offset. Ultimately, custom lowering GEPs is probably the better idea (if not a lot more work). If I go ahead and shuffle around some code (no pun intended), would it worth my while to prototype some refactoring of the legalize/ promote/custom mess, since I'll have to touch it anyway for custom GEP lowering? -scooter