Daniel Sanders
2013-Aug-12 14:06 UTC
[LLVMdev] [global-isel] Type-independence of load/store
> > Other big-endian targets may have similar issues, but I know virtually > > nothing about them. > > ARM's is an interesting implementation of big-endian vectors. AFAIK, other > architectures go all in and use both big-endian lanes and elements. That > makes the problem go away, and you only need one load instruction.The recently published MIPS SIMD Architecture (MSA) has the same issue for big-endian vectors. There's a small non-functional benefit to accounting for this in little-endian too. For little-endian mode, the emitted code is a bit easier to understand if the 'correct' loads and stores are used. Daniel Sanders Leading Software Design Engineer, MIPS Processor IP Imagination Technologies Limited www.imgtec.com
Jakob Stoklund Olesen
2013-Aug-12 17:20 UTC
[LLVMdev] [global-isel] Type-independence of load/store
On Aug 12, 2013, at 7:06 AM, Daniel Sanders <Daniel.Sanders at imgtec.com> wrote:>>> Other big-endian targets may have similar issues, but I know virtually >>> nothing about them. >> >> ARM's is an interesting implementation of big-endian vectors. AFAIK, other >> architectures go all in and use both big-endian lanes and elements. That >> makes the problem go away, and you only need one load instruction. > > The recently published MIPS SIMD Architecture (MSA) has the same issue for big-endian vectors. There's a small non-functional benefit to accounting for this in little-endian too. For little-endian mode, the emitted code is a bit easier to understand if the 'correct' loads and stores are used.AltiVec is an implementation of big-endian vectors that doesn’t require multiple load instructions or shuffling bitcasts. See section 4.2 of http://www.freescale.com/files/32bit/doc/ref_manual/ALTIVECPIM.pdf I can’t tell if MIPS and ARM are doing the same thing, or if they need different models. I don’t think either has ever been attempted in LLVM. I suspect that some tinkering is required at the IR level as well to make it work. But it seems like we’ll probably need to allow the vector shape to influence load/store instruction selection. Thanks, /jakob
Daniel Sanders
2013-Aug-13 09:52 UTC
[LLVMdev] [global-isel] Type-independence of load/store
I believe we are doing the same thing for normal loads and stores. In big-endian mode, our st.h stores using the byte-order 10325476 and our st.d stores 76543210. This is the same as the vst1.16 and vst1.64 that Tim described. I'm actually working on MSA at the moment. I've just started upstreaming my work now that the spec has been published. I've had two main problems with my implementation of MSA so far. The first is that the type system is rather awkward and doesn't cope very well with multiple types in the same register class. I ended up splitting up my register classes according to the number of bits in the element. The second is that on MIPS32 with MSA the v2i64 type is legal but i64 is not. I'm finding that the legalizer and dag combiner sometimes generates SelectionDAG nodes with illegal types (e.g. a build_vector containing i64's). I believe both of these issues would be solved using the proposed global-isel.> -----Original Message----- > From: Jakob Stoklund Olesen [mailto:stoklund at 2pi.dk] > Sent: 12 August 2013 18:20 > To: Daniel Sanders > Cc: Tim Northover; LLVM Developers Mailing List > Subject: Re: [LLVMdev] [global-isel] Type-independence of load/store > > > On Aug 12, 2013, at 7:06 AM, Daniel Sanders <Daniel.Sanders at imgtec.com> > wrote: > > >>> Other big-endian targets may have similar issues, but I know > >>> virtually nothing about them. > >> > >> ARM's is an interesting implementation of big-endian vectors. AFAIK, > >> other architectures go all in and use both big-endian lanes and > >> elements. That makes the problem go away, and you only need one load > instruction. > > > > The recently published MIPS SIMD Architecture (MSA) has the same issue > for big-endian vectors. There's a small non-functional benefit to accounting > for this in little-endian too. For little-endian mode, the emitted code is a bit > easier to understand if the 'correct' loads and stores are used. > > AltiVec is an implementation of big-endian vectors that doesn't require > multiple load instructions or shuffling bitcasts. See section 4.2 of > http://www.freescale.com/files/32bit/doc/ref_manual/ALTIVECPIM.pdf > > I can't tell if MIPS and ARM are doing the same thing, or if they need > different models. I don't think either has ever been attempted in LLVM. I > suspect that some tinkering is required at the IR level as well to make it work. > > But it seems like we'll probably need to allow the vector shape to influence > load/store instruction selection. > > Thanks, > /jakob