Jose Fonseca
2011-Nov-29 20:24 UTC
[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP
----- Original Message -----> "Rotem, Nadav" <nadav.rotem at intel.com> writes: > > > David, > > > > Thanks for the support! I sent a detailed email with the overall > > plan. But just to reiterate, the GEP would look like this: > > > > %PV = getelementptr <4 x i32*> %base, <4 x i32> <i32 1, i32 2, i32 > > 3, i32 4> > > > > Where the index of the GEP is a vector of indices. I am not against > > having multiple indices. I just want to start with a basic set of > > features. > > Ah, I see. I actually think multiple indices as in multiple vectors > of > indices to the GEP above would be pretty rare.Nadav, David, I'd like to understand a bit better the final role of these pointer vector types in 64bit architectures, where the pointers are often bigger than the elements stored/fetch (e.g, 32bits floats/ints). Will 64bits backends be forced to actually operate with 64bit pointer vectors all the time? Or will they be able to retain operations on base + 32bit offsets as such? In particular, an important use case for 3D software rendering is to be able to gather <4 x i32> values, from a i32* scalar base pointer in a 64bit address space, indexed by <N x i32> offsets. [1] And it is important that the intermediate <N x i32*> pointer vectors is actually never instanced, as it wouldn't fit in the hardware SIMD registers, and therefore would require two gather operations. It would be nice to see how this use case would look in the proposed IR, and get assurance that backends will be able to emit efficient code (i.e., a single gather instruction) from that IR. Jose [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-June/040825.html
Rotem, Nadav
2011-Nov-29 21:48 UTC
[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP
Hi Jose, The proposed IR change does not contribute nor hinder the usecase you mentioned. The case of a base + vector-index should be easily addressed by an intrinsic. The pointer-vector proposal comes to support full scatter/gather instructions (such as the AVX2 gather instructions). Nadav -----Original Message----- From: Jose Fonseca [mailto:jfonseca at vmware.com] Sent: Tuesday, November 29, 2011 22:25 To: Rotem, Nadav; David A. Greene Cc: LLVM Developers Mailing List Subject: Re: [LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP ----- Original Message -----> "Rotem, Nadav" <nadav.rotem at intel.com> writes: > > > David, > > > > Thanks for the support! I sent a detailed email with the overall > > plan. But just to reiterate, the GEP would look like this: > > > > %PV = getelementptr <4 x i32*> %base, <4 x i32> <i32 1, i32 2, i32 > > 3, i32 4> > > > > Where the index of the GEP is a vector of indices. I am not against > > having multiple indices. I just want to start with a basic set of > > features. > > Ah, I see. I actually think multiple indices as in multiple vectors > of > indices to the GEP above would be pretty rare.Nadav, David, I'd like to understand a bit better the final role of these pointer vector types in 64bit architectures, where the pointers are often bigger than the elements stored/fetch (e.g, 32bits floats/ints). Will 64bits backends be forced to actually operate with 64bit pointer vectors all the time? Or will they be able to retain operations on base + 32bit offsets as such? In particular, an important use case for 3D software rendering is to be able to gather <4 x i32> values, from a i32* scalar base pointer in a 64bit address space, indexed by <N x i32> offsets. [1] And it is important that the intermediate <N x i32*> pointer vectors is actually never instanced, as it wouldn't fit in the hardware SIMD registers, and therefore would require two gather operations. It would be nice to see how this use case would look in the proposed IR, and get assurance that backends will be able to emit efficient code (i.e., a single gather instruction) from that IR. Jose [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-June/040825.html --------------------------------------------------------------------- Intel Israel (74) Limited This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
Jose Fonseca
2011-Nov-30 15:59 UTC
[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP
Yes, indeed I can always fallback to intrinsics. But still, I believe that the case I described is in its essence quite common-place, so it should be a first-class citizen in the LLVM IR. AVX2 is the target ISA I'm thinking of too BTW. Let's forget 3D, and imagine something as trivial as a vectorized i32 => float table look up. I'd expect that the IR would look something like: ; Look Up Table with precomputed values declare float* @lut; define <8 x float> @foo(<8 x float> %indices) { %pointer = getelementptr float* @lut, <8 x i32> %indices %values = load <8 x float*> %pointer ret <8 x float> %values; } And the final AVX2 code I'd expect would consist of a single VGATHERDPS, both on 64bits and 32bits addressing mode: foo: VPCMPEQB ymm1, ymm1, ymm1 ; generate all ones VGATHERDPS ymm0, DWORD PTR [ymm0 * 4 + lut], ymm1 RET Jose ----- Original Message -----> Hi Jose, > > The proposed IR change does not contribute nor hinder the usecase you > mentioned. The case of a base + vector-index should be easily > addressed by an intrinsic. The pointer-vector proposal comes to > support full scatter/gather instructions (such as the AVX2 gather > instructions). > > Nadav > > > -----Original Message----- > From: Jose Fonseca [mailto:jfonseca at vmware.com] > Sent: Tuesday, November 29, 2011 22:25 > To: Rotem, Nadav; David A. Greene > Cc: LLVM Developers Mailing List > Subject: Re: [LLVMdev] [llvm-commits] Vectors of Pointers and > Vector-GEP > > ----- Original Message ----- > > "Rotem, Nadav" <nadav.rotem at intel.com> writes: > > > > > David, > > > > > > Thanks for the support! I sent a detailed email with the overall > > > plan. But just to reiterate, the GEP would look like this: > > > > > > %PV = getelementptr <4 x i32*> %base, <4 x i32> <i32 1, i32 2, > > > i32 > > > 3, i32 4> > > > > > > Where the index of the GEP is a vector of indices. I am not > > > against > > > having multiple indices. I just want to start with a basic set of > > > features. > > > > Ah, I see. I actually think multiple indices as in multiple > > vectors > > of > > indices to the GEP above would be pretty rare. > > Nadav, David, > > I'd like to understand a bit better the final role of these pointer > vector types in 64bit architectures, where the pointers are often > bigger than the elements stored/fetch (e.g, 32bits floats/ints). > > Will 64bits backends be forced to actually operate with 64bit pointer > vectors all the time? Or will they be able to retain operations on > base + 32bit offsets as such? > > In particular, an important use case for 3D software rendering is to > be able to gather <4 x i32> values, from a i32* scalar base pointer > in a 64bit address space, indexed by <N x i32> offsets. [1] And it > is important that the intermediate <N x i32*> pointer vectors is > actually never instanced, as it wouldn't fit in the hardware SIMD > registers, and therefore would require two gather operations. > > It would be nice to see how this use case would look in the proposed > IR, and get assurance that backends will be able to emit efficient > code (i.e., a single gather instruction) from that IR. > > Jose > > [1] http://lists.cs.uiuc.edu/pipermail/llvmdev/2011-June/040825.html > --------------------------------------------------------------------- > Intel Israel (74) Limited > > This e-mail and any attachments may contain confidential material for > the sole use of the intended recipient(s). Any review or distribution > by others is strictly prohibited. If you are not the intended > recipient, please contact the sender and delete all copies. >
David A. Greene
2011-Dec-01 17:08 UTC
[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP
Jose Fonseca <jfonseca at vmware.com> writes:> ----- Original Message ----- >> "Rotem, Nadav" <nadav.rotem at intel.com> writes: >> >> > David, >> > >> > Thanks for the support! I sent a detailed email with the overall >> > plan. But just to reiterate, the GEP would look like this: >> > >> > %PV = getelementptr <4 x i32*> %base, <4 x i32> <i32 1, i32 2, i32 >> > 3, i32 4> >> > >> > Where the index of the GEP is a vector of indices. I am not against >> > having multiple indices. I just want to start with a basic set of >> > features. >> >> Ah, I see. I actually think multiple indices as in multiple vectors >> of >> indices to the GEP above would be pretty rare. > > Nadav, David, > > I'd like to understand a bit better the final role of these pointer > vector types in 64bit architectures, where the pointers are often > bigger than the elements stored/fetch (e.g, 32bits floats/ints).The pointers are addresses. On a 64-bit address machine they will be 64 bits. On a 32-bit address machine they will be 32 bits. For a situation like PTX that has multiple addresses sizes, we will need additional LLVM support. Right now a pointer can only have one size per target.> Will 64bits backends be forced to actually operate with 64bit pointer > vectors all the time? Or will they be able to retain operations on > base + 32bit offsets as such?Are you talking about 32-bit pointers? If so, Nadav has talked about vector inttoptr and ptrtoint instructions which I think can address the need you're getting at. But I'm a little unclear on what you want.> In particular, an important use case for 3D software rendering is to > be able to gather <4 x i32> values, from a i32* scalar base pointer in > a 64bit address space, indexed by <N x i32> offsets. [1] And it is > important that the intermediate <N x i32*> pointer vectors is actually > never instanced, as it wouldn't fit in the hardware SIMD registers, > and therefore would require two gather operations.By "fit" are you worried about vector length? If so, legalize would have to break up the <N x i32*> vector into two or more smaller vectors. If you are worried about element size (there are only 32-bit elements) then inttoptr/ptrtoint should handle it, I think. -Dave
Jose Fonseca
2011-Dec-05 18:02 UTC
[LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP
----- Original Message -----> Jose Fonseca <jfonseca at vmware.com> writes: > > > ----- Original Message ----- > >> "Rotem, Nadav" <nadav.rotem at intel.com> writes: > >> > >> > David, > >> > > >> > Thanks for the support! I sent a detailed email with the overall > >> > plan. But just to reiterate, the GEP would look like this: > >> > > >> > %PV = getelementptr <4 x i32*> %base, <4 x i32> <i32 1, i32 2, > >> > i32 > >> > 3, i32 4> > >> > > >> > Where the index of the GEP is a vector of indices. I am not > >> > against > >> > having multiple indices. I just want to start with a basic set > >> > of > >> > features. > >> > >> Ah, I see. I actually think multiple indices as in multiple > >> vectors > >> of > >> indices to the GEP above would be pretty rare. > > > > Nadav, David, > > > > I'd like to understand a bit better the final role of these pointer > > vector types in 64bit architectures, where the pointers are often > > bigger than the elements stored/fetch (e.g, 32bits floats/ints). > > The pointers are addresses. On a 64-bit address machine they will be > 64 > bits. On a 32-bit address machine they will be 32 bits. > > For a situation like PTX that has multiple addresses sizes, we will > need > additional LLVM support. Right now a pointer can only have one size > per > target. > > > Will 64bits backends be forced to actually operate with 64bit > > pointer > > vectors all the time? Or will they be able to retain operations on > > base + 32bit offsets as such? > > Are you talking about 32-bit pointers? If so, Nadav has talked about > vector inttoptr and ptrtoint instructions which I think can address > the > need you're getting at. But I'm a little unclear on what you want. > > > In particular, an important use case for 3D software rendering is > > to > > be able to gather <4 x i32> values, from a i32* scalar base pointer > > in > > a 64bit address space, indexed by <N x i32> offsets. [1] And it is > > important that the intermediate <N x i32*> pointer vectors is > > actually > > never instanced, as it wouldn't fit in the hardware SIMD registers, > > and therefore would require two gather operations. > > By "fit" are you worried about vector length? If so, legalize would > have to break up the <N x i32*> vector into two or more smaller > vectors. > > If you are worried about element size (there are only 32-bit > elements) > then inttoptr/ptrtoint should handle it, I think.I was referring to gathering a vector of sparse 32bit words, all relative from a base scalar pointer in a 64bit address space, where the offsets are in a 32bit integer vector. My other reply gave a more detailed and concrete example. Anyway, from Nadav's and your other replies on this thread it is now clear to me that even if the IR doesn't express base scalar pointers w/ vector indices directly, the backend can always match and emit the most efficient machine instruction. This addresses my main concern. Jose
Possibly Parallel Threads
- [LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP
- [LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP
- [LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP
- [LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP
- [LLVMdev] [llvm-commits] Vectors of Pointers and Vector-GEP