The ARM Neon load, store and shuffle operations that I've been implementing recently with LLVM intrinsics do not care about the distinction between vectors with i32 and f32 elements -- only the size matters. But, because we have only MVT::fAny and MVT::iAny types, I've been having to define separate intrinsics for the operations with floating-point vector elements. It didn't bother me when there were only a few intrinsics like this, but now there are more, and I realized this weekend that I still need to add more for the load/store lane operations. I had been thinking about trying to bitcast my way out of this, but it struck me that it would make a lot more sense to have a new MVT::vAny type that TableGen would match to any vector type. That would more accurately reflect the type constraints on these intrinsics. It seems like since these "*Any" types are confined to TableGen, it should be pretty easy to add another one. I looked at the places using iAny and fAny and they seem pretty easy to extend to handle a new vAny type. Does this seem like a good idea? Any objections? I'd like to get the Neon intrinsics finalized before the 2.6 release, since it may be harder to change them later.
Hi Bob, An alternative would be to model the operations as regular shuffle, load, and store operators, combined to describe the actual instructions. This would make them easier for target-independent code to understand. Dan On Aug 8, 2009, at 11:47 PM, Bob Wilson <bob.wilson at apple.com> wrote:> The ARM Neon load, store and shuffle operations that I've been > implementing recently with LLVM intrinsics do not care about the > distinction between vectors with i32 and f32 elements -- only the size > matters. But, because we have only MVT::fAny and MVT::iAny types, > I've been having to define separate intrinsics for the operations with > floating-point vector elements. It didn't bother me when there were > only a few intrinsics like this, but now there are more, and I > realized this weekend that I still need to add more for the load/store > lane operations. > > I had been thinking about trying to bitcast my way out of this, but it > struck me that it would make a lot more sense to have a new MVT::vAny > type that TableGen would match to any vector type. That would more > accurately reflect the type constraints on these intrinsics. > > It seems like since these "*Any" types are confined to TableGen, it > should be pretty easy to add another one. I looked at the places > using iAny and fAny and they seem pretty easy to extend to handle a > new vAny type. Does this seem like a good idea? Any objections? > > I'd like to get the Neon intrinsics finalized before the 2.6 release, > since it may be harder to change them later. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Aug 8, 2009, at 11:47 PM, Bob Wilson wrote:> The ARM Neon load, store and shuffle operations that I've been > implementing recently with LLVM intrinsics do not care about the > distinction between vectors with i32 and f32 elements -- only the size > matters. But, because we have only MVT::fAny and MVT::iAny types, > I've been having to define separate intrinsics for the operations with > floating-point vector elements. It didn't bother me when there were > only a few intrinsics like this, but now there are more, and I > realized this weekend that I still need to add more for the load/store > lane operations.Hi Bob, I really do think that bitcast is the right way to go here. I ran into a couple of similar problems when bringing up the altivec port. For example, at one time we'd get "all zero vectors" of different MVTs, which would not be CSEd. The fix for this was to be really disciplined about what types to make things in, and use bitcasts to convert the types when appropriate. For example, the all zeros vector is now only created as a <4 x i32> (IIRC) and bitcasted to the desired type. If this is impacting intrinsics, it seems that the front-end could do a similar thing to canonicalize the intrinsics early. As you know, we do prefer to have as few intrinsics as possible. Can you describe a bit more about what fAny would do for you, maybe with an example? I'm sorry that I don't know much at all about neon... -Chris> > I had been thinking about trying to bitcast my way out of this, but it > struck me that it would make a lot more sense to have a new MVT::vAny > type that TableGen would match to any vector type. That would more > accurately reflect the type constraints on these intrinsics. > > It seems like since these "*Any" types are confined to TableGen, it > should be pretty easy to add another one. I looked at the places > using iAny and fAny and they seem pretty easy to extend to handle a > new vAny type. Does this seem like a good idea? Any objections? > > I'd like to get the Neon intrinsics finalized before the 2.6 release, > since it may be harder to change them later. > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Aug 9, 2009, at 6:29 AM, Dan Gohman wrote:> Hi Bob, > > An alternative would be to model the operations as regular shuffle, > load, and store operators, combined to describe the actual > instructions. This would make them easier for target-independent code > to understand.Yes, I have tried to do that as much as possible. There are still a number of operations where we've ended up using intrinsics, for varying reasons. For example, I had been planning to have the front-end translate the VTRN, VZIP, and VUZP builtins to vector shuffles, since that is exactly what they are. But, after discussing it with Evan, I changed these to intrinsics because we couldn't figure out a good way to handle them as shuffles. They take two vector operands and shuffle them in place, producing two vector results. I had been translating these to shuffles that produced double-wide vectors, e.g., shuffle two <8 x i8> vectors producing one <16 x i8> vector. That made it hard for the optimizer to deal with the results, since they are really two separate vectors, and some simple experiments made me think we won't get very good code from that approach. The load/store multiple with element (de)interleaving operations also worked out best as intrinsics. Maybe we can talk about these in person if you want the gory details.> > Dan > > On Aug 8, 2009, at 11:47 PM, Bob Wilson <bob.wilson at apple.com> wrote: > >> The ARM Neon load, store and shuffle operations that I've been >> implementing recently with LLVM intrinsics do not care about the >> distinction between vectors with i32 and f32 elements -- only the >> size >> matters. But, because we have only MVT::fAny and MVT::iAny types, >> I've been having to define separate intrinsics for the operations >> with >> floating-point vector elements. It didn't bother me when there were >> only a few intrinsics like this, but now there are more, and I >> realized this weekend that I still need to add more for the load/ >> store >> lane operations. >> >> I had been thinking about trying to bitcast my way out of this, but >> it >> struck me that it would make a lot more sense to have a new MVT::vAny >> type that TableGen would match to any vector type. That would more >> accurately reflect the type constraints on these intrinsics. >> >> It seems like since these "*Any" types are confined to TableGen, it >> should be pretty easy to add another one. I looked at the places >> using iAny and fAny and they seem pretty easy to extend to handle a >> new vAny type. Does this seem like a good idea? Any objections? >> >> I'd like to get the Neon intrinsics finalized before the 2.6 release, >> since it may be harder to change them later. >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
On Aug 9, 2009, at 8:37 AM, Chris Lattner wrote:> I really do think that bitcast is the right way to go here. I ran > into a couple of similar problems when bringing up the altivec port. > For example, at one time we'd get "all zero vectors" of different > MVTs, which would not be CSEd. > > The fix for this was to be really disciplined about what types to make > things in, and use bitcasts to convert the types when appropriate. > For example, the all zeros vector is now only created as a <4 x i32> > (IIRC) and bitcasted to the desired type.Yes, I used that approach, at least to some extent, for Neon. There may be more to do to make sure things are getting CSEd the way we want.> If this is impacting > intrinsics, it seems that the front-end could do a similar thing to > canonicalize the intrinsics early. As you know, we do prefer to have > as few intrinsics as possible.That is exactly what I'm trying to accomplish here (fewer intrinsics). I think I can do it with bitcasts, though.> > Can you describe a bit more about what fAny would do for you, maybe > with an example? I'm sorry that I don't know much at all about > neon...It doesn't do anything fundamental. It just seems like a better fit. Neon has vectors of both integers and floats. Currently my choices for describing the type constraints for a Neon intrinsic are iAny or fAny, but those also allow scalars. vAny would more accurately indicate that only vector types are allowed, and it would also avoid the need for bitcasting. It sounds like it is not a popular idea, so I'll let it rest.> > -Chris > > >> >> I had been thinking about trying to bitcast my way out of this, but >> it >> struck me that it would make a lot more sense to have a new MVT::vAny >> type that TableGen would match to any vector type. That would more >> accurately reflect the type constraints on these intrinsics. >> >> It seems like since these "*Any" types are confined to TableGen, it >> should be pretty easy to add another one. I looked at the places >> using iAny and fAny and they seem pretty easy to extend to handle a >> new vAny type. Does this seem like a good idea? Any objections? >> >> I'd like to get the Neon intrinsics finalized before the 2.6 release, >> since it may be harder to change them later. >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev