Bob Wilson writes:> On Sep 21, 2010, at 9:33 AM, Renato Golin wrote: > > I was checking NEON instructions this week and the vector types seem > > to be inside structures. If vector types are considered proper types > > in LLVM, why pack them inside structures? > > Because that is what ARM has specified? They define the vector types > that are used with their NEON intrinsics as "containerized vectors". > Perhaps someone on the list from ARM can explain why they did it that > way."Containerized Vector" in the ARM AAPCS refers to fundamental data types (machine types), it's the class of machine types comprising the 64-bit and 128-bit NEON machine types. The AAPCS defines how to pass containerized vectors and it defines how the NEON user types map on to them. Also it defines how to mangle the NEON user types. So it defines how to use NEON user types in a public binary interface. It also says that arm_neon.h "defines a set of internal structures that describe the short vector types" which I guess could be read as saying they are packed inside structures - but I don't think this is the intention and it doesn't match implementations. The arm_neon.h implementation in the ARM compiler defines the user types in terms of C structs called __simd64_int8_t etc. and the mangling originated as an artifact of this. But the C structs aren't wrapped vectors; they wrap double or a pair of doubles, to get the size and alignment. Their only purpose is to be recognized by name by the front end and turned into a native register type. In gcc's arm_neon.h the user types aren't structs at all, they're defined using vector_size and the mangling is done as a special case. So I think there's no need to wrap these types in LLVM. Al -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On Tue, Sep 21, 2010 at 11:07 PM, Alasdair Grant <Alasdair.Grant at arm.com> wrote:> Bob Wilson writes: >> On Sep 21, 2010, at 9:33 AM, Renato Golin wrote: >> > I was checking NEON instructions this week and the vector types seem >> > to be inside structures. If vector types are considered proper types >> > in LLVM, why pack them inside structures? >> >> Because that is what ARM has specified? They define the vector types >> that are used with their NEON intrinsics as "containerized vectors". >> Perhaps someone on the list from ARM can explain why they did it that >> way. > > "Containerized Vector" in the ARM AAPCS refers to fundamental data > types (machine types), it's the class of machine types comprising > the 64-bit and 128-bit NEON machine types. > > The AAPCS defines how to pass containerized vectors and it defines > how the NEON user types map on to them. Also it defines how to > mangle the NEON user types. So it defines how to use NEON user > types in a public binary interface. > > It also says that arm_neon.h "defines a set of internal structures > that describe the short vector types" which I guess could be read > as saying they are packed inside structures - but I don't think this > is the intention and it doesn't match implementations. > The arm_neon.h implementation in the ARM compiler defines the user > types in terms of C structs called __simd64_int8_t etc. and the > mangling originated as an artifact of this. But the C structs aren't > wrapped vectors; they wrap double or a pair of doubles, to get the > size and alignment. Their only purpose is to be recognized by name > by the front end and turned into a native register type. > In gcc's arm_neon.h the user types aren't structs at all, they're > defined using vector_size and the mangling is done as a special case. > > So I think there's no need to wrap these types in LLVM.They are defined as structures. The table in A.2 defines the exact structure names. There is a requirement to mangle them as those structures in A.2.1. The fields of the structure may be different in this implementation, but the net effect here is that llvm-gcc and clang avoid having to magically recognize NEON types and substitute the proper mangling for them the way GCC does. deep
On Sep 21, 2010, at 4:33 PM, Sandeep Patel wrote:> On Tue, Sep 21, 2010 at 11:07 PM, Alasdair Grant <Alasdair.Grant at arm.com> wrote: >> Bob Wilson writes: >>> On Sep 21, 2010, at 9:33 AM, Renato Golin wrote: >>>> I was checking NEON instructions this week and the vector types seem >>>> to be inside structures. If vector types are considered proper types >>>> in LLVM, why pack them inside structures? >>> >>> Because that is what ARM has specified? They define the vector types >>> that are used with their NEON intrinsics as "containerized vectors". >>> Perhaps someone on the list from ARM can explain why they did it that >>> way. >> >> "Containerized Vector" in the ARM AAPCS refers to fundamental data >> types (machine types), it's the class of machine types comprising >> the 64-bit and 128-bit NEON machine types. >> >> The AAPCS defines how to pass containerized vectors and it defines >> how the NEON user types map on to them. Also it defines how to >> mangle the NEON user types. So it defines how to use NEON user >> types in a public binary interface. >> >> It also says that arm_neon.h "defines a set of internal structures >> that describe the short vector types" which I guess could be read >> as saying they are packed inside structures - but I don't think this >> is the intention and it doesn't match implementations. >> The arm_neon.h implementation in the ARM compiler defines the user >> types in terms of C structs called __simd64_int8_t etc. and the >> mangling originated as an artifact of this. But the C structs aren't >> wrapped vectors; they wrap double or a pair of doubles, to get the >> size and alignment. Their only purpose is to be recognized by name >> by the front end and turned into a native register type. >> In gcc's arm_neon.h the user types aren't structs at all, they're >> defined using vector_size and the mangling is done as a special case. >> >> So I think there's no need to wrap these types in LLVM. > > They are defined as structures. The table in A.2 defines the exact > structure names. There is a requirement to mangle them as those > structures in A.2.1. The fields of the structure may be different in > this implementation, but the net effect here is that llvm-gcc and > clang avoid having to magically recognize NEON types and substitute > the proper mangling for them the way GCC does.Right. The contents of the struct don't matter -- the spec is pretty clear about that -- so llvm uses vector types instead of doubles, but your spec definitely shows them being defined as structs. Beyond that, if you want any sort of cross-compiler portability, you don't want to write code for GCC's implementation. GCC lets you freely intermix vector types, or at least integer vector types, as long as they have the same total size. Since ARM's definition says they are structs, if you want portable NEON code, you have to assume that your intrinsic arguments are compatible based on struct type compatibility, i.e., they have to match exactly, even down to signed vs. unsigned element types. This is a huge hassle. If you take code written for GCC, you typically end up inserting vreinterpret calls all over the place. This was such a problem for llvm-gcc that we had to implement an optional GCC-compatibility mode, and we're planning to do something similar for clang using overloaded intrinsics.
> They are defined as structures. The table in A.2 defines the exact > structure names. There is a requirement to mangle them as those > structures in A.2.1.The mangling requirement doesn't require you to meet it in any particular way as long as you end up with the right strings. I.e. the mangling requirement places no requirements at all on the implementation, outside of mangled names. The controversial statement is where A.2 requires the user types to map on to the __simd64 structures. But that still isn't an argument for wrapping. This sentence might be significant: "The structures have 64-bit alignment and map directly onto the containerized vector fundamental data types." So the structures map _directly_ on to the vector types - not on to wrappers around the vector types.> The fields of the structure may be different in > this implementation, but the net effect here is that llvm-gcc and > clang avoid having to magically recognize NEON types and substitute > the proper mangling for them the way GCC does.But mangling routines are self-contained and have to deal with this sort of target issue anyway, e.g. gcc for ARM deals with 16-bit floats as well as NEON types. Saving 20 or so lines in a mangling routine makes no sense as an argument for a particular implementation strategy for arm_neon.h and a front end, let alone for inflating bitcode files with lots of wrapping around vector operations. Al -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.