Something like: index 6db7f68..68564cb 100644 --- a/lib/Transforms/Vectorize/LoopVectorize.cpp +++ b/lib/Transforms/Vectorize/LoopVectorize.cpp @@ -1208,6 +1208,8 @@ void InnerLoopVectorizer::vectorizeMemoryInstruction(Instr Type *DataTy = VectorType::get(ScalarDataTy, VF); Value *Ptr = LI ? LI->getPointerOperand() : SI->getPointerOperand(); unsigned Alignment = LI ? LI->getAlignment() : SI->getAlignment(); + if (Alignment == 0) + Alignment = 1; unsigned AddressSpace = Ptr->getType()->getPointerAddressSpace(); unsigned ScalarAllocatedSize = DL->getTypeAllocSize(ScalarDataTy); unsigned VectorElementSize = DL->getTypeStoreSize(DataTy)/VF; Should fix this. On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote:> Nadav, > > I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to vectorization. I can't tell if it's the loop vectorizer or the codegen at fault, but the alignment assumption seems to sneak in somewhere. > > v/r, > Josh > > [1] http://pastebin.com/kc95WtGG > [2] http://pastebin.com/VY3ZLVJK > > > On Fri, Nov 15, 2013 at 3:58 PM, Nadav Rotem <nrotem at apple.com> wrote: > > On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org> wrote: > >> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> wrote: >> Agreed, is there a pass that will insert a runtime alignment check? Also, what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32? Thanks! >> >> I think that's a fair question, and it's about safety. If you're getting this on the JIT, means we may be generating unsafe transformations on the vectorizer. >> >> Arnold, Nadav, I don't remember seeing code to generate any run-time alignment checks on the incoming pointer, is there such a thing? If not, shouldn't we add one? > > > If the the vectorizer generates aligned memory accesses to unaligned addresses then this is a serious bug. But I don’t think that Josh said that the vectorizer generated aligned accesses to unaligned pointers. > > There is no point in LLVM checking for alignment because if the memory is unaligned then the program will crash. Users who want to crash with a readable error message can simply write code that checks the pointer (by masking the high bits and comparing to zero). > >
----- Original Message -----> From: "Arnold Schwaighofer" <aschwaighofer at apple.com> > To: "Joshua Klontz" <josh.klontz at gmail.com> > Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu> > Sent: Friday, November 15, 2013 4:05:53 PM > Subject: Re: [LLVMdev] Limit loop vectorizer to SSE > > > Something like: > > index 6db7f68..68564cb 100644 > --- a/lib/Transforms/Vectorize/LoopVectorize.cpp > +++ b/lib/Transforms/Vectorize/LoopVectorize.cpp > @@ -1208,6 +1208,8 @@ void > InnerLoopVectorizer::vectorizeMemoryInstruction(Instr > Type *DataTy = VectorType::get(ScalarDataTy, VF); > Value *Ptr = LI ? LI->getPointerOperand() : > SI->getPointerOperand(); > unsigned Alignment = LI ? LI->getAlignment() : SI->getAlignment(); > + if (Alignment == 0) > + Alignment = 1;That may be conservatively correct, but I don't think it is really right. An alignment of 0 is supposed to mean the platform default alignment (IIRC, DataLayout::getPrefTypeAlignment tells you what it is). -Hal> unsigned AddressSpace = Ptr->getType()->getPointerAddressSpace(); > unsigned ScalarAllocatedSize = DL->getTypeAllocSize(ScalarDataTy); > unsigned VectorElementSize = DL->getTypeStoreSize(DataTy)/VF; > > Should fix this. > > On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> > wrote: > > > Nadav, > > > > I believe aligned accesses to unaligned pointers is precisely the > > issue. Consider the function `add_u8S` before[1] and after[2] the > > loop vectorizer pass. There is no alignment assumption associated > > with %kernel_data prior to vectorization. I can't tell if it's the > > loop vectorizer or the codegen at fault, but the alignment > > assumption seems to sneak in somewhere. > > > > v/r, > > Josh > > > > [1] http://pastebin.com/kc95WtGG > > [2] http://pastebin.com/VY3ZLVJK > > > > > > On Fri, Nov 15, 2013 at 3:58 PM, Nadav Rotem <nrotem at apple.com> > > wrote: > > > > On Nov 15, 2013, at 12:36 PM, Renato Golin > > <renato.golin at linaro.org> wrote: > > > >> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> > >> wrote: > >> Agreed, is there a pass that will insert a runtime alignment > >> check? Also, what's the easiest way to get at > >> TargetTransformInfo::getRegisterBitWidth() so I don't have to > >> hard code 32? Thanks! > >> > >> I think that's a fair question, and it's about safety. If you're > >> getting this on the JIT, means we may be generating unsafe > >> transformations on the vectorizer. > >> > >> Arnold, Nadav, I don't remember seeing code to generate any > >> run-time alignment checks on the incoming pointer, is there such > >> a thing? If not, shouldn't we add one? > > > > > > If the the vectorizer generates aligned memory accesses to > > unaligned addresses then this is a serious bug. But I don’t think > > that Josh said that the vectorizer generated aligned accesses to > > unaligned pointers. > > > > There is no point in LLVM checking for alignment because if the > > memory is unaligned then the program will crash. Users who want > > to crash with a readable error message can simply write code that > > checks the pointer (by masking the high bits and comparing to > > zero). > > > > > > > _______________________________________________ > LLVM Developers mailing list > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev >
Just confirmed this (conservatively correct) patch fixed the issue for me, thanks! On Fri, Nov 15, 2013 at 5:12 PM, Hal Finkel <hfinkel at anl.gov> wrote:> ----- Original Message ----- > > From: "Arnold Schwaighofer" <aschwaighofer at apple.com> > > To: "Joshua Klontz" <josh.klontz at gmail.com> > > Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu> > > Sent: Friday, November 15, 2013 4:05:53 PM > > Subject: Re: [LLVMdev] Limit loop vectorizer to SSE > > > > > > Something like: > > > > index 6db7f68..68564cb 100644 > > --- a/lib/Transforms/Vectorize/LoopVectorize.cpp > > +++ b/lib/Transforms/Vectorize/LoopVectorize.cpp > > @@ -1208,6 +1208,8 @@ void > > InnerLoopVectorizer::vectorizeMemoryInstruction(Instr > > Type *DataTy = VectorType::get(ScalarDataTy, VF); > > Value *Ptr = LI ? LI->getPointerOperand() : > > SI->getPointerOperand(); > > unsigned Alignment = LI ? LI->getAlignment() : SI->getAlignment(); > > + if (Alignment == 0) > > + Alignment = 1; > > That may be conservatively correct, but I don't think it is really right. > An alignment of 0 is supposed to mean the platform default alignment (IIRC, > DataLayout::getPrefTypeAlignment tells you what it is). > > -Hal > > > unsigned AddressSpace = Ptr->getType()->getPointerAddressSpace(); > > unsigned ScalarAllocatedSize = DL->getTypeAllocSize(ScalarDataTy); > > unsigned VectorElementSize = DL->getTypeStoreSize(DataTy)/VF; > > > > Should fix this. > > > > On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> > > wrote: > > > > > Nadav, > > > > > > I believe aligned accesses to unaligned pointers is precisely the > > > issue. Consider the function `add_u8S` before[1] and after[2] the > > > loop vectorizer pass. There is no alignment assumption associated > > > with %kernel_data prior to vectorization. I can't tell if it's the > > > loop vectorizer or the codegen at fault, but the alignment > > > assumption seems to sneak in somewhere. > > > > > > v/r, > > > Josh > > > > > > [1] http://pastebin.com/kc95WtGG > > > [2] http://pastebin.com/VY3ZLVJK > > > > > > > > > On Fri, Nov 15, 2013 at 3:58 PM, Nadav Rotem <nrotem at apple.com> > > > wrote: > > > > > > On Nov 15, 2013, at 12:36 PM, Renato Golin > > > <renato.golin at linaro.org> wrote: > > > > > >> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> > > >> wrote: > > >> Agreed, is there a pass that will insert a runtime alignment > > >> check? Also, what's the easiest way to get at > > >> TargetTransformInfo::getRegisterBitWidth() so I don't have to > > >> hard code 32? Thanks! > > >> > > >> I think that's a fair question, and it's about safety. If you're > > >> getting this on the JIT, means we may be generating unsafe > > >> transformations on the vectorizer. > > >> > > >> Arnold, Nadav, I don't remember seeing code to generate any > > >> run-time alignment checks on the incoming pointer, is there such > > >> a thing? If not, shouldn't we add one? > > > > > > > > > If the the vectorizer generates aligned memory accesses to > > > unaligned addresses then this is a serious bug. But I don’t think > > > that Josh said that the vectorizer generated aligned accesses to > > > unaligned pointers. > > > > > > There is no point in LLVM checking for alignment because if the > > > memory is unaligned then the program will crash. Users who want > > > to crash with a readable error message can simply write code that > > > checks the pointer (by masking the high bits and comparing to > > > zero). > > > > > > > > > > > > _______________________________________________ > > LLVM Developers mailing list > > LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu > > http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20131115/60a39f50/attachment.html>
Yes, I was just about to send out: DL->getABITypeAlignment(ScalarDataTy); The question is: “… ABI alignment for the target …" is that getPrefTypeAlignment or getABITypeAlignment I would have thought the latter. On Nov 15, 2013, at 4:12 PM, Hal Finkel <hfinkel at anl.gov> wrote:> ----- Original Message ----- >> From: "Arnold Schwaighofer" <aschwaighofer at apple.com> >> To: "Joshua Klontz" <josh.klontz at gmail.com> >> Cc: "LLVM Dev" <llvmdev at cs.uiuc.edu> >> Sent: Friday, November 15, 2013 4:05:53 PM >> Subject: Re: [LLVMdev] Limit loop vectorizer to SSE >> >> >> Something like: >> >> index 6db7f68..68564cb 100644 >> --- a/lib/Transforms/Vectorize/LoopVectorize.cpp >> +++ b/lib/Transforms/Vectorize/LoopVectorize.cpp >> @@ -1208,6 +1208,8 @@ void >> InnerLoopVectorizer::vectorizeMemoryInstruction(Instr >> Type *DataTy = VectorType::get(ScalarDataTy, VF); >> Value *Ptr = LI ? LI->getPointerOperand() : >> SI->getPointerOperand(); >> unsigned Alignment = LI ? LI->getAlignment() : SI->getAlignment(); >> + if (Alignment == 0) >> + Alignment = 1; > > That may be conservatively correct, but I don't think it is really right. An alignment of 0 is supposed to mean the platform default alignment (IIRC, DataLayout::getPrefTypeAlignment tells you what it is). > > -Hal > >> unsigned AddressSpace = Ptr->getType()->getPointerAddressSpace(); >> unsigned ScalarAllocatedSize = DL->getTypeAllocSize(ScalarDataTy); >> unsigned VectorElementSize = DL->getTypeStoreSize(DataTy)/VF; >> >> Should fix this. >> >> On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> >> wrote: >> >>> Nadav, >>> >>> I believe aligned accesses to unaligned pointers is precisely the >>> issue. Consider the function `add_u8S` before[1] and after[2] the >>> loop vectorizer pass. There is no alignment assumption associated >>> with %kernel_data prior to vectorization. I can't tell if it's the >>> loop vectorizer or the codegen at fault, but the alignment >>> assumption seems to sneak in somewhere. >>> >>> v/r, >>> Josh >>> >>> [1] http://pastebin.com/kc95WtGG >>> [2] http://pastebin.com/VY3ZLVJK >>> >>> >>> On Fri, Nov 15, 2013 at 3:58 PM, Nadav Rotem <nrotem at apple.com> >>> wrote: >>> >>> On Nov 15, 2013, at 12:36 PM, Renato Golin >>> <renato.golin at linaro.org> wrote: >>> >>>> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> >>>> wrote: >>>> Agreed, is there a pass that will insert a runtime alignment >>>> check? Also, what's the easiest way to get at >>>> TargetTransformInfo::getRegisterBitWidth() so I don't have to >>>> hard code 32? Thanks! >>>> >>>> I think that's a fair question, and it's about safety. If you're >>>> getting this on the JIT, means we may be generating unsafe >>>> transformations on the vectorizer. >>>> >>>> Arnold, Nadav, I don't remember seeing code to generate any >>>> run-time alignment checks on the incoming pointer, is there such >>>> a thing? If not, shouldn't we add one? >>> >>> >>> If the the vectorizer generates aligned memory accesses to >>> unaligned addresses then this is a serious bug. But I don’t think >>> that Josh said that the vectorizer generated aligned accesses to >>> unaligned pointers. >>> >>> There is no point in LLVM checking for alignment because if the >>> memory is unaligned then the program will crash. Users who want >>> to crash with a readable error message can simply write code that >>> checks the pointer (by masking the high bits and comparing to >>> zero). >>> >>> >> >> >> _______________________________________________ >> LLVM Developers mailing list >> LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu >> http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
You are right! Good catch. On Nov 15, 2013, at 2:05 PM, Arnold Schwaighofer <aschwaighofer at apple.com> wrote:> > Something like: > > index 6db7f68..68564cb 100644 > --- a/lib/Transforms/Vectorize/LoopVectorize.cpp > +++ b/lib/Transforms/Vectorize/LoopVectorize.cpp > @@ -1208,6 +1208,8 @@ void InnerLoopVectorizer::vectorizeMemoryInstruction(Instr > Type *DataTy = VectorType::get(ScalarDataTy, VF); > Value *Ptr = LI ? LI->getPointerOperand() : SI->getPointerOperand(); > unsigned Alignment = LI ? LI->getAlignment() : SI->getAlignment(); > + if (Alignment == 0) > + Alignment = 1; > unsigned AddressSpace = Ptr->getType()->getPointerAddressSpace(); > unsigned ScalarAllocatedSize = DL->getTypeAllocSize(ScalarDataTy); > unsigned VectorElementSize = DL->getTypeStoreSize(DataTy)/VF; > > Should fix this. > > On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote: > >> Nadav, >> >> I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to vectorization. I can't tell if it's the loop vectorizer or the codegen at fault, but the alignment assumption seems to sneak in somewhere. >> >> v/r, >> Josh >> >> [1] http://pastebin.com/kc95WtGG >> [2] http://pastebin.com/VY3ZLVJK >> >> >> On Fri, Nov 15, 2013 at 3:58 PM, Nadav Rotem <nrotem at apple.com> wrote: >> >> On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org> wrote: >> >>> On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> wrote: >>> Agreed, is there a pass that will insert a runtime alignment check? Also, what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32? Thanks! >>> >>> I think that's a fair question, and it's about safety. If you're getting this on the JIT, means we may be generating unsafe transformations on the vectorizer. >>> >>> Arnold, Nadav, I don't remember seeing code to generate any run-time alignment checks on the incoming pointer, is there such a thing? If not, shouldn't we add one? >> >> >> If the the vectorizer generates aligned memory accesses to unaligned addresses then this is a serious bug. But I don’t think that Josh said that the vectorizer generated aligned accesses to unaligned pointers. >> >> There is no point in LLVM checking for alignment because if the memory is unaligned then the program will crash. Users who want to crash with a readable error message can simply write code that checks the pointer (by masking the high bits and comparing to zero). >> >> >