thr3ads.net - search: "unalign"

Displaying 20 results from an estimated 697 matches for "unalign".

Did you mean: nalign

Kernel unaligned access at TPC[10101f18] btrfs_csum_final+0x38/0x60

2010 Feb 05

Kernel unaligned access at TPC[10101f18] btrfs_csum_final+0x38/0x60

When writing to a newly created btrfs (vanilla 2.6.33-rc6, sparc64) the following messages are printed: [28617.650231] Kernel unaligned access at TPC[10101f18] btrfs_csum_final+0x38/0x60 [btrfs] [28617.745783] Kernel unaligned access at TPC[10101f18] btrfs_csum_final+0x38/0x60 [btrfs] [28654.589492] Kernel unaligned access at TPC[10101f18] btrfs_csum_final+0x38/0x60 [btrfs] [28654.685036] Kernel unaligned access at TPC[10101f18]...

ipp2p: Unaligned access in search_all_ed2k on sparc64

2007 Dec 02

ipp2p: Unaligned access in search_all_ed2k on sparc64

Hey guys, I''ve just built a sparc64 (Ultra/5) based firewall with ipp2p compiled as a module and I''m constantly getting the following message in my logs: Kernel unaligned access at TPC[100f8490] search_all_edk+0x20/0x4c [ipt_ipp2p] I''m running the following versions: - Kernel 2.6.22 - ipp2p 0.8.2-r4 - iptables 1.3.8-r1 Any thoughts?

[LLVMdev] Disable vectorization for unaligned data

2013 Jul 19

[LLVMdev] Disable vectorization for unaligned data

What is the proper solution to disable auto-vectorization for unaligned data? I have an out of tree target and I added this: bool OpusTargetLowering::allowsUnalignedMemoryAccesses(EVT VT, bool *Fast) const { if (VT.isVector()) return false; .... } After that, I could see that vectorization is still done on unaligned data except that llvm will copy the data b...

[V4]fix ocfs2 aio/dio writing process hang

2012 Jun 27

[V4]fix ocfs2 aio/dio writing process hang

V4 changes: add Acked-by: Joel Becker <jlbec at evilplan.org> V3 changes: - add Cc: stable at vger.kernel.org in the patch header to align with stable rules - add Acked-by: Jeff Moyer <jmoyer at redhat.com> V2 changes: - update the patch header of the first patch to make it more clear. This patch list fixes an issue about ocfs2 aio/dio write process hang. The call trace is like

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 06

[LLVMdev] Unaligned vector memory access for ARM/NEON.

-----Original Message----- From: Bob Wilson [mailto:bob.wilson at apple.com] Sent: Thursday, September 06, 2012 3:39 PM To: David Peixotto Cc: 'Peter Couperus'; 'Jim Grosbach'; 'Jakob Olesen'; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Unaligned vector memory access for ARM/NEON. On Sep 6, 2012, at 2:48 PM, David Peixotto <dpeixott at codeaurora.org> wrote: > Hi Pete, > > We ran into the same issue with generating vector loads/stores for > vectors with less than word alignment. It seems we took a similar > appr...

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 07

[LLVMdev] Unaligned vector memory access for ARM/NEON.

...ote: > -----Original Message----- > From: Bob Wilson [mailto:bob.wilson at apple.com] > Sent: Thursday, September 06, 2012 3:39 PM > To: David Peixotto > Cc: 'Peter Couperus'; 'Jim Grosbach'; 'Jakob Olesen'; llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] Unaligned vector memory access for ARM/NEON. > > > On Sep 6, 2012, at 2:48 PM, David Peixotto <dpeixott at codeaurora.org> wrote: > >> Hi Pete, >> >> We ran into the same issue with generating vector loads/stores for >> vectors with less than word alignment. I...

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

2013 Nov 18

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

On my (out-of-tree) target I have 16 128-bit registers. Unaligned load/store are illegal. (must 16-bytes aligned) 8 of those registers are defined as callee-saved and 8 caller-saved. The default stack size is 4 bytes. The target implements dynamic stack realign to make sure the stack will always be aligned correctly when necessary. Yet I am still getting unal...

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

2013 Nov 21

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

...essage ----- > From: "Hal Finkel" <hfinkel at anl.gov> > To: "Francois Pichet" <pichet2000 at gmail.com> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Monday, November 18, 2013 2:45:53 PM > Subject: Re: [LLVMdev] Unaligned load/store for callee-saved 128-bit registers > > ----- Original Message ----- > > From: "Francois Pichet" <pichet2000 at gmail.com> > > To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > > Sent: Monday, November 18, 2013 2:26:30...

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 06

[LLVMdev] Unaligned vector memory access for ARM/NEON.

...2:48 PM, David Peixotto <dpeixott at codeaurora.org> wrote: > Hi Pete, > > We ran into the same issue with generating vector loads/stores for vectors > with less than word alignment. It seems we took a similar approach to > solving the problem by modifying the logic in allowsUnalignedMemoryAccesses. > > As you and Jim mentioned, it looks like the vld1/vst1 instructions should > support element aligned access for any armv7 implementation (I'm looking at > Table A3-1 ARM Architecture Reference Manual - ARM DDI 0406C). > > Right now I do not think we have...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

...e generating unsafe transformations on the vectorizer. > > Arnold, Nadav, I don't remember seeing code to generate any run-time alignment checks on the incoming pointer, is there such a thing? If not, shouldn't we add one? If the the vectorizer generates aligned memory accesses to unaligned addresses then this is a serious bug. But I don’t think that Josh said that the vectorizer generated aligned accesses to unaligned pointers. There is no point in LLVM checking for alignment because if the memory is unaligned then the program will crash. Users who want to crash with a readable...

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 07

[LLVMdev] Unaligned vector memory access for ARM/NEON.

> -----Original Message----- > From: Bob Wilson [mailto:bob.wilson at apple.com] > Sent: Friday, September 07, 2012 10:57 AM > To: David Peixotto > Cc: 'Peter Couperus'; 'Jim Grosbach'; 'Jakob Olesen'; llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] Unaligned vector memory access for ARM/NEON. > > > On Sep 6, 2012, at 4:40 PM, David Peixotto <dpeixott at codeaurora.org> > wrote: > > > -----Original Message----- > > From: Bob Wilson [mailto:bob.wilson at apple.com] > > Sent: Thursday, September 06, 2012 3:39 PM...

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

2013 Nov 18

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

----- Original Message ----- > From: "Francois Pichet" <pichet2000 at gmail.com> > To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Monday, November 18, 2013 2:26:30 PM > Subject: [LLVMdev] Unaligned load/store for callee-saved 128-bit registers > > > > On my (out-of-tree) target I have 16 128-bit registers. > Unaligned load/store are illegal. (must 16-bytes aligned) > > > > 8 of those registers are defined as callee-saved and 8 caller-saved. > The default...

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

2013 Nov 21

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

...anl.gov> > Cc: "Chad Rosier" <mcrosier at codeaurora.org>, "Jakob Stoklund Olesen" <jolesen at apple.com>, "LLVM Developers Mailing > List" <llvmdev at cs.uiuc.edu> > Sent: Thursday, November 21, 2013 2:36:06 PM > Subject: Re: [LLVMdev] Unaligned load/store for callee-saved 128-bit registers > > > BTW I managed to get around this problem by flagging all the 128-bit > registers as caller saved only. > > On my system, vector registers are more likely to be used on leaf > functions anyway. > > Sounds good; ho...

[LLVMdev] Disable vectorization for unaligned data

2013 Jul 21

[LLVMdev] Disable vectorization for unaligned data

Ok any quick workaround to limit vectorization to 16-byte aligned 128-bit data then? All the memory copying done by ExpandUnalignedStore/ExpandUnalignedLoad is just too expensive. On Sat, Jul 20, 2013 at 12:52 PM, Arnold Schwaighofer < aschwaighofer at apple.com> wrote: > > On Jul 19, 2013, at 3:14 PM, Francois Pichet <pichet2000 at gmail.com> wrote: > > > > > What is the proper solution to...

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

2013 Nov 21

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

...om: "Hal Finkel" <hfinkel at anl.gov> > > To: "Francois Pichet" <pichet2000 at gmail.com> > > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > > Sent: Monday, November 18, 2013 2:45:53 PM > > Subject: Re: [LLVMdev] Unaligned load/store for callee-saved 128-bit > registers > > > > ----- Original Message ----- > > > From: "Francois Pichet" <pichet2000 at gmail.com> > > > To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > > > Sent: Mond...

[LLVMdev] Disable vectorization for unaligned data

2013 Jul 19

[LLVMdev] Disable vectorization for unaligned data

On Fri, Jul 19, 2013 at 1:14 PM, Francois Pichet <pichet2000 at gmail.com> wrote: > > What is the proper solution to disable auto-vectorization for unaligned > data? Why are you trying to do this? If auto-vectorization is making a given loop slower on your target, that means the cost metrics are off, and we should fix them. If code size is an issue, you should tell the optimizer that you want to optimize for size. -Eli > I have an out of tr...

[LLVMdev] unaligned AVX store gets split into two instructions

2013 Jul 10

[LLVMdev] unaligned AVX store gets split into two instructions

Hi, Yes. On Sandybridge 256-bit loads/stores are double pumped. This means that they go in one after the other in two cycles. On Haswell the memory ports are wide enough to allow a 256bit memory operation in one cycle. So, on Sandybridge we split unaligned memory operations into two 128bit parts to allow them to execute in two separate ports. This is also what GCC and ICC do. It is very possible that the decision to split the wide vectors causes a regression. If the memory ports are busy it is better to double-pump them and save the cost of the...

[LLVMdev] unaligned AVX store gets split into two instructions

2013 Jul 10

[LLVMdev] unaligned AVX store gets split into two instructions

I'm seeing a difference in how LLVM 3.3 and 3.2 emit unaligned vector loads on AVX. 3.3 is splitting up an unaligned vector load but in 3.2, it was emitted as a single instruction (details below). In a matrix-matrix inner-kernel, I see a ~25% decrease in performance, which seems to be due to this. Any ideas why this changed? Thanks! Zach LLVM Code: define...

[LLVMdev] unaligned AVX store gets split into two instructions

2013 Jul 10

[LLVMdev] unaligned AVX store gets split into two instructions

...01s # time ./harness33 real 0m0.730s user 0m0.725s sys 0m0.001s If you look at kernel33.s, it has a register spill/reload in the inner loop. This doesn't appear in the llvm 3.2 version and disappears from the 3.3 version if you remove the "align 8"s from kernel.ll which are making it unaligned. Do the two-instruction unaligned loads increase register pressure? Or is something else going on? Zach On Tue, Jul 9, 2013 at 11:33 PM, Zach Devito <zdevito at stanford.edu> wrote: > Thanks for all the the info! I'm still in the process of narrowing down > the performance dif...

[LLVMdev] Disable vectorization for unaligned data

2013 Jul 21

[LLVMdev] Disable vectorization for unaligned data

...r code -> estimate cost based on scalar instructions -> vectorize -> vectorized code -> ... -> instcombine (calls ComputeMaskedBits) which computes better alignment for pointer accesses like “aligned_ptr += 128bit”. I will have to work on this soon as ARM also has pretty inefficient unaligned vector loads. On Jul 21, 2013, at 9:29 AM, Francois Pichet <pichet2000 at gmail.com> wrote: > Ok any quick workaround to limit vectorization to 16-byte aligned 128-bit data then? > > All the memory copying done by ExpandUnalignedStore/ExpandUnalignedLoad is just too expensive....

search for: unalign