thr3ads.net - search: "unalignment"

Displaying 20 results from an estimated 698 matches for "unalignment".

Did you mean: alignment

Kernel unaligned access at TPC[10101f18] btrfs_csum_final+0x38/0x60

2010 Feb 05

Kernel unaligned access at TPC[10101f18] btrfs_csum_final+0x38/0x60

When writing to a newly created btrfs (vanilla 2.6.33-rc6, sparc64) the following messages are printed: [28617.650231] Kernel unaligned access at TPC[10101f18] btrfs_csum_final+0x38/0x60 [btrfs] [28617.745783] Kernel unaligned access at TPC[10101f18] btrfs_csum_final+0x38/0x60 [btrfs] [28654.589492] Kernel unaligned access at TPC[10101f18] btrfs_csum_final+0x38/0x60 [btrfs] [28654.685036] Kernel

ipp2p: Unaligned access in search_all_ed2k on sparc64

2007 Dec 02

ipp2p: Unaligned access in search_all_ed2k on sparc64

Hey guys, I''ve just built a sparc64 (Ultra/5) based firewall with ipp2p compiled as a module and I''m constantly getting the following message in my logs: Kernel unaligned access at TPC[100f8490] search_all_edk+0x20/0x4c [ipt_ipp2p] I''m running the following versions: - Kernel 2.6.22 - ipp2p 0.8.2-r4 - iptables 1.3.8-r1 Any thoughts?

[LLVMdev] Disable vectorization for unaligned data

2013 Jul 19

[LLVMdev] Disable vectorization for unaligned data

What is the proper solution to disable auto-vectorization for unaligned data? I have an out of tree target and I added this: bool OpusTargetLowering::allowsUnalignedMemoryAccesses(EVT VT, bool *Fast) const { if (VT.isVector()) return false; .... } After that, I could see that vectorization is still done on unaligned data except that llvm will copy the data back and forth from the source

[V4]fix ocfs2 aio/dio writing process hang

2012 Jun 27

[V4]fix ocfs2 aio/dio writing process hang

V4 changes: add Acked-by: Joel Becker <jlbec at evilplan.org> V3 changes: - add Cc: stable at vger.kernel.org in the patch header to align with stable rules - add Acked-by: Jeff Moyer <jmoyer at redhat.com> V2 changes: - update the patch header of the first patch to make it more clear. This patch list fixes an issue about ocfs2 aio/dio write process hang. The call trace is like

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 06

[LLVMdev] Unaligned vector memory access for ARM/NEON.

-----Original Message----- From: Bob Wilson [mailto:bob.wilson at apple.com] Sent: Thursday, September 06, 2012 3:39 PM To: David Peixotto Cc: 'Peter Couperus'; 'Jim Grosbach'; 'Jakob Olesen'; llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Unaligned vector memory access for ARM/NEON. On Sep 6, 2012, at 2:48 PM, David Peixotto <dpeixott at codeaurora.org> wrote:

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 07

[LLVMdev] Unaligned vector memory access for ARM/NEON.

On Sep 6, 2012, at 4:40 PM, David Peixotto <dpeixott at codeaurora.org> wrote: > -----Original Message----- > From: Bob Wilson [mailto:bob.wilson at apple.com] > Sent: Thursday, September 06, 2012 3:39 PM > To: David Peixotto > Cc: 'Peter Couperus'; 'Jim Grosbach'; 'Jakob Olesen'; llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] Unaligned vector

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

2013 Nov 18

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

On my (out-of-tree) target I have 16 128-bit registers. Unaligned load/store are illegal. (must 16-bytes aligned) 8 of those registers are defined as callee-saved and 8 caller-saved. The default stack size is 4 bytes. The target implements dynamic stack realign to make sure the stack will always be aligned correctly when necessary. Yet I am still getting unaligned load/store when running this

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

2013 Nov 21

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

----- Original Message ----- > From: "Hal Finkel" <hfinkel at anl.gov> > To: "Francois Pichet" <pichet2000 at gmail.com> > Cc: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Monday, November 18, 2013 2:45:53 PM > Subject: Re: [LLVMdev] Unaligned load/store for callee-saved 128-bit registers > > ----- Original

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 06

[LLVMdev] Unaligned vector memory access for ARM/NEON.

On Sep 6, 2012, at 2:48 PM, David Peixotto <dpeixott at codeaurora.org> wrote: > Hi Pete, > > We ran into the same issue with generating vector loads/stores for vectors > with less than word alignment. It seems we took a similar approach to > solving the problem by modifying the logic in allowsUnalignedMemoryAccesses. > > As you and Jim mentioned, it looks like the

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

On Nov 15, 2013, at 12:36 PM, Renato Golin <renato.golin at linaro.org> wrote: > On 15 November 2013 20:24, Joshua Klontz <josh.klontz at gmail.com> wrote: > Agreed, is there a pass that will insert a runtime alignment check? Also, what's the easiest way to get at TargetTransformInfo::getRegisterBitWidth() so I don't have to hard code 32? Thanks! > > I think

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 07

[LLVMdev] Unaligned vector memory access for ARM/NEON.

> -----Original Message----- > From: Bob Wilson [mailto:bob.wilson at apple.com] > Sent: Friday, September 07, 2012 10:57 AM > To: David Peixotto > Cc: 'Peter Couperus'; 'Jim Grosbach'; 'Jakob Olesen'; llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] Unaligned vector memory access for ARM/NEON. > > > On Sep 6, 2012, at 4:40 PM, David Peixotto

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

2013 Nov 18

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

----- Original Message ----- > From: "Francois Pichet" <pichet2000 at gmail.com> > To: "LLVM Developers Mailing List" <llvmdev at cs.uiuc.edu> > Sent: Monday, November 18, 2013 2:26:30 PM > Subject: [LLVMdev] Unaligned load/store for callee-saved 128-bit registers > > > > On my (out-of-tree) target I have 16 128-bit registers. >

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

2013 Nov 21

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

----- Original Message ----- > From: "Francois Pichet" <pichet2000 at gmail.com> > To: "Hal Finkel" <hfinkel at anl.gov> > Cc: "Chad Rosier" <mcrosier at codeaurora.org>, "Jakob Stoklund Olesen" <jolesen at apple.com>, "LLVM Developers Mailing > List" <llvmdev at cs.uiuc.edu> > Sent: Thursday, November

[LLVMdev] Disable vectorization for unaligned data

2013 Jul 21

[LLVMdev] Disable vectorization for unaligned data

Ok any quick workaround to limit vectorization to 16-byte aligned 128-bit data then? All the memory copying done by ExpandUnalignedStore/ExpandUnalignedLoad is just too expensive. On Sat, Jul 20, 2013 at 12:52 PM, Arnold Schwaighofer < aschwaighofer at apple.com> wrote: > > On Jul 19, 2013, at 3:14 PM, Francois Pichet <pichet2000 at gmail.com> wrote: > > > > >

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

2013 Nov 21

[LLVMdev] Unaligned load/store for callee-saved 128-bit registers

BTW I managed to get around this problem by flagging all the 128-bit registers as caller saved only. On my system, vector registers are more likely to be used on leaf functions anyway. On Thu, Nov 21, 2013 at 3:24 PM, Hal Finkel <hfinkel at anl.gov> wrote: > ----- Original Message ----- > > From: "Hal Finkel" <hfinkel at anl.gov> > > To: "Francois

[LLVMdev] Disable vectorization for unaligned data

2013 Jul 19

[LLVMdev] Disable vectorization for unaligned data

On Fri, Jul 19, 2013 at 1:14 PM, Francois Pichet <pichet2000 at gmail.com> wrote: > > What is the proper solution to disable auto-vectorization for unaligned > data? Why are you trying to do this? If auto-vectorization is making a given loop slower on your target, that means the cost metrics are off, and we should fix them. If code size is an issue, you should tell the optimizer

[LLVMdev] unaligned AVX store gets split into two instructions

2013 Jul 10

[LLVMdev] unaligned AVX store gets split into two instructions

Hi, Yes. On Sandybridge 256-bit loads/stores are double pumped. This means that they go in one after the other in two cycles. On Haswell the memory ports are wide enough to allow a 256bit memory operation in one cycle. So, on Sandybridge we split unaligned memory operations into two 128bit parts to allow them to execute in two separate ports. This is also what GCC and ICC do. It is very

[LLVMdev] unaligned AVX store gets split into two instructions

2013 Jul 10

[LLVMdev] unaligned AVX store gets split into two instructions

I'm seeing a difference in how LLVM 3.3 and 3.2 emit unaligned vector loads on AVX. 3.3 is splitting up an unaligned vector load but in 3.2, it was emitted as a single instruction (details below). In a matrix-matrix inner-kernel, I see a ~25% decrease in performance, which seems to be due to this. Any ideas why this changed? Thanks! Zach LLVM Code: define <4 x double> @vstore(<4 x

[LLVMdev] unaligned AVX store gets split into two instructions

2013 Jul 10

[LLVMdev] unaligned AVX store gets split into two instructions

I've narrowed this down to a single kernel (kernel.ll), which does a fixed-size matrix-matrix multiply: # ~/llvm-32-final/bin/llc kernel.ll -o kernel32.s # ~/llvm-33-final/bin/llc kernel.ll -o kernel33.s # ~/llvm-32-final/bin/clang++ harness.cpp kernel32.s -o harness32 # ~/llvm-32-final/bin/clang++ harness.cpp kernel33.s -o harness33 # time ./harness32 real 0m0.584s user 0m0.581s sys 0m0.001s

[LLVMdev] Disable vectorization for unaligned data

2013 Jul 21

[LLVMdev] Disable vectorization for unaligned data

No, I am afraid not without computing alignment based on the scalar code. In order to limit vectorization to 16-byte aligned data we need to know that data is 16-byte aligned. The way we vectorize we won’t know that until after we have vectorized. As you have observed we will pass “4” to getMemoryOpCost in the loop vectorizer (as that is the only thing that can be inferred from a consecutive

search for: unalignment