thr3ads.net - search: "movupd"

Displaying 13 results from an estimated 13 matches for "movupd".

Did you mean: movups

[LLVMdev] Use of movupd instead of movapd for x86

2011 Feb 28

[LLVMdev] Use of movupd instead of movapd for x86

...know this is stupid, but I want to try to pass a <4 x float>* as parameter of a routine and at the call site I want to pass a misaligned pointer. Since LLVM is generating movapd instruction it will raise an exception (SEGFAULT), I just want to know if there is a way to enforce generation of movupd instruction instead of movapd. Seb > -----Original Message----- > From: David A. Greene [mailto:greened at obbligato.org] > Sent: Friday, February 25, 2011 5:13 PM > To: Sebastien DELDON-GNB > Cc: llvmdev at cs.uiuc.edu > Subject: Re: [LLVMdev] Use of movupd instead of movapd fo...

[LLVMdev] Use of movupd instead of movapd for x86

2011 Feb 25

[LLVMdev] Use of movupd instead of movapd for x86

Hi all, Is there a way to force llc to generate movupd instruction instead of movapd for x86 target ? I know that movapd is more performant, but I would like to measure degradation when alignment constraints are not met. Best Regards Seb -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail...

[LLVMdev] Use of movupd instead of movapd for x86

2011 Mar 01

[LLVMdev] Use of movupd instead of movapd for x86

...this is stupid, but I want to try to pass a <4 x float>* as parameter of a routine and at the call site I want to pass a misaligned pointer. Since LLVM is generating movapd instruction it will raise an exception (SEGFAULT), I just want to know if there is a way to enforce > generation of movupd instruction instead of movapd. If llvm is generating movapd then it believes the pointer is aligned. Without having more information it's impossible to tell what the issue is. Evan > > Seb > >> -----Original Message----- >> From: David A. Greene [mailto:greened at obbl...

[LLVMdev] Use of movupd instead of movapd for x86

2011 Feb 25

[LLVMdev] Use of movupd instead of movapd for x86

Sebastien DELDON-GNB <sebastien.deldon at st.com> writes: > Hi all, > > Is there a way to force llc to generate movupd instruction instead of movapd for x86 target ? > > I know that movapd is more performant, but I would like to measure degradation when alignment constraints are not met. On modern processors a movupd on aligned data is going to be indistinguishable in performance from a movapd....

[LLVMdev] Enabling the SLP vectorizer by default for -O3

2013 Jul 15

[LLVMdev] Enabling the SLP vectorizer by default for -O3

...The hot function in BH is “gravsub”. The vectorized IR looks fine and the assembly looks fine, but for some reason Instruments reports that the first vector-subtract instruction takes 18% of the time. The regression happens both with the VEX prefix and without. I suspected that the problem is the movupd's that load xmm0 and xmm1. I started looking at some performance counters on Friday, but I did not find anything suspicious yet. +0x00 movupd 16(%rsi), %xmm0 +0x05 movupd 16(%rsp), %xmm1 +0x0b subpd %xmm1, %xmm0 <———— 18% of the runtime of bh ? +...

[LLVMdev] Enabling the SLP vectorizer by default for -O3

2013 Jul 23

[LLVMdev] Enabling the SLP vectorizer by default for -O3

...The hot function in BH is “gravsub”. The vectorized IR looks fine and the assembly looks fine, but for some reason Instruments reports that the first vector-subtract instruction takes 18% of the time. The regression happens both with the VEX prefix and without. I suspected that the problem is the movupd's that load xmm0 and xmm1. I started looking at some performance counters on Friday, but I did not find anything suspicious yet. > > +0x00 movupd 16(%rsi), %xmm0 > +0x05 movupd 16(%rsp), %xmm1 > +0x0b subpd %xmm1, %xmm0 <———— 18% of t...

[LLVMdev] Another memory alignment issue with SSE operations

2013 Jul 20

[LLVMdev] Another memory alignment issue with SSE operations

...ice that the offset is aligned though. The crash occurs on the first instance of addpd applied to the stack (as I understand ESP is used for). There is also raises the question of would it be worth requiring alignment of the function stack to improve performance (assuming movapd is faster then movupd). I'm not expecting LLVM to recognize this (although it would be neat) but is this something worth setting ourselves, knowing we're going to be using mostly SSE instructions? And how would we do that? -- Peter N -------------- next part -------------- A non-text attachment was scrubbed....

[LLVMdev] Codegen for vector float->double cast fails on x86 above SSE3

2011 Dec 28

[LLVMdev] Codegen for vector float->double cast fails on x86 above SSE3

...load <2 x float>* %in, align 8 %1 = fpext <2 x float> %0 to <2 x double> store <2 x double> %1, <2 x double>* %out, align 1 ret void } The code should load a <2 x float> vector from %in, fpext cast it to a <2 x double>, and do an unaligned store (movupd) of the result to %out. This works as expected on earlier SSE targets, generating this with llc -mcpu=core2: movss (%rdi), %xmm1 movss 4(%rdi), %xmm0 cvtss2sd %xmm0, %xmm0 cvtss2sd %xmm1, %xmm1 unpcklpd %xmm0, %xmm1 ## xmm1 = xmm1[0],xmm0[0] movupd %xmm1, (%rsi) ret Load both, cast flo...

[LLVMdev] buildbot with -vectorize

2012 Jun 28

[LLVMdev] buildbot with -vectorize

...Tobias Grosser<tobias at grosser.es> wrote: > [..] > Also, since you're running these on an x86_64 machine, and I think they > don't have unaligned vector load/stores, you should probably add -mllvm > -bb-vectorize-aligned-only to the target flags. What about MOVUPS and MOVUPD? Tobi

[LLVMdev] buildbot with -vectorize

2012 Jun 28

[LLVMdev] buildbot with -vectorize

...t; wrote: > > > [..] > > > Also, since you're running these on an x86_64 machine, and I think > > they don't have unaligned vector load/stores, you should probably > > add -mllvm -bb-vectorize-aligned-only to the target flags. > > What about MOVUPS and MOVUPD? Good point. Never mind. I suppose those can be used for the integer vectors too. Thanks again, Hal > > Tobi -- Hal Finkel Postdoctoral Appointee Leadership Computing Facility Argonne National Laboratory

[LLVMdev] Enabling the SLP vectorizer by default for -O3

2013 Jul 15

[LLVMdev] Enabling the SLP vectorizer by default for -O3

On Jul 13, 2013, at 11:30 PM, Nadav Rotem <nrotem at apple.com> wrote: > Hi, > > LLVM’s SLP-vectorizer is a new pass that combines similar independent instructions in a straight-line code. It is currently not enabled by default, and people who want to experiment with it can use the clang command line flag “-fslp-vectorize”. I ran LLVM’s test suite with and without the SLP

[LLVMdev] buildbot with -vectorize

2012 Jun 28

[LLVMdev] buildbot with -vectorize

On Sun, 24 Jun 2012 14:44:45 +0200 Tobias Grosser <tobias at grosser.es> wrote: > On 06/24/2012 02:42 PM, Hal Finkel wrote: > > On Sun, 24 Jun 2012 08:17:32 +0200 > > Tobias Grosser<tobias at grosser.es> wrote: > > > >> On 06/24/2012 05:42 AM, Hal Finkel wrote: > >>> On Thu, 21 Jun 2012 16:25:13 +0200 > >>> Tobias

[LLVMdev] Enabling the SLP vectorizer by default for -O3

2013 Jul 14

[LLVMdev] Enabling the SLP vectorizer by default for -O3

Hi, LLVM’s SLP-vectorizer is a new pass that combines similar independent instructions in a straight-line code. It is currently not enabled by default, and people who want to experiment with it can use the clang command line flag “-fslp-vectorize”. I ran LLVM’s test suite with and without the SLP vectorizer on a Sandybridge mac (using SSE4, w/o AVX). Based on my performance measurements

search for: movupd