thr3ads.net - similar to: "[LLVMdev] Unaligned SSE Memop Support Patch"

Displaying 20 results from an estimated 3000 matches similar to: "[LLVMdev] Unaligned SSE Memop Support Patch"

2012 Feb 24

[LLVMdev] [RFC] Remat Enhancements

Jakob Stoklund Olesen <stoklund at 2pi.dk> writes: > That's great, but I really wish you would discuss the design of these > things publicly, and not develop features on long-running secret > branches. If you secretly start out in the wrong direction, you could > be wasting a lot of your time. I don't have a choice. I have to get patches approved after I already have

[LLVMdev] [RFC] Remat Enhancements

2012 Feb 27

[LLVMdev] [RFC] Remat Enhancements

dag at cray.com (David A. Greene) writes: >>> The change requires that live interval analysis be able to determine >>> whether and instruction is a load and whether an instruction writes to >>> memory. >> >> Just use MI->mayLoad(), MI->mayStore(). > > Does this also account for arithmetic instructions with memops? These > interfaces

[LLVMdev] SSE Scalar Convert Intrinsics

2009 Jun 05

[LLVMdev] SSE Scalar Convert Intrinsics

On Friday 05 June 2009 15:19, Dan Gohman wrote: > > Do we need two intrinsics for these scalar converts, one to satisfy > > the > > (arguably broken) GCC interface and one to really reflect the > > operation > > as specified by the ISA? > > That's what's done for most other instructions, unfortunately. > For cvtsd2si, there's currently no

[LLVMdev] SSE Scalar Convert Intrinsics

2009 Jun 05

[LLVMdev] SSE Scalar Convert Intrinsics

On Jun 5, 2009, at 3:16 PM, David Greene wrote: > On Friday 05 June 2009 15:19, Dan Gohman wrote: > >> One thing we'd like to do at some point is have front-ends lower >> intrinsics for scalar instructions into >> extractelement+op+insertelement, so that we don't need two >> versions of each of the instructions. Doing this for everything >> will

[LLVMdev] [RFC] Remat Enhancements

2012 Feb 27

[LLVMdev] [RFC] Remat Enhancements

On Feb 27, 2012, at 9:51 AM, David A. Greene wrote: > dag at cray.com (David A. Greene) writes: > >>>> The change requires that live interval analysis be able to determine >>>> whether and instruction is a load and whether an instruction writes to >>>> memory. >>> >>> Just use MI->mayLoad(), MI->mayStore(). >> >>

[LLVMdev] [RFC] Remat Enhancements

2012 Feb 23

[LLVMdev] [RFC] Remat Enhancements

On Feb 23, 2012, at 8:14 AM, David Greene <dag at cray.com> wrote: > I have a set of changes that enhances rematerialization to handle more > kinds of loads, specifically loads with multiple address registers. > This is a big win for some codes on x86. That's great, but I really wish you would discuss the design of these things publicly, and not develop features on

[LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets

2014 Dec 14

[LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets

Hi, I think that def FeatureVectorUAMem : SubtargetFeature<"vector-unaligned-mem", "HasVectorUAMem", "true", "Allow unaligned memory operands on vector/SIMD instructions">; should be switched-ON on AVX and AVX-512 instructions because: According to the AVX spec: "Most arithmetic and

[PATCH] memop: adjust error checking in populate_physmap()

2012 Dec 06

[PATCH] memop: adjust error checking in populate_physmap()

Checking that multi-page allocations are permitted is unnecessary for PoD population operations. Instead, the (loop invariant) check added for addressing XSA-31 can be moved here. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -99,7 +99,8 @@ static void populate_physmap(struct memo

[LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets

2014 Dec 15

[LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets

AFAIK, there is no additional penalty for AMD processors. From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Chandler Carruth Sent: Monday, December 15, 2014 3:57 AM To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets FWIW, this makes sense to me. I'd be interested to hear from

Enable / Disable a processor feature

2016 Mar 05

Enable / Disable a processor feature

I'm trying to enable/disable a target feature through clang. Here is how my target looks like // Esencia subtarget features //===----------------------------------------------------------------------===// def FeatureMul : SubtargetFeature<"mul", "HasMul", "true", "Enable hardware multiplier">; def FeatureDiv

Dragon egg not recognizing Target ARM machine

2016 Oct 12

Dragon egg not recognizing Target ARM machine

Hello Team, Good Morning!! This is Vishnu Prasanth doing my master's thesis on improving llvm compiler optimization. Currently I am trying to build dragon egg and when I gave take, it is not getting recognized for ARM machine. Can you please help me with. Below are the errors when I gave the below command inside dragon egg directory GCC=GCC_DIR/gcc

[LLVMdev] SSE Scalar Convert Intrinsics

2009 Jun 05

[LLVMdev] SSE Scalar Convert Intrinsics

On Jun 5, 2009, at 8:51 AM, David Greene wrote: > I have a question about the SSE scalar convert intrinsics. > > cvtsd2si is defined thusly: > > def int_x86_sse2_cvtsd2si64 : > GCCBuiltin<"__builtin_ia32_cvtsd2si64">, > Intrinsic<[llvm_i64_ty, llvm_v2f64_ty], [IntrNoMem]>; > > This matches the signature of the GCC intrinsic. The

[LLVMdev] RFC: AVX Feature Specification

2009 Apr 30

[LLVMdev] RFC: AVX Feature Specification

I've been working on adding AVX to LLVM and have run across a number of questions. Here's the first one. In some ways AVX is "just another" SSE level. Having AVX implies you have SSE1-SSE4.2. However AVX is very different from SSE and there are a number of sub-features which may or may not be available on various implementations. So right now I've done this: def

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 06

[LLVMdev] Unaligned vector memory access for ARM/NEON.

On Sep 5, 2012, at 4:58 PM, Jim Grosbach <grosbach at apple.com> wrote: > Hmmm. Well, it's entirely possible that it's LLVM that's confused about the alignment requirements here. :) > > I think I see, in general, where. I twiddled the IR to give it higher alignment (16 bytes) and get: > extend: @ @extend > @ BB#0: > vldr d16,

ISelDAGToDAG breaks node ordering

2017 Jul 29

ISelDAGToDAG breaks node ordering

Hi, During instruction selection, I have the following code for certain LOAD instructions: const LoadSDNode *LD = cast<LoadSDNode>(N); SDNode* LDW = CurDAG->getMachineNode(AVR::LDWRdPtr, SDLoc(N), VT, PtrVT, MVT::Other, LD->getBasePtr(), LD->getChain()); // Honestly, I have no idea what this does, but other memory // accessing instructions

[LLVMdev] mem2reg optimization

2008 Oct 04

[LLVMdev] mem2reg optimization

On Oct 4, 2008, at 2:51 PM, Chris Lattner wrote: >>> I like your approach of using the use lists but I'm not sure the >>> ordering >>> is guaranteed. If it is, your approach is superior. >> >> I got my patch updated to work with TOT. Here it is. Comments >> welcome. > > Hi Dave, > > Great. I'd like to get this in, but would

[LLVMdev] SSE Scalar Convert Intrinsics

2009 Jun 05

[LLVMdev] SSE Scalar Convert Intrinsics

I have a question about the SSE scalar convert intrinsics. cvtsd2si is defined thusly: def int_x86_sse2_cvtsd2si64 : GCCBuiltin<"__builtin_ia32_cvtsd2si64">, Intrinsic<[llvm_i64_ty, llvm_v2f64_ty], [IntrNoMem]>; This matches the signature of the GCC intrinsic. The fact that the GCC intrinsic has a type mismatch on the input (vector rather than scalar) is

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 05

[LLVMdev] Unaligned vector memory access for ARM/NEON.

Hmmm. Well, it's entirely possible that it's LLVM that's confused about the alignment requirements here. :) I think I see, in general, where. I twiddled the IR to give it higher alignment (16 bytes) and get: extend: @ @extend @ BB#0: vldr d16, [r0] vmovl.s16 q8, d16 vstmia r1, {d16, d17} vldr d16, [r0, #8] add r0, r1, #16 vmovl.s16 q8, d16 vstmia

[LLVMdev] [RFC] Remat Enhancements

2012 Feb 23

[LLVMdev] [RFC] Remat Enhancements

I have a set of changes that enhances rematerialization to handle more kinds of loads, specifically loads with multiple address registers. This is a big win for some codes on x86. I plan to send these up ASAP but I want to solicit a bit of guidance first. The change requires that live interval analysis be able to determine whether and instruction is a load and whether an instruction writes to

[LLVMdev] Unaligned vector memory access for ARM/NEON.

2012 Sep 05

[LLVMdev] Unaligned vector memory access for ARM/NEON.

Hello Jim, Thank you for the response. I may be confused about the alignment rules here. I had been looking at the ARM RVCT Assembler Guide, which seems to indicate vld1.16 operates on 16-bit aligned data, unless I am misinterpreting their table (Table 5-11 in ARM DUI 0204H, pg 5-70,5-71). Prior to the table, It does mention the accesses need to be "element" aligned, where I took

similar to: [LLVMdev] Unaligned SSE Memop Support Patch