similar to: [LLVMdev] Unaligned SSE Memop Support Patch

Displaying 20 results from an estimated 3000 matches similar to: "[LLVMdev] Unaligned SSE Memop Support Patch"

2012 Feb 24
2
[LLVMdev] [RFC] Remat Enhancements
Jakob Stoklund Olesen <stoklund at 2pi.dk> writes: > That's great, but I really wish you would discuss the design of these > things publicly, and not develop features on long-running secret > branches. If you secretly start out in the wrong direction, you could > be wasting a lot of your time. I don't have a choice. I have to get patches approved after I already have
2012 Feb 27
0
[LLVMdev] [RFC] Remat Enhancements
dag at cray.com (David A. Greene) writes: >>> The change requires that live interval analysis be able to determine >>> whether and instruction is a load and whether an instruction writes to >>> memory. >> >> Just use MI->mayLoad(), MI->mayStore(). > > Does this also account for arithmetic instructions with memops? These > interfaces
2009 Jun 05
2
[LLVMdev] SSE Scalar Convert Intrinsics
On Friday 05 June 2009 15:19, Dan Gohman wrote: > > Do we need two intrinsics for these scalar converts, one to satisfy > > the > > (arguably broken) GCC interface and one to really reflect the > > operation > > as specified by the ISA? > > That's what's done for most other instructions, unfortunately. > For cvtsd2si, there's currently no
2009 Jun 05
0
[LLVMdev] SSE Scalar Convert Intrinsics
On Jun 5, 2009, at 3:16 PM, David Greene wrote: > On Friday 05 June 2009 15:19, Dan Gohman wrote: > >> One thing we'd like to do at some point is have front-ends lower >> intrinsics for scalar instructions into >> extractelement+op+insertelement, so that we don't need two >> versions of each of the instructions. Doing this for everything >> will
2012 Feb 27
1
[LLVMdev] [RFC] Remat Enhancements
On Feb 27, 2012, at 9:51 AM, David A. Greene wrote: > dag at cray.com (David A. Greene) writes: > >>>> The change requires that live interval analysis be able to determine >>>> whether and instruction is a load and whether an instruction writes to >>>> memory. >>> >>> Just use MI->mayLoad(), MI->mayStore(). >> >>
2012 Feb 23
0
[LLVMdev] [RFC] Remat Enhancements
On Feb 23, 2012, at 8:14 AM, David Greene <dag at cray.com> wrote: > I have a set of changes that enhances rematerialization to handle more > kinds of loads, specifically loads with multiple address registers. > This is a big win for some codes on x86. That's great, but I really wish you would discuss the design of these things publicly, and not develop features on
2014 Dec 14
2
[LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets
Hi, I think that def FeatureVectorUAMem : SubtargetFeature<"vector-unaligned-mem", "HasVectorUAMem", "true", "Allow unaligned memory operands on vector/SIMD instructions">; should be switched-ON on AVX and AVX-512 instructions because: According to the AVX spec: "Most arithmetic and
2012 Dec 06
1
[PATCH] memop: adjust error checking in populate_physmap()
Checking that multi-page allocations are permitted is unnecessary for PoD population operations. Instead, the (loop invariant) check added for addressing XSA-31 can be moved here. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- a/xen/common/memory.c +++ b/xen/common/memory.c @@ -99,7 +99,8 @@ static void populate_physmap(struct memo
2014 Dec 15
2
[LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets
AFAIK, there is no additional penalty for AMD processors. From: llvmdev-bounces at cs.uiuc.edu [mailto:llvmdev-bounces at cs.uiuc.edu] On Behalf Of Chandler Carruth Sent: Monday, December 15, 2014 3:57 AM To: Demikhovsky, Elena Cc: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] Memory alignment model on AVX, AVX2 and AVX-512 targets FWIW, this makes sense to me. I'd be interested to hear from
2016 Mar 05
2
Enable / Disable a processor feature
I'm trying to enable/disable a target feature through clang. Here is how my target looks like // Esencia subtarget features //===----------------------------------------------------------------------===// def FeatureMul : SubtargetFeature<"mul", "HasMul", "true", "Enable hardware multiplier">; def FeatureDiv
2016 Oct 12
3
Dragon egg not recognizing Target ARM machine
Hello Team, Good Morning!! This is Vishnu Prasanth doing my master's thesis on improving llvm compiler optimization. Currently I am trying to build dragon egg and when I gave take, it is not getting recognized for ARM machine. Can you please help me with. Below are the errors when I gave the below command inside dragon egg directory GCC=GCC_DIR/gcc
2009 Jun 05
0
[LLVMdev] SSE Scalar Convert Intrinsics
On Jun 5, 2009, at 8:51 AM, David Greene wrote: > I have a question about the SSE scalar convert intrinsics. > > cvtsd2si is defined thusly: > > def int_x86_sse2_cvtsd2si64 : > GCCBuiltin<"__builtin_ia32_cvtsd2si64">, > Intrinsic<[llvm_i64_ty, llvm_v2f64_ty], [IntrNoMem]>; > > This matches the signature of the GCC intrinsic. The
2009 Apr 30
2
[LLVMdev] RFC: AVX Feature Specification
I've been working on adding AVX to LLVM and have run across a number of questions. Here's the first one. In some ways AVX is "just another" SSE level. Having AVX implies you have SSE1-SSE4.2. However AVX is very different from SSE and there are a number of sub-features which may or may not be available on various implementations. So right now I've done this: def
2012 Sep 06
1
[LLVMdev] Unaligned vector memory access for ARM/NEON.
On Sep 5, 2012, at 4:58 PM, Jim Grosbach <grosbach at apple.com> wrote: > Hmmm. Well, it's entirely possible that it's LLVM that's confused about the alignment requirements here. :) > > I think I see, in general, where. I twiddled the IR to give it higher alignment (16 bytes) and get: > extend: @ @extend > @ BB#0: > vldr d16,
2017 Jul 29
2
ISelDAGToDAG breaks node ordering
Hi, During instruction selection, I have the following code for certain LOAD instructions: const LoadSDNode *LD = cast<LoadSDNode>(N); SDNode* LDW = CurDAG->getMachineNode(AVR::LDWRdPtr, SDLoc(N), VT, PtrVT, MVT::Other, LD->getBasePtr(), LD->getChain()); // Honestly, I have no idea what this does, but other memory // accessing instructions
2008 Oct 04
5
[LLVMdev] mem2reg optimization
On Oct 4, 2008, at 2:51 PM, Chris Lattner wrote: >>> I like your approach of using the use lists but I'm not sure the >>> ordering >>> is guaranteed. If it is, your approach is superior. >> >> I got my patch updated to work with TOT. Here it is. Comments >> welcome. > > Hi Dave, > > Great. I'd like to get this in, but would
2009 Jun 05
5
[LLVMdev] SSE Scalar Convert Intrinsics
I have a question about the SSE scalar convert intrinsics. cvtsd2si is defined thusly: def int_x86_sse2_cvtsd2si64 : GCCBuiltin<"__builtin_ia32_cvtsd2si64">, Intrinsic<[llvm_i64_ty, llvm_v2f64_ty], [IntrNoMem]>; This matches the signature of the GCC intrinsic. The fact that the GCC intrinsic has a type mismatch on the input (vector rather than scalar) is
2012 Sep 05
0
[LLVMdev] Unaligned vector memory access for ARM/NEON.
Hmmm. Well, it's entirely possible that it's LLVM that's confused about the alignment requirements here. :) I think I see, in general, where. I twiddled the IR to give it higher alignment (16 bytes) and get: extend: @ @extend @ BB#0: vldr d16, [r0] vmovl.s16 q8, d16 vstmia r1, {d16, d17} vldr d16, [r0, #8] add r0, r1, #16 vmovl.s16 q8, d16 vstmia
2012 Feb 23
2
[LLVMdev] [RFC] Remat Enhancements
I have a set of changes that enhances rematerialization to handle more kinds of loads, specifically loads with multiple address registers. This is a big win for some codes on x86. I plan to send these up ASAP but I want to solicit a bit of guidance first. The change requires that live interval analysis be able to determine whether and instruction is a load and whether an instruction writes to
2012 Sep 05
3
[LLVMdev] Unaligned vector memory access for ARM/NEON.
Hello Jim, Thank you for the response. I may be confused about the alignment rules here. I had been looking at the ARM RVCT Assembler Guide, which seems to indicate vld1.16 operates on 16-bit aligned data, unless I am misinterpreting their table (Table 5-11 in ARM DUI 0204H, pg 5-70,5-71). Prior to the table, It does mention the accesses need to be "element" aligned, where I took