search for: vstr

Displaying 19 results from an estimated 19 matches for "vstr".

Did you mean: str
2013 Oct 21
1
[LLVMdev] MI scheduler produce badly code with inline function
Hi Andy, I'm working on defining new machine model for my target, But I don't understand how to define the in-order machine (reservation tables) in new model. For example, if target has IF ID EX WB stages should I do: let BufferSize=0 in { def IF: ProcResource<1>; def ID: ProcResource<1>; def EX: ProcResource<1>; def WB: ProcResource<1>; } def :
2011 May 26
2
[LLVMdev] LLVM CodeGen Engineer job opening with Apple's compiler team
Hi all, LLVM CodeGen and Tools team at Apple is looking for exceptional compiler engineers. This is a great opportunity to work with many of the leaders in the LLVM community. If you are interested in this position, please send your resume / CV and relevant information to evan.cheng at apple.com Thanks, Evan Job description The Apple compiler team is seeking an engineer who is strongly
2011 May 27
1
[LLVMdev] Question about ARM/vfp/NEON code generation
...r7, sp sub sp, sp, #36 str r0, [r7, #-4] vmov s0, r0 str r1, [r7, #-8] vmov s1, r1 str r2, [r7, #-12] vmov s2, r2 vldr.32 s3, [r7, #-4] vldr.32 s4, [r7, #-8] vmul.f32 s3, s3, s4 vstr.32 s3, [r7, #-16] vldr.32 s4, [r7, #-12] vcmpe.f32 s3, s4 vmrs apsr_nzcv, fpscr vstr.32 s0, [sp, #16] vstr.32 s2, [sp, #12] vstr.32 s1, [sp, #8] ble LBB20_2 @ BB#1: @ %bb vldr.32 s0, [r7, #-...
2011 Feb 16
2
create a data frame with the given column names
how do I create a data frame with the given column names _NOT KNOWN IN ADVANCE_? i.e., I have a vector of strings for names and I want to get an _EMPTY_ data frame with these column names. is it at all possible? -- Sam Steingold (http://sds.podval.org/) on CentOS release 5.3 (Final) http://openvotingconsortium.org http://pmw.org.il http://memri.org http://mideasttruth.com
2018 Feb 26
0
Suggentions on modeling a micro architecture with per-operand machine model
Hi everyone, I would like to know how to model an instruction waiting a pipeline unit to be empty for cycles. For example, I have a vstr that waits FP pipelines to be empty for at most 3 cycles. I set FP instructions use a resource unit called FPPipe with resourceCycle=3 and vstr use FPPipe with resourceClycle=0. So scheduler will know a vstr will wait 3 cycle if it is scheduled right after a FP instruction. However, this way will...
2013 Oct 15
0
[LLVMdev] MI scheduler produce badly code with inline function
On Oct 14, 2013, at 3:27 AM, Zakk <zakk0610 at gmail.com> wrote: > Hi all, > I meet this problem when compiling the TREAM benchmark (http://www.cs.virginia.edu/stream/FTP/Code/) with enable-misched > > The small function will be scheduled as good code, but if opt inline this function, the inline part will be scheduled as bad code. A bug for this is welcome. Pretty soon, I’ll
2013 Oct 14
2
[LLVMdev] MI scheduler produce badly code with inline function
Hi all, I meet this problem when compiling the TREAM benchmark ( http://www.cs.virginia.edu/stream/FTP/Code/) with enable-misched The small function will be scheduled as good code, but if opt inline this function, the inline part will be scheduled as bad code. so I rewrite a simple code as attached link (foo.c), and compiled with two different methods: *method A:* *$clang -O3 foo.c -static -S
2014 Dec 09
1
[RFC PATCH v2] armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics
...vtrn_f32(tv.val[0], ZERO); > + tv.val[0] = vadd_f32(tv.val[0], tv.val[1]); > + > + vst1_lane_f32(&sumi, tv.val[0], 0); Accessing tv.val[0] and tv.val[1] directly seems to send these values through the stack, e.g., f4: f3ba7085 vtrn.32 d7, d5 f8: ed0b7b0f vstr d7, [fp, #-60] fc: ed0b5b0d vstr d5, [fp, #-52] ... 114: ed1b6b09 vldr d6, [fp, #-36] 118: ed1b7b0b vldr d7, [fp, #-44] 11c: f2077d06 vadd.f32 d7, d7, d6 120: f483780f vst1.32 {d7[0]}, [r3] Can't you just use float32...
2010 Sep 21
3
[LLVMdev] Vectors in structures
...ector types are considered proper types in LLVM, why pack them inside structures? That results in a lot of boilerplate code for converting and copying the values (about 20 lines of IR) just to call a NEON instruction that, in the end, will be converted into three instructions: VLDR + {whatever} + VSTR If the load and store are normally performed by one operation (I assume it's the same on Intel and others), why bother with the structure passing instead of just using load/store for vector types? Also, the extra struct { [i8 x 8] } for memcopy seems also redundant. If you're explicitly t...
2013 Oct 16
3
[LLVMdev] MI scheduler produce badly code with inline function
...lr} movw r12, :lower16:c movw lr, :lower16:b movw r3, #9216 movt r12, :upper16:c mov r1, #0 vmov.f64 d16, #3.000000e+00 movt lr, :upper16:b movt r3, #244 .LBB0_1: add r0, r12, r1 add r2, lr, r1 *vldr d17, [r0]* add r1, r1, #32 vmul.f64 d17, d17, d16 cmp r1, r3 vstr d17, [r2] * vldr d17, [r0, #8]* vmul.f64 d17, d17, d16 * * vstr d17, [r2, #8] * vldr d17, [r0, #16]* vmul.f64 d17, d17, d16 vstr d17, [r2, #16] * vldr d17, [r0, #24]* vmul.f64 d17, d17, d16 vstr d17, [r2, #24] bne .LBB0_1 pop {lr} bx lr .Ltmp0: Using Itinerary will ge...
2014 Jul 23
2
[LLVMdev] JIT on armhf, again
On 7/23/14, 3:30 PM, Tim Northover wrote: [...] > It looks like it's a case of calling Module::setTargetTriple. As with > most JIT setup questions, though, often the best way to find out is to > get something working in lli and then look at what it does > (tools/lli/lli.cpp). Well, it's *almost* working --- hardfloat code is now being generated, and it even seems to be right
2014 Jul 24
2
[LLVMdev] JIT on armhf, again
On 7/24/14, 7:18 AM, Tim Northover wrote: [...] > Which triple are you using? And is the correct code used when you run > the same IR through "llc -mtriple=whatever"? armv7-linux-gnueabihf, as suggested; and if I use llc -mtriple then the code compiles to: vstr s0, [r0] bx lr ...which I would consider correct. (What's more interesting is *without* specifying the triple llc generates armel code. Should llc default to generating code which will actually run on a given platform? Is it possible my version of llvm has been compiled with the wrong option...
2010 Sep 21
0
[LLVMdev] Vectors in structures
...ype of arguments, but that is not yet implemented. > > That results in a lot of boilerplate code for converting and copying > the values (about 20 lines of IR) just to call a NEON instruction > that, in the end, will be converted into three instructions: > > VLDR + {whatever} + VSTR > > If the load and store are normally performed by one operation (I > assume it's the same on Intel and others), why bother with the > structure passing instead of just using load/store for vector types? As you noted, the struct wrappers produce a lot of extra code but it should...
2015 Apr 20
2
[LLVMdev] question about alignment of structures on the stack (arm 32)
Dear community, I faced with code which was generated by llvm, assembly instructions of that code is relying on 8-bytes alignment for structures on the stack. The part of Objective C code is following: -(void)getCharacters:(unichar *)unicode {     NSRange range;     range.location = 0;     range.length = [self length];     printf("%p, %p\n", &range.location, &range.length); And
2014 Dec 07
2
[RFC PATCH v2] cover: armv7: celt_pitch_xcorr: Introduce ARM neon intrinsics
Hi, Optimizes celt_pitch_xcorr for floating point. Changes from RFCv1: - Rebased on top of commit aad281878: Fix celt_pitch_xcorr_c signature. which got rid of ugly code around CELT_PITCH_XCORR_IMPL passing of "arch" parameter. - Unified with --enable-intrinsics used by x86 - Modified algorithm to be more in-line with algorithm in celt_pitch_xcorr_arm.s Viswanath Puttagunta
2015 Apr 21
2
[LLVMdev] question about alignment of structures on the stack (arm 32)
...ix ----- And we get following code of assembler language: main:     push    {r11, lr}     mov    r11, sp     sub    sp, sp, #24     mov    r0, #0     str    r0, [r11, #-4]     add    r1, sp, #8     movw    r2, :lower16:.Lmain.mStruct     movt    r2, :upper16:.Lmain.mStruct     vldr    d16, [r2]     vstr    d16, [sp, #8]     orr    r2, r1, #4     movw    r3, :lower16:.L.str     movt    r3, :upper16:.L.str     str    r0, [sp, #4]     mov    r0, r3     bl    printf     ldr    r1, [sp, #4]     str    r0, [sp]     mov    r0, r1     mov    sp, r11     pop    {r11, pc} r2 populates by r1 plus 4 (but plu...
2011 Nov 12
2
[LLVMdev] Thumb-2 code generation error in Apple LLVM at all optimization levels
...24_11: add r1, pc ldr r1, [r1] blx _objc_msgSend str r0, [sp, #28] .loc 1 399 69 ldr r0, [sp, #28] ldr r1, [sp, #40] ldr.n r2, LCPI24_10 LPC24_10: add r2, pc ldr r2, [r2] add r1, r2 ldr r2, [r1] ldr.n r1, LCPI24_9 LPC24_9: add r1, pc ldr r1, [r1] blx _objc_msgSend vmov d16, r0, r1 vstr.64 d16, [sp, #16] .loc 1 401 2 ldr r0, [sp, #40] ldr.n r1, LCPI24_8 LPC24_8: add r1, pc ldr r1, [r1] add r0, r1 ldr r0, [r0] ldr.n r1, LCPI24_7 LPC24_7: add r1, pc ldr r1, [r1] blx _objc_msgSend .loc 1 402 2 ldr r0, [sp, #28] ldr.n r1, LCPI24_6 LPC24_6: add r1, pc ldr r1, [r1] blx...
2013 Aug 08
14
[LLVMdev] [global-isel] Proposal for a global instruction selector
...if f64 is a legal type, so is i64, v2f32, and even v64i1. On the ARM target, for example, these types would be legal: All 8-bit types via ldrb/strb to GPR. (i8, v1i8, v2i4, v4i2, v8i1) All 16-bit types via ldrh/strh to GPR. (i16, f16, v1i16, v2i8, ...) All 32-bit types via ldr/str to GPR and vldr/vstr to SPR. All 64-bit types via ldrd/strd to GPRPair and vldr/vstr to DPR. All 128-bit types via vld1/vst1 to DPair. All 192-bit types via vld1/vst1 to DTriple. All 256-bit types via vld1/vst1 to DQuad. This larger set of legal types also makes it easier to handle things like extractelement <8 x i8...
2013 Jun 24
1
[LLVMdev] DebugInfo: Missing non-trivially-copyable parameters in SelectionDAG
...st-isel-conversion.ll --check-prefix=THUMB -- Exit Code: 1 Command Output (stderr): -- /usr/local/google/home/blaikie/dev/llvm/src/test/CodeGen/ARM/fast-isel-conversion.ll:23:8: error: expected string not found in input ; ARM: sitofp_single_i16 ^ <stdin>:43:2: note: scanning from here vstr s0, [sp] ^ <stdin>:113:10: note: possible intended match here .globl _uitofp_single_i16 ^ -- ******************** FAIL: LLVM :: CodeGen/R600/store.ll (9 of 51) ******************** TEST 'LLVM :: CodeGen/R600/store.ll' FAILED ******************** Script: -- /usr/local/googl...