thr3ads.net - search: "lbb0

2017 Aug 18

1

[PATCH] fix alignment exceptions

...versions of clang, a version of the patch that looked at __clang_major__ and friends seems fair. -- Ray 826% diff -c *old *new *** pitch_sse4_1.s-old 2017-08-18 13:51:39.359084637 -0700 --- pitch_sse4_1.s-new 2017-08-18 13:51:54.595106450 -0700 *************** *** 73,80 **** cmpl $4, %eax jl .LBB0_8 # BB#7: ! movdqa (%edx,%edi,2), %xmm2 ! movdqa (%esi,%edi,2), %xmm1 addl $4, %edi movdqa %xmm2, %xmm3 pmullw %xmm1, %xmm2 --- 73,80 ---- cmpl $4, %eax jl .LBB0_8 # BB#7: ! movq (%edx,%edi,2), %xmm2 # xmm2 = mem[0],zero ! movq (%esi,%edi,2), %xmm1 # xmm1 = mem[0],zero addl $4,...

[PATCH] fix alignment exceptions

2017 Aug 18

2

[PATCH] fix alignment exceptions

We see the MOVQ instruction but this patch deliberately uses it rather than MOVQDA (load 128-bits aligned). We were seeing that with the trace below, the final invocation is not 128-bit aligned but MOVQDA insists on it (the calling function was pitch_sse4_1.c:90, in the 4-way N - i >= 4 loop). 07-31 11:00:13.469 210 2540 <(469)%20210-2540> D opus_sse1: RBE celt_inner_prod_sse4_1: x

[PATCH] fix alignment exceptions

2017 Aug 22

0

[PATCH] fix alignment exceptions

...versions of clang, a version of the patch that looked at __clang_major__ and friends seems fair. -- Ray 826% diff -c *old *new *** pitch_sse4_1.s-old 2017-08-18 13:51:39.359084637 -0700 --- pitch_sse4_1.s-new 2017-08-18 13:51:54.595106450 -0700 *************** *** 73,80 **** cmpl $4, %eax jl .LBB0_8 # BB#7: ! movdqa (%edx,%edi,2), %xmm2 ! movdqa (%esi,%edi,2), %xmm1 addl $4, %edi movdqa %xmm2, %xmm3 pmullw %xmm1, %xmm2 --- 73,80 ---- cmpl $4, %eax jl .LBB0_8 # BB#7: ! movq (%edx,%edi,2), %xmm2 # xmm2 = mem[0],zero ! movq (%esi,%edi,2), %xmm1 # xmm1 = mem[0],zero addl $4,...

[LLVMdev] Question regarding basic-block placement optimization

2011 Oct 19

0

[LLVMdev] Question regarding basic-block placement optimization

...2 .LBB0_2: # %else1 cmpl $3, 8(%r14) jb .LBB0_4 .LBB0_4: # %else2 cmpl $4, 12(%r14) jb .LBB0_6 .LBB0_6: # %else3 cmpl $5, 16(%r14) jb .LBB0_8 .LBB0_8: # %else4 cmpl $4, 12(%r14) jb .LBB0_10 .LBB0_10: # %exit movl %ebx, %eax popq %rbx popq %r14 popq %rbp ret .LBB0_1: # %the...

[LLVMdev] Question regarding basic-block placement optimization

2011 Oct 19

3

[LLVMdev] Question regarding basic-block placement optimization

On Tue, Oct 18, 2011 at 6:58 PM, Jakob Stoklund Olesen <stoklund at 2pi.dk>wrote: > > On Oct 18, 2011, at 5:22 PM, Chandler Carruth wrote: > > As for why it should be an IR pass, mostly because once the selection dag >> runs through the code, we can never recover all of the freedom we have at >> the IR level. To start with, splicing MBBs around requires known about

search for: lbb0_8