Displaying 3 results from an estimated 3 matches for "func_orig".
2013 Nov 16
0
[LLVMdev] Limit loop vectorizer to SSE
I confirm that r194876 fixes the issue, i.e. segfault not caused.
My program still passed 16 byte aligned pointers to the function
which the loop vectorizer processes successfully:
LV: Vector loop of width 8 costs: 1.
LV: Selecting VF = : 8.
LV: Found a vectorizable loop (8) in func_orig.ll
LV: Unroll Factor is 1
Since the program runs fine, it seems to be allowed for the CPU
to issue a vector load (8 floats) to a 16 byte aligned address (as
opposed to 32 byte aligned). Or does in fact the loop vectorizer
handle this case in the preamble and the vector.body issues only
32 byte ali...
2013 Nov 16
1
[LLVMdev] Limit loop vectorizer to SSE
...194876 fixes the issue, i.e. segfault not caused.
>
> My program still passed 16 byte aligned pointers to the function
> which the loop vectorizer processes successfully:
>
> LV: Vector loop of width 8 costs: 1.
> LV: Selecting VF = : 8.
> LV: Found a vectorizable loop (8) in func_orig.ll
> LV: Unroll Factor is 1
>
> Since the program runs fine, it seems to be allowed for the CPU
> to issue a vector load (8 floats) to a 16 byte aligned address (as
> opposed to 32 byte aligned). Or does in fact the loop vectorizer
> handle this case in the preamble and the vecto...
2013 Nov 15
2
[LLVMdev] Limit loop vectorizer to SSE
A fix for this is in r194876.
Thanks for reporting this!
On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote:
> Nadav,
>
> I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to