thr3ads.net - search: "func

Displaying 3 results from an estimated 3 matches for "func_orig".

2013 Nov 16

[LLVMdev] Limit loop vectorizer to SSE

I confirm that r194876 fixes the issue, i.e. segfault not caused. My program still passed 16 byte aligned pointers to the function which the loop vectorizer processes successfully: LV: Vector loop of width 8 costs: 1. LV: Selecting VF = : 8. LV: Found a vectorizable loop (8) in func_orig.ll LV: Unroll Factor is 1 Since the program runs fine, it seems to be allowed for the CPU to issue a vector load (8 floats) to a 16 byte aligned address (as opposed to 32 byte aligned). Or does in fact the loop vectorizer handle this case in the preamble and the vector.body issues only 32 byte ali...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 16

[LLVMdev] Limit loop vectorizer to SSE

...194876 fixes the issue, i.e. segfault not caused. > > My program still passed 16 byte aligned pointers to the function > which the loop vectorizer processes successfully: > > LV: Vector loop of width 8 costs: 1. > LV: Selecting VF = : 8. > LV: Found a vectorizable loop (8) in func_orig.ll > LV: Unroll Factor is 1 > > Since the program runs fine, it seems to be allowed for the CPU > to issue a vector load (8 floats) to a 16 byte aligned address (as > opposed to 32 byte aligned). Or does in fact the loop vectorizer > handle this case in the preamble and the vecto...

[LLVMdev] Limit loop vectorizer to SSE

2013 Nov 15

[LLVMdev] Limit loop vectorizer to SSE

A fix for this is in r194876. Thanks for reporting this! On Nov 15, 2013, at 3:49 PM, Joshua Klontz <josh.klontz at gmail.com> wrote: > Nadav, > > I believe aligned accesses to unaligned pointers is precisely the issue. Consider the function `add_u8S` before[1] and after[2] the loop vectorizer pass. There is no alignment assumption associated with %kernel_data prior to

search for: func_orig