search for: uglygep242243

Displaying 2 results from an estimated 2 matches for "uglygep242243".

2016 Oct 10
2
[arm, aarch64] Alignment checking in interleaved access pass
...t's right, perhaps because Halide is not a regular vectorizer, which opens up new cases. To give a bit more insight, here's a simple example of where the data is still continuous: [0 .. 32) , but it needs to be split to use multiple VSTns/STns. This is what Halide generates for aarch64: %uglygep242243 = bitcast i8* %uglygep242 to <16 x i32>* %114 = shufflevector <16 x i32> %112, <16 x i32> %113, <4 x i32> <i32 0, i32 1, i32 2, i32 3> %115 = shufflevector <16 x i32> %112, <16 x i32> %113, <4 x i32> <i32 8, i32 9, i32 10, i32 11> %116 =...
2016 Oct 10
2
[arm, aarch64] Alignment checking in interleaved access pass
Hi Renato, Thank you for the answers! First, let me clarify a couple of things and give some context. The patch it looking at VSTn, rather than VLDn (stores seem to be somewhat harder to get the "right" patterns, the pass is doing a good job for loads already) The examples you gave come mostly from loop vectorization, which, as I understand it, was the reason for adding the