אלכס לופ' via llvm-dev
2019-Oct-25 08:02 UTC
[llvm-dev] Where and how to report an optimisation issue that doesn't cause a crash
<div dir='rtl'><div> <div dir="rtl"> <div dir="ltr">Could be... But the wierd thing here is if I change the array to be of size 256 and the index to be 'unsigned char', seems like there is no way to access the 'size' field throught "y->ptr[index]" (in your example) but clang still performs the re-read: <a href="https://godbolt.org/z/ahlGHY">https://godbolt.org/z/ahlGHY</a><br /><br /></div> <div dir="ltr">It looks like clang sees an alias between "int ptr[256]" and "int size"... maybe because they are both "int"? Because changing the type of size to char/short/long eliminates the re-read: <a href="https://godbolt.org/z/4kr1oo">https://godbolt.org/z/4kr1oo</a></div> <div dir="ltr"> </div> <div dir="ltr">Interesting and confusing at the same time. Can it be related to the standard definition of (non)aliasing as this answer suggests: <a href="https://stackoverflow.com/a/58550055/5218277">https://stackoverflow.com/a/58550055/5218277</a> ?</div> <div dir="ltr"> </div> <div dir="ltr">On the other hand, placing the "int size" (as you mentioned) before "int ptr[256]" also eliminates the re-reads: <a href="https://godbolt.org/z/kpB74m">https://godbolt.org/z/kpB74m</a></div> <div dir="ltr"> </div> <div dir="ltr"> </div> </div> <section class="cust_msg_end"></section> <blockquote style="margin: 0; margin-bottom: 20px; border-top: 1px solid #e0e0e0;"><br />ב אוק׳ 25, 2019 0:44, David Blaikie כתב: <blockquote style="margin: 0; margin-bottom: 20px; border-top: 1px solid #e0e0e0;"> <div dir="ltr"> <div dir="ltr">Here's a simpler version which I believe still captures the essence of the issue: <a href="https://godbolt.org/z/BUoLq1" target="_blank" rel="noopener">https://godbolt.org/z/BUoLq1</a><br /><br />Yep, looks like GCC manages this optimization and Clang does not - the runtime bound on the array is stopping clang from going further here (perhaps that's intentional - maybe to allow the array indexed store to possibly exceed the bounds of the array and assign to 'size' that comes after it - UB, but I don't know just how far non-strict aliasing goes)<br /><br />Yeah, looks like it might be something like that ^ if you change the index to be unsigned, and reorder the members so 'size' comes before the array (so indexing into the array can't alias 'size'), you do get: <a href="https://godbolt.org/z/Ax62Fv" target="_blank" rel="noopener">https://godbolt.org/z/Ax62Fv</a> which avoids reloading size after the assignment to the array.</div> <br /> <div> <div dir="ltr">On Thu, Oct 24, 2019 at 12:53 PM אלכס לופ' <<a href="mailto:alex_lop@walla.com" target="_blank" rel="noopener">alex_lop@walla.com</a>> wrote:</div> <blockquote style="margin: 0px 0px 0px 0.8ex; border-left: 1px solid #cccccc; padding-left: 1ex;"> <div dir="rtl"> <div dir="ltr">Hi David,<br /><br /></div> <div dir="ltr"> </div> <div dir="ltr">Thanks for your reply. The fact that "queue_ptr->queue[queue_ptr->wr_idx++]" could be somewhere in the object pointed by "queue_ptr" came to my mind too.</div> <div dir="ltr">I also added this in the stackoverflow question at the bottom.</div> <div dir="ltr">In such case, I assumed, that if I change the struct "queue_t" to have an array of "event_t" instead of just pointer to "event_t", like this:</div> <div dir="ltr"> </div> <div dir="ltr"> <pre><code>typedef struct { event_t queue[256]; // changed from pointer to array with max size size_t size; uint16_t num_of_items; uint8_t rd_idx; uint8_t wr_idx; } queue_t;</code><br /><br />there would be no valid way (without breaking the C standard definitions) that "queue_ptr->queue[queue_ptr->wr_idx++]" ends up somewhere inside "*queue_ptr", right?<br />However the generated assembly code still contained the same re-reads...<br />Any idea why would this happen even with the new "queue_t" struct definition?<br /><br />P.S.<br />I can easily reproduce it and provide any additional data which can be gathered during the compilation process, if it helps.<br /><br />Thanks,<br />Alex.</pre> </div> <div id="gmail-m_6399794515280957024gmail-m_-1659663878248091923gtx-trans"> </div> </div> </blockquote> </div> </div> </blockquote> </blockquote> </div></div>
David Blaikie via llvm-dev
2019-Oct-25 22:45 UTC
[llvm-dev] Where and how to report an optimisation issue that doesn't cause a crash
Baseline: No compiler produces optimal code in all cases - it's too expensive. They'll all have some tradeoffs, some more reasonable than others, some more intentional than others. I don't know where this particular situation lies (whether it's a reasonable tradeoff of compiler complexity, or an accidental one that might still be a reasonable tradeoff of compiler engineer time (other things more important than this particular optimization, etc)). On Fri, Oct 25, 2019 at 1:02 AM אלכס לופ' <alex_lop at walla.com> wrote:> Could be... But the wierd thing here is if I change the array to be of > size 256 and the index to be 'unsigned char', seems like there is no way to > access the 'size' field throught "y->ptr[index]" (in your example) but > clang still performs the re-read: https://godbolt.org/z/ahlGHY >Yup - I guess LLVM doesn't track/use the upper bound of the index here, even if it uses the lower bound. (as demonstrated by the reordering enabling the optimization)> It looks like clang sees an alias between "int ptr[256]" and "int size"... > maybe because they are both "int"? Because changing the type of size to > char/short/long eliminates the re-read: https://godbolt.org/z/4kr1oo >That's somewhat interesting - hmm, I'd assumed we didn't use strict aliasing by default, but perhaps we do. You can try compiling with -fno-strict-aliasing to disable type based alias analysis & see that the type change doesn't fix the issue then.> Interesting and confusing at the same time. Can it be related to the > standard definition of (non)aliasing as this answer suggests: > https://stackoverflow.com/a/58550055/5218277 ? >Yeah, guess so - once you change the type, type based alias analysis proves the pointer produced by the index arithmetic can't alias the non-int member. (with -fno-strict-aliasing changing the type does not remove the extra load even when the type is different)> On the other hand, placing the "int size" (as you mentioned) before "int > ptr[256]" also eliminates the re-reads: https://godbolt.org/z/kpB74m > > > > > ב אוק׳ 25, 2019 0:44, David Blaikie כתב: > > Here's a simpler version which I believe still captures the essence of the > issue: https://godbolt.org/z/BUoLq1 > > Yep, looks like GCC manages this optimization and Clang does not - the > runtime bound on the array is stopping clang from going further here > (perhaps that's intentional - maybe to allow the array indexed store to > possibly exceed the bounds of the array and assign to 'size' that comes > after it - UB, but I don't know just how far non-strict aliasing goes) > > Yeah, looks like it might be something like that ^ if you change the index > to be unsigned, and reorder the members so 'size' comes before the array > (so indexing into the array can't alias 'size'), you do get: > https://godbolt.org/z/Ax62Fv which avoids reloading size after the > assignment to the array. > > On Thu, Oct 24, 2019 at 12:53 PM אלכס לופ' <alex_lop at walla.com> > wrote: > > Hi David, > > > Thanks for your reply. The fact that > "queue_ptr->queue[queue_ptr->wr_idx++]" could be somewhere in the object > pointed by "queue_ptr" came to my mind too. > I also added this in the stackoverflow question at the bottom. > In such case, I assumed, that if I change the struct "queue_t" to have an > array of "event_t" instead of just pointer to "event_t", like this: > > > typedef struct > { > event_t queue[256]; // changed from pointer to array with max size > size_t size; > uint16_t num_of_items; > uint8_t rd_idx; > uint8_t wr_idx; > } queue_t; > > there would be no valid way (without breaking the C standard definitions) that "queue_ptr->queue[queue_ptr->wr_idx++]" ends up somewhere inside "*queue_ptr", right? > However the generated assembly code still contained the same re-reads... > Any idea why would this happen even with the new "queue_t" struct definition? > > P.S. > I can easily reproduce it and provide any additional data which can be gathered during the compilation process, if it helps. > > Thanks, > Alex. > > > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191025/78765bdf/attachment.html>
Momchil Velikov via llvm-dev
2019-Oct-26 20:39 UTC
[llvm-dev] Where and how to report an optimisation issue that doesn't cause a crash
Using the shorter test case: struct S { int a[3]; int b; }; int f(struct S *p, int i) { if (p->b) return 42; p->a[i] = 3; return p->b; } one can see that the the TBAA metadata loses information about the array member: !4 = !{!"S", !5, i64 0, !7, i64 12} !5 = !{!"omnipotent char", !6, i64 0} The "new struct path TBAA" looks better, it seems to say "there are 12 bytes of `int`s at offset 0 in struct S" (Command line was ./bin/clang -target armv7m-eabi -O2 -S y.c -emit-llvm -Xclang -new-struct-path-tbaa) !3 = !{!4, !7, i64 12, i64 4} !4 = !{!5, i64 16, !"S", !7, i64 0, i64 12, !7, i64 12, i64 4} !5 = !{!6, i64 1, !"omnipotent char"} !6 = !{!"Simple C/C++ TBAA"} !7 = !{!5, i64 4, !"int"} !8 = !{!7, !7, i64 0, i64 4} but then, the access tag for the store to the array %arrayidx = getelementptr inbounds %struct.S, %struct.S* %p, i32 0, i32 0, i32 %i store i32 3, i32* %arrayidx, align 4, !tbaa !8 says just "it's in int" and there it still a redundant load: f: ldr r2, [r0, #12] cmp r2, #0 itt ne movne r0, #42 bxne lr movs r2, #3 str.w r2, [r0, r1, lsl #2] ldr r0, [r0, #12] bx lr So, I manually hacked the metadata too look like: !8 = !{!4, !7, i64 0, i64 4} i.e. as if we access the first element of the array. Running that through `opt -O2` and `llc` yields: f: ldr r2, [r0, #12] cmp r2, #0 iteee ne movne r0, #42 moveq r2, #3 streq.w r2, [r0, r1, lsl #2] moveq r0, #0 bx lr That seems like something that Clang can do by itself for access tags for index expressions with member arrays: state that they access the offset in the struct that corresponds to the first array element, so unknown indices would still conservatively alias between each other, but not with other struct members. Thoughts? Pitfalls? I may give it a shot. ~chill -- Compiler scrub, Arm -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20191026/30e7dd3c/attachment.html>