Adrian Moreno via llvm-dev
2021-Dec-21 16:30 UTC
[llvm-dev] broken C code only when optimized "-O2"
Hello, I need some help understanding what might be wrong with a piece of code from the openvswitch project. By ${subject} I'm not suggesting there's a problem in clang, gcc also shows the same behavior so it's likely our code is broken. I am kindly asking for help to understand/troubleshoot the problem. Summary: It seems that certain interaction between two main openvswitch data structures, when optimized ("-O2 -flto=auto") is broken. The two data structures are: hmap: https://github.com/openvswitch/ovs/blob/master/include/openvswitch/hmap.h list: https://github.com/openvswitch/ovs/blob/master/include/openvswitch/list.h I've reproduced the problem outside of openvswitch daemon using a short C program (attached) Code snippet: struct bond { struct hmap members; }; struct member { struct hmap_node hmap_node; int order; struct ovs_list elem; }; int main() { int ret = 0; struct member *member, *member1, *member2; struct bond *bond; struct ovs_list start = {0}; bond = malloc(sizeof *bond); memset(bond, 0, sizeof (struct bond)); hmap_init(&bond->members); member1 = malloc(sizeof *member1); member2 = malloc(sizeof *member2); memset(member1, 0, sizeof (struct member)); memset(member2, 0, sizeof (struct member)); member1->order = 3; member2->order = 2; hmap_insert(&bond->members, &member1->hmap_node, (uint32_t)(uintptr_t)member1); hmap_insert(&bond->members, &member2->hmap_node, (uint32_t)(uintptr_t)member2); ovs_list_init(&start); HMAP_FOR_EACH (member, hmap_node, &bond->members) { /* * Insert member in start (sorted) * */ struct member *pos; LIST_FOR_EACH (pos, elem, &start) { if (member->order > pos->order) { break; } } // TESTED: If I add this printf, the problem disappears //printf("Inserting member: %p\n", member); ovs_list_insert(&pos->elem, &member->elem); } /* I've inserted two members into the 'start' list. * first and last have to be either member1 or member2 * */ if ((first != member1 && first != member2) || (last != member1 && last != member2)) { printf("list is broken!\n"); } } What I know for now: * -fno-strict-aliasing does not fix it * Only happens with "-O2 -flto=auto" * If I define 'ovs_list *start' and change the code to use the pointer directly and not '&start' the problem disappears. It seems that the LIST_FOR_EACH macros prefer an lvalue rather than "&" but I don't get why. * I'm not able to reproduce without using hmap _and_ ovs_list. * If I add a compiler barrier (or a call to an external function) after the loop, the problem disappears (e.g printf), the problem disappears. * If I add -fsanitize=undefined the problem disappears! I'd really appreciate any hint or idea to try to understand this problem. Thanks in advanced. -- Adrián Moreno -------------- next part -------------- A non-text attachment was scrubbed... Name: example.c Type: text/x-csrc Size: 3876 bytes Desc: not available URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211221/65316a4a/attachment.c>
Sterling Augustine via llvm-dev
2021-Dec-21 17:31 UTC
[llvm-dev] broken C code only when optimized "-O2"
The thing to do with problems like this is to run with clang's sanitizers, which diagnose undefined behavior, memory errors, and other such issues that often show up only under optimization. A cursory look at all the casting in this example makes me think there is undefined behavior, but there very well could be another type of failure. https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html https://clang.llvm.org/docs/AddressSanitizer.html https://clang.llvm.org/docs/MemorySanitizer.html On Tue, Dec 21, 2021 at 9:30 AM Adrian Moreno via llvm-dev < llvm-dev at lists.llvm.org> wrote:> Hello, > > I need some help understanding what might be wrong with a piece of code > from the > openvswitch project. By ${subject} I'm not suggesting there's a problem in > clang, gcc also shows the same behavior so it's likely our code is broken. > I am > kindly asking for help to understand/troubleshoot the problem. > > Summary: It seems that certain interaction between two main openvswitch > data > structures, when optimized ("-O2 -flto=auto") is broken. > The two data structures are: > > hmap: > https://github.com/openvswitch/ovs/blob/master/include/openvswitch/hmap.h > list: > https://github.com/openvswitch/ovs/blob/master/include/openvswitch/list.h > > I've reproduced the problem outside of openvswitch daemon using a short C > program (attached) > > Code snippet: > > struct bond { > struct hmap members; > }; > > struct member { > struct hmap_node hmap_node; > int order; > struct ovs_list elem; > }; > > int main() { > int ret = 0; > struct member *member, *member1, *member2; > struct bond *bond; > struct ovs_list start = {0}; > > bond = malloc(sizeof *bond); > memset(bond, 0, sizeof (struct bond)); > hmap_init(&bond->members); > > member1 = malloc(sizeof *member1); > member2 = malloc(sizeof *member2); > memset(member1, 0, sizeof (struct member)); > memset(member2, 0, sizeof (struct member)); > > member1->order = 3; > member2->order = 2; > > hmap_insert(&bond->members, &member1->hmap_node, > (uint32_t)(uintptr_t)member1); > hmap_insert(&bond->members, &member2->hmap_node, > (uint32_t)(uintptr_t)member2); > > ovs_list_init(&start); > HMAP_FOR_EACH (member, hmap_node, &bond->members) { > /* > * Insert member in start (sorted) > * */ > struct member *pos; > LIST_FOR_EACH (pos, elem, &start) { > if (member->order > pos->order) { > break; > } > } > // TESTED: If I add this printf, the problem disappears > //printf("Inserting member: %p\n", member); > ovs_list_insert(&pos->elem, &member->elem); > } > > /* I've inserted two members into the 'start' list. > * first and last have to be either member1 or member2 > * */ > if ((first != member1 && first != member2) || (last != member1 && > last !> member2)) { > printf("list is broken!\n"); > } > > } > > > What I know for now: > * -fno-strict-aliasing does not fix it > * Only happens with "-O2 -flto=auto" > * If I define 'ovs_list *start' and change the code to use the pointer > directly > and not '&start' the problem disappears. It seems that the LIST_FOR_EACH > macros > prefer an lvalue rather than "&" but I don't get why. > * I'm not able to reproduce without using hmap _and_ ovs_list. > * If I add a compiler barrier (or a call to an external function) after > the > loop, the problem disappears (e.g printf), the problem disappears. > * If I add -fsanitize=undefined the problem disappears! > > I'd really appreciate any hint or idea to try to understand this problem. > > Thanks in advanced. > > -- > Adrián Moreno_______________________________________________ > LLVM Developers mailing list > llvm-dev at lists.llvm.org > https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-dev >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211221/f5d93e32/attachment.html>
Dimitry Andric via llvm-dev
2021-Dec-21 18:25 UTC
[llvm-dev] broken C code only when optimized "-O2"
On 21 Dec 2021, at 17:30, Adrian Moreno via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > I need some help understanding what might be wrong with a piece of code from the openvswitch project. By ${subject} I'm not suggesting there's a problem in clang, gcc also shows the same behavior so it's likely our code is broken. I am kindly asking for help to understand/troubleshoot the problem. > > Summary: It seems that certain interaction between two main openvswitch data structures, when optimized ("-O2 -flto=auto") is broken. > The two data structures are: > > hmap: https://github.com/openvswitch/ovs/blob/master/include/openvswitch/hmap.h > list: https://github.com/openvswitch/ovs/blob/master/include/openvswitch/list.h > > I've reproduced the problem outside of openvswitch daemon using a short C program (attached) > > Code snippet: > > struct bond { > struct hmap members; > }; > > struct member { > struct hmap_node hmap_node; > int order; > struct ovs_list elem; > }; > > int main() { > int ret = 0; > struct member *member, *member1, *member2; > struct bond *bond; > struct ovs_list start = {0}; > > bond = malloc(sizeof *bond); > memset(bond, 0, sizeof (struct bond)); > hmap_init(&bond->members); > > member1 = malloc(sizeof *member1); > member2 = malloc(sizeof *member2); > memset(member1, 0, sizeof (struct member)); > memset(member2, 0, sizeof (struct member)); > > member1->order = 3; > member2->order = 2; > > hmap_insert(&bond->members, &member1->hmap_node, (uint32_t)(uintptr_t)member1); > hmap_insert(&bond->members, &member2->hmap_node, (uint32_t)(uintptr_t)member2); > > ovs_list_init(&start); > HMAP_FOR_EACH (member, hmap_node, &bond->members) { > /* > * Insert member in start (sorted) > * */ > struct member *pos; > LIST_FOR_EACH (pos, elem, &start) { > if (member->order > pos->order) { > break; > } > } > // TESTED: If I add this printf, the problem disappears > //printf("Inserting member: %p\n", member); > ovs_list_insert(&pos->elem, &member->elem); > } > > /* I've inserted two members into the 'start' list. > * first and last have to be either member1 or member2 > * */ > if ((first != member1 && first != member2) || (last != member1 && last != member2)) { > printf("list is broken!\n"); > } > > } > > > What I know for now: > * -fno-strict-aliasing does not fix it > * Only happens with "-O2 -flto=auto" > * If I define 'ovs_list *start' and change the code to use the pointer directly and not '&start' the problem disappears. It seems that the LIST_FOR_EACH macros prefer an lvalue rather than "&" but I don't get why. > * I'm not able to reproduce without using hmap _and_ ovs_list. > * If I add a compiler barrier (or a call to an external function) after the loop, the problem disappears (e.g printf), the problem disappears. > * If I add -fsanitize=undefined the problem disappears!Not for me: % clang -g -fsanitize=undefined -I/Users/dim/tmp/vswitch/foo/include example.c -o example -L/Users/dim/tmp/vswitch/foo/lib -lopenvswitch % ./example start: 0x16ee6f618 example.c:78:5: runtime error: applying zero offset to null pointer SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior example.c:78:5 in example.c:78:5: runtime error: applying zero offset to null pointer SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior example.c:78:5 in first: 0x600003200270 last: 0x6000032002a0 It looks like the HMAP_FOR_EACH() macro uses null pointer arithmetic. The problem appears to be in https://github.com/openvswitch/ovs/blob/master/include/openvswitch/util.h#L94 : /* Given OBJECT of type pointer-to-structure, expands to the offset of MEMBER * within an instance of the structure. * * The GCC-specific version avoids the technicality of undefined behavior if * OBJECT is null, invalid, or not yet initialized. This makes some static * checkers (like Coverity) happier. But the non-GCC version does not actually * dereference any pointer, so it would be surprising for it to cause any * problems in practice. */ #ifdef __GNUC__ #define OBJECT_OFFSETOF(OBJECT, MEMBER) offsetof(typeof(*(OBJECT)), MEMBER) #else #define OBJECT_OFFSETOF(OBJECT, MEMBER) \ ((char *) &(OBJECT)->MEMBER - (char *) (OBJECT)) #endif The comment is incorrect here, because dereferencing a null pointer, as done in *(OBJECT), is definitely undefined behavior. -Dimitry -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 223 bytes Desc: Message signed with OpenPGP URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20211221/5ac63076/attachment.sig>
James Dutton via llvm-dev
2021-Dec-21 18:44 UTC
[llvm-dev] broken C code only when optimized "-O2"
On Tue, 21 Dec 2021 at 16:30, Adrian Moreno via llvm-dev <llvm-dev at lists.llvm.org> wrote:> > Hello, > > I need some help understanding what might be wrong with a piece of code from the > openvswitch project. By ${subject} I'm not suggesting there's a problem in > clang, gcc also shows the same behavior so it's likely our code is broken. I am > kindly asking for help to understand/troubleshoot the problem. > > Summary: It seems that certain interaction between two main openvswitch data > structures, when optimized ("-O2 -flto=auto") is broken. > The two data structures are: > > hmap: https://github.com/openvswitch/ovs/blob/master/include/openvswitch/hmap.h > list: https://github.com/openvswitch/ovs/blob/master/include/openvswitch/list.h > > I've reproduced the problem outside of openvswitch daemon using a short C > program (attached) > > Code snippet: >... It is quite obvious that the code is wrong. Can't you spot the problem with this?: member1 = malloc(sizeof *member1); member2 = malloc(sizeof *member2); memset(member1, 0, sizeof (struct member)); memset(member2, 0, sizeof (struct member));