Alexey Perevalov
2015-Apr-09 07:58 UTC
[LLVMdev] __sync_add_and_fetch in objc block for global variable on ARM
Hi Tim ----------------------------------------> Date: Wed, 8 Apr 2015 06:53:44 -0700 > Subject: Re: [LLVMdev] __sync_add_and_fetch in objc block for global variable on ARM > From: t.p.northover at gmail.com > To: alexey.perevalov at hotmail.com > CC: llvmdev at cs.uiuc.edu > >> in disas I see dmb ish instruction, but I don't know is it enough. > > There should be 2 dmb instructions: one before the ldrex/strex loop > and one after. But I wouldn't expect dropping one to actually cause a > problem in the code you posted.Yes, there are two dmb's => 0x00008ed8 <+224>: dmb ish 0x00008edc <+228>: movw r1, #10800 ; 0x2a30 0x00008ee0 <+232>: movt r1, #1 0x00008ee4 <+236>: str r0, [sp, #44] ; 0x2c 0x00008ee8 <+240>: str r1, [sp, #40] ; 0x28 0x00008eec <+244>: ldr r0, [sp, #40] ; 0x28 0x00008ef0 <+248>: ldrexb r1, [r0] 0x00008ef4 <+252>: ldr r2, [sp, #44] ; 0x2c 0x00008ef8 <+256>: add r3, r1, r2 0x00008efc <+260>: strexb r12, r3, [r0] 0x00008f00 <+264>: cmp r12, #0 0x00008f04 <+268>: str r1, [sp, #36] ; 0x24 ---Type <return> to continue, or q <return> to quit--- 0x00008f08 <+272>: bne 0x8eec <__main_block_invoke+244> 0x00008f0c <+276>: ldr r0, [sp, #36] ; 0x24 0x00008f10 <+280>: add r0, r0, #1 0x00008f14 <+284>: dmb ish 0x00008f18 <+288>: ldr r1, [sp, #40] ; 0x28 0x00008f1c <+292>: strb r0, [r1] 0x00008f20 <+296>: bl 0x8aa0 <pthread_self>> > In what way is "count" corrupted, and how do you observe it? What > assembly is actually produced for the block?The assembly for whole block is huge even for minimal test case. I attached source code. I used cocotron derived framework, but due canaries was alive I don't think it's due runtime. If you undef REPRODUCE_CASE in example, it will not reproduce, I think it's because of introducing additional time interval. The output of sample is following (when it reproduced): after -1316199408 count addr 0x12a30, value 1 canary1 77, canary2 88 after -1324588016 count addr 0x12a30, value 2 canary1 77, canary2 88 after -1324588016 count addr 0x12a30, value 3 canary1 77, canary2 88 after -1324588016 count addr 0x12a30, value 33 canary1 77, canary2 88 after -1324588016 count addr 0x12a30, value 34> >> I understand, my clang is out of date. Moving to new version could be painful ) >> Maybe somebody knows, was that bug fixed? > > That area's certainly improved, but I'm not aware of any bugs on that > scale ("can't dispatch 32 threads to atomically increment a single > variable and print it") in clang 3.3 so I think something else is > probably going on. > > A self-contained, minimal example we can examine would be useful. > > Cheers. > > Tim.-------------- next part -------------- An embedded and charset-unspecified text was scrubbed... Name: disp_async_min.m URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20150409/fbe5d0df/attachment.ksh>
Alexey Perevalov
2015-Apr-09 08:15 UTC
[LLVMdev] __sync_add_and_fetch in objc block for global variable on ARM
Hi Tim, thank you for response and sorry my fault, count name in example is too general, my objc runtime is using it for handling auto release pool occupancy. compiler is fine ;) ---------------------------------------- From: alexey.perevalov at hotmail.com To: t.p.northover at gmail.com Date: Thu, 9 Apr 2015 10:58:30 +0300 CC: llvmdev at cs.uiuc.edu Subject: Re: [LLVMdev] __sync_add_and_fetch in objc block for global variable on ARM Hi Tim ----------------------------------------> Date: Wed, 8 Apr 2015 06:53:44 -0700 > Subject: Re: [LLVMdev] __sync_add_and_fetch in objc block for global variable on ARM > From: t.p.northover at gmail.com > To: alexey.perevalov at hotmail.com > CC: llvmdev at cs.uiuc.edu > >> in disas I see dmb ish instruction, but I don't know is it enough. > > There should be 2 dmb instructions: one before the ldrex/strex loop > and one after. But I wouldn't expect dropping one to actually cause a > problem in the code you posted.Yes, there are two dmb's => 0x00008ed8 <+224>: dmb ish 0x00008edc <+228>: movw r1, #10800 ; 0x2a30 0x00008ee0 <+232>: movt r1, #1 0x00008ee4 <+236>: str r0, [sp, #44] ; 0x2c 0x00008ee8 <+240>: str r1, [sp, #40] ; 0x28 0x00008eec <+244>: ldr r0, [sp, #40] ; 0x28 0x00008ef0 <+248>: ldrexb r1, [r0] 0x00008ef4 <+252>: ldr r2, [sp, #44] ; 0x2c 0x00008ef8 <+256>: add r3, r1, r2 0x00008efc <+260>: strexb r12, r3, [r0] 0x00008f00 <+264>: cmp r12, #0 0x00008f04 <+268>: str r1, [sp, #36] ; 0x24 ---Type <return> to continue, or q <return> to quit--- 0x00008f08 <+272>: bne 0x8eec <__main_block_invoke+244> 0x00008f0c <+276>: ldr r0, [sp, #36] ; 0x24 0x00008f10 <+280>: add r0, r0, #1 0x00008f14 <+284>: dmb ish 0x00008f18 <+288>: ldr r1, [sp, #40] ; 0x28 0x00008f1c <+292>: strb r0, [r1] 0x00008f20 <+296>: bl 0x8aa0 <pthread_self>> > In what way is "count" corrupted, and how do you observe it? What > assembly is actually produced for the block?The assembly for whole block is huge even for minimal test case. I attached source code. I used cocotron derived framework, but due canaries was alive I don't think it's due runtime. If you undef REPRODUCE_CASE in example, it will not reproduce, I think it's because of introducing additional time interval. The output of sample is following (when it reproduced): after -1316199408 count addr 0x12a30, value 1 canary1 77, canary2 88 after -1324588016 count addr 0x12a30, value 2 canary1 77, canary2 88 after -1324588016 count addr 0x12a30, value 3 canary1 77, canary2 88 after -1324588016 count addr 0x12a30, value 33 canary1 77, canary2 88 after -1324588016 count addr 0x12a30, value 34> >> I understand, my clang is out of date. Moving to new version could be painful ) >> Maybe somebody knows, was that bug fixed? > > That area's certainly improved, but I'm not aware of any bugs on that > scale ("can't dispatch 32 threads to atomically increment a single > variable and print it") in clang 3.3 so I think something else is > probably going on. > > A self-contained, minimal example we can examine would be useful. > > Cheers. > > Tim._______________________________________________ LLVM Developers mailing list LLVMdev at cs.uiuc.edu http://llvm.cs.uiuc.edu http://lists.cs.uiuc.edu/mailman/listinfo/llvmdev
Tim Northover
2015-Apr-09 13:48 UTC
[LLVMdev] __sync_add_and_fetch in objc block for global variable on ARM
On 9 April 2015 at 01:15, Alexey Perevalov <alexey.perevalov at hotmail.com> wrote:> count name in example is too general, my objc runtime is using it for handling auto release pool > occupancy.Ah! That would certainly do it. Glad you managed to sort the issue. Tim.