On Wed, Sep 07, 2016 at 10:13:48PM +0300, Slawa Olhovchenkov wrote:
> I am have strange issuse with nginx on FreeBSD11.
> I am have FreeBSD11 instaled over STABLE-10.
> nginx build for FreeBSD10 and run w/o recompile work fine.
> nginx build for FreeBSD11 crushed inside rbtree lookups: next node
> totaly craped.
>
> I am see next potential cause:
>
> 1) clang 3.8 code generation issuse
> 2) system library issuse
>
> may be i am miss something?
>
> How to find real cause?
I find real cause and this like show-stopper for RELEASE.
I am use nginx with AIO and AIO from one nginx process corrupt memory
from other nginx process. Yes, this is cross-process memory
corruption.
Last case, core dumped proccess with pid 1060 at 15:45:14.
Corruped memory at 0x860697000.
I am know about good memory at 0x86067f800.
Dumping (form core) this region to file and analyze by hexdump I am
found start of corrupt region -- offset 0000c8c0 from 0x86067f800.
0x86067f800+0xc8c0 = 0x86068c0c0
I am preliminary enabled debuggin of AIO started operation to nginx
error log (memory address, file name, offset and size of transfer).
grep -i 86068c0c0 error.log near 15:45:14 give target file.
grep ce949665cbcd.hls error.log near 15:45:14 give next result:
2016/09/15 15:45:13 [notice] 1055#0: *11659936 AIO_RD 000000082065DB60 start
000000086068C0C0 561b0 2646736 ce949665cbcd.hls
2016/09/15 15:45:14 [notice] 1060#0: *10998125 AIO_RD 000000081F1FFB60 start
000000086FF2C0C0 6cdf0 140016832 ce949665cbcd.hls
2016/09/15 15:45:14 [notice] 1055#0: *11659936 AIO_RD 00000008216B6B60 start
000000086472B7C0 7ff70 2999424 ce949665cbcd.hls
0x860697000-0x86068c0c0 = 0xaf40
from memory dump:
0000af00 5c 81 4d 7c 0b b6 81 f2 c8 a5 df 94 08 43 c1 08 |\.M|.........C..|
0000af10 74 00 57 55 5f 15 11 b1 00 d5 29 6a 4e d2 fd fb |t.WU_.....)jN...|
0000af20 49 d1 fd 98 49 58 b7 66 c2 c9 64 67 30 05 06 c0 |I...IX.f..dg0...|
0000af30 0e b2 64 fa b7 9f 69 69 fc cd 91 82 83 ba c3 f2 |..d...ii........|
0000af40 b7 34 eb 8e 0e 88 40 60 1b a8 71 7a 12 15 26 d3
|.4....@`..qz..&.|
0000af50 7f 3e 80 e9 74 96 30 24 cb 82 88 8a ea e0 45 10
|.>..t.0$......E.|
0000af60 e5 75 b2 f7 5b 7c 83 fa 95 a9 09 80 0a 8c fd a9 |.u..[|..........|
0000af70 ef 30 f6 68 9c b2 3f ae 2e e5 21 79 78 8b 34 36 |.0.h..?...!yx.46|
0000af80 c6 55 16 a2 47 00 ca 13 9c 8e 2c 6b eb c7 4f 51 |.U..G.....,k..OQ|
0000af90 81 80 71 f3 a5 9a 5f 40 54 9c f1 f9 ba 81 b2 82 |..q..._ at
T.......|
from disk file (offset from 2646736):
0000af00 5c 81 4d 7c 0b b6 81 f2 c8 a5 df 94 08 43 c1 08 |\.M|.........C..|
0000af10 74 00 57 55 5f 15 11 b1 00 d5 29 6a 4e d2 fd fb |t.WU_.....)jN...|
0000af20 49 d1 fd 98 49 58 b7 66 c2 c9 64 67 30 05 06 c0 |I...IX.f..dg0...|
0000af30 0e b2 64 fa b7 9f 69 69 fc cd 91 82 83 ba c3 f2 |..d...ii........|
0000af40 b7 34 eb 8e 0e 88 40 60 1b a8 71 7a 12 15 26 d3
|.4....@`..qz..&.|
0000af50 7f 3e 80 e9 74 96 30 24 cb 82 88 8a ea e0 45 10
|.>..t.0$......E.|
0000af60 e5 75 b2 f7 5b 7c 83 fa 95 a9 09 80 0a 8c fd a9 |.u..[|..........|
0000af70 ef 30 f6 68 9c b2 3f ae 2e e5 21 79 78 8b 34 36 |.0.h..?...!yx.46|
0000af80 c6 55 16 a2 47 00 ca 13 9c 8e 2c 6b eb c7 4f 51 |.U..G.....,k..OQ|
0000af90 81 80 71 f3 a5 9a 5f 40 54 9c f1 f9 ba 81 b2 82 |..q..._ at
T.......|
Bingo!
aio read file by process 1055 placed to same memory address as requested but in
memory space of process 1060!
This is kernel bug and this bug must be stoped release.