thr3ads.net - freebsd stable - nginx and FreeBSD11 [Sep 2016]

If this information is useful, please help other people find it:
Share via:

Slawa Olhovchenkov

2016-Sep-18 16:22 UTC

nginx and FreeBSD11

On Thu, Sep 15, 2016 at 10:28:11AM -0700, John Baldwin wrote:
> On Thursday, September 15, 2016 05:41:03 PM Slawa Olhovchenkov wrote:
> > On Wed, Sep 07, 2016 at 10:13:48PM +0300, Slawa Olhovchenkov wrote:
> > 
> > > I am have strange issuse with nginx on FreeBSD11.
> > > I am have FreeBSD11 instaled over STABLE-10.
> > > nginx build for FreeBSD10 and run w/o recompile work fine.
> > > nginx build for FreeBSD11 crushed inside rbtree lookups: next
node
> > > totaly craped.
> > > 
> > > I am see next potential cause:
> > > 
> > > 1) clang 3.8 code generation issuse
> > > 2) system library issuse
> > > 
> > > may be i am miss something?
> > > 
> > > How to find real cause?
> > 
> > I find real cause and this like show-stopper for RELEASE.
> > I am use nginx with AIO and AIO from one nginx process corrupt memory
> > from other nginx process. Yes, this is cross-process memory
> > corruption.
> > 
> > Last case, core dumped proccess with pid 1060 at 15:45:14.
> > Corruped memory at 0x860697000.
> > I am know about good memory at 0x86067f800.
> > Dumping (form core) this region to file and analyze by hexdump I am
> > found start of corrupt region -- offset 0000c8c0 from 0x86067f800.
> > 0x86067f800+0xc8c0 = 0x86068c0c0
> > 
> > I am preliminary enabled debuggin of AIO started operation to nginx
> > error log (memory address, file name, offset and size of transfer).
> > 
> > grep -i 86068c0c0 error.log near 15:45:14 give target file.
> > grep ce949665cbcd.hls error.log near 15:45:14 give next result:
> > 
> > 2016/09/15 15:45:13 [notice] 1055#0: *11659936 AIO_RD 000000082065DB60
start 000000086068C0C0 561b0   2646736 ce949665cbcd.hls
> > 2016/09/15 15:45:14 [notice] 1060#0: *10998125 AIO_RD 000000081F1FFB60
start 000000086FF2C0C0 6cdf0 140016832 ce949665cbcd.hls
> > 2016/09/15 15:45:14 [notice] 1055#0: *11659936 AIO_RD 00000008216B6B60
start 000000086472B7C0 7ff70   2999424 ce949665cbcd.hls
> 
> Does nginx only use AIO for regular files or does it also use it with
sockets?
> 
> You can try using this patch as a diagnostic (you will need to
> run with INVARIANTS enabled, or at least enabled for vfs_aio.c):
> 
> Index: vfs_aio.c
> ==================================================================> ---
vfs_aio.c	(revision 305811)
> +++ vfs_aio.c	(working copy)
> @@ -787,6 +787,8 @@ aio_process_rw(struct kaiocb *job)
>  	 * aio_aqueue() acquires a reference to the file that is
>  	 * released in aio_free_entry().
>  	 */
> +	KASSERT(curproc->p_vmspace == job->userproc->p_vmspace,
> +	    ("%s: vmspace mismatch", __func__));
>  	if (cb->aio_lio_opcode == LIO_READ) {
>  		auio.uio_rw = UIO_READ;
>  		if (auio.uio_resid == 0)
> @@ -1054,6 +1056,8 @@ aio_switch_vmspace(struct kaiocb *job)
>  {
>  
>  	vmspace_switch_aio(job->userproc->p_vmspace);
> +	KASSERT(curproc->p_vmspace == job->userproc->p_vmspace,
> +	    ("%s: vmspace mismatch", __func__));
>  }
> 
> If this panics, then vmspace_switch_aio() is not working for
> some reason.
I am try using next DTrace script:
===#pragma D option dynvarsize=64m

int req[struct vmspace  *, void *];
self int trace;

syscall:freebsd:aio_read:entry
{
        this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct aiocb));
        req[curthread->td_proc->p_vmspace, this->aio.aio_buf] =
curthread->td_proc->p_pid;
}

fbt:kernel:aio_process_rw:entry
{
        self->job = args[0];
        self->trace = 1;
}

fbt:kernel:aio_process_rw:return
/self->trace/
{
        req[self->job->userproc->p_vmspace,
self->job->uaiocb.aio_buf] = 0;
        self->job = 0;
        self->trace = 0;
}

fbt:kernel:vn_io_fault:entry
/self->trace && !req[curthread->td_proc->p_vmspace,
args[1]->uio_iov[0].iov_base]/
{
        this->buf = args[1]->uio_iov[0].iov_base;
        printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp,
curthread->td_proc->p_vmspace, this->buf,
req[curthread->td_proc->p_vmspace, this->buf]);
}
==
And don't got any messages near nginx core dump.
What I can check next?
May be check context/address space switch for kernel process?

John Baldwin

2016-Sep-20 01:05 UTC

head link

nginx and FreeBSD11

On Sunday, September 18, 2016 07:22:41 PM Slawa Olhovchenkov
wrote:> On Thu, Sep 15, 2016 at 10:28:11AM -0700, John Baldwin wrote:
> 
> > On Thursday, September 15, 2016 05:41:03 PM Slawa Olhovchenkov wrote:
> > > On Wed, Sep 07, 2016 at 10:13:48PM +0300, Slawa Olhovchenkov
wrote:
> > > 
> > > > I am have strange issuse with nginx on FreeBSD11.
> > > > I am have FreeBSD11 instaled over STABLE-10.
> > > > nginx build for FreeBSD10 and run w/o recompile work fine.
> > > > nginx build for FreeBSD11 crushed inside rbtree lookups:
next node
> > > > totaly craped.
> > > > 
> > > > I am see next potential cause:
> > > > 
> > > > 1) clang 3.8 code generation issuse
> > > > 2) system library issuse
> > > > 
> > > > may be i am miss something?
> > > > 
> > > > How to find real cause?
> > > 
> > > I find real cause and this like show-stopper for RELEASE.
> > > I am use nginx with AIO and AIO from one nginx process corrupt
memory
> > > from other nginx process. Yes, this is cross-process memory
> > > corruption.
> > > 
> > > Last case, core dumped proccess with pid 1060 at 15:45:14.
> > > Corruped memory at 0x860697000.
> > > I am know about good memory at 0x86067f800.
> > > Dumping (form core) this region to file and analyze by hexdump I
am
> > > found start of corrupt region -- offset 0000c8c0 from
0x86067f800.
> > > 0x86067f800+0xc8c0 = 0x86068c0c0
> > > 
> > > I am preliminary enabled debuggin of AIO started operation to
nginx
> > > error log (memory address, file name, offset and size of
transfer).
> > > 
> > > grep -i 86068c0c0 error.log near 15:45:14 give target file.
> > > grep ce949665cbcd.hls error.log near 15:45:14 give next result:
> > > 
> > > 2016/09/15 15:45:13 [notice] 1055#0: *11659936 AIO_RD
000000082065DB60 start 000000086068C0C0 561b0   2646736 ce949665cbcd.hls
> > > 2016/09/15 15:45:14 [notice] 1060#0: *10998125 AIO_RD
000000081F1FFB60 start 000000086FF2C0C0 6cdf0 140016832 ce949665cbcd.hls
> > > 2016/09/15 15:45:14 [notice] 1055#0: *11659936 AIO_RD
00000008216B6B60 start 000000086472B7C0 7ff70   2999424 ce949665cbcd.hls
> > 
> > Does nginx only use AIO for regular files or does it also use it with
sockets?
> > 
> > You can try using this patch as a diagnostic (you will need to
> > run with INVARIANTS enabled, or at least enabled for vfs_aio.c):
> > 
> > Index: vfs_aio.c
> > ==================================================================>
> --- vfs_aio.c	(revision 305811)
> > +++ vfs_aio.c	(working copy)
> > @@ -787,6 +787,8 @@ aio_process_rw(struct kaiocb *job)
> >  	 * aio_aqueue() acquires a reference to the file that is
> >  	 * released in aio_free_entry().
> >  	 */
> > +	KASSERT(curproc->p_vmspace == job->userproc->p_vmspace,
> > +	    ("%s: vmspace mismatch", __func__));
> >  	if (cb->aio_lio_opcode == LIO_READ) {
> >  		auio.uio_rw = UIO_READ;
> >  		if (auio.uio_resid == 0)
> > @@ -1054,6 +1056,8 @@ aio_switch_vmspace(struct kaiocb *job)
> >  {
> >  
> >  	vmspace_switch_aio(job->userproc->p_vmspace);
> > +	KASSERT(curproc->p_vmspace == job->userproc->p_vmspace,
> > +	    ("%s: vmspace mismatch", __func__));
> >  }
> > 
> > If this panics, then vmspace_switch_aio() is not working for
> > some reason.
> 
> I am try using next DTrace script:
> ===> #pragma D option dynvarsize=64m
> 
> int req[struct vmspace  *, void *];
> self int trace;
> 
> syscall:freebsd:aio_read:entry
> {
>         this->aio = *(struct aiocb *)copyin(arg0, sizeof(struct aiocb));
>         req[curthread->td_proc->p_vmspace, this->aio.aio_buf] =
curthread->td_proc->p_pid;
> }
> 
> fbt:kernel:aio_process_rw:entry
> {
>         self->job = args[0];
>         self->trace = 1;
> }
> 
> fbt:kernel:aio_process_rw:return
> /self->trace/
> {
>         req[self->job->userproc->p_vmspace,
self->job->uaiocb.aio_buf] = 0;
>         self->job = 0;
>         self->trace = 0;
> }
> 
> fbt:kernel:vn_io_fault:entry
> /self->trace && !req[curthread->td_proc->p_vmspace,
args[1]->uio_iov[0].iov_base]/
> {
>         this->buf = args[1]->uio_iov[0].iov_base;
>         printf("%Y vn_io_fault %p:%p pid %d\n", walltimestamp,
curthread->td_proc->p_vmspace, this->buf,
req[curthread->td_proc->p_vmspace, this->buf]);
> }
> ==> 
> And don't got any messages near nginx core dump.
> What I can check next?
> May be check context/address space switch for kernel process?
Which CPU are you using?  Perhaps try disabling PCID support (I think
vm.pmap.pcid_enabled=0 from
loader prompt or loader.conf)?  (Wondering if pmap_activate() is somehow not
switching)

-- 
John Baldwin

freebsd stable - Sep 2016 - nginx and FreeBSD11

nginx and FreeBSD11

nginx and FreeBSD11