On Tue, Feb 09, 2016 at 02:03:20PM +0100, Giuseppe Lettieri wrote:
> Hi all,
>
> I have only looked into the fist LOR, which has actually been there for
> a long time.
>
> It should be triggered by the following paths:
> 1) application does an ioctl(, NIOCREGIF) (or GETINFO)
> ->netmap_mem_finalize() [locks the netmap allocator]
> ->contigmalloc() [locks things related to vm]
> 2) application mmap()s the netmap fd and then accesses the area
> -> page fault [locks things related to vm]
> ->netmap_mem_ofstophys() [locks the netmap allocator]
>
> As a quick check, the LOR disappears if I replace the contigmalloc()
> with a dummy operation returning a static buffer.
>
> If this is correct, there cannot be any concurrency between the two
> paths, since the 1st one must be completed before the first mmap(). I
> also think that the vm objects locked in the two paths are not the same,
> but I don't know whether WITNESS keeps track of (some?) lock instances,
> or just lock types.
Thanks, Giuseppe.
Can you look also on second issue?
PS: What need from me? May be open PR?
> Il 09/02/2016 13:31, Luigi Rizzo ha scritto:
> > I am Cc-ing Giuseppe Lettieri who has looked at the problem and may
> > have some comments to share
> >
> > cheers
> > luigi
> >
> > On Mon, Feb 8, 2016 at 9:39 AM, Slawa Olhovchenkov <slw at
zxy.spb.ru> wrote:
> >> On Thu, Feb 04, 2016 at 10:47:34AM -0800, Adrian Chadd wrote:
> >>
> >>> .. but if it does, can you enable witness and see what it
reports as
> >>> lock order violations?
> >>
> >> last STABLE:
> >>
> >> 1. first LOR (with poll, don't cause direct problems now):
> >>
> >> lock order reversal:
> >> 1st 0xfffff800946e6700 vm object (vm object) @
/usr/src/sys/vm/vm_fault.c:363
> >> 2nd 0xffffffff813e14d8 netmap memory allocator lock (netmap
memory allocator lock) @ /usr/src/sys/dev/netmap/netmap_mem2.c:393
> >> KDB: stack backtrace:
> >> #0 0xffffffff80970320 at kdb_backtrace+0x60
> >> #1 0xffffffff809882ce at witness_checkorder+0xc7e
> >> #2 0xffffffff8091fcbc at __mtx_lock_flags+0x4c
> >> #3 0xffffffff806784f6 at netmap_mem_ofstophys+0x36
> >> #4 0xffffffff80676834 at netmap_dev_pager_fault+0x34
> >> #5 0xffffffff80b81a0f at dev_pager_getpages+0x3f
> >> #6 0xffffffff80b8cc1e at vm_fault_hold+0x86e
> >> #7 0xffffffff80b8c367 at vm_fault+0x77
> >> #8 0xffffffff80d0e2c9 at trap_pfault+0x199
> >> #9 0xffffffff80d0db47 at trap+0x527
> >> #10 0xffffffff80cf4ce2 at calltrap+0x8
> >>
> >> 2. kqueue issuse (not LOR!)
> >>
> >> acquiring duplicate lock of same type: "nm_kn_lock"
> >> 1st nm_kn_lock @ /usr/src/sys/kern/kern_event.c:2003
> >> 2nd nm_kn_lock @ /usr/src/sys/kern/kern_event.c:2003
> >> KDB: stack backtrace:
> >> #0 0xffffffff80970320 at kdb_backtrace+0x60
> >> #1 0xffffffff809882ce at witness_checkorder+0xc7e
> >> #2 0xffffffff8091fcbc at __mtx_lock_flags+0x4c
> >> #3 0xffffffff808fd899 at knote+0x39
> >> #4 0xffffffff8067636b at freebsd_selwakeup+0x8b
> >> #5 0xffffffff80674eb5 at netmap_notify+0x55
> >> #6 0xffffffff8067ccb6 at netmap_pipe_txsync+0x156
> >> #7 0xffffffff80674740 at netmap_poll+0x400
> >> #8 0xffffffff80676b8e at netmap_knrw+0x6e
> >> #9 0xffffffff808fc57a at kqueue_register+0x64a
> >> #10 0xffffffff808fcdd4 at kern_kevent_fp+0x144
> >> #11 0xffffffff808fcc4f at kern_kevent+0x9f
> >> #12 0xffffffff808fcaea at sys_kevent+0x12a
> >> #13 0xffffffff80d0e914 at amd64_syscall+0x2d4
> >> #14 0xffffffff80cf4fcb at Xfast_syscall+0xfb
> >>
> >> Do you need anything?
> >>
> >>> On 4 February 2016 at 10:47, Adrian Chadd <adrian.chadd at
gmail.com> wrote:
> >>>> I've no time to help with this, I'm sorry :(
> >>>>
> >>>>
> >>>> -a
> >>>>
> >>>>
> >>>> On 4 February 2016 at 05:00, Slawa Olhovchenkov <slw at
zxy.spb.ru> wrote:
> >>>>> On Tue, Feb 02, 2016 at 11:44:47PM +0300, Slawa
Olhovchenkov wrote:
> >>>>>
> >>>>>> On Thu, Oct 22, 2015 at 11:24:53AM -0700, Luigi
Rizzo wrote:
> >>>>>>
> >>>>>>> On Thu, Oct 22, 2015 at 11:12 AM, Adrian Chadd
<adrian.chadd at gmail.com> wrote:
> >>>>>>>> On 22 October 2015 at 09:35, Slawa
Olhovchenkov <slw at zxy.spb.ru> wrote:
> >>>>>>>>> On Sun, Oct 18, 2015 at 07:45:52PM
-0700, Adrian Chadd wrote:
> >>>>>>>>>
> >>>>>>>>>> Heh, file a bug with luigi; it
should be defined better inside netmap itself.
> >>>>>>>>>
> >>>>>>>>> I am CC: luigi.
> >>>>>>>>>
> >>>>>>>>> Next question: do kevent RX/TX sync?
> >>>>>>>>> In my setup I am need to manual
NIOCTXSYNC/NIOCRXSYNC.
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> Nope. kqueue() doesn't do the implicit
sync like poll() does; it's
> >>>>>>>> just the notification path.
> >>>>>>>
> >>>>>>> actually not. When the file descriptor is
registered there
> >>>>>>> is an implicit sync, and there is another one
when an event
> >>>>>>> is posted for the file descriptor.
> >>>>>>>
> >>>>>>> unless there are bugs, of course.
> >>>>>>
> >>>>>> I found strange behaivor:
> >>>>>>
> >>>>>> 1. open netmap and register in main thread
> >>>>>> 2. kevent register in different thread
> >>>>>> 3. result: got event by kevent but no ring sinc
(all head,tail,cur
> >>>>>> still 0).
> >>>>>>
> >>>>>> Is this normal? Or is this bug?
> >>>>>>
> >>>>>> open and registering netmap in same thread as
kevent resolve this.
> >>>>>
> >>>>> Also, kevent+netmap deadlocked for me:
> >>>>>
> >>>>> PID TID COMM TDNAME KSTACK
> >>>>> 1095 100207 addos -
mi_switch+0xe1 sleepq_catch_signals+0xab sleepq_timedwait_sig+0x10 _sleep+0x238
kern_nanosleep+0x10e sys_nanosleep+0x51 amd64_syscall+0x40f Xfast_syscall+0xfb
> >>>>> 1095 100208 addos worker#0
mi_switch+0xe1 sleepq_catch_signals+0xab sleepq_wait_sig+0xf _sleep+0x27d
kern_kevent+0x401 sys_kevent+0x12a amd64_syscall+0x40f Xfast_syscall+0xfb
> >>>>> 1095 100209 addos worker#1
mi_switch+0xe1 turnstile_wait+0x42a __mtx_lock_sleep+0x26b knote+0x38
freebsd_selwakeup+0x8b netmap_notify+0x55 netmap_pipe_txsync+0x156
netmap_poll+0x400 netmap_knrw+0x6e kqueue_register+0x799 kern_kevent+0x158
sys_kevent+0x12a amd64_syscall+0x40f Xfast_syscall+0xfb
> >>>>> 1095 100210 addos worker#2
mi_switch+0xe1 sleepq_catch_signals+0xab sleepq_wait_sig+0xf _sleep+0x27d
kern_kevent+0x401 sys_kevent+0x12a amd64_syscall+0x40f Xfast_syscall+0xfb
> >>>>> 1095 100211 addos worker#NOIP
mi_switch+0xe1 sleepq_catch_signals+0xab sleepq_wait_sig+0xf _sleep+0x27d
kern_kevent+0x401 sys_kevent+0x12a amd64_syscall+0x40f Xfast_syscall+0xfb
> >>>>> 1095 100212 addos balancer
mi_switch+0xe1 turnstile_wait+0x42a __mtx_lock_sleep+0x26b knote+0x38
freebsd_selwakeup+0x8b netmap_notify+0x2a netmap_pipe_rxsync+0x54
netmap_poll+0x774 netmap_knrw+0x6e kern_kevent+0x5cc sys_kevent+0x12a
amd64_syscall+0x40f Xfast_syscall+0xfb
> >
> >
> >
>
>
> --
> Dr. Ing. Giuseppe Lettieri
> Dipartimento di Ingegneria della Informazione
> Universita' di Pisa
> Largo Lucio Lazzarino 1, 56122 Pisa - Italy
> Ph. : (+39) 050-2217.649 (direct) .599 (switch)
> Fax : (+39) 050-2217.600
> e-mail: g.lettieri at iet.unipi.it
> _______________________________________________
> freebsd-stable at freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-stable
> To unsubscribe, send any mail to "freebsd-stable-unsubscribe at
freebsd.org"