Il 15/02/2016 16:13, Slawa Olhovchenkov ha scritto:> On Mon, Feb 15, 2016 at 04:10:30PM +0100, Giuseppe Lettieri wrote:
>
>> Hi Slawa,
>>
>> I think WITNESS is seeing a false positive, since those two are always
>> different mutexes.
>>
>> The actual deadlock you experience should be caused by something else.
I
>
> Are you sure? When deadlock occur I am see threads waiting on nm_kn_lock.
The deadlock I mentioned still involves nm_kn_locks, sorry if I was not
clear about that. I am just saying that we never try to take the same
lock that we already holding.
Nonetheless, there are indeed problems in the path that WITNESS has
seen. The problem is that pipes have to notify the other end while
called by kevent. kevent holds the nm_kn_lock on the TX src ring and the
notification takes the nm_kn_lock on the RX dst ring.>
>> have not been able to reproduce it locally (I have not tried that hard,
>> to be honest). I am pretty sure that there is a lock inversion - one
>> that may cause real deadlocks - when you use netmap pipes+kqueue and
you
>> don't pass NETMAP_NO_TX_POLL at NIOCREGIF time. The attached patch
>> should solve this particular problem, but there may be others. May you
>> please try it?
>
> Try it with or w/o WITNESS?
I am trying to see if the actual deadlock disappears, so disable WITNESS
if it slows down the system and masks the real deadlock. Otherwise,
leave it on.
Cheers,
Giuseppe
>
>> Cheers,
>> Giuseppe
>>
>> Il 11/02/2016 14:34, Slawa Olhovchenkov ha scritto:
>>> On Thu, Feb 11, 2016 at 10:11:59AM +0100, Giuseppe Lettieri wrote:
>>>
>>>> Il 10/02/2016 14:53, Slawa Olhovchenkov ha scritto:
>>>>> On Wed, Feb 10, 2016 at 02:33:20PM +0100, Giuseppe Lettieri
wrote:
>>>>>
>>>>>> Il 10/02/2016 12:59, Slawa Olhovchenkov ha scritto:
>>>>>>> Can you look also on second issue?
>>>>>>>
>>>>>>> PS: What need from me? May be open PR?
>>>>>>
>>>>>> May you provide some example code that triggers the
issue?
>>>>>
>>>>> This is about 700 lines of code (not very clear), may be I
can describe it?
>>>>
>>>> I just need some code to trigger the problem locally. Don't
worry about
>>>> the clarity and the line count, unless you cannot share the
code for
>>>> other reasons.
>>>
>>> I am attach source.
>>> run as "prog if1 if2"
>>> Got `acquiring duplicate lock of same type: "nm_kn_lock"`
immediatly
>>> after start.
>>> Dead locking may be occur immediatly after start or may be need
>>> traffic flooding.
>>>
>>
>>
>> --
>> Dr. Ing. Giuseppe Lettieri
>> Dipartimento di Ingegneria della Informazione
>> Universita' di Pisa
>> Largo Lucio Lazzarino 1, 56122 Pisa - Italy
>> Ph. : (+39) 050-2217.649 (direct) .599 (switch)
>> Fax : (+39) 050-2217.600
>> e-mail: g.lettieri at iet.unipi.it
>
>> Index: dev/netmap/netmap.c
>>
==================================================================>> ---
dev/netmap/netmap.c (revision 287671)
>> +++ dev/netmap/netmap.c (working copy)
>> @@ -2378,7 +2378,7 @@
>> * XXX should also check cur != hwcur on the tx rings.
>> * Fortunately, normal tx mode has np_txpoll set.
>> */
>> - if (priv->np_txpoll || want_tx) {
>> + if ((priv->np_txpoll && !is_kevent) || want_tx) {
>> /*
>> * The first round checks if anyone is ready, if not
>> * do a selrecord and another round to handle races.
>
--
Dr. Ing. Giuseppe Lettieri
Dipartimento di Ingegneria della Informazione
Universita' di Pisa
Largo Lucio Lazzarino 1, 56122 Pisa - Italy
Ph. : (+39) 050-2217.649 (direct) .599 (switch)
Fax : (+39) 050-2217.600
e-mail: g.lettieri at iet.unipi.it