Hi Slawa,
On 10/10/16 7:35 PM, Slawa Olhovchenkov wrote:> On Mon, Oct 10, 2016 at 05:44:21PM +0200, Julien Charbon wrote:
>>>> can check the current other usages of goto findpcb in
tcp_input(). The
>>>> rational here being:
>>>>
>>>> - Behavior before the patch: If the inp we found was deleted
then goto
>>>> findpcb.
>>>> - Behavior after the patch: If the inp we found was deleted
or dropped
>>>> then goto findpcb.
>>>>
>>>> I just prefer having the same behavior applied everywhere: If
>>>> tcp_input() loses the inp lock race and the inp was deleted or
dropped
>>>> then retry to find a new inpcb to deliver to.
>>>>
>>>> But you are right dropping the packet here will also fix the
issue.
>>>>
>>>> Then the review process becomes quite helpful because people
can argue:
>>>> Dropping here is better because "blah", or goto
findpcb is better
>>>> because "bluh", etc. And at the review end you have
a nice final patch.
>>>>
>>>> https://reviews.freebsd.org/D8211
>>>
>>> I am not sure, I am see to
>>>
>>> sys/netinet/in_pcb.h:#define INP_DROPPED 0x04000000
/*
>> protocol drop flag */
>>>
>>> and think this is a flag 'all packets must be droped'
>>
>> Hm, I believe this flag means "this inp has been dropped by the
TCP
>> stack, so don't use it anymore". Actually this flag is better
described
>> in the function that sets it:
>>
>> "(INP_DROPPED) is used by TCP to mark an inpcb as unused and avoid
>> future packet delivery or event notification when a socket remains open
>> but TCP has closed."
>>
>>
https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb.c#L1320
>>
>> /*
>> * in_pcbdrop() removes an inpcb from hashed lists, releasing its
>> address and
>> * port reservation, and preventing it from being returned by inpcb
lookups.
>> *
>> * It is used by TCP to mark an inpcb as unused and avoid future packet
>> * delivery or event notification when a socket remains open but TCP
has
>> * closed. This might occur as a result of a shutdown()-initiated TCP
close
>> * or a RST on the wire, and allows the port binding to be reused while
>> still
>> * maintaining the invariant that so_pcb always points to a valid inpcb
>> until
>> * in_pcbdetach().
>> *
>> */
>> void
>> in_pcbdrop(struct inpcb *inp)
>> {
>> inp->inp_flags |= INP_DROPPED;
>> ...
>>
>> The classical example where "goto findpcb" is useful: You
receive a
>> new connection request with a TCP SYN packet and this packet is unlucky
>> and reached a inp being dropped:
>>
>> - with "goto findpcb" approach, the next lookup will most
likely find
>> the LISTEN inp and start the TCP hand-shake as usual
>> - with "drop the packet" approach, the TCP client will need
to
>> re-transmit a TCP SYN packet
>>
>> It is not because a packet was unlucky once that it deserves to be
>> dropped. :)
>
> Thanks for explaining, very helpfull.
> In this situation (TCP SYN with same 4-tuple as existing socket)
> allocate new PCB is best. But for this we must destroy current PCB. I
> am think INP_WUNLOCK(inp) don't destroy it and in_pcblookup_mbuf find
> it again (I am think in_pcblookup_mbuf find this PCB on first turn).
> I am assume for classical example in_pcbrele_wlocked(inp) free and
> destroy current PCB for possibility in_pcblookup_mbuf allocate new
> one.
Astute question: Here, the same inp cannot be find again by
in_pcblookup_mbuf(), the explanation is a bit long though:
in_pcbdrop() does two things under INP_WLOCK lock:
https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb.c#L1334
1. Add INP_DROPPED flag
2. Remove the inp from the TCP hash table
And once removed from the TCP hash table, in_pcblookup_mbuf() will
return NULL when doing the same TCP hash table lookup again.
It means that under a INP_WLOCK lock, these two changes are atomic:
- Either an inp does not have INP_DROPPED flag and can be in TCP hash table
- Either an inp has INP_DROPPED flag and is not in TCP hash table
But when you don't have the INP_WLOCK lock, then you can witness the
intermediate state where a inp is still in TCP hash table while a thread
is adding the INP_DROPPED flag.
Nothing unusual here.
Then threads are competing for the INP_WLOCK lock. For the example,
let's say the thread A wants to run tcp_input()/in_pcblookup_mbuf() and
racing for this INP_WLOCK:
https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/in_pcb.c#L1964
And thread B wants to run tcp_timer_2msl()/tcp_close()/in_pcbdrop() and
racing for this INP_WLOCK:
https://github.com/freebsd/freebsd/blob/release/11.0.0/sys/netinet/tcp_timer.c#L323
That leads to two cases:
o Thread A wins the race:
Thread A will continue tcp_input() as usal and INP_DROPPED flags is
not set and inp is still in TCP hash table.
Thread B is waiting on thread A to release INP_WLOCK after finishing
tcp_input() processing, and thread B will continue
tcp_timer_2msl()/tcp_close()/in_pcbdrop() processing.
o Thread B wins the race:
Thread B runs tcp_timer_2msl()/tcp_close()/in_pcbdrop() and inp
INP_DROPPED is set and inp being removed from TCP hash table.
In parallel, thread A has found the inp in TCP hash before is was
removed, and waiting on the found inp INP_WLOCK lock.
Once thread B has released the INP_WLOCK lock, thread A gets this lock
and sees the INP_DROPPED flag and do "goto findpcb" but here because
the
inp is not more in TCP hash table and it will not be find again by
in_pcblookup_mbuf().
Hopefully I am clear enough here.
--
Julien
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 496 bytes
Desc: OpenPGP digital signature
URL:
<http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20161011/e6a354f5/attachment.sig>