Stefano Garzarella
2022-Aug-05 12:42 UTC
[PATCH RFC net-next] vsock: Reschedule connect_work for O_NONBLOCK connect() requests
On Thu, Aug 04, 2022 at 04:44:47PM -0700, Peilin Ye wrote:>Hi Stefano, > >On Thu, Aug 04, 2022 at 08:59:23AM +0200, Stefano Garzarella wrote: >> The last thing I was trying to figure out before sending the patch was >> whether to set sock->state = SS_UNCONNECTED in vsock_connect_timeout(). >> >> I think we should do that, otherwise a subsequent to connect() with >> O_NONBLOCK set would keep returning -EALREADY, even though the timeout has >> expired. >> >> What do you think? > >Thanks for bringing this up, after thinking about sock->state, I have 3 >thoughts: > >1. I think the root cause of this memleak is, we keep @connect_work > pending, even after the 2nd, blocking request times out (or gets > interrupted) and sets sock->state back to SS_UNCONNECTED. > > @connect_work is effectively no-op when sk->sk_state is > TCP_CLOS{E,ING} anyway, so why not we just cancel @connect_work when > blocking requests time out or get interrupted? Something like: > >diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c >index f04abf662ec6..62628af84164 100644 >--- a/net/vmw_vsock/af_vsock.c >+++ b/net/vmw_vsock/af_vsock.c >@@ -1402,6 +1402,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, > lock_sock(sk); > > if (signal_pending(current)) { >+ if (cancel_delayed_work(&vsk->connect_work)) >+ sock_put(sk); >+ > err = sock_intr_errno(timeout); > sk->sk_state = sk->sk_state == TCP_ESTABLISHED ? TCP_CLOSING : TCP_CLOSE; > sock->state = SS_UNCONNECTED; >@@ -1409,6 +1412,9 @@ static int vsock_connect(struct socket *sock, struct sockaddr *addr, > vsock_remove_connected(vsk); > goto out_wait; > } else if (timeout == 0) { >+ if (cancel_delayed_work(&vsk->connect_work)) >+ sock_put(sk); >+ > err = -ETIMEDOUT; > sk->sk_state = TCP_CLOSE; > sock->state = SS_UNCONNECTED; > > Then no need to worry about rescheduling @connect_work, and the state > machine becomes more accurate. What do you think? I will ask syzbot > to test this.It could work, but should we set `sk->sk_err` and call sk_error_report() to wake up thread waiting on poll()? Maybe the previous version is simpler.> >2. About your suggestion of setting sock->state = SS_UNCONNECTED in > vsock_connect_timeout(), I think it makes sense. Are you going to > send a net-next patch for this?If you have time, feel free to send it. Since it is a fix, I believe you can use the "net" tree. (Also for this patch). Remember to put the "Fixes" tag that should be the same.> >3. After a TCP_SYN_SENT sock receives VIRTIO_VSOCK_OP_RESPONSE in > virtio_transport_recv_connecting(), why don't we cancel > @connect_work? > Am I missing something?Because when the timeout will fire, vsock_connect_timeout() will just call sock_put() since sk->sk_state is changed. Of course, we can cancel it if we want, but I think it's not worth it. In the end, this rescheduling patch should solve all the problems. Thanks, Stefano