Stefano Garzarella
2022-Mar-10 11:00 UTC
[PATCH] vhost/vsock: reset only the h2g connections upon release
On Thu, Mar 10, 2022 at 07:41:54PM +0900, Jiyong Park wrote:>Hi Stefano, > >On Thu, Mar 10, 2022 at 5:59 PM Stefano Garzarella <sgarzare at redhat.com> wrote: >> >> Hi Jiyong, >> >> On Thu, Mar 10, 2022 at 05:18:54PM +0900, Jiyong Park wrote: >> >Filtering non-h2g connections out when determining orphaned connections. >> >Otherwise, in a nested VM configuration, destroying the nested VM (which >> >often involves the closing of /dev/vhost-vsock if there was h2g >> >connections to the nested VM) kills not only the h2g connections, but >> >also all existing g2h connections to the (outmost) host which are >> >totally unrelated. >> > >> >Tested: Executed the following steps on Cuttlefish (Android running on a >> >VM) [1]: (1) Enter into an `adb shell` session - to have a g2h >> >connection inside the VM, (2) open and then close /dev/vhost-vsock by >> >`exec 3< /dev/vhost-vsock && exec 3<&-`, (3) observe that the adb >> >session is not reset. >> > >> >[1] https://android.googlesource.com/device/google/cuttlefish/ >> > >> >Signed-off-by: Jiyong Park <jiyong at google.com> >> >--- >> > drivers/vhost/vsock.c | 4 ++++ >> > 1 file changed, 4 insertions(+) >> > >> >diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c >> >index 37f0b4274113..2f6d5d66f5ed 100644 >> >--- a/drivers/vhost/vsock.c >> >+++ b/drivers/vhost/vsock.c >> >@@ -722,6 +722,10 @@ static void vhost_vsock_reset_orphans(struct sock *sk) >> > * executing. >> > */ >> > >> >+ /* Only the h2g connections are reset */ >> >+ if (vsk->transport != &vhost_transport.transport) >> >+ return; >> >+ >> > /* If the peer is still valid, no need to reset connection */ >> > if (vhost_vsock_get(vsk->remote_addr.svm_cid)) >> > return; >> >-- >> >2.35.1.723.g4982287a31-goog >> > >> >> Thanks for your patch! >> >> Yes, I see the problem and I think I introduced it with the >> multi-transports support (ooops). >> >> So we should add this fixes tag: >> >> Fixes: c0cfa2d8a788 ("vsock: add multi-transports support") >> >> >> IIUC the problem is for all transports that should only cycle on their >> own sockets. Indeed I think there is the same problem if the g2h driver >> will be unloaded (or a reset event is received after a VM migration), it >> will close all sockets of the nested h2g. >> >> So I suggest a more generic solution, modifying >> vsock_for_each_connected_socket() like this (not tested): >> >> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c >> index 38baeb189d4e..f04abf662ec6 100644 >> --- a/net/vmw_vsock/af_vsock.c >> +++ b/net/vmw_vsock/af_vsock.c >> @@ -334,7 +334,8 @@ void vsock_remove_sock(struct vsock_sock *vsk) >> } >> EXPORT_SYMBOL_GPL(vsock_remove_sock); >> >> -void vsock_for_each_connected_socket(void (*fn)(struct sock *sk)) >> +void vsock_for_each_connected_socket(struct vsock_transport *transport, >> + void (*fn)(struct sock *sk)) >> { >> int i; >> >> @@ -343,8 +344,12 @@ void vsock_for_each_connected_socket(void (*fn)(struct sock *sk)) >> for (i = 0; i < ARRAY_SIZE(vsock_connected_table); i++) { >> struct vsock_sock *vsk; >> list_for_each_entry(vsk, &vsock_connected_table[i], >> - connected_table) >> + connected_table) { >> + if (vsk->transport != transport) >> + continue; >> + >> fn(sk_vsock(vsk)); >> + } >> } >> >> >> And all transports that call it. >> >> Thanks, >> Stefano >> > >Thanks for the suggestion, which looks much better. It actually worked well.Thanks for trying this!> >By the way, the suggested change will alter the kernel-module interface (KMI), >which will make it difficult to land the change on older releases where we'd >like to keep the KMI stable [1]. Would it be OK if we let the supplied function >(fn) be responsible for checking the transport? I think that there, in >the future, >might be a case where one needs to cycle over all sockets for inspection or so. >I admit that this would be prone to error, though. > >Please let me know what you think. I don't have a strong preference. I will >submit a revision as you want.I see your point, and it makes sense to keep KMI on stable branches, but mainline I would like to have the proposed approach since all transports use it to cycle on their sockets. So we could do a series with 2 patches: - Patch 1 fixes the problem in all transports by checking the transport in the callback (here we insert the fixes tag so we backport it to the stable branches) - Patch 2 moves the check in vsock_for_each_connected_socket() removing it from callbacks so it is less prone to errors and we merge it only mainline What do you think? Thanks, Stefano