Jason Wang
2021-May-31 03:28 UTC
[PATCH net-next] virtio_net: set link state down when virtqueue is broken
? 2021/5/28 ??6:58, wangyunjian ??:>> -----Original Message----- >>> From: Yunjian Wang <wangyunjian at huawei.com> >>> >>> The NIC can't receive/send packets if a rx/tx virtqueue is broken. >>> However, the link state of the NIC is still normal. As a result, the >>> user cannot detect the NIC exception. >> >> Doesn't we have: >> >> ?????? /* This should not happen! */ >> ??????? if (unlikely(err)) { >> ??????????????? dev->stats.tx_fifo_errors++; >> ??????????????? if (net_ratelimit()) >> ??????????????????????? dev_warn(&dev->dev, >> ???????????????????????????????? "Unexpected TXQ (%d) queue >> failure: %d\n", >> ???????????????????????????????? qnum, err); >> ??????????????? dev->stats.tx_dropped++; >> ??????????????? dev_kfree_skb_any(skb); >> ??????????????? return NETDEV_TX_OK; >> ??????? } >> >> Which should be sufficient? > There may be other reasons for this error, e.g -ENOSPC(no free desc).This should not happen unless the device or driver is buggy. We always reserved sufficient slots: ??????? if (sq->vq->num_free < 2+MAX_SKB_FRAGS) { ??????????????? netif_stop_subqueue(dev, qnum); ...> And if rx virtqueue is broken, there is no error statistics.Feel free to add one if it's necessary. Let's leave the policy decision (link down) to userspace.> >> >>> The driver can set the link state down when the virtqueue is broken. >>> If the state is down, the user can switch over to another NIC. >> >> Note that, we probably need the watchdog for virtio-net in order to be a >> complete solution. > Yes, I can think of is that the virtqueue's broken exception is detected on watchdog. > Is there anything else that needs to be done?Basically, it's all about TX stall which watchdog tries to catch. Broken vq is only one of the possible reason. Thanks> > Thanks > >> Thanks >> >>