Stefano Garzarella
2023-May-31 07:27 UTC
[syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue
On Tue, May 30, 2023 at 6:30?PM <michael.christie at oracle.com> wrote:> > On 5/30/23 11:17 AM, Stefano Garzarella wrote: > > On Tue, May 30, 2023 at 11:09:09AM -0500, Mike Christie wrote: > >> On 5/30/23 11:00 AM, Stefano Garzarella wrote: > >>> I think it is partially related to commit 6e890c5d5021 ("vhost: use > >>> vhost_tasks for worker threads") and commit 1a5f8090c6de ("vhost: move > >>> worker thread fields to new struct"). Maybe that commits just > >>> highlighted the issue and it was already existing. > >> > >> See my mail about the crash. Agree with your analysis about worker->vtsk > >> not being set yet. It's a bug from my commit where I should have not set > >> it so early or I should be checking for > >> > >> if (dev->worker && worker->vtsk) > >> > >> instead of > >> > >> if (dev->worker) > > > > Yes, though, in my opinion the problem may persist depending on how the > > instructions are reordered. > > Ah ok. > > > > > Should we protect dev->worker() with an RCU to be safe? > > For those multiple worker patchsets Jason had asked me about supporting > where we don't have a worker while we are swapping workers around. To do > that I had added rcu around the dev->worker. I removed it in later patchsets > because I didn't think anyone would use it. > > rcu would work for your case and for what Jason had requested.Yeah, so you already have some patches? Do you want to send it to solve this problem? Thanks, Stefano
Mike Christie
2023-May-31 15:15 UTC
[syzbot] [kvm?] [net?] [virt?] general protection fault in vhost_work_queue
On 5/31/23 2:27 AM, Stefano Garzarella wrote:> On Tue, May 30, 2023 at 6:30?PM <michael.christie at oracle.com> wrote: >> >> On 5/30/23 11:17 AM, Stefano Garzarella wrote: >>> On Tue, May 30, 2023 at 11:09:09AM -0500, Mike Christie wrote: >>>> On 5/30/23 11:00 AM, Stefano Garzarella wrote: >>>>> I think it is partially related to commit 6e890c5d5021 ("vhost: use >>>>> vhost_tasks for worker threads") and commit 1a5f8090c6de ("vhost: move >>>>> worker thread fields to new struct"). Maybe that commits just >>>>> highlighted the issue and it was already existing. >>>> >>>> See my mail about the crash. Agree with your analysis about worker->vtsk >>>> not being set yet. It's a bug from my commit where I should have not set >>>> it so early or I should be checking for >>>> >>>> if (dev->worker && worker->vtsk) >>>> >>>> instead of >>>> >>>> if (dev->worker) >>> >>> Yes, though, in my opinion the problem may persist depending on how the >>> instructions are reordered. >> >> Ah ok. >> >>> >>> Should we protect dev->worker() with an RCU to be safe? >> >> For those multiple worker patchsets Jason had asked me about supporting >> where we don't have a worker while we are swapping workers around. To do >> that I had added rcu around the dev->worker. I removed it in later patchsets >> because I didn't think anyone would use it. >> >> rcu would work for your case and for what Jason had requested. > > Yeah, so you already have some patches? > > Do you want to send it to solve this problem? >Yeah, I'll break them out and send them later today when I can retest rebased patches.