I''ve updated this patch to compile against current unstable and implemented a new protocol which I think fixes the data integrity issue present in all previous versions where URBs were not correctly stalled on error. In this version of the driver, an error in the BE causes the BE URB queue to stall and all URBs present and subsequently arriving in the BE to be unlinked and failed back to the FE. The FE catches the unlinking URBs and hangs onto them until the callbacks of any failing URBs and any URBs explicitly unlinked by the FE driver as a result have completed. After the FE error and unlink completions are quiesced, the FE retries any URBs that remain back to the BE, sending the first with a flag that clears the stall in the BE. This was fairly ugly, and the locking is a bit of a nightmare. The main problem is that the USB stack has the semantics that it guarantees that the URB queue will be stalled only until the URB completion returns which means that there is a requirement to to URB unlinking nested in the BE completion. I have seen a couple of URB errors handled without the kernel crashing but I would expect there to be some bugs in this code somewhere. Also the xenbus stuff has been stirred around a few times since I wrote it correctly and I have lost interest in proving to myself that the current hooks into xenbus are correct. The state machines in ub_xb_channel and uf_xb_channel which coordinate with xenbus were derived by trial and error and unlike the rest of the code are not intended to be correct by design. Someone has messed around with the usb stuff in the python code since the last version too, possibly to implement usb support for the HVM guests. I changed this code to use "usbport" rather than usb and hopefully it won''t conflict. This is untested. The driver used to be modular and an older version did correctly handle module load and unload in both the FE and BE including quiescing ongoing I/O. I was told to simplify the driver and remove this functionality. The current version is therefore not modular. I had some correspondence with the author of the USB over IP patch and we came to the conclusion that USB/IP does not address the stalling requirement above. We were not 100% sure whether this is a problem but I think it probably is. This driver might perhaps be a useful example of how to solve the problem for the USB/IP code. I had put this work on hold waiting for an upstream merge of the other drivers in the hope that it would force a cleanup of some of the xenbus design. Recently I discovered that I''ll be leaving the IBM Xen team in the near future and I''m not sure how much more time I can spend on this so this is an attempt to get the driver in the best possible shape for someone else to pick up. The xen core team should decide whether they really want a USB split driver at all or if they want to go with USB/IP or some common code approach. As for this driver, the basic structure of the code is OK. I think the locking is probably OK. Lots more testing is required as well as regression tests for xm-test. Some formatting issues probably remain too. It''s probably a bit too abstract for some of you---if so, you can always read the .ko files ;-) Sniff tested against f426f6e646eb. Signed-off-by: Harry Butterworth <butterwo@uk.ibm.com> _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Thanks, Harry. I know you''ve put a lot of work into this. Peace. Andrew On Tue, 2006-10-03 at 18:10 +0100, Harry Butterworth wrote:> I''ve updated this patch to compile against current unstable and > implemented a new protocol which I think fixes the data integrity issue > present in all previous versions where URBs were not correctly stalled > on error. > > In this version of the driver, an error in the BE causes the BE URB > queue to stall and all URBs present and subsequently arriving in the BE > to be unlinked and failed back to the FE. > > The FE catches the unlinking URBs and hangs onto them until the > callbacks of any failing URBs and any URBs explicitly unlinked by the FE > driver as a result have completed. > > After the FE error and unlink completions are quiesced, the FE retries > any URBs that remain back to the BE, sending the first with a flag that > clears the stall in the BE. > > This was fairly ugly, and the locking is a bit of a nightmare. The main > problem is that the USB stack has the semantics that it guarantees that > the URB queue will be stalled only until the URB completion returns > which means that there is a requirement to to URB unlinking nested in > the BE completion. > > I have seen a couple of URB errors handled without the kernel crashing > but I would expect there to be some bugs in this code somewhere. > > Also the xenbus stuff has been stirred around a few times since I wrote > it correctly and I have lost interest in proving to myself that the > current hooks into xenbus are correct. The state machines in > ub_xb_channel and uf_xb_channel which coordinate with xenbus were > derived by trial and error and unlike the rest of the code are not > intended to be correct by design. > > Someone has messed around with the usb stuff in the python code since > the last version too, possibly to implement usb support for the HVM > guests. I changed this code to use "usbport" rather than usb and > hopefully it won''t conflict. This is untested. > > The driver used to be modular and an older version did correctly handle > module load and unload in both the FE and BE including quiescing ongoing > I/O. I was told to simplify the driver and remove this functionality. > The current version is therefore not modular. > > I had some correspondence with the author of the USB over IP patch and > we came to the conclusion that USB/IP does not address the stalling > requirement above. We were not 100% sure whether this is a problem but > I think it probably is. This driver might perhaps be a useful example > of how to solve the problem for the USB/IP code. > > I had put this work on hold waiting for an upstream merge of the other > drivers in the hope that it would force a cleanup of some of the xenbus > design. Recently I discovered that I''ll be leaving the IBM Xen team in > the near future and I''m not sure how much more time I can spend on this > so this is an attempt to get the driver in the best possible shape for > someone else to pick up. > > The xen core team should decide whether they really want a USB split > driver at all or if they want to go with USB/IP or some common code > approach. > > As for this driver, the basic structure of the code is OK. I think the > locking is probably OK. Lots more testing is required as well as > regression tests for xm-test. Some formatting issues probably remain > too. It''s probably a bit too abstract for some of you---if so, you can > always read the .ko files ;-) > > Sniff tested against f426f6e646eb. > > Signed-off-by: Harry Butterworth <butterwo@uk.ibm.com> > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
I tried to get my webcam working today and the isoc URB path had a bug in it which caused all isoc URBs to fail back. This hammered the new error recovery code and exposed a couple of bugs which I''ve fixed in this latest version of the patch. I fixed the bug in the handling of isoc URBs and implemented emulation of the set_interface command in the BE. The webcam is now discovered but camorama can''t connect to it. None of the URBs seem to be failing so I''ll have to compare with the behaviour under native Linux to see what the difference is. Aside from the webcam, all of my other USB devices are now working with this version of the patch. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
I forgot... Signed-off-by: Harry Butterworth <butterwo@uk.ibm.com> The problem with the webcam turned out to be that the required driver is not in the mainline Linux tree---the webcam worked under native only because ubuntu ships the extra driver. So the patch attached to the parent is good AFAIK. It could do with being tested with a wider range of USB devices than I possess. Owing to the way it hooks into the Linux USB stack, there are likely to be a few more commands that need to be emulated in the BE to get specific devices working. The Linux USB folks were talking about allowing drivers to bind to devices rather than interfaces which would make things cleaner. On Wed, 2006-10-04 at 22:22 +0100, Harry Butterworth wrote:> I tried to get my webcam working today and the isoc URB path had a bug > in it which caused all isoc URBs to fail back. This hammered the new > error recovery code and exposed a couple of bugs which I''ve fixed in > this latest version of the patch. > > I fixed the bug in the handling of isoc URBs and implemented emulation > of the set_interface command in the BE. > > The webcam is now discovered but camorama can''t connect to it. None of > the URBs seem to be failing so I''ll have to compare with the behaviour > under native Linux to see what the difference is. > > Aside from the webcam, all of my other USB devices are now working with > this version of the patch. > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel