Summary: Progress accelerating. Current state of play is that the 2.6 USB back-end driver domain correctly receives a usbif_be_create message during configuration of the 2.6 USB front-end domain after which configuration fails due to unfinished xend usb driver domain support. Detailed status: Monday was a holiday. Tuesday: Prior to vacation, was testing new 2.6 front end code against old 2.4 back-end code. Changes in USB stack between 2.4 and 2.6 meant that the sequence of USB requests had changed and the new 2.6 front end was issuing a get descriptor request which was failing in the back-end code. Rather than spend time fixing the 2.4 back-end code to handle the request explicitly I decided to start work on the 2.6 back-end and get it to the point where I can test new 2.6 back and front-end code together. BK pull brings in kernel 2.6.11 containing changes in the linux USB interface and the ring macros. Fixes for ring macro changes are thankfully trivial. USB interface changes are more subtle. Build and install up-to-date code on test machine. Exclude USB HCD PCI device from Dom0 --- will attempt to build back-end code in a driver domain to avoid continual reboots. Wednesday: Give USB HCD PCI device to xen-test domain. Move back-end code back to 2.4 tree. Create skeleton 2.6 back-end code. Rebuild kernels and install in xen-test domain. uhci_hcd loads in xen-test domain and finds disgo key. usbback skeleton loads in xen-test domain and (successfully) does nothing. Port initial control interface code from 2.4 Try to configure xen-test as a usb driver domain. Docs indicate sxp config file is required to configure driver domains. Can''t find an example. Use grep on python code to find domain creation code. USB driver domain support seems to be missing from config options. I try to add it and install new xen tools on test box. Create another test domain for the front-end since I think the back-end won''t get any control protocol messages until the front-end starts up which means I''ll have to test them both at the same time. Thursday: Investigate whether usb driver domains are supported by xend. They are not. Spend the day reading the xend usb code. Little bits for driver domains missing all over the place. Object model doesn''t seem to support required multiplicity. Bug in inheritance where child class initialises grandparent instead of parent. Much redundant code: child reimplements methods of parent (workaround for bug?). Time to learn Python. Friday: Incremental rewrite of xend usb code. Remove redundant duplication of methods of parent class of factory. Implement backend domain configuration parameter and change usb port config specification to key=value format as used by netif. get about halfway through xend usb code rewrite. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Summary: USB 2.6 back and front-end driver modules can now be loaded and unloaded and the new xend usb code sequences the control messages to establish the connection and tear it down as necessary. I''m going back to forward porting the remaining USB specific 2.4 code to complete the 2.6 back-end before resuming testing on the still incomplete 2.6 front-end. Detailed status: I spent some time trying to finish the new (usb driver domain support) xend usb code and got to the point where I thought I had most of the code in place apart from the core state machine which sequences the control messages between the front and back-end. The state machine support code was there but I found it impossible to write the actual state machine because I didn''t have a complete understanding of the control message protocol and the behaviour of the xend messaging support code. I use a technique called ''sequence enumeration'' for writing state machine code which consists of: defining a mapping between the inputs to a system and stimuli to the state machine; defining a set of state machine responses which may be atomic or may complete asynchronously by generating a stimulus; enumerating all possible states starting at an initial state by considering all of the possible stimuli that may occur in that state, what the desired state machine responses are and whether the resulting state is a new state or is equivalent to a state previously considered. Sequence enumeration is a fairly painful process but it forces you to think about all the corner cases and, provided you are rigorous enough in defining the input->stimuli mapping and the responses, allows you to write code which is likely to be correct. In the case of the xend usb code I couldn''t get the state machine to close because of confusion about a few aspects of the control message protocol concerning unloading of modules: it''s unclear whether a driver status down message is a request for xend to sequence stuff to allow the driver to go down or an indication to xend that the driver has already sequenced stuff and is going down. Also, if the front end is supposed to issue disconnects for all interfaces before going down then there is the possibility for an interface status disconnected resulting from the back-end going down crossing in the post with an interface disconnect request from the front end. It''s not particularly hard to construct a protocol to solve this kind of problem from scratch (I''ve done it before for multi-pathing device drivers and cluster inter-node communication) but I found it very difficult trying to reverse engineer the intent of the existing protocol. Also, I didn''t understand the system behaviour that would result if messages went unacknowledged as a result of a driver unloading so I tried very hard for some time to avoid any unacknowledged messages. I decided to defer writing the xend usb state machine code until it was actually required for some testing when I could determine the system behaviour experimentally. I started testing with the new xend usb code and found that I couldn''t reliably start and stop domains without getting zombie domains that would force me to reboot. The problem turned out to be in controller.py: I had to change the BackendController to be a child of CtrlMsgRcvr rather than Controller and change lostChannel to call self.factory.backendControllerClosed in place of self.backend.backendClosed. Once I could start and stop my skeletal USB driver domains reliably, I ported a chunk of the 2.4 USB back-end into the 2.6 skeleton back-end and coded the state machine to handle receipt of the control interface messages in the back-end. I made the assumption that, when unloading the back-end module, the back-end would stop using all front-end memory before sending the back-end driver status down message and the xend usb code would stop sending the back-end control messages before acknowledging the driver status down message. On the first attempt, I had the back-end make sure to respond to all received control messages. The back-end state machine seemed to come out OK so I went on to the front-end state machine (which lacked shutdown before) and tried to enumerate a complete solution that included shutdown. To start with, I assumed that the front-end would disconnect its interfaces before sending a driver status down message and I again tried to ensure that the front-end would respond to all received control messages. With these assumptions, I couldn''t get the state machine to close because of problems with disconnects crossing in the post. I tried again, this time assuming that the front-end would send a driver status down first as a request to xend to get the back-end to free up resources and disconnect the interfaces so the front-end could exit gracefully. The state machine for the second attempt also got too complicated. In the end, I came to the conclusion that the interpretation of the existing protocol resulting in the simplest solution is that the back-end driver status down is an indication that the back-end has finished using all front-end resources and won''t necessarily respond to outstanding or subsequent control messages and the front-end driver status down is an indication that the front-end won''t send any more control messages and won''t necessarily respond to any outstanding or subsequent control messages and is also a request that xend force the back-end(s) to free up all front-end resources before acknowledging. Given this interpretation, the front end state machine came out relatively easily. After this third attempt at the front and back end state machines, I went back to the xend usb state machine in the middle and wrote the straight through path for loading the back-end, loading the front-end, unloading the front-end and then unloading the back-end. I spent a very painful day trying to debug a problem in this sequence where messages to the back end were hanging xend. This turned out to be a whitespace issue in the xend usb code I''d written. Being a python novice, I had formatted a call to packMsg as follows... msg = packMsg ( ...lots of parameters on multiple lines ) writeRequest( msg ) ...which is how I''d do it in C. Unfortunately this fails silently in python (since---I''m guessing---msg is set to the function packMsg and the parameters are interpreted as a statement with no effect). After fixing this issue the messaging sprang back into life and now, after another day of debugging and a bit more hacking on the xend usb state machine, the module unloading and loading is working well enough to allow me to get back to porting the USB specific bits of the 2.4 back-end code into the 2.6 framework. The xend usb state machine isn''t yet a complete enumeration (and the other state machines are a bit ragged after being reworked three times) but it''s good enough for the intended purpose: to improve the turn-around-time for debugging the USB driver code by eliminating reboots. If I had known in advance it was going to be this painful to make it work I would have put up with the reboots but now it is working it ought to be very convenient. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Summary (since last status): Implemented complete enumeration of usb xend backend interface state machine. Ported most of 2.4 back-end functionality to 2.6 back-end with the exception of support for isochronous transfers. Resumed testing (incomplete) 2.6 front-end against 2.6 back-end. Currently the initial get descriptor request fails in the 2.6 back-end as it did against the 2.4 back-end. Still to do: Unknown amount of testing/handling of USB protocol requests (e.g failing get descriptor request) in the back-end to make it work. Finish porting remaining 2.4 code for back and front-ends for isochronous transfers etc. Sync up with latest xend rewrite and latest unstable (my snapshot is a couple of months old now). Create and submit patch. Testing of different USB device types. 2.6/2.4 interoperability. Testing of multiple back-ends/multiple front ends/multiple devices etc. Testing on SMP guests? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Summary (since last status): The 2.6 USB virt driver can now mount a filesystem in a 2.6 FE domain from my USB key driven by a 2.6 USB BE driver domain. Have successfully done a small amount of I/O with miscompare checking. Happy Happy Joy Joy. Perhaps 2 weeks away from submitting a patch---want to fix items marked *** below first. Details: I spent some time implementing USB protocol emulation in the back-end to handle the USB control requests that were failing (starting with the get-descriptor request) but then discarded the emulation code when I discovered that the problem was due to a bug in the FE (usb_pipein( urb->pipe ) returns 0x00 or 0x80 and was being assigned to a single bit bit-field). I''m not sure why this didn''t cause a problem for the 2.4 driver. The 2.4 back-end also doesn''t seem to correctly map all of the FE buffer if a transfer is > 4k but doesn''t start at a 4k boundary. After fixing the transfer direction bug, the control requests all sprang into life. The other bug might not be an issue since all the bulk transfers I''ve seen so far have been aligned. Managed to mount the FS on the USB key but the data was being corrupted. I did some testing with a file full of random data and a recent dd implementation with support for O_DIRECT to discover that 4K I/O was OK but 8K or larger I/0 would sometimes swap around 4K chunks of data within an I/O. This turned out to be because I wasn''t preserving the correct URB serialisation---I had assumed that URBs didn''t have order dependencies and was sometimes issuing them out of order. After reviewing the URB submission path and ensuring correct serialisation, the I/O started working. I spent a couple of days fixing module unload of the BE. Previously it only worked if the uhci-hcd module was unloaded before the usbback module. Now it works either way around and I''m not aware of any constraints on module loading/unloading. I put in a kernel option to compile tracing (printfs for debugging) into the usb FE and BE. Still to do: ***Implement URB unlink/timeout functionality. This part of the USB api has changed since 2.4. The 2.6 code currently asserts if an unlink is attempted and doesn''t maintain any timeouts. ***Finish porting remaining 2.4 code for back and front-ends for interrupt and isochronous transfers. Support for USB devices with multiple interfaces. The 2.4 code is broken in this area. The 2.6 code refuses to drive devices with multiple interfaces. Need to claim and release all of the interfaces of a multiple interface device in one go. BE uses allocate_empty_lowmem_region but there doesn''t seem to be an equivalent free function for module unload. Need to fix this memory leak. Grant table support. Review linux driver model reference counting. Review error return codes. ***Sync up with latest xend rewrite and latest unstable (my snapshot is a few months old now). ***Implement support in vm-tools. ***Update copyright notices. ***Create and submit patch. Testing of different USB device types (esp blade centre CD drive). 2.6/2.4 interoperability. Test when compiled into kernel rather than modular. Testing of multiple back-ends/multiple front ends/multiple devices etc. Testing on SMP guests? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> The 2.6 USB virt driver can now mount a filesystem in a 2.6 FE domain > from my USB key driven by a 2.6 USB BE driver domain. Have successfully > done a small amount of I/O with miscompare checking.That''s fantastic news! Great work!> I spent some time implementing USB protocol emulation in the back-end to > handle the USB control requests that were failing (starting with the > get-descriptor request) but then discarded the emulation code when I > discovered that the problem was due to a bug in the FE > (usb_pipein( urb->pipe ) returns 0x00 or 0x80 and was being assigned to > a single bit bit-field). I''m not sure why this didn''t cause a problem > for the 2.4 driver.The definition of usb_pipein on 2.4 and returns 1 or 0 for true or false, so this bug would only apply in 2.6. Some of the protocol emulation stuff might be useful if we encounter awkward devices, so it''s good that you at least had a chance to look at it.> The 2.4 back-end also doesn''t seem to correctly map > all of the FE buffer if a transfer is > 4k but doesn''t start at a 4k > boundary.It''s not clear to me why that particular case should break but I can see a couple of fixes needed in the logic for unaligned buffers, anyhow. I''ll go through and tweak bits of the code later, watch this space... I might test domain suspend / resume with USB devices while I''m at it.> ***Finish porting remaining 2.4 code for back and front-ends for > interrupt and isochronous transfers.Interrupts are fairly close to control / bulk transfers. Isochronous requires a little more logic but the 2.4 implementation is reasonably straightforward. You might like to consider updating the code to use a slab cache for the iso schedule buffers in the frontend driver.> Support for USB devices with multiple interfaces. The 2.4 code is broken > in this area. The 2.6 code refuses to drive devices with multiple > interfaces. Need to claim and release all of the interfaces of a > multiple interface device in one go.How is it broken? I think the 2.4 backend claims all interfaces attached to a port (with the probe function being called for each interface by the USB core), although I haven''t tried driving other other interfaces from the frontend.> Grant table support.Should be relatively similar to the usage in the block device code, since we never transfer page ownership. Cheers, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel