Glauber de Oliveira Costa
2006-Dec-04 21:40 UTC
[Xen-devel] [PATCH] Safely finish closing protocol when guest fails in blkfront
If a guest finds any error and aborts the connection of a block device, it''s online state set at device create phase will stop it from being properly cleaned up. Follows a fix for it. -- Glauber de Oliveira Costa Red Hat Inc. "Free as in Freedom" _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Dec-05 07:39 UTC
Re: [Xen-devel] [PATCH] Safely finish closing protocol when guest fails in blkfront
On 4/12/06 9:40 pm, "Glauber de Oliveira Costa" <gcosta@redhat.com> wrote:> If a guest finds any error and aborts the connection of a block device, > it''s online state set at device create phase will stop it from being > properly cleaned up. > > Follows a fix for it.Assignment and unassignment of physical resources is really a tools issue. Tools should really be integrated with device-hotplug success/failure anyway -- for example, it is likely the initiator would like confirmation of success/failure in most cases. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Glauber de Oliveira Costa
2006-Dec-05 10:22 UTC
Re: [Xen-devel] [PATCH] Safely finish closing protocol when guest fails in blkfront
On Tue, Dec 05, 2006 at 07:39:11AM +0000, Keir Fraser wrote:> On 4/12/06 9:40 pm, "Glauber de Oliveira Costa" <gcosta@redhat.com> wrote: > > > If a guest finds any error and aborts the connection of a block device, > > it''s online state set at device create phase will stop it from being > > properly cleaned up. > > > > Follows a fix for it. > > Assignment and unassignment of physical resources is really a tools issue. > Tools should really be integrated with device-hotplug success/failure anyway > -- for example, it is likely the initiator would like confirmation of > success/failure in most cases. >Agree. But what if after properly initiation, frontend finds an error and starts Closing protocol? What will happen is that the test if (xenbus_dev_is_online(dev)) will cause the device to not be unregistered. At this point, it do not see frontend changes. (putting backend in closing leads to frontend closing,closed, but backend never see frontend closing, never going to closed). Given that, what tools can do ? At the current point, this is what leads me to believe that arbitrary frontend-failure cases should be handled in the frontend. -- Glauber de Oliveira Costa Red Hat Inc. "Free as in Freedom" _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Glauber de Oliveira Costa
2006-Dec-05 20:25 UTC
Re: [Xen-devel] [PATCH] Safely finish closing protocol when guest fails in blkfront
> > Assignment and unassignment of physical resources is really a tools issue. > > Tools should really be integrated with device-hotplug success/failure anyway > > -- for example, it is likely the initiator would like confirmation of > > success/failure in most cases. > > > Agree. But what if after properly initiation, frontend finds an error > and starts Closing protocol? What will happen is that the test > > if (xenbus_dev_is_online(dev)) > > will cause the device to not be unregistered. At this point, it do not > see frontend changes. (putting backend in closing leads to frontend > closing,closed, but backend never see frontend closing, never going to > closed). > > Given that, what tools can do ? At the current point, this is what leads > me to believe that arbitrary frontend-failure cases should be handled in the frontend. >Keir, Let me just try to clarify this. (after all, I just realised that even if this is the right path, there''s a piece missing). Right now, I think that handling failures in the frontend code is the correct choice, because failures can pretty much happen anytime . According to the diagram at http://wiki.xensource.com/xenwiki/XenSplitDrivers, a closedown initiated by the frontend should end in the device being unregistered, and I don''t think tools will _ever_ be able to do it. The best they can do is wait to see if the device is properly connected, but what if the error happens after it? If this is indeed the real scenario, the missing piece would be to delete the error message, to avoid unregistering devices that should not be unregistered. If you can assure, that now and ever, errors in the frontend side will _always_ be constrained to the pre-Connect steps, then, my proposal is to set the online flag just after the device is connected. It would assure that device is properly unregistered, and tools would have a way to know if the process was successfull (online = 1). Any comments on that ? I assume that I don''t understand exactly the purpose of online. At first I thought it was save & restore related, but I''m currently able to save & restore with online being always 0. Can you shed some light on it ? As soon as you answer those, I''ll proceed with the right approach to fix this. -- Glauber de Oliveira Costa. "Free as in Freedom" Add your comments to GPLv3 at: http://gplv3.fsf.org/comments/gplv3-draft-2.html _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel