thr3ads.net - Xen devel - [Xen-devel] Linux-specific blkif.py change [Nov 2006]

If this information is useful, please help other people find it:
Share via:

John Levon

2006-Nov-02 01:42 UTC

[Xen-devel] Linux-specific blkif.py change

Changeset 11831:f5321161c649 has broken non-Linux domUs with this
change:

         devid = blkif.blkdev_name_to_number(dev)
+        if not devid:
+            raise VmError(''Unable to find number for device
(%s)'' % (dev))
+

The immediate problem is that Solaris domU''s have "0" for dev
for the
first disk. So it''s presumably matched on the hex re in util/blkif.py,
returning 0 and failing this incorrect check. There are other problems:

1) util/blkif.py logs to xend-debug.log if the stat() fails. This is
needlessly chatty, and indicates there''s some kind of error, when there
is not.

2) util/blkif.py has a load of Linux gook for getting the device
numbers. Luckily Solaris has a completely different naming scheme, but
wouldn''t this go horribly wrong if a domU just happened to use the same
name, different device number?

It''s not clear to us why Linux even needs to do this?

For now I think the change needs backing out so non-Linux domU''s can
work again. I''m not sure of a better fix; suggestions welcome.

regards
john

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Brendan Cully

2006-Nov-02 02:03 UTC

head link

Re: [Xen-devel] Linux-specific blkif.py change

On 1-Nov-06, at 5:42 PM, John Levon wrote:
>
> Changeset 11831:f5321161c649 has broken non-Linux domUs with this
> change:
>
>          devid = blkif.blkdev_name_to_number(dev)
> +        if not devid:
> +            raise VmError(''Unable to find number for device
(%s)''
> % (dev))
> +
>
> The immediate problem is that Solaris domU''s have "0"
for dev for the
> first disk. So it''s presumably matched on the hex re in
util/blkif.py,
> returning 0 and failing this incorrect check. There are other  
> problems:
I don''t know about the other stuff, but changing the check to

if devid is None:

should solve your immediate problem, right?

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

John Levon

2006-Nov-02 02:03 UTC

head link

Re: [Xen-devel] Linux-specific blkif.py change

On Wed, Nov 01, 2006 at 06:03:03PM -0800, Brendan Cully wrote:
> >The immediate problem is that Solaris domU''s have
"0" for dev for the
> >first disk. So it''s presumably matched on the hex re in
util/blkif.py,
> >returning 0 and failing this incorrect check. There are other  
> >problems:
> 
> I don''t know about the other stuff, but changing the check to
> 
> if devid is None:
> 
> should solve your immediate problem, right?
Yes, but only by chance.

regards
john

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ewan Mellor

2006-Nov-02 23:21 UTC

head link

Re: [Xen-devel] Linux-specific blkif.py change

On Thu, Nov 02, 2006 at 01:42:50AM +0000, John Levon wrote:
> 
> Changeset 11831:f5321161c649 has broken non-Linux domUs with this
> change:
> 
>          devid = blkif.blkdev_name_to_number(dev)
> +        if not devid:
> +            raise VmError(''Unable to find number for device
(%s)'' % (dev))
> +
> 
> The immediate problem is that Solaris domU''s have "0"
for dev for the
> first disk. So it''s presumably matched on the hex re in
util/blkif.py,
> returning 0 and failing this incorrect check. There are other problems:
> 
> 1) util/blkif.py logs to xend-debug.log if the stat() fails. This is
> needlessly chatty, and indicates there''s some kind of error, when
there
> is not.
> 
> 2) util/blkif.py has a load of Linux gook for getting the device
> numbers. Luckily Solaris has a completely different naming scheme, but
> wouldn''t this go horribly wrong if a domU just happened to use the
same
> name, different device number?
> 
> It''s not clear to us why Linux even needs to do this?
> 
> For now I think the change needs backing out so non-Linux domU''s
can
> work again. I''m not sure of a better fix; suggestions welcome.
I think that the correct fix would be for the tools to pass the untranslated
device name into the guest, rather than translating it to a device number
first.  I''ve no idea why this was done in the first place, as
it''s clearly
wrong.  Like you say, there''s no reason for a guest''s device
name -> number
mapping to be the same as dom0''s.

Unfortunately, this is part of the guaranteed interface to guests now, so we
need to reproduce this behaviour for old guests, but there''s nothing
stopping
us fixing this for new guests.  If we fixed the tools to write the device name
as well as the (Linux) device number then new guests could use the name rather
than the number and do the lookup themselves.

In this scheme, the check above would go -- the failure to look up the device
would be handled simply by writing the name and not the number, and hoping
that it''s not an old Linux guest.  The change was intended to improve
the
error message that you receive in this case, so at the least, the failure
ought to be logged (unless you can come up with some way to detect old Linux
guests, and only complain in that case).

Would you like to put together a patch along these lines?

Thanks,

Ewan.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

John Levon

2006-Nov-02 23:38 UTC

head link

Re: [Xen-devel] Linux-specific blkif.py change

On Thu, Nov 02, 2006 at 11:21:17PM +0000, Ewan Mellor wrote:
> > 1) util/blkif.py logs to xend-debug.log if the stat() fails. This is
> > needlessly chatty, and indicates there''s some kind of error,
when there
> > is not.
> > 
> > 2) util/blkif.py has a load of Linux gook for getting the device
> > numbers. Luckily Solaris has a completely different naming scheme, but
> > wouldn''t this go horribly wrong if a domU just happened to
use the same
> > name, different device number?
> > 
> > It''s not clear to us why Linux even needs to do this?
> 
> I think that the correct fix would be for the tools to pass the
untranslated
> device name into the guest, rather than translating it to a device number
Sounds sensible to me.
> that it''s not an old Linux guest.  The change was intended to
improve the
> error message that you receive in this case, so at the least, the failure
> ought to be logged (unless you can come up with some way to detect old
Linux
> guests, and only complain in that case).
Is there some other way to indicate the failure later? We''d like
xend-debug.log to be essentially silent during normal operation for a
non-debug xend...
> Would you like to put together a patch along these lines?
I can do a patch for xend, but I''m not familiar enough to update the
Linux side of things.

I see that the ''is none'' hack has been committed along with
removing the
message in blkif.py, so that solves the immediate issue for us.

thanks,
john

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Ewan Mellor

2006-Nov-03 00:03 UTC

head link

Re: [Xen-devel] Linux-specific blkif.py change

On Thu, Nov 02, 2006 at 11:38:04PM +0000, John Levon wrote:
> On Thu, Nov 02, 2006 at 11:21:17PM +0000, Ewan Mellor wrote:
> 
> > > 1) util/blkif.py logs to xend-debug.log if the stat() fails. This
is
> > > needlessly chatty, and indicates there''s some kind of
error, when there
> > > is not.
> > > 
> > > 2) util/blkif.py has a load of Linux gook for getting the device
> > > numbers. Luckily Solaris has a completely different naming
scheme, but
> > > wouldn''t this go horribly wrong if a domU just happened
to use the same
> > > name, different device number?
> > > 
> > > It''s not clear to us why Linux even needs to do this?
> > 
> > I think that the correct fix would be for the tools to pass the
untranslated
> > device name into the guest, rather than translating it to a device
number
> 
> Sounds sensible to me.
> 
> > that it''s not an old Linux guest.  The change was intended to
improve the
> > error message that you receive in this case, so at the least, the
failure
> > ought to be logged (unless you can come up with some way to detect old
Linux
> > guests, and only complain in that case).
> 
> Is there some other way to indicate the failure later? We''d like
> xend-debug.log to be essentially silent during normal operation for a
> non-debug xend...
I meant log it to xend.log (the log infrastructure), as opposed to
xend-debug.log (Xend''s stderr) if you were making that distinction.

Certainly we could indicate the failure later, though it''s a little
complicated.  Of course, the error can only be detected by the guest, so
you''ll have to make blkfront or the equivalent in Solaris write an
error code
to the store, and then pick that up again from Xend.  This has been done to a
certain extent already, but the problem is that, by this point, xm has
returned success, so though the error has been flagged, no-one gets to see it,
other than diagnostic tools, and it''s not long before the device
teardown
occurs and the error message is deleted then anyway.

We would need to grab the error code in Xend, before the device teardown, and
then because there''s no client waiting at this point, the only thing we
could
do is log it anyway.  Alternatively, we could extend the
"wait-for-devices"
functionality in the xm create path to wait for an indication of successful
device set-up (at the moment, we only wait for successful hotplugging in
dom0).  In that case, you would actually have a client to send the error
message to.  Which would be nice.
> > Would you like to put together a patch along these lines?
> 
> I can do a patch for xend, but I''m not familiar enough to update
the
> Linux side of things.
That''s fine -- if you can make it work for Solaris without breaking the
existing functionality, we can move Linux over to the new scheme at a later
date.

Cheers,

Ewan.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

John Levon

2006-Nov-03 00:43 UTC

head link

Re: [Xen-devel] Linux-specific blkif.py change

On Fri, Nov 03, 2006 at 12:03:04AM +0000, Ewan Mellor wrote:
> I meant log it to xend.log (the log infrastructure), as opposed to
> xend-debug.log (Xend''s stderr) if you were making that
distinction.
Well, I suppose that''s a bit better.
> Certainly we could indicate the failure later, though it''s a
little
> complicated.  Of course, the error can only be detected by the guest, so
> you''ll have to make blkfront or the equivalent in Solaris write an
error code
> to the store, and then pick that up again from Xend.  This has been done to
a
> certain extent already, but the problem is that, by this point, xm has
> returned success, so though the error has been flagged, no-one gets to see
it,
> other than diagnostic tools, and it''s not long before the device
teardown
> occurs and the error message is deleted then anyway.
> 
> We would need to grab the error code in Xend, before the device teardown,
and
> then because there''s no client waiting at this point, the only
thing we could
> do is log it anyway.  Alternatively, we could extend the
"wait-for-devices"
> functionality in the xm create path to wait for an indication of successful
> device set-up (at the moment, we only wait for successful hotplugging in
> dom0).  In that case, you would actually have a client to send the error
> message to.  Which would be nice.
Presumably, one day, these sorts of errors will be forwardable to
something watching what''s going via xen-api. At least, that would be
nice.

BTW, it''d be great if one of you could do a quick write-up on the
current code in xen-unstable? There''s a heck of a lot of changes just
gone in, and it''d be nice to know what state things are supposed to be
in: what works, what doesn''t, what needs improving.

regards
john

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Nov 2006 - Linux-specific blkif.py change

[Xen-devel] Linux-specific blkif.py change

Re: [Xen-devel] Linux-specific blkif.py change

Re: [Xen-devel] Linux-specific blkif.py change

Re: [Xen-devel] Linux-specific blkif.py change

Re: [Xen-devel] Linux-specific blkif.py change

Re: [Xen-devel] Linux-specific blkif.py change

Re: [Xen-devel] Linux-specific blkif.py change