thr3ads.net - Xen devel - [Xen-devel] Timeout connecting to device [Aug 2005]

If this information is useful, please help other people find it:
Share via:

Arun Sharma

2005-Aug-22 20:41 UTC

[Xen-devel] Timeout connecting to device

Is there any userspace configuration I''m missing?

	-Arun

xen_blk: Initialising virtual block device driver
XENBUS xs_read_watch: 0
xen_blk: Timeout connecting to device!
Netdev frontend (TX) is using grant tables.
Netdev frontend (RX) is using grant tables.
xen_net: Initialising virtual ethernet driver.
CBNET: Registered protocol family 2
IP: routing cache hash table of 256 buckets, 4Kbytes
TCP established hash table entries: 4096 (order: 4, 65536 bytes)
TCP bind hash table entries: 4096 (order: 3, 49152 bytes)
TCP: Hash tables configured (established 4096 bind 4096)
NET: Registered protocol family 1
NET: Registered protocol family 17
Root-NFS: No NFS server available, giving up.


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Arun Sharma

2005-Aug-22 20:42 UTC

head link

[Xen-devel] Re: Timeout connecting to device

Arun Sharma wrote:> 
> Is there any userspace configuration I''m missing?
> 
This was a xenlinux domain not a VMX domain, BTW.

	-Arun


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Arun Sharma

2005-Aug-24 01:36 UTC

head link

[Xen-devel] Re: Timeout connecting to device

I didn''t see any responses to this query. I also see:

http://bugzilla.xensource.com/bugzilla/show_bug.cgi?id=176

which is essentially the same issue. Do VBDs work for anyone on the list?

It sounds like

 > XENBUS xs_read_watch: 0

is an indication of the fact that after the right entries were made in 
xenstore (I can see them on /var/lib/xenstored/store/domain), but xenbus 
  is not able to read events?

# find . -name *end
./a9b503a7-2494-409f-9e06-2bd1f9283953/device/vbd/768/backend
./45194ebb-4029-4131-9013-d17cd3dbc828/device/vbd/768/backend
./53ffa08b-5f90-45a4-9671-8d6fff7f8509/backend
./53ffa08b-5f90-45a4-9671-8d6fff7f8509/backend/vbd/a9b503a7-2494-409f-9e06-2bd1f9283953/768/frontend
./53ffa08b-5f90-45a4-9671-8d6fff7f8509/backend/vbd/45194ebb-4029-4131-9013-d17cd3dbc828/768/frontend

Couple of questions:

- is there a way to debug xenstored? It doesn''t seem to be logging
much.
- Could we add some debug flags elsewhere as well (xenbus with debug=1?) 
to make debugging problems of this nature easier?

A golden "working" log file would also be useful so we can compare our
logs against it..

	-Arun

Arun Sharma wrote:> 
> Is there any userspace configuration I''m missing?
> 
>     -Arun
> 
> xen_blk: Initialising virtual block device driver
> XENBUS xs_read_watch: 0
> xen_blk: Timeout connecting to device!
> Netdev frontend (TX) is using grant tables.
> Netdev frontend (RX) is using grant tables.
> xen_net: Initialising virtual ethernet driver.
> CBNET: Registered protocol family 2
> IP: routing cache hash table of 256 buckets, 4Kbytes
> TCP established hash table entries: 4096 (order: 4, 65536 bytes)
> TCP bind hash table entries: 4096 (order: 3, 49152 bytes)
> TCP: Hash tables configured (established 4096 bind 4096)
> NET: Registered protocol family 1
> NET: Registered protocol family 17
> Root-NFS: No NFS server available, giving up.
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Rusty Russell

2005-Aug-24 04:43 UTC

head link

Re: [Xen-devel] Re: Timeout connecting to device

On Tue, 2005-08-23 at 18:36 -0700, Arun Sharma wrote:>  > XENBUS xs_read_watch: 0
This is harmless, in fact expected.  It''s a debugging message which
should go away...
> # find . -name *end
> ./a9b503a7-2494-409f-9e06-2bd1f9283953/device/vbd/768/backend
> ./45194ebb-4029-4131-9013-d17cd3dbc828/device/vbd/768/backend
> ./53ffa08b-5f90-45a4-9671-8d6fff7f8509/backend
>
./53ffa08b-5f90-45a4-9671-8d6fff7f8509/backend/vbd/a9b503a7-2494-409f-9e06-2bd1f9283953/768/frontend
>
./53ffa08b-5f90-45a4-9671-8d6fff7f8509/backend/vbd/45194ebb-4029-4131-9013-d17cd3dbc828/768/frontend
> 
> Couple of questions:
> 
> - is there a way to debug xenstored? It doesn''t seem to be logging
much.
Yes, add --trace-file=/tmp/trace to the invocation of xenstored, and
you''ll see all the conversations that the store has.

Grovelling around the store by looking in /var/lib/xenstored/store is
pretty simple, as you have found, but I know there were a few efforts to
have a nicer browser...
> - Could we add some debug flags elsewhere as well (xenbus with debug=1?) 
> to make debugging problems of this nature easier?
The trace file is actually better in my experience.  Also look for
"error" nodes in the store (although Christian removed some of those
paths in the merge).

Cheers,
Rusty.
-- 
A bad analogy is like a leaky screwdriver -- Richard Braakman


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Christian Limpach

2005-Aug-24 08:25 UTC

head link

Re: [Xen-devel] Re: Timeout connecting to device

On 8/24/05, Rusty Russell <rusty@rustcorp.com.au>
wrote:> > - is there a way to debug xenstored? It doesn''t seem to be
logging much.
> 
> Yes, add --trace-file=/tmp/trace to the invocation of xenstored, and
> you''ll see all the conversations that the store has.
You can set XENSTORED_TRACE in /usr/sbin/xend''s environment causing it
to add the --trace-file option to the invocation of xenstored.  You''ll
need to do this in your boot script because after restarting
xenstored, the backends won''t notice any new changes anymore.
> > - Could we add some debug flags elsewhere as well (xenbus with
debug=1?)
> > to make debugging problems of this nature easier?
> 
> The trace file is actually better in my experience.  Also look for
> "error" nodes in the store (although Christian removed some of
those
> paths in the merge).
I only removed the ones which were not fatal because I was under the
impression initally that adding an error node does indicate a final
error after which it would be up to a control tool to sort out the
failure or report it.

    christian

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Arun Sharma

2005-Aug-25 00:31 UTC

head link

Re: [Xen-devel] Re: Timeout connecting to device

Rusty Russell wrote:
> The trace file is actually better in my experience.  Also look for
> "error" nodes in the store (although Christian removed some of
those
> paths in the merge).
Thanks, this works for me now:

losetup -o 16384 /dev/loop1 /var/images/min-el3-i386.img
disk = [ ''phy:loop1,hda1,w'' ]

But this doesn''t work:

disk = [ ''file:/var/images/min-el3-i386.img,hda,w'' ]
xen_blk: Initialising virtual block device driver
Registering block device major 3
  hda:    <hang>

	-Arun



_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Rusty Russell

2005-Aug-25 03:18 UTC

head link

Re: [Xen-devel] Re: Timeout connecting to device

On Wed, 2005-08-24 at 09:25 +0100, Christian Limpach
wrote:> On 8/24/05, Rusty Russell <rusty@rustcorp.com.au> wrote:
> > The trace file is actually better in my experience.  Also look for
> > "error" nodes in the store (although Christian removed some
of those
> > paths in the merge).
> 
> I only removed the ones which were not fatal because I was under the
> impression initally that adding an error node does indicate a final
> error after which it would be up to a control tool to sort out the
> failure or report it.
Well, I expect there to be an error node at some point in the normal
case: you look in the backend and something isn''t there yet, you want
to
indicate that, because it may never change (in the normal case, it will
appear soon and we will delete the error node).

The error node idea might be overly simplistic: its non-existence
doesn''t tell you the device is ok (it might have just been created).
Makes it harder to synchronously create a device.

Will think more on this...
Rusty.
-- 
A bad analogy is like a leaky screwdriver -- Richard Braakman


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Christian Limpach

2005-Aug-25 11:01 UTC

head link

Re: [Xen-devel] Re: Timeout connecting to device

On Thu, Aug 25, 2005 at 01:18:56PM +1000, Rusty Russell
wrote:> On Wed, 2005-08-24 at 09:25 +0100, Christian Limpach wrote:
> > On 8/24/05, Rusty Russell <rusty@rustcorp.com.au> wrote:
> > > The trace file is actually better in my experience.  Also look
for
> > > "error" nodes in the store (although Christian removed
some of those
> > > paths in the merge).
> > 
> > I only removed the ones which were not fatal because I was under the
> > impression initally that adding an error node does indicate a final
> > error after which it would be up to a control tool to sort out the
> > failure or report it.
> 
> Well, I expect there to be an error node at some point in the normal
> case: you look in the backend and something isn''t there yet, you
want to
> indicate that, because it may never change (in the normal case, it will
> appear soon and we will delete the error node).
> 
> The error node idea might be overly simplistic: its non-existence
> doesn''t tell you the device is ok (it might have just been
created).
> Makes it harder to synchronously create a device.
I think we need a setup node which indicates that a driver is stuck
because it''s waiting for a change in the store but that it expects
this change to happen once the other party makes progress.  This would
get set when the other party''s directory doesn''t exist yet or
there
are still nodes missing.
The error node should indicate a final failure, where intervention
by the control tool will be required.  This would get set when you
read a value and can''t parse it (not a number, not a mac address, ...)
or when the information you''ve read is incorrect (ring reference
can''t
be mapped, event channel can''t connect, ...).

I don''t like the idea of a status node since eventually it would get
out of sync, while the setup/error nodes indicate a state the driver
is in and won''t get out of without it doing something actively.

    christian


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Rusty Russell

2005-Aug-26 00:27 UTC

head link

Re: [Xen-devel] Re: Timeout connecting to device

On Thu, 2005-08-25 at 12:01 +0100, Christian Limpach
wrote:> On Thu, Aug 25, 2005 at 01:18:56PM +1000, Rusty Russell wrote:
> > The error node idea might be overly simplistic: its non-existence
> > doesn''t tell you the device is ok (it might have just been
created).
> > Makes it harder to synchronously create a device.
> 
> I think we need a setup node which indicates that a driver is stuck
> because it''s waiting for a change in the store but that it expects
> this change to happen once the other party makes progress.  This would
> get set when the other party''s directory doesn''t exist
yet or there
> are still nodes missing.
> The error node should indicate a final failure, where intervention
> by the control tool will be required.  This would get set when you
> read a value and can''t parse it (not a number, not a mac address,
...)
> or when the information you''ve read is incorrect (ring reference
can''t
> be mapped, event channel can''t connect, ...).
A slight variation on this would be to have the tools create the
"setting-up" node, and the driver delete it when it''s happy.

I want to leave the error node on every error, so that way you can tell
if it can''t read the backend because of permission problem or
something,
even if during setup.

Synchronous startup is now fairly easy: wait for the setting-up node to
be deleted.  Monitoring status during startup is also fairly easy, by
watching the error node.  And the absence of both means we''re happy and
live.

Thoughts?
Rusty.
-- 
A bad analogy is like a leaky screwdriver -- Richard Braakman


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Christian Limpach

2005-Aug-26 10:37 UTC

head link

Re: [Xen-devel] Re: Timeout connecting to device

On Fri, Aug 26, 2005 at 10:27:06AM +1000, Rusty Russell
wrote:> A slight variation on this would be to have the tools create the
> "setting-up" node, and the driver delete it when it''s
happy.
> 
> I want to leave the error node on every error, so that way you can tell
> if it can''t read the backend because of permission problem or
something,
> even if during setup.
How do you tell if an error is final though?  Removing the setting-up
node sounds kind of misleading since the device is not done setting-up.
I think we need to indicate somehow whether an error is final or
transient, but I don''t think we should require the format of the
error message to somehow indicate this.

    christian


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Rusty Russell

2005-Aug-29 00:17 UTC

head link

Re: [Xen-devel] Re: Timeout connecting to device

On Fri, 2005-08-26 at 11:37 +0100, Christian Limpach
wrote:> On Fri, Aug 26, 2005 at 10:27:06AM +1000, Rusty Russell wrote:
> > A slight variation on this would be to have the tools create the
> > "setting-up" node, and the driver delete it when
it''s happy.
> > 
> > I want to leave the error node on every error, so that way you can
tell
> > if it can''t read the backend because of permission problem or
something,
> > even if during setup.
> 
> How do you tell if an error is final though?  Removing the setting-up
> node sounds kind of misleading since the device is not done setting-up.
> I think we need to indicate somehow whether an error is final or
> transient, but I don''t think we should require the format of the
> error message to somehow indicate this.
I don''t think the frontend can, in general, tell the difference between
a final error and a transient one.  The problem may simply be that the
backend never creates a node we need.  If we''re completely giving up,
we
could remove the setting-up node, but I''m not sure it''s worth
it?

Rusty.
-- 
A bad analogy is like a leaky screwdriver -- Richard Braakman


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Christian Limpach

2005-Aug-29 00:23 UTC

head link

Re: [Xen-devel] Re: Timeout connecting to device

On Mon, Aug 29, 2005 at 10:17:18AM +1000, Rusty Russell
wrote:> On Fri, 2005-08-26 at 11:37 +0100, Christian Limpach wrote:
> > On Fri, Aug 26, 2005 at 10:27:06AM +1000, Rusty Russell wrote:
> > > A slight variation on this would be to have the tools create the
> > > "setting-up" node, and the driver delete it when
it''s happy.
> > > 
> > > I want to leave the error node on every error, so that way you
can tell
> > > if it can''t read the backend because of permission
problem or something,
> > > even if during setup.
> > 
> > How do you tell if an error is final though?  Removing the setting-up
> > node sounds kind of misleading since the device is not done
setting-up.
> > I think we need to indicate somehow whether an error is final or
> > transient, but I don''t think we should require the format of
the
> > error message to somehow indicate this.
> 
> I don''t think the frontend can, in general, tell the difference
between
> a final error and a transient one.  The problem may simply be that the
> backend never creates a node we need.  If we''re completely giving
up, we
> could remove the setting-up node, but I''m not sure it''s
worth it?
There are error which are final, like a node which holds data which
doesn''t
make sense (like when scanf "%d" fails).  Yes, there are some where
you
can''t tell if they are transient or final...  I guess the fatal ones I
think about are not very likely if the frontend and backend are matched
and not broken...

    christian


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xensource.com
http://lists.xensource.com/xen-devel

Xen devel - Aug 2005 - Timeout connecting to device

[Xen-devel] Timeout connecting to device

[Xen-devel] Re: Timeout connecting to device

[Xen-devel] Re: Timeout connecting to device

Re: [Xen-devel] Re: Timeout connecting to device

Re: [Xen-devel] Re: Timeout connecting to device

Re: [Xen-devel] Re: Timeout connecting to device

Re: [Xen-devel] Re: Timeout connecting to device

Re: [Xen-devel] Re: Timeout connecting to device

Re: [Xen-devel] Re: Timeout connecting to device

Re: [Xen-devel] Re: Timeout connecting to device

Re: [Xen-devel] Re: Timeout connecting to device

Re: [Xen-devel] Re: Timeout connecting to device