thr3ads.net - Libguestfs - Re: [Libguestfs] Few libnbd questions/concerns [Jun 2019]

If this information is useful, please help other people find it:
Share via:

Eric Blake

2019-Jun-25 14:25 UTC

Re: [Libguestfs] Few libnbd questions/concerns

On 6/25/19 8:14 AM, Richard W.M. Jones wrote:> On Tue, Jun 25, 2019 at 02:58:34PM +0200, Martin Kletzander wrote:
>> Here are few things I found out when using libnbd that might be
perfectly fine
>> or maybe just an oversight, but I wanted to point them out.  It's
nothing major.
>>
>> When running a program with `nbdkit -U - --run ...`, the $nbd parameter
gets
>> expanded to nbd:unix:/path/to/socket.  When this string is passed to
>> nbd_connect_uri(), it does not return an error (even though it is not a
valid
>> URL), but what's more it treats it as "nbd://localhost",
which might be a completely different server (that actually happened to me and
it was kind of confusing).
> 
> There's a bit of magic in nbdkit about $nbd.  It also predates our
> standardization effort for NBD URLs which is going on upstream.
> 
> That needs to be fixed, but in the meantime just use $unixsocket
> instead (see nbdkit-captive(1) man page for the full details).
In the meantime, I don't mind working on the quick fix for rejecting an
invalid URI.
> 
>> When nbd_block_status() callback returns -1, the nbd_block_status()
function
>> reports:
>>
>>  command failed: Protocol error (errno = 71)
>>
>> which is a bit confusing to me.  I could be nicer to have that report
that it
>> really was the callback that caused this.
> 
> Not sure about this one - Eric?
Right now, a failed callback will cause the overall nbd_block_status()
to fail with whatever errno the callback left behind, with EPROTO as the
default. I'm guessing you didn't set an errno when you returned -1?  Is
there a better default we should use than EPROTO?  Is there really a way
to diagnose without an errno value that the callback failed but didn't
set errno?

There's also the related question - my patches to add
nbd_pread_structured are set up to call the callback for as many chunks
as the server sends, even if an earlier invocation of the callback
fails, but to still preserve the errno of the first failed callback
(again defaulting to EPROTO). Should nbd_block_status also call the
callback for all chunks from the server, rather than stopping the use of
further callbacks once the first callback fails, if only for consistency
between the two?
>> One last thing is that I could not find any definition for the flags
>> of "base:allocation" metadata (NBD_STATE_HOLE and
NBD_STATE_ZERO).
>> It might just be things that are still missing due to an early stage
>> of libnbd, or it might not belong there or it might be me misusing
>> something.
> 
> Have a look at the NBD protocol doc:
> 
> https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md
We could add convenience #defines into libnbd.h if needed, which would
work for the base:allocation context, but it would not help any other
context (such as qemu:dirty-bitmap:FOO) which has other definitions for
the returned bits.  But the fact that you have to call
nbd_add_meta_context(h, "base:allocation") is also awkward; if we add
convenience macros for the results, we probably also want to support
nbd_add_meta_context(h, LIBNBD_CONTEXT_BASE_ALLOCATION).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Martin Kletzander

2019-Jun-25 15:55 UTC

head link

Re: [Libguestfs] Few libnbd questions/concerns

On Tue, Jun 25, 2019 at 09:25:51AM -0500, Eric Blake
wrote:>On 6/25/19 8:14 AM, Richard W.M. Jones wrote:
>> On Tue, Jun 25, 2019 at 02:58:34PM +0200, Martin Kletzander wrote:
>>> Here are few things I found out when using libnbd that might be
perfectly fine
>>> or maybe just an oversight, but I wanted to point them out. 
It's nothing major.
>>>
>>> When running a program with `nbdkit -U - --run ...`, the $nbd
parameter gets
>>> expanded to nbd:unix:/path/to/socket.  When this string is passed
to
>>> nbd_connect_uri(), it does not return an error (even though it is
not a valid
>>> URL), but what's more it treats it as
"nbd://localhost", which might be a completely different server (that
actually happened to me and it was kind of confusing).
>>
>> There's a bit of magic in nbdkit about $nbd.  It also predates our
>> standardization effort for NBD URLs which is going on upstream.
>>
>> That needs to be fixed, but in the meantime just use $unixsocket
>> instead (see nbdkit-captive(1) man page for the full details).
>
I forgot to say I solved all the issues mentioned here.  I was just worried that
there was no error and it lead me to a different NBD server.
>In the meantime, I don't mind working on the quick fix for rejecting an
>invalid URI.
>
>>
>>> When nbd_block_status() callback returns -1, the nbd_block_status()
function
>>> reports:
>>>
>>>  command failed: Protocol error (errno = 71)
>>>
>>> which is a bit confusing to me.  I could be nicer to have that
report that it
>>> really was the callback that caused this.
>>
>> Not sure about this one - Eric?
>
>Right now, a failed callback will cause the overall nbd_block_status()
>to fail with whatever errno the callback left behind, with EPROTO as the
>default. I'm guessing you didn't set an errno when you returned -1? 
Is
>there a better default we should use than EPROTO?  Is there really a way
>to diagnose without an errno value that the callback failed but didn't
>set errno?
>
I haven't looked how the error messages are stored, but if is says that the
callback returned negative value in nbd_get_error(), then I don't need
anything
else.
>There's also the related question - my patches to add
>nbd_pread_structured are set up to call the callback for as many chunks
>as the server sends, even if an earlier invocation of the callback
>fails, but to still preserve the errno of the first failed callback
>(again defaulting to EPROTO). Should nbd_block_status also call the
>callback for all chunks from the server, rather than stopping the use of
>further callbacks once the first callback fails, if only for consistency
>between the two?
>
I can't decide.  I think there should be a way to say you don't want the
callback to be called again.  I can imagine a callback which relies on that
(failed to allocate data for an internal structure and now the integrity is
broken or something).
>>> One last thing is that I could not find any definition for the
flags
>>> of "base:allocation" metadata (NBD_STATE_HOLE and
NBD_STATE_ZERO).
>>> It might just be things that are still missing due to an early
stage
>>> of libnbd, or it might not belong there or it might be me misusing
>>> something.
>>
>> Have a look at the NBD protocol doc:
>>
>> https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md
I know, I took the values from there and #define'd them in my code.  But it
would be nicer if this was defined in the provided header file.

This leads me to another value exposed only in an error message, the maximum
request sizes for each call.  And that leads me to nbd_pread() saying the
maximum request size is 2^32-1, but if I use that I see (in nbdkit logs) it
requests with count=2^32 (probably an alignment) and then hits an assertion
inside the nbdkit code:

  nbdkit: protocol.c:528: extents_to_block_descriptors: Assertion `e.length
<= length' failed.
>
>We could add convenience #defines into libnbd.h if needed, which would
>work for the base:allocation context, but it would not help any other
>context (such as qemu:dirty-bitmap:FOO) which has other definitions for
>the returned bits.  But the fact that you have to call
>nbd_add_meta_context(h, "base:allocation") is also awkward; if we
add
>convenience macros for the results, we probably also want to support
>nbd_add_meta_context(h, LIBNBD_CONTEXT_BASE_ALLOCATION).
>
That is what I was thinking about.  Standardized metadata contexts could be
"supported" in a way that the values are #defined in libnbd.h.
>--
>Eric Blake, Principal Software Engineer
>Red Hat, Inc.           +1-919-301-3226
>Virtualization:  qemu.org | libvirt.org
>


>_______________________________________________
>Libguestfs mailing list
>Libguestfs@redhat.com
>https://www.redhat.com/mailman/listinfo/libguestfs

Eric Blake

2019-Jun-27 02:33 UTC

head link

Re: [Libguestfs] Few libnbd questions/concerns

On 6/25/19 9:25 AM, Eric Blake wrote:
>>> When running a program with `nbdkit -U - --run ...`, the $nbd
parameter gets
>>> expanded to nbd:unix:/path/to/socket.  When this string is passed
to
>>> nbd_connect_uri(), it does not return an error (even though it is
not a valid
>>> URL), but what's more it treats it as
"nbd://localhost", which might be a completely different server (that
actually happened to me and it was kind of confusing).
>>
>> There's a bit of magic in nbdkit about $nbd.  It also predates our
>> standardization effort for NBD URLs which is going on upstream.
>>
>> That needs to be fixed, but in the meantime just use $unixsocket
>> instead (see nbdkit-captive(1) man page for the full details).
> 
> In the meantime, I don't mind working on the quick fix for rejecting an
> invalid URI.
Done.
> 
>>
>>> When nbd_block_status() callback returns -1, the nbd_block_status()
function
>>> reports:
>>>
>>>  command failed: Protocol error (errno = 71)
>>>
>>> which is a bit confusing to me.  I could be nicer to have that
report that it
>>> really was the callback that caused this.
>>
>> Not sure about this one - Eric?
> 
> Right now, a failed callback will cause the overall nbd_block_status()
> to fail with whatever errno the callback left behind, with EPROTO as the
> default. I'm guessing you didn't set an errno when you returned -1?
Is
> there a better default we should use than EPROTO?  Is there really a way
> to diagnose without an errno value that the callback failed but didn't
> set errno?
> 
My pending patch to take advantage of Mutable(Int) in the callback,
makes this a bit nicer, as you can then control the error status by
assigning into the parameter rather than by what got left in errno.
> There's also the related question - my patches to add
> nbd_pread_structured are set up to call the callback for as many chunks
> as the server sends, even if an earlier invocation of the callback
> fails, but to still preserve the errno of the first failed callback
> (again defaulting to EPROTO). Should nbd_block_status also call the
> callback for all chunks from the server, rather than stopping the use of
> further callbacks once the first callback fails, if only for consistency
> between the two?
So now I'm leaning towards making that change, particularly since I'm
also working on adding a nbd_aio_FOO_notify() for each command that
takes a callback that also takes an error pointer.
> 
>>> One last thing is that I could not find any definition for the
flags
>>> of "base:allocation" metadata (NBD_STATE_HOLE and
NBD_STATE_ZERO).
>>> It might just be things that are still missing due to an early
stage
>>> of libnbd, or it might not belong there or it might be me misusing
>>> something.
>>
>> Have a look at the NBD protocol doc:
>>
>> https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md
> 
> We could add convenience #defines into libnbd.h if needed, which would
> work for the base:allocation context, but it would not help any other
> context (such as qemu:dirty-bitmap:FOO) which has other definitions for
> the returned bits.  But the fact that you have to call
> nbd_add_meta_context(h, "base:allocation") is also awkward; if we
add
> convenience macros for the results, we probably also want to support
> nbd_add_meta_context(h, LIBNBD_CONTEXT_BASE_ALLOCATION).
Still needs to be done.

Another thing I just noticed: nbd_poll returns 0 both for timeout and
for when it notified the state machine about POLLIN or POLLOUT making
progress. We probably want to switch it to return 0 on timeout and 1
when it made progress, so that a user can detect timeout (all our
internal callers pass -1 for the timeout parameter, and thus can't time
out, but external callers may care).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Richard W.M. Jones

2019-Jun-27 07:29 UTC

head link

Re: [Libguestfs] Few libnbd questions/concerns

On Wed, Jun 26, 2019 at 09:33:04PM -0500, Eric Blake
wrote:> Another thing I just noticed: nbd_poll returns 0 both for timeout and
> for when it notified the state machine about POLLIN or POLLOUT making
> progress. We probably want to switch it to return 0 on timeout and 1
> when it made progress, so that a user can detect timeout (all our
> internal callers pass -1 for the timeout parameter, and thus can't time
> out, but external callers may care).
Agreed.

And another thing you said on IRC:
> 04:26 < eblake> urgh - libnbd is prone to die from SIGPIPE, but as a
library, I don't know if we can ignore it
> 04:26 < eblake> should we be using pthread_sigmask and blocking
SIGPIPE around our reads, and restoring it to the calling application's
default the rest of the
>                 time?
> 04:28 < eblake> ooh - maybe we can rely on MSG_NOSIGNAL :)
> 04:28 < eblake> that's a lot cleaner, even if it is
Linux-specific
MSG_NOSIGNAL is also available on FreeBSD and seems to make a lot of
sense here, given that as you say changing the signal handler isn't a
good idea from a library.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

Martin Kletzander

2019-Jun-27 10:11 UTC

head link

Re: [Libguestfs] Few libnbd questions/concerns

On Wed, Jun 26, 2019 at 09:33:04PM -0500, Eric Blake
wrote:>On 6/25/19 9:25 AM, Eric Blake wrote:
>
>>>> When running a program with `nbdkit -U - --run ...`, the $nbd
parameter gets
>>>> expanded to nbd:unix:/path/to/socket.  When this string is
passed to
>>>> nbd_connect_uri(), it does not return an error (even though it
is not a valid
>>>> URL), but what's more it treats it as
"nbd://localhost", which might be a completely different server (that
actually happened to me and it was kind of confusing).
>>>
>>> There's a bit of magic in nbdkit about $nbd.  It also predates
our
>>> standardization effort for NBD URLs which is going on upstream.
>>>
>>> That needs to be fixed, but in the meantime just use $unixsocket
>>> instead (see nbdkit-captive(1) man page for the full details).
>>
>> In the meantime, I don't mind working on the quick fix for
rejecting an
>> invalid URI.
>
>Done.
>
>>
>>>
>>>> When nbd_block_status() callback returns -1, the
nbd_block_status() function
>>>> reports:
>>>>
>>>>  command failed: Protocol error (errno = 71)
>>>>
>>>> which is a bit confusing to me.  I could be nicer to have that
report that it
>>>> really was the callback that caused this.
>>>
>>> Not sure about this one - Eric?
>>
>> Right now, a failed callback will cause the overall nbd_block_status()
>> to fail with whatever errno the callback left behind, with EPROTO as
the
>> default. I'm guessing you didn't set an errno when you returned
-1?  Is
>> there a better default we should use than EPROTO?  Is there really a
way
>> to diagnose without an errno value that the callback failed but
didn't
>> set errno?
>>
>
>My pending patch to take advantage of Mutable(Int) in the callback,
>makes this a bit nicer, as you can then control the error status by
>assigning into the parameter rather than by what got left in errno.
>
>> There's also the related question - my patches to add
>> nbd_pread_structured are set up to call the callback for as many chunks
>> as the server sends, even if an earlier invocation of the callback
>> fails, but to still preserve the errno of the first failed callback
>> (again defaulting to EPROTO). Should nbd_block_status also call the
>> callback for all chunks from the server, rather than stopping the use
of
>> further callbacks once the first callback fails, if only for
consistency
>> between the two?
>
>So now I'm leaning towards making that change, particularly since
I'm
>also working on adding a nbd_aio_FOO_notify() for each command that
>takes a callback that also takes an error pointer.
>
>>
>>>> One last thing is that I could not find any definition for the
flags
>>>> of "base:allocation" metadata (NBD_STATE_HOLE and
NBD_STATE_ZERO).
>>>> It might just be things that are still missing due to an early
stage
>>>> of libnbd, or it might not belong there or it might be me
misusing
>>>> something.
>>>
>>> Have a look at the NBD protocol doc:
>>>
>>> https://github.com/NetworkBlockDevice/nbd/blob/master/doc/proto.md
>>
>> We could add convenience #defines into libnbd.h if needed, which would
>> work for the base:allocation context, but it would not help any other
>> context (such as qemu:dirty-bitmap:FOO) which has other definitions for
>> the returned bits.  But the fact that you have to call
>> nbd_add_meta_context(h, "base:allocation") is also awkward;
if we add
>> convenience macros for the results, we probably also want to support
>> nbd_add_meta_context(h, LIBNBD_CONTEXT_BASE_ALLOCATION).
>
>Still needs to be done.
>
Oh, it it's just this, then I can add that, at least this is not too deep
inside
the machinery that I'd have to go through everything to understand it
properly.
I sent a simple patch that adds it, we can discuss that there.
>Another thing I just noticed: nbd_poll returns 0 both for timeout and
>for when it notified the state machine about POLLIN or POLLOUT making
>progress. We probably want to switch it to return 0 on timeout and 1
>when it made progress, so that a user can detect timeout (all our
>internal callers pass -1 for the timeout parameter, and thus can't time
>out, but external callers may care).
>
I haven't gotten to using async APIs yet.  But resembling poll(2) seems
reasonable.
>-- 
>Eric Blake, Principal Software Engineer
>Red Hat, Inc.           +1-919-301-3226
>Virtualization:  qemu.org | libvirt.org
>

Apparently Analagous Threads

Search for more maybe matching threads

Libguestfs - Jun 2019 - Re: Few libnbd questions/concerns

Re: [Libguestfs] Few libnbd questions/concerns

Re: [Libguestfs] Few libnbd questions/concerns

Re: [Libguestfs] Few libnbd questions/concerns

Re: [Libguestfs] Few libnbd questions/concerns

Re: [Libguestfs] Few libnbd questions/concerns

Apparently Analagous Threads