Eric Blake
2019-Aug-12 18:53 UTC
Re: [Libguestfs] [PATCH libnbd 1/7] api: Add semi-private function for freeing persistent data.
On 8/12/19 1:13 PM, Richard W.M. Jones wrote:> On Mon, Aug 12, 2019 at 11:29:10AM -0500, Eric Blake wrote: >> On 8/12/19 11:08 AM, Richard W.M. Jones wrote: >>> + typedef void (*nbd_free_callback) (void *ptr, void *user_data); >>> + int nbd_add_free_callback (struct nbd_handle *h, >>> + void *ptr, >>> + nbd_free_callback cb, >>> + void *user_data); >> >> Do we want to insist on a user_data argument? Libvirt, for example, >> states that free callbacks are passed only the pointer to be freed (as >> you can already pack whatever you need into that pointer alone, rather >> than needing yet another void *user_data); if we do not take a user_data >> here, then... > > Indeed, and that was my original implementation, and it would > be usable in the simple case (free). > > However it doesn't work for OCaml buffers (nor for Python). In OCaml > we need to free the GC root, but that is separate from the buffer > pointer. > > The generated code for an OCaml persistent buffers eventually looks > like this: > > /* The function may save a reference to the Buffer, so we > * must treat it as a possible GC root. > */ > value *buf_user_data; > buf_user_data = malloc (sizeof (value)); > if (buf_user_data == NULL) caml_raise_out_of_memory (); > *buf_user_data = bufv; > caml_register_generational_global_root (buf_user_data); > struct nbd_buffer *buf_buf = NBD_buffer_val (bufv); > const void *buf = buf_buf->data; > size_t count = buf_buf->len; > if (nbd_add_free_callback (h, (void *)buf, > free_root, buf_user_data) == -1) > caml_raise_out_of_memory (); > > Notice that the free_root function needs the pointer to the GC root, > but you can't get to that from the buffer pointer.But still, can't we create a single C struct that tracks both the buffer and the GC root, and then pass the address of that C struct as the one pointer to the free function? In other words, I'm trying to get to the point of something like: struct closure_wrap { /* track anything for buffer GC root and refcounting */ value *buf; /* track anything for the OCaml callback, if one was present */ value *callback; }; struct closure_wrap *wrap = malloc (sizeof *wrap); wrap->buf = malloc (sizeof (*wrap->buf)); caml_register_generational_global_root (wrap->buf); struct nbd_buffer *buf_buf = NBD_buffer_val (bufv); const void *buf = buf_buf->data; size_t count = buf_buf->len; if (in nbd_aio_pread) { wrap->callback = NULL; res = nbd_aio_pread_callback(h, buf, count, offset, NULL, free_cb, wrap, flags); } else /* in nbd_aio_pread_callback */ { wrap->callback = ...; res = nbd_aio_pread_callback(h, buf, count, offset, completion_cb, free_cb, wrap, flags); } where completion_cb(wrap) calls the OCaml callback (and doesn't get used if there is no OCaml callback), and where free_cb(wrap) gets called at the end of the lifetime for buf, wrap->callback (if any), AND wrap itself, so that we have: void free_cb(void *ptr) { struct closure_wrap *wrap = ptr; caml_remove_generational_global_root (wrap->buf); if (wrap->callback) { unregister the OCaml callback... } free (wrap); }> > A similar problem will exist in Python, but I've not got around to > writing that code yet. > >>> +C<ptr> is a pointer to the object (ie. the buffer or >>> +callback C<user_data>). C<cb (ptr)> is called when libnbd >> >> As written, your patch uses C<cb (ptr, user_data)>, so either this text >> is wrong, or you really don't need user_data. > > Yes it should be cb (ptr, user_data). I'll tighten it up because > there are now two user_datas.But again, we may be able to get by with just one pointer. (I'm still hoping the idea of a three-tuple in C: main callback, free callback, and user data - is sufficient to give language bindings the flexibility they need to pack up a C struct containing everything needing cleanup at the end of the lifetime, whether or not there is also a language callback in play)> >>> + new_callbacks = realloc (h->free_callbacks, >>> + sizeof (struct free_callback) * new_alloc); >> >> Should we start relying on reallocarray() to guarantee no multiplication >> overflow? (Not yet in POSIX, but it has been proposed: >> http://austingroupbugs.net/view.php?id=1218) > > I guess - didn't know about it. Seems like it's not available in > glibc yet?'man realloc' on Fedora 29 lists reallocarray under _GNU_SOURCE, but fails to say which glibc introduced it, and 'man reallocarray' is missing (bug in the man pages).>>> + >>> + /* Need to keep the list sorted by pointer so we can use bsearch. >>> + * Insert the new entry before the i'th entry. >>> + * >>> + * Note the same pointer can be added multiple times (eg. if a >>> + * buffer is shared between commands), and that is not a bug. Which >>> + * free function is called is undefined, but they should all be >>> + * called eventually. >> >> or in other words, if the free function is actually a reference counter >> that only does something interesting when the count reaches 0, then >> things still work. > > Yes this is another reason why this API is trickier to use than it > seems. But this is really necessary if you share a buffer between > several AIO calls as I discovered during implementation.Normally, a user doesn't share the same buffer on the same handle - but we do test that for our speed tests (where we don't care about buffer contents so much as how many API calls can we fire off, so having cross-contamination between API calls into the same buffer in those tests is okay). But you are also right that it is going to be common to read into a buffer from one handle, then write that buffer out to another handle - so our actions on capturing the buffer during aio_pread, and then releasing it when retiring that command, really are centered on ref-counting whatever the language handed us, and only free()ing the actual C struct we use to tie everything together under a single pointer.> >>> + */ >>> + for (i = 0; i < h->nr_free_callbacks; ++i) { >>> + if (ptr <= h->free_callbacks[i].ptr) >> >> Comparing 2 pointers that are not allocated as part of the same array is >> undefined in C. To be safe, you're better off writing this as: >> >> if ((intptr_t)ptr <= (intptr_t)h->free_callbacks[i].ptr) >> >> or even storing intptr_t rather than void* in your list of pointers >> needing callbacks. > > OK.And if I'm right about my idea of having each Closure become a 3-tuple in C, where we can use a common free_cb(ptr) at the time the command is retired for both the language nbd.aio_pread and for nbd.aio_pread_callback, then pointer comparisons is moot because we don't need to track a list of pointers, but rather just a free_cb alongside each user_data.> >>> + break; >>> + } >> >> This is a linear search O(n) for where to insert. Why not bsearch() for >> O(log n), particularly since you document your intent to use bsearch() >> for later lookups? > > Yes indeed, good idea! > >>> + memmove (&h->free_callbacks[i+1], &h->free_callbacks[i], >>> + (h->nr_free_callbacks - i) * sizeof (struct free_callback)); >> >> Hmm - the use of memmove() is O(n); we'd need a different data structure >> (for example red-black tree or hash table) if we wanted list >> manipulations to not be the chokepoint. So, as long as you are already >> stuck with an O(n) list manipulation, using O(n) lookup is not making >> the problem any worse. > > h->nr_free_callbacks is actually reasonably small in real code > (usually <= 20). It's somewhat related to the number of commands in > flight multiplied by a small constant.And my argument is that we are not in control of that number - a user can write a Python script that deliberately calls nbd.aio_pread 1000 times before ever calling nbd.poll to finally advance the state machine.>>> +static int >>> +compare_free_callbacks (const void *v1, const void *v2) >>> +{ >>> + const void *ptr = v1; >>> + const struct free_callback *cb = v2; >>> + >>> + if (ptr < cb->ptr) return -1; >>> + else if (ptr > cb->ptr) return 1; >> >> Again, undefined in C; comparing intptr_t is defined, but direct >> comparison of unrelated pointers could cause a compiler to mis-optimize >> (even if it happens to currently work in practice). > > OK. We could store the pointers in the struct as intptr_t ?Yes, if we have to compare them at all. But I'm hoping we don't have to.>>> + free_cb = bsearch (ptr, h->free_callbacks, h->nr_free_callbacks, >>> + sizeof (struct free_callback), >>> + compare_free_callbacks); >> >> Here, you've got O(log n) lookup. >> >>> + if (free_cb) { >>> + assert (ptr == free_cb->ptr); >>> + >>> + free_cb->cb (ptr, free_cb->user_data); >>> + >>> + /* Remove it from the free list. */ >>> + memmove (free_cb, free_cb+1, >>> + sizeof (struct free_callback) * >>> + (&h->free_callbacks[h->nr_free_callbacks] - (free_cb+1))); >> >> but then slow it down with O(n) list manipulation. > > Right, but only when we actually call the callback. > > For C callers we will basically never use this. h->nr_free_callbacks > will be 0 and we will do a frequent bsearch on a zero-sized array. > It's only expected to be used from other programming languages.Okay, so the effect to C is minimal, but the effect to other languages may be noticeable (since there we will be registering a callback for every aio call).> >> I'm still not sold on whether this is any better than our existing >> completion callbacks coupled with VALID|FREE (where at least we didn't >> suffer from O(n) lookups); or whether we want to instead explore taking >> an explicit free function pointer as part of a C closure (that is, each >> Closure maps to a three-tuple of C arguments for main function, free >> function, and user_data). But as you said in the cover letter, having >> the whole series to review will let us choose between our options, so >> I'll see what the rest of the series is able to accomplish with this. > > I think we shouldn't worry about whether the implementation is > efficient since we can always improve that based on feedback from > perf. It's really about whether this is a better API for freeing or not.Getting rid of the VALID|FREE seems worthwhile, and _since_ that is already an API/ABI change, we can _also_ make the API/ABI change from Closure in C being a 2-tuple (fn, ptr) to being a 3-tuple (fn, free_fn, ptr). The tricky part that remains is how to convince the generator that the code for Python nbd.aio_pread AND for nbd.aio_pread_callback both need to use the C nbd_aio_pread_callback, for the sake of the completion's free_cb() to do the necessary refcount cleanup on the buf, at which point we no longer need a separate registration of free callbacks. We then have the design question of whether to make an OClosure type, where C has two functions nbd_aio_pread and nbd_aio_pread_callback for convenience, but where other languages have only a single nbd.aio_pread where the callback parameter is optional (the Closure type for pread_structured chunk and for block_status extent will still be mandatory; it is only the completion callback that is currently causing us twice the API because we are treating it as pseudo-optional). Or maybe we just require C clients of nbd_aio_pread to always provide parameters for callbacks, but document that the completion callback and free callback pointers may be NULL. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3226 Virtualization: qemu.org | libvirt.org
Richard W.M. Jones
2019-Aug-12 22:00 UTC
Re: [Libguestfs] [PATCH libnbd 1/7] api: Add semi-private function for freeing persistent data.
On Mon, Aug 12, 2019 at 01:53:56PM -0500, Eric Blake wrote:> We then have the design question of whether to make an OClosure type, > where C has two functions nbd_aio_pread and nbd_aio_pread_callback for > convenience, but where other languages have only a single nbd.aio_pread > where the callback parameter is optional (the Closure type for > pread_structured chunk and for block_status extent will still be > mandatory; it is only the completion callback that is currently causing > us twice the API because we are treating it as pseudo-optional). Or > maybe we just require C clients of nbd_aio_pread to always provide > parameters for callbacks, but document that the completion callback and > free callback pointers may be NULL.I'll try to see what OClosure (and then getting rid of the _callback variants, adding to the non-callback variants OClosure) will look like tomorrow. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com libguestfs lets you edit virtual machines. Supports shell scripting, bindings from many languages. http://libguestfs.org
Richard W.M. Jones
2019-Aug-13 07:33 UTC
Re: [Libguestfs] [PATCH libnbd 1/7] api: Add semi-private function for freeing persistent data.
On Mon, Aug 12, 2019 at 11:00:01PM +0100, Richard W.M. Jones wrote:> On Mon, Aug 12, 2019 at 01:53:56PM -0500, Eric Blake wrote: > > We then have the design question of whether to make an OClosure type, > > where C has two functions nbd_aio_pread and nbd_aio_pread_callback for > > convenience, but where other languages have only a single nbd.aio_pread > > where the callback parameter is optional (the Closure type for > > pread_structured chunk and for block_status extent will still be > > mandatory; it is only the completion callback that is currently causing > > us twice the API because we are treating it as pseudo-optional). Or > > maybe we just require C clients of nbd_aio_pread to always provide > > parameters for callbacks, but document that the completion callback and > > free callback pointers may be NULL. > > I'll try to see what OClosure (and then getting rid of the _callback > variants, adding to the non-callback variants OClosure) will look like > tomorrow.I should say that while OClosure has the possibility of solving the freeing problem for buffers and closures attached to AIO commands, and may also be a good idea to get rid of the *_callback variants, it doesn't solve the problem in general. In particular there's no way to free the debug function closure, which means it would be leaked in language bindings. Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
Reasonably Related Threads
- Re: [PATCH libnbd 1/7] api: Add semi-private function for freeing persistent data.
- Re: [PATCH libnbd 1/7] api: Add semi-private function for freeing persistent data.
- Re: [PATCH libnbd] api: Rename nbd_aio_*_callback to nbd_aio_*.
- [PATCH libnbd] api: Rename nbd_aio_*_callback to nbd_aio_*.
- [PATCH libnbd proposal] api: Add semi-private function for freeing persistent data.