Richard W.M. Jones
2022-Jun-09 14:27 UTC
[Libguestfs] [libnbd PATCH v3 2/5] python: Alter lock for persistent buffer
On Thu, Jun 09, 2022 at 08:34:44AM -0500, Eric Blake wrote:> As long as we were wrapping a C buffer and not leaking any Python > objects, we had to lock the nbd.Buffer object, to ensure that the > PyCapsule was not released prematurely, as there was nothing else to > hold on to. But now that the previous commit (4f15d0e9) swapped to > wrapping a Python bytearray object, we are better off locking the > PyMemoryView created from that object, for two reasons. First, if we > are going to allow aio_{pread,pwrite} to take an arbitrary buffer-like > object instead of requiring an nbd.Buffer object, then that's all we > have available to lock. Second, hanging on to a PyMemoryView provides > some nice guarantees that the underlying python buffer-like object > won't be reallocated behind our backs. The following snippets from > nbdsh use bad style (you should never access a ._* member outside of > its class), but demonstrate the same issue we would be faced with once > nbd.Buffer starts sharing rather than copying in the public > .{to,from}_bytearray: > > Pre-patch: > nbd> b = nbd.Buffer(10) > nbd> c = h.aio_pread(b, 0) > nbd> b._o.pop() > 0 > nbd> b.size() > 9 > nbd> h.poll(-1) > > Oops - we changed the size of the underlying bytearray prior to > receiving the server's reply. If that caused python to reallocate the > buffer underlying the bytearray, we've just clobbered random memory. > > Post-patch: > nbd> b = nbd.Buffer(10) > nbd> c = h.aio_pread(b, 0) > nbd> b._o.pop() > Traceback (most recent call last): > File "/usr/lib64/python3.10/code.py", line 90, in runcode > exec(code, self.locals) > File "<console>", line 1, in <module> > BufferError: Existing exports of data: object cannot be re-sizedWow what a horrible error message! However it comes from Python, not us, so there's not a lot we can do about it.> nbd> h.poll(-1) > nbd> h.aio_command_completed(c) > True > nbd> b._o.pop() > 0 > nbd> b.size() > 9 > > There - now we are sure that as long as our read is pending, the > underlying bytearray cannot be reallocated; but as soon as it > completes, we have regained full use of the bytearray. > --- > generator/Python.ml | 15 +++++---------- > 1 file changed, 5 insertions(+), 10 deletions(-) > > diff --git a/generator/Python.ml b/generator/Python.ml > index 6980034..20d7e78 100644 > --- a/generator/Python.ml > +++ b/generator/Python.ml > @@ -424,16 +424,12 @@ let > pr " %s_view = nbd_internal_py_get_aio_view (%s, true);\n" n n; > pr " if (!%s_view) goto out;\n" n; > pr " py_%s = PyMemoryView_GET_BUFFER (%s_view);\n" n n; > - pr " /* Increment refcount since buffer may be saved by libnbd. */\n"; > - pr " Py_INCREF (%s);\n" n; > - pr " completion_user_data->buf = %s;\n" n > + pr " completion_user_data->view = %s_view;\n" n > | BytesPersistOut (n, _) -> > pr " %s_view = nbd_internal_py_get_aio_view (%s, false);\n" n n; > pr " if (!%s_view) goto out;\n" n; > pr " py_%s = PyMemoryView_GET_BUFFER (%s_view);\n" n n; > - pr " /* Increment refcount since buffer may be saved by libnbd. */\n"; > - pr " Py_INCREF (%s);\n" n; > - pr " completion_user_data->buf = %s;\n" n > + pr " completion_user_data->view = %s_view;\n" n > | Closure { cbname } -> > pr " %s.user_data = %s_user_data = alloc_user_data ();\n" cbname cbname; > pr " if (%s_user_data == NULL) goto out;\n" cbname; > @@ -538,8 +534,7 @@ let > pr " if (%s.obj)\n" n; > pr " PyBuffer_Release (&%s);\n" n > | BytesOut (n, _) -> pr " Py_XDECREF (%s);\n" n > - | BytesPersistIn (n, _) | BytesPersistOut (n, _) -> > - pr " Py_XDECREF (%s_view);\n" n > + | BytesPersistIn (n, _) | BytesPersistOut (n, _) -> () > | Closure { cbname } -> > pr " free_user_data (%s_user_data);\n" cbname > | Enum _ -> () > @@ -588,7 +583,7 @@ let > pr " */\n"; > pr "struct user_data {\n"; > pr " PyObject *fn; /* Optional pointer to Python function. */\n"; > - pr " PyObject *buf; /* Optional pointer to persistent buffer. */\n"; > + pr " PyObject *view; /* Optional PyMemoryView of persistent buffer. */\n"; > pr "};\n"; > pr "\n"; > pr "static struct user_data *\n"; > @@ -609,7 +604,7 @@ let > pr "\n"; > pr " if (data) {\n"; > pr " Py_XDECREF (data->fn);\n"; > - pr " Py_XDECREF (data->buf);\n"; > + pr " Py_XDECREF (data->view);\n"; > pr " free (data);\n"; > pr " }\n"; > pr "}\n";I gather it works, but I don't understand why. What increments the ref count on data->view? Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com Fedora Windows cross-compiler. Compile Windows programs, test, and build Windows installers. Over 100 libraries supported. http://fedoraproject.org/wiki/MinGW
Eric Blake
2022-Jun-09 16:24 UTC
[Libguestfs] [libnbd PATCH v3 2/5] python: Alter lock for persistent buffer
On Thu, Jun 09, 2022 at 03:27:36PM +0100, Richard W.M. Jones wrote:> > Post-patch: > > nbd> b = nbd.Buffer(10) > > nbd> c = h.aio_pread(b, 0) > > nbd> b._o.pop() > > Traceback (most recent call last): > > File "/usr/lib64/python3.10/code.py", line 90, in runcode > > exec(code, self.locals) > > File "<console>", line 1, in <module> > > BufferError: Existing exports of data: object cannot be re-sized > > Wow what a horrible error message! However it comes from Python, not > us, so there's not a lot we can do about it.Yeah, it doesn't tell you how to find which object is still hanging on to the buffer export. And during development, I had several times where I didn't call enough Py_DECREF(), which was interesting to track down where the stale references were being kept when all I had was this message to go on.> > > nbd> h.poll(-1) > > nbd> h.aio_command_completed(c) > > True > > nbd> b._o.pop() > > 0 > > nbd> b.size() > > 9But I'm pretty happy about the results - Python's ability to magically lock a bytearray from being resized while a memoryview is active, so that we can then peer into its underlying C memory without copying, then restore full functionality when we're done with the underlying memory, is pretty cool. As to your question about ref-counting:> > +++ b/generator/Python.ml > > @@ -424,16 +424,12 @@ let > > pr " %s_view = nbd_internal_py_get_aio_view (%s, true);\n" n n; > > pr " if (!%s_view) goto out;\n" n; > > pr " py_%s = PyMemoryView_GET_BUFFER (%s_view);\n" n n; > > - pr " /* Increment refcount since buffer may be saved by libnbd. */\n"; > > - pr " Py_INCREF (%s);\n" n; > > - pr " completion_user_data->buf = %s;\n" n > > + pr " completion_user_data->view = %s_view;\n" nPre-patch and post-patch: buf_view provides a new object (reference count incremented to at least 1), as a result of calling nbd_internal_py_get_aio_view. Pre-patch did not want that reference count left around, so it cleans up that reference in the same function at [1]; rather, pre-patch wanted to save a different object (buf) and had to both Py_INCREF() it as well as assigning buf into completion_user_data for later cleanup [2]. Post-patch WANTS to use the reference count on view as our long-term lock, so it no longer needs in-function cleanup at [1], and no longer has to mess with the reference counts on buf.> > | BytesPersistOut (n, _) -> > > pr " %s_view = nbd_internal_py_get_aio_view (%s, false);\n" n n; > > pr " if (!%s_view) goto out;\n" n; > > pr " py_%s = PyMemoryView_GET_BUFFER (%s_view);\n" n n; > > - pr " /* Increment refcount since buffer may be saved by libnbd. */\n"; > > - pr " Py_INCREF (%s);\n" n; > > - pr " completion_user_data->buf = %s;\n" n > > + pr " completion_user_data->view = %s_view;\n" n > > | Closure { cbname } -> > > pr " %s.user_data = %s_user_data = alloc_user_data ();\n" cbname cbname; > > pr " if (%s_user_data == NULL) goto out;\n" cbname; > > @@ -538,8 +534,7 @@ let > > pr " if (%s.obj)\n" n; > > pr " PyBuffer_Release (&%s);\n" n > > | BytesOut (n, _) -> pr " Py_XDECREF (%s);\n" n > > - | BytesPersistIn (n, _) | BytesPersistOut (n, _) -> > > - pr " Py_XDECREF (%s_view);\n" n > > + | BytesPersistIn (n, _) | BytesPersistOut (n, _) -> ()[1] mentioned above, where we cleaned up buf_view pre-patch but not post-patch.> > | Closure { cbname } -> > > pr " free_user_data (%s_user_data);\n" cbname > > | Enum _ -> () > > @@ -588,7 +583,7 @@ let > > pr " */\n"; > > pr "struct user_data {\n"; > > pr " PyObject *fn; /* Optional pointer to Python function. */\n"; > > - pr " PyObject *buf; /* Optional pointer to persistent buffer. */\n"; > > + pr " PyObject *view; /* Optional PyMemoryView of persistent buffer. */\n"; > > pr "};\n"; > > pr "\n"; > > pr "static struct user_data *\n"; > > @@ -609,7 +604,7 @@ let > > pr "\n"; > > pr " if (data) {\n"; > > pr " Py_XDECREF (data->fn);\n"; > > - pr " Py_XDECREF (data->buf);\n"; > > + pr " Py_XDECREF (data->view);\n";[2], where we cleaned up buf pre-patch, and buf_view post-patch.> > pr " free (data);\n"; > > pr " }\n"; > > pr "}\n"; > > I gather it works, but I don't understand why. What increments the > ref count on data->view?I hope that answered your question. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org