Richard W.M. Jones
2023-Jun-08 12:20 UTC
[Libguestfs] [libnbd PATCH v3 03/22] protocol: Add definitions for extended headers
On Thu, Jun 08, 2023 at 01:48:41PM +0200, Laszlo Ersek wrote:> On 6/7/23 12:00, Richard W.M. Jones wrote: > > On Tue, May 30, 2023 at 05:48:25PM +0200, Laszlo Ersek wrote: > >> BTW I'm foreseeing a problem: if the extended block descriptor can > >> provide an unsigned 64-bit length, we're going to have trouble exposing > >> that in OCaml, because OCaml only has signed 64-bit integers. So that's > >> going to reproduce the same issue, only for OCaml callers of the *new* API. > >> > >> I can see Eric's series includes patches like "ocaml: Add example for > >> 64-bit extents" -- I've not looked at those yet; for now I'm just > >> wondering what tricks we might need in the bindings generator. The > >> method seen in the "middle patch" above won't work; we don't have a > >> native OCaml "i128" type for example that we could use as an escape > >> hatch, for representing C's uint64_t. > > > > I think that's OK because disk sizes are already limited to > > 2^63 - 1 by the kernel (and for qemu even less than that). > > The OCaml bindings return a (signed) int64 for NBD.get_size. > > Under patch#7 yesterday, I made a proposal for "armoring" at least one > instance / direction of the uint64_t <-> int64 conversion. It raised an > interesting problem: raising OCaml exceptions in such C functions that > are *not* directly called by the OCaml runtime. Comments would be much > appreciated in that subthread!I can't seem to find that thread (but also gmail-- split the messages randomly over the 3 different mailing lists because Google don't understand how email works). Do you have a link? But the answer is you can raise an exception anywhere since it's really just a longjmp. The only issue is if you need stack-allocated variables to be freed, which for OCaml values is handled by the CAMLlocal* macros and for C variables you need to be careful about and deal with yourself.> (On a tangent: I've also noticed we use CAMLparam0() & friends in some > of our functions that are *not* directly called by the OCaml runtime. > They certainly run on the OCaml runtime's stack, but there's at least > one intervening stack frame where the C-language function is provided by > us. Now I know we must use CAMLparam0() in our *outermost* such > function, but what about the further functions (inner C-language > functions) that our outermost function calls in turn? I think the inner > functions are at liberty not to use CAMLparam0() -- otherwise, our > functions couldn't even call normal C library functions!)These macros just set up a linked list of frames. You don't need to use them in every function, only ones which are using OCaml values. The macros are fairly easy to understand by reading them: https://github.com/ocaml/ocaml/blob/864f772e5338dcf6be2093d5cc3ed6f7fbce16b7/runtime/caml/memory.h#L270 When the GC runs it walks up the linked list of the current thread to find roots. The only tricky thing about it is making sure that at any point where the GC could run, each slot contains a valid entry and not some intermediate or uninitialized value, since this is precise (not conservative) garbage collection. This mechanism is only used by C code. In OCaml code there's a bitmap generated for each function showing which stack slots contain values (versus ints, return addresses, other stuff). Rich. -- Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones Read my programming and virtualization blog: http://rwmj.wordpress.com nbdkit - Flexible, fast NBD server with plugins https://gitlab.com/nbdkit/nbdkit
Laszlo Ersek
2023-Jun-08 12:38 UTC
[Libguestfs] [libnbd PATCH v3 03/22] protocol: Add definitions for extended headers
On 6/8/23 14:20, Richard W.M. Jones wrote:> On Thu, Jun 08, 2023 at 01:48:41PM +0200, Laszlo Ersek wrote: >> On 6/7/23 12:00, Richard W.M. Jones wrote: >>> On Tue, May 30, 2023 at 05:48:25PM +0200, Laszlo Ersek wrote: >>>> BTW I'm foreseeing a problem: if the extended block descriptor can >>>> provide an unsigned 64-bit length, we're going to have trouble exposing >>>> that in OCaml, because OCaml only has signed 64-bit integers. So that's >>>> going to reproduce the same issue, only for OCaml callers of the *new* API. >>>> >>>> I can see Eric's series includes patches like "ocaml: Add example for >>>> 64-bit extents" -- I've not looked at those yet; for now I'm just >>>> wondering what tricks we might need in the bindings generator. The >>>> method seen in the "middle patch" above won't work; we don't have a >>>> native OCaml "i128" type for example that we could use as an escape >>>> hatch, for representing C's uint64_t. >>> >>> I think that's OK because disk sizes are already limited to >>> 2^63 - 1 by the kernel (and for qemu even less than that). >>> The OCaml bindings return a (signed) int64 for NBD.get_size. >> >> Under patch#7 yesterday, I made a proposal for "armoring" at least one >> instance / direction of the uint64_t <-> int64 conversion. It raised an >> interesting problem: raising OCaml exceptions in such C functions that >> are *not* directly called by the OCaml runtime. Comments would be much >> appreciated in that subthread! > > I can't seem to find that thread (but also gmail-- split the messages > randomly over the 3 different mailing lists because Google don't > understand how email works). Do you have a link?It starts here (link to patch#7): http://mid.mail-archive.com/20230525130108.757242-8-eblake at redhat.com (The primary libguestfs archive works too, but mailman splits the archives on month boundaries, and so the patch is still in the May bucket, but my reply is in the June one: https://listman.redhat.com/archives/libguestfs/2023-May/031613.html https://listman.redhat.com/archives/libguestfs/2023-June/031736.html ) The particular topic is at the very end of my message.> But the answer is you can raise an exception anywhere since it's > really just a longjmp. The only issue is if you need stack-allocated > variables to be freed, which for OCaml values is handled by the > CAMLlocal* macros and for C variables you need to be careful about and > deal with yourself.Understood! Thanks! So for example, in extent64_wrapper_locked(), if we exited the called nbd_internal_ocaml_alloc_extent64_array() function with caml_failwith(), the caml_copy_string() and caml_copy_int64() allocations, stored earlier into "CAMLlocal"s "metacontextv" and "offsetv", would not be leaked?> >> (On a tangent: I've also noticed we use CAMLparam0() & friends in some >> of our functions that are *not* directly called by the OCaml runtime. >> They certainly run on the OCaml runtime's stack, but there's at least >> one intervening stack frame where the C-language function is provided by >> us. Now I know we must use CAMLparam0() in our *outermost* such >> function, but what about the further functions (inner C-language >> functions) that our outermost function calls in turn? I think the inner >> functions are at liberty not to use CAMLparam0() -- otherwise, our >> functions couldn't even call normal C library functions!) > > These macros just set up a linked list of frames. You don't need to > use them in every function, only ones which are using OCaml values.Ah, understood.> The macros are fairly easy to understand by reading them: > > https://github.com/ocaml/ocaml/blob/864f772e5338dcf6be2093d5cc3ed6f7fbce16b7/runtime/caml/memory.h#L270 > > When the GC runs it walks up the linked list of the current thread to > find roots. The only tricky thing about it is making sure that at any > point where the GC could run, each slot contains a valid entry and not > some intermediate or uninitialized value, since this is precise (not > conservative) garbage collection.Right, <https://v2.ocaml.org/manual/intfc.html> too contains a related warning: Rule 5? After a structured block (a block with tag less than No_scan_tag) is allocated with the low-level functions, all fields of this block must be filled with well-formed values before the next allocation operation. [...] Thankfully it also says, "You can ignore those rules if you stick to the simplified allocation function caml_alloc".> > This mechanism is only used by C code. In OCaml code there's a bitmap > generated for each function showing which stack slots contain values > (versus ints, return addresses, other stuff).I'm slightly interested in learning the OCaml runtime details, but at the same time I feel like not knowing them (and relying only on the docs) might allow me to write more "portable" code... Laszlo