thr3ads.net - Libguestfs - [Libguestfs] nbdcpy: from scratch nbdcopy using io

If this information is useful, please help other people find it:
Share via:

Abhay Raj Singh

2021-Aug-23 15:56 UTC

[Libguestfs] nbdcpy: from scratch nbdcopy using io_uring

I had an idea for optimizing my current approach, it's good in some
ways but can be faster with some breaking changes to the protocol.

Currently, we read (from socket connected to source) one request at a time
the simple flow looks like `read_header(io_uring) ---- success --->
recv(data) --- success ---> send(data) & queue another read header`
but it's not as efficient as it could be at best it's a hack.

Another approach I am thinking about is a large buffer
where we can read all of the socket's data and process packets from
that buffer as all the I/O is handled.
this minimizes the number of read requests to the kernel as we do 1
read for multiple NBD packets.

Further optimization requires changing the NBD protocol a bit
Current protocol
1. Memory representation of a response (20-byte header + data)
2. Memory representation of a request (28-byte header + data)

HHHHH_DDDDDDDDD...
HHHHHHH_DDDDDDDDD...

H and D represent 4 bytes, _ represents 0 bytes

With the large buffer approach, we read data into a large buffer, then
copy the NBD packet's data to a new buffer, strap a new header to it
and send it.
This copying is what we wanted to avoid in the first place.

If the response header was 28 bytes or the first 8-bytes of data were
useless we could have just overwritten the header part and sent data
directly from the large buffer, therefore avoiding the copy.

What are your thoughts?

Thanks and Regards.
Abhay

Eric Blake

2021-Aug-23 17:20 UTC

head link

[Libguestfs] nbdcpy: from scratch nbdcopy using io_uring

[adding the NBD list into cc]

On Mon, Aug 23, 2021 at 09:26:34PM +0530, Abhay Raj Singh
wrote:> I had an idea for optimizing my current approach, it's good in some
> ways but can be faster with some breaking changes to the protocol.
> 
> Currently, we read (from socket connected to source) one request at a time
> the simple flow looks like `read_header(io_uring) ---- success --->
> recv(data) --- success ---> send(data) & queue another read header`
> but it's not as efficient as it could be at best it's a hack.
> 
> Another approach I am thinking about is a large buffer
> where we can read all of the socket's data and process packets from
> that buffer as all the I/O is handled.
> this minimizes the number of read requests to the kernel as we do 1
> read for multiple NBD packets.
> 
> Further optimization requires changing the NBD protocol a bit
> Current protocol
> 1. Memory representation of a response (20-byte header + data)
> 2. Memory representation of a request (28-byte header + data)
> 
> HHHHH_DDDDDDDDD...
> HHHHHHH_DDDDDDDDD...
> 
> H and D represent 4 bytes, _ represents 0 bytes
You are correct that requests are currently 28 bytes header plus any
payload (where payload is currently only in NBD_CMD_WRITE).  But
responses are two different lengths: simple responses are 16 bytes +
payload (payload only for NBD_CMD_READ, and only if structured replies
not negotiated), while structured responses are 20 bytes + payload
(but while NBD_CMD_READ and NBD_CMD_BLOCK_STATUS require structured
replies, a compliant server can still send simple replies to other
commands).  So it's even trickier than you represent here, as reading
20-byte headers of a reply is not going to always do the right thing.
> 
> With the large buffer approach, we read data into a large buffer, then
> copy the NBD packet's data to a new buffer, strap a new header to it
> and send it.
> This copying is what we wanted to avoid in the first place.
> 
> If the response header was 28 bytes or the first 8-bytes of data were
> useless we could have just overwritten the header part and sent data
> directly from the large buffer, therefore avoiding the copy.
> 
> What are your thoughts?
There's already discussions about what it would take to extend the NBD
protocol to support 64-bit requests (not that we'd want to go beyond
current server restrictions of 32M or 64M maximum NBD_CMD_READ and
NBD_CMD_WRITE, but more so we can permit quick image zeroing via a
64-bit NBD_CMD_WRITE_ZEROES).  Your observation that having the
request and response headers be equally sized for more efficient
handling is worthwhile to consider in making such a protocol extension
- of necessity, it would have to be via an NBD_OPT_* option requested
by the client during negotiation and responded to affirmatively by the
server, before both sides then use the new-size packets in both
directions after NBD_OPT_GO (and a client would still have to be
prepared to fall back to the unequal-sized headers if the server
doesn't understand the option).

For that matter, is there a benefit to having cache-line-optimized
sizing, where all headers are exactly 32 bytes (both requests and
responses, and both simple and structured replies)?  I'm thinking
maybe NBD_OPT_FIXED_SIZE_HEADER might be a sane name for such an
option.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Libguestfs - Aug 2021 - nbdcpy: from scratch nbdcopy using io_uring

[Libguestfs] nbdcpy: from scratch nbdcopy using io_uring

[Libguestfs] nbdcpy: from scratch nbdcopy using io_uring