Eric Blake
2022-May-31 19:23 UTC
[Libguestfs] [libnbd PATCH 1/2] api: Tighter checking of structured read replies
On Tue, May 31, 2022 at 08:40:57PM +0300, Nir Soffer wrote:> > > > @@ -364,6 +363,8 @@ STATE_MACHINE { > > > > SET_NEXT_STATE (%.DEAD); > > > > return 0; > > > > } > > > > + if (cmd->data_seen <= cmd->count) > > > > + cmd->data_seen += length; > > > > > > This does not feel right. if you received more data, it should be counted, > > > and if this causes data_seen to be bigger than cmd->count, isn't this a fatal > > > error? > > > > cmd->count is at most 64M; it represents how much we asked the server > > to provide. length was just validated (in the elided statement > > between these two hunks) to be <= cmd->count (otherwise, the server is > > known-buggy for replying with more than we asked, and we've already > > moved to %.DEAD state). cmd->data_seen is a cumulative count of all > > bytes seen in prior chunks, plus the current chunk. If we have not > > yet passed cmd->count, then this chunk counts towards the cumulative > > limit (and there is no overflow, since 64M*2 < UINT32_MAX). If we > > have already passed cmd->count (in a previous chunk), then we don't > > further increase cmd->count, but we already know that we will fail the > > overall read later. In other words, we can stop incrementing > > cmd->data_seen as soon as we know it exceeds cmd->count, and by > > stopping early, we still detect server bugs without overflowing > > uint32_t. > > But we can have this case: > > 1. ask for 32m > 2. server sends 16m (data_seen increase to 16m) > 3. server sends 16m (data_seen increase to 32m) > 4. server sends 1m (data_seen does not increase)Yes it does. 32m <= cmd->count is true, so we bump data_seen to 33m. Then, later on when retiring the command, we note that 33m != 32m and fail the read with EIO (if it has not already failed for other reasons).> 5. entire request succeeds > > Shouldn't we fail if server sends unexpected data? > > If we detected that all data was received, and we get > unexpected data, why not fail immediately? > > cmd->data_seen += length > if (cmd->data_seen > cmd->count) > switch to dead state?Switching immediately to a dead state is also possible, but it's nice to try and keep the connection alive as long as we can with a nice diagnosis of a failed CMD_READ but still allow further commands, rather than an abrupt disconnect that takes out all other use of the server. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org
Nir Soffer
2022-May-31 19:32 UTC
[Libguestfs] [libnbd PATCH 1/2] api: Tighter checking of structured read replies
On Tue, May 31, 2022 at 10:23 PM Eric Blake <eblake at redhat.com> wrote:> > On Tue, May 31, 2022 at 08:40:57PM +0300, Nir Soffer wrote: > > > > > @@ -364,6 +363,8 @@ STATE_MACHINE { > > > > > SET_NEXT_STATE (%.DEAD); > > > > > return 0; > > > > > } > > > > > + if (cmd->data_seen <= cmd->count) > > > > > + cmd->data_seen += length; > > > > > > > > This does not feel right. if you received more data, it should be counted, > > > > and if this causes data_seen to be bigger than cmd->count, isn't this a fatal > > > > error? > > > > > > cmd->count is at most 64M; it represents how much we asked the server > > > to provide. length was just validated (in the elided statement > > > between these two hunks) to be <= cmd->count (otherwise, the server is > > > known-buggy for replying with more than we asked, and we've already > > > moved to %.DEAD state). cmd->data_seen is a cumulative count of all > > > bytes seen in prior chunks, plus the current chunk. If we have not > > > yet passed cmd->count, then this chunk counts towards the cumulative > > > limit (and there is no overflow, since 64M*2 < UINT32_MAX). If we > > > have already passed cmd->count (in a previous chunk), then we don't > > > further increase cmd->count, but we already know that we will fail the > > > overall read later. In other words, we can stop incrementing > > > cmd->data_seen as soon as we know it exceeds cmd->count, and by > > > stopping early, we still detect server bugs without overflowing > > > uint32_t. > > > > But we can have this case: > > > > 1. ask for 32m > > 2. server sends 16m (data_seen increase to 16m) > > 3. server sends 16m (data_seen increase to 32m) > > 4. server sends 1m (data_seen does not increase) > > Yes it does. 32m <= cmd->count is true, so we bump data_seen to 33m.Right! I missed this.> Then, later on when retiring the command, we note that 33m != 32m and > fail the read with EIO (if it has not already failed for other > reasons). > > > 5. entire request succeeds > > > > Shouldn't we fail if server sends unexpected data? > > > > If we detected that all data was received, and we get > > unexpected data, why not fail immediately? > > > > cmd->data_seen += length > > if (cmd->data_seen > cmd->count) > > switch to dead state? > > Switching immediately to a dead state is also possible, but it's nice > to try and keep the connection alive as long as we can with a nice > diagnosis of a failed CMD_READ but still allow further commands, > rather than an abrupt disconnect that takes out all other use of the > server.I agree, this is better. Reviewed-by: Nir Soffer <nsoffer at redhat.com>
Eric Blake
2022-Jun-01 13:57 UTC
[Libguestfs] [libnbd PATCH 1/2] api: Tighter checking of structured read replies
On Tue, May 31, 2022 at 10:32:23PM +0300, Nir Soffer wrote:> > > But we can have this case: > > > > > > 1. ask for 32m > > > 2. server sends 16m (data_seen increase to 16m) > > > 3. server sends 16m (data_seen increase to 32m) > > > 4. server sends 1m (data_seen does not increase) > > > > Yes it does. 32m <= cmd->count is true, so we bump data_seen to 33m. > > Right! I missed this. > > > Then, later on when retiring the command, we note that 33m != 32m and > > fail the read with EIO (if it has not already failed for other > > reasons). > > > > > 5. entire request succeeds > > > > > > Shouldn't we fail if server sends unexpected data? > > > > > > If we detected that all data was received, and we get > > > unexpected data, why not fail immediately? > > > > > > cmd->data_seen += length > > > if (cmd->data_seen > cmd->count) > > > switch to dead state? > > > > Switching immediately to a dead state is also possible, but it's nice > > to try and keep the connection alive as long as we can with a nice > > diagnosis of a failed CMD_READ but still allow further commands, > > rather than an abrupt disconnect that takes out all other use of the > > server. > > I agree, this is better. > > Reviewed-by: Nir Soffer <nsoffer at redhat.com>I've pushed this one as edd8f5c; the rest of the series will be tweaked and posted as v2. -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org