Liu, Changpeng
2018-Jun-07 23:07 UTC
[PATCH v6] virtio_blk: add DISCARD and WRIET ZEROES commands support
> -----Original Message----- > From: Stefan Hajnoczi [mailto:stefanha at redhat.com] > Sent: Thursday, June 7, 2018 9:10 PM > To: Liu, Changpeng <changpeng.liu at intel.com> > Cc: virtualization at lists.linux-foundation.org; cavery at redhat.com; > jasowang at redhat.com; pbonzini at redhat.com; Wang, Wei W > <wei.w.wang at intel.com> > Subject: Re: [PATCH v6] virtio_blk: add DISCARD and WRIET ZEROES commands > support > > On Wed, Jun 06, 2018 at 12:19:00PM +0800, Changpeng Liu wrote: > > Existing virtio-blk protocol doesn't have DISCARD/WRITE ZEROES commands > > support, this will impact the performance when using SSD backend over > > file systems. > > > > Commit 88c85538 "virtio-blk: add discard and write zeroes features to > > specification"(see https://github.com/oasis-tcs/virtio-spec) extended > > existing virtio-blk protocol, adding extra DISCARD and WRITE ZEROES > > commands support. > > > > While here, using 16 bytes descriptor to describe one segment of DISCARD > > or WRITE ZEROES commands, each command may contain one or more > decriptors. > > > > The following data structure shows the definition of one descriptor: > > > > struct virtio_blk_discard_write_zeroes { > > le64 sector; > > le32 num_sectors; > > le32 unmap; > > }; > > > > Field 'sector' means the start sector for DISCARD and WRITE ZEROES, > > filed 'num_sectors' means the number of sectors for DISCARD and WRITE > > ZEROES, if both DISCARD and WRITE ZEROES are supported, field 'unmap' > > maybe used for WRITE ZEROES command with DISCARD enabled. > > > > We also extended the virtio-blk configuration space to let backend > > device put DISCARD and WRITE ZEROES configuration parameters. > > > > struct virtio_blk_config { > > [...] > > > > le32 max_discard_sectors; > > le32 max_discard_seg; > > le32 discard_sector_alignment; > > le32 max_write_zeroes_sectors; > > le32 max_write_zeroes_seg; > > u8 write_zeroes_may_unmap; > > } > > > > New feature bit [VIRTIO_BLK_F_DISCARD (13)]: Device can support discard > > command, maximum discard sectors size in field 'max_discard_sectors' and > > maximum discard segment number in field 'max_discard_seg'. > > > > New feature [VIRTIO_BLK_F_WRITE_ZEROES (14)]: Device can support write > > zeroes command, maximum write zeroes sectors size in field > > 'max_write_zeroes_sectors' and maximum write zeroes segment number in > > field 'max_write_zeroes_seg'. > > > > The parameters in the configuration space of the device field > > 'max_discard_sectors' and field 'discard_sector_alignment' are expressed in > > 512-byte units if the VIRTIO_BLK_F_DISCARD feature bit is negotiated. The > > field 'max_write_zeroes_sectors' is expressed in 512-byte units if the > > VIRTIO_BLK_F_WRITE_ZEROES feature bit is negotiated. > > > > Signed-off-by: Changpeng Liu <changpeng.liu at intel.com> > > --- > > CHANGELOG: > > v6: don't set T_OUT bit to discard and write zeroes commands. > > I don't see this in the patch...Yeah, do noting with DISCARD/WRITE ZEROES means no need to OR BLK_T_OUT again.> > > @@ -225,6 +260,7 @@ static blk_status_t virtio_queue_rq(struct > blk_mq_hw_ctx *hctx, > > int qid = hctx->queue_num; > > int err; > > bool notify = false; > > + bool unmap = false; > > u32 type; > > > > BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems); > > @@ -237,6 +273,13 @@ static blk_status_t virtio_queue_rq(struct > blk_mq_hw_ctx *hctx, > > case REQ_OP_FLUSH: > > type = VIRTIO_BLK_T_FLUSH; > > break; > > + case REQ_OP_DISCARD: > > + type = VIRTIO_BLK_T_DISCARD; > > + break; > > + case REQ_OP_WRITE_ZEROES: > > + type = VIRTIO_BLK_T_WRITE_ZEROES; > > + unmap = !(req->cmd_flags & REQ_NOUNMAP); > > + break; > > case REQ_OP_SCSI_IN: > > case REQ_OP_SCSI_OUT: > > type = VIRTIO_BLK_T_SCSI_CMD; > > @@ -256,6 +299,12 @@ static blk_status_t virtio_queue_rq(struct > blk_mq_hw_ctx *hctx, > > > > blk_mq_start_request(req); > > > > + if (type == VIRTIO_BLK_T_DISCARD || type => VIRTIO_BLK_T_WRITE_ZEROES) { > > + err = virtblk_setup_discard_write_zeroes(req, unmap); > > + if (err) > > + return BLK_STS_RESOURCE; > > + } > > + > > num = blk_rq_map_sg(hctx->queue, req, vbr->sg); > > if (num) { > > if (rq_data_dir(req) == WRITE) > > ...since we still do blk_rq_map_sg() here and num should be != 0.No, while here, we should keep the original logic for READ/WRITE commands.
Stefan Hajnoczi
2018-Jun-08 10:20 UTC
[PATCH v6] virtio_blk: add DISCARD and WRIET ZEROES commands support
On Thu, Jun 07, 2018 at 11:07:06PM +0000, Liu, Changpeng wrote:> > > > -----Original Message----- > > From: Stefan Hajnoczi [mailto:stefanha at redhat.com] > > Sent: Thursday, June 7, 2018 9:10 PM > > To: Liu, Changpeng <changpeng.liu at intel.com> > > Cc: virtualization at lists.linux-foundation.org; cavery at redhat.com; > > jasowang at redhat.com; pbonzini at redhat.com; Wang, Wei W > > <wei.w.wang at intel.com> > > Subject: Re: [PATCH v6] virtio_blk: add DISCARD and WRIET ZEROES commands > > support > > > > On Wed, Jun 06, 2018 at 12:19:00PM +0800, Changpeng Liu wrote: > > > Existing virtio-blk protocol doesn't have DISCARD/WRITE ZEROES commands > > > support, this will impact the performance when using SSD backend over > > > file systems. > > > > > > Commit 88c85538 "virtio-blk: add discard and write zeroes features to > > > specification"(see https://github.com/oasis-tcs/virtio-spec) extended > > > existing virtio-blk protocol, adding extra DISCARD and WRITE ZEROES > > > commands support. > > > > > > While here, using 16 bytes descriptor to describe one segment of DISCARD > > > or WRITE ZEROES commands, each command may contain one or more > > decriptors. > > > > > > The following data structure shows the definition of one descriptor: > > > > > > struct virtio_blk_discard_write_zeroes { > > > le64 sector; > > > le32 num_sectors; > > > le32 unmap; > > > }; > > > > > > Field 'sector' means the start sector for DISCARD and WRITE ZEROES, > > > filed 'num_sectors' means the number of sectors for DISCARD and WRITE > > > ZEROES, if both DISCARD and WRITE ZEROES are supported, field 'unmap' > > > maybe used for WRITE ZEROES command with DISCARD enabled. > > > > > > We also extended the virtio-blk configuration space to let backend > > > device put DISCARD and WRITE ZEROES configuration parameters. > > > > > > struct virtio_blk_config { > > > [...] > > > > > > le32 max_discard_sectors; > > > le32 max_discard_seg; > > > le32 discard_sector_alignment; > > > le32 max_write_zeroes_sectors; > > > le32 max_write_zeroes_seg; > > > u8 write_zeroes_may_unmap; > > > } > > > > > > New feature bit [VIRTIO_BLK_F_DISCARD (13)]: Device can support discard > > > command, maximum discard sectors size in field 'max_discard_sectors' and > > > maximum discard segment number in field 'max_discard_seg'. > > > > > > New feature [VIRTIO_BLK_F_WRITE_ZEROES (14)]: Device can support write > > > zeroes command, maximum write zeroes sectors size in field > > > 'max_write_zeroes_sectors' and maximum write zeroes segment number in > > > field 'max_write_zeroes_seg'. > > > > > > The parameters in the configuration space of the device field > > > 'max_discard_sectors' and field 'discard_sector_alignment' are expressed in > > > 512-byte units if the VIRTIO_BLK_F_DISCARD feature bit is negotiated. The > > > field 'max_write_zeroes_sectors' is expressed in 512-byte units if the > > > VIRTIO_BLK_F_WRITE_ZEROES feature bit is negotiated. > > > > > > Signed-off-by: Changpeng Liu <changpeng.liu at intel.com> > > > --- > > > CHANGELOG: > > > v6: don't set T_OUT bit to discard and write zeroes commands. > > > > I don't see this in the patch... > Yeah, do noting with DISCARD/WRITE ZEROES means no need to OR BLK_T_OUT again. > > > > > @@ -225,6 +260,7 @@ static blk_status_t virtio_queue_rq(struct > > blk_mq_hw_ctx *hctx, > > > int qid = hctx->queue_num; > > > int err; > > > bool notify = false; > > > + bool unmap = false; > > > u32 type; > > > > > > BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems); > > > @@ -237,6 +273,13 @@ static blk_status_t virtio_queue_rq(struct > > blk_mq_hw_ctx *hctx, > > > case REQ_OP_FLUSH: > > > type = VIRTIO_BLK_T_FLUSH; > > > break; > > > + case REQ_OP_DISCARD: > > > + type = VIRTIO_BLK_T_DISCARD; > > > + break; > > > + case REQ_OP_WRITE_ZEROES: > > > + type = VIRTIO_BLK_T_WRITE_ZEROES; > > > + unmap = !(req->cmd_flags & REQ_NOUNMAP); > > > + break; > > > case REQ_OP_SCSI_IN: > > > case REQ_OP_SCSI_OUT: > > > type = VIRTIO_BLK_T_SCSI_CMD; > > > @@ -256,6 +299,12 @@ static blk_status_t virtio_queue_rq(struct > > blk_mq_hw_ctx *hctx, > > > > > > blk_mq_start_request(req); > > > > > > + if (type == VIRTIO_BLK_T_DISCARD || type => > VIRTIO_BLK_T_WRITE_ZEROES) { > > > + err = virtblk_setup_discard_write_zeroes(req, unmap); > > > + if (err) > > > + return BLK_STS_RESOURCE; > > > + } > > > + > > > num = blk_rq_map_sg(hctx->queue, req, vbr->sg); > > > if (num) { > > > if (rq_data_dir(req) == WRITE) > > > > ...since we still do blk_rq_map_sg() here and num should be != 0. > No, while here, we should keep the original logic for READ/WRITE commands.My question is: why does the changelog say "don't set T_OUT" but the code *will* set it because blk_rq_map_sg() returns != 0 and rq_data_dir(req) == WRITE? Maybe I'm misreading the code, but it looks to me like this patch does the opposite of what the changelog says. Stefan -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 455 bytes Desc: not available URL: <http://lists.linuxfoundation.org/pipermail/virtualization/attachments/20180608/4aae894d/attachment.sig>
Liu, Changpeng
2018-Jun-11 03:37 UTC
[PATCH v6] virtio_blk: add DISCARD and WRIET ZEROES commands support
> -----Original Message----- > From: Stefan Hajnoczi [mailto:stefanha at redhat.com] > Sent: Friday, June 8, 2018 6:20 PM > To: Liu, Changpeng <changpeng.liu at intel.com> > Cc: virtualization at lists.linux-foundation.org; cavery at redhat.com; > jasowang at redhat.com; pbonzini at redhat.com; Wang, Wei W > <wei.w.wang at intel.com> > Subject: Re: [PATCH v6] virtio_blk: add DISCARD and WRIET ZEROES commands > support > > On Thu, Jun 07, 2018 at 11:07:06PM +0000, Liu, Changpeng wrote: > > > > > > > -----Original Message----- > > > From: Stefan Hajnoczi [mailto:stefanha at redhat.com] > > > Sent: Thursday, June 7, 2018 9:10 PM > > > To: Liu, Changpeng <changpeng.liu at intel.com> > > > Cc: virtualization at lists.linux-foundation.org; cavery at redhat.com; > > > jasowang at redhat.com; pbonzini at redhat.com; Wang, Wei W > > > <wei.w.wang at intel.com> > > > Subject: Re: [PATCH v6] virtio_blk: add DISCARD and WRIET ZEROES commands > > > support > > > > > > On Wed, Jun 06, 2018 at 12:19:00PM +0800, Changpeng Liu wrote: > > > > Existing virtio-blk protocol doesn't have DISCARD/WRITE ZEROES commands > > > > support, this will impact the performance when using SSD backend over > > > > file systems. > > > > > > > > Commit 88c85538 "virtio-blk: add discard and write zeroes features to > > > > specification"(see https://github.com/oasis-tcs/virtio-spec) extended > > > > existing virtio-blk protocol, adding extra DISCARD and WRITE ZEROES > > > > commands support. > > > > > > > > While here, using 16 bytes descriptor to describe one segment of DISCARD > > > > or WRITE ZEROES commands, each command may contain one or more > > > decriptors. > > > > > > > > The following data structure shows the definition of one descriptor: > > > > > > > > struct virtio_blk_discard_write_zeroes { > > > > le64 sector; > > > > le32 num_sectors; > > > > le32 unmap; > > > > }; > > > > > > > > Field 'sector' means the start sector for DISCARD and WRITE ZEROES, > > > > filed 'num_sectors' means the number of sectors for DISCARD and WRITE > > > > ZEROES, if both DISCARD and WRITE ZEROES are supported, field 'unmap' > > > > maybe used for WRITE ZEROES command with DISCARD enabled. > > > > > > > > We also extended the virtio-blk configuration space to let backend > > > > device put DISCARD and WRITE ZEROES configuration parameters. > > > > > > > > struct virtio_blk_config { > > > > [...] > > > > > > > > le32 max_discard_sectors; > > > > le32 max_discard_seg; > > > > le32 discard_sector_alignment; > > > > le32 max_write_zeroes_sectors; > > > > le32 max_write_zeroes_seg; > > > > u8 write_zeroes_may_unmap; > > > > } > > > > > > > > New feature bit [VIRTIO_BLK_F_DISCARD (13)]: Device can support discard > > > > command, maximum discard sectors size in field 'max_discard_sectors' and > > > > maximum discard segment number in field 'max_discard_seg'. > > > > > > > > New feature [VIRTIO_BLK_F_WRITE_ZEROES (14)]: Device can support write > > > > zeroes command, maximum write zeroes sectors size in field > > > > 'max_write_zeroes_sectors' and maximum write zeroes segment number in > > > > field 'max_write_zeroes_seg'. > > > > > > > > The parameters in the configuration space of the device field > > > > 'max_discard_sectors' and field 'discard_sector_alignment' are expressed in > > > > 512-byte units if the VIRTIO_BLK_F_DISCARD feature bit is negotiated. The > > > > field 'max_write_zeroes_sectors' is expressed in 512-byte units if the > > > > VIRTIO_BLK_F_WRITE_ZEROES feature bit is negotiated. > > > > > > > > Signed-off-by: Changpeng Liu <changpeng.liu at intel.com> > > > > --- > > > > CHANGELOG: > > > > v6: don't set T_OUT bit to discard and write zeroes commands. > > > > > > I don't see this in the patch... > > Yeah, do noting with DISCARD/WRITE ZEROES means no need to OR BLK_T_OUT > again. > > > > > > > @@ -225,6 +260,7 @@ static blk_status_t virtio_queue_rq(struct > > > blk_mq_hw_ctx *hctx, > > > > int qid = hctx->queue_num; > > > > int err; > > > > bool notify = false; > > > > + bool unmap = false; > > > > u32 type; > > > > > > > > BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems); > > > > @@ -237,6 +273,13 @@ static blk_status_t virtio_queue_rq(struct > > > blk_mq_hw_ctx *hctx, > > > > case REQ_OP_FLUSH: > > > > type = VIRTIO_BLK_T_FLUSH; > > > > break; > > > > + case REQ_OP_DISCARD: > > > > + type = VIRTIO_BLK_T_DISCARD; > > > > + break; > > > > + case REQ_OP_WRITE_ZEROES: > > > > + type = VIRTIO_BLK_T_WRITE_ZEROES; > > > > + unmap = !(req->cmd_flags & REQ_NOUNMAP); > > > > + break; > > > > case REQ_OP_SCSI_IN: > > > > case REQ_OP_SCSI_OUT: > > > > type = VIRTIO_BLK_T_SCSI_CMD; > > > > @@ -256,6 +299,12 @@ static blk_status_t virtio_queue_rq(struct > > > blk_mq_hw_ctx *hctx, > > > > > > > > blk_mq_start_request(req); > > > > > > > > + if (type == VIRTIO_BLK_T_DISCARD || type => > > VIRTIO_BLK_T_WRITE_ZEROES) { > > > > + err = virtblk_setup_discard_write_zeroes(req, unmap); > > > > + if (err) > > > > + return BLK_STS_RESOURCE; > > > > + } > > > > + > > > > num = blk_rq_map_sg(hctx->queue, req, vbr->sg); > > > > if (num) { > > > > if (rq_data_dir(req) == WRITE) > > > > > > ...since we still do blk_rq_map_sg() here and num should be != 0. > > No, while here, we should keep the original logic for READ/WRITE commands. > > My question is: why does the changelog say "don't set T_OUT" but the > code *will* set it because blk_rq_map_sg() returns != 0 and > rq_data_dir(req) == WRITE?Since the last bit of DISCARD/WRITE ZEROES commands are already 1, so I said we don't need to set T_OUT bit to DISCARD/WRITE ZEROES commands again. But the original logic for WRITE, T_OUT is still needed, so just keep the original code here is fine.> > Maybe I'm misreading the code, but it looks to me like this patch > does the opposite of what the changelog says. > > Stefan