Konrad Rzeszutek Wilk
2011-Oct-10 15:28 UTC
[Xen-devel] [PATCH] Xen block fixes, secure discard, and barrier support for 3.2 (v1)
[PATCH 1/3] xen/blkback: Support ''feature-barrier'' aka old-style The BARRIER support is done by: 1) setting the drain value to one and waiting for the ''complete waitqueue'' to be complete. We also double-check the kthread status in case the interface is being disconnected. Waiting means stalling processing of the queue. 2) By latching on the refcnt value which is incremented every time a ''struct request'' is used for a particular interface we can figure out outstanding I/Os. The refcnt is decremented when all the bio''s that were generated for the ''struct request'' have been processed. When we detect that all outstanding ''struct request'' and their bio''s have been completed we notify the ''complete waitqueue'' which halted processing of the ring. 3) When the ''complete waitqueue'' is signaled, it submits a WRITE_FLUSH operation. That should take care of emulating the drain behavior properly. When I ran with SLES11 guests using the ext3/reiserfs using the ''fio tiobench-example;fio fsx'' they ran fine. Also killing the guest during runtime and then restarting showed no corruption. If there are some better tests to check for proper operation of this - please advise. [PATCH 2/3] xen/blkback: Fix the inhibition to map pages when Obvious bugfix in the ''feature-discard'' patchset. [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure We also provide the REQ_SECURE support to allow the user to now issue: ''blkdev_issue_discard(.., secure)'' flag. [NOTE: My SSD died on my - so I can''t test this yet - going through the process of RMA-ing it, so please consider the 3/3 patch more as an RFC as it has not been tested]. Please review. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-10 15:28 UTC
[Xen-devel] [PATCH 1/3] xen/blkback: Support ''feature-barrier'' aka old-style BARRIER requests.
We emulate the barrier requests by draining the outstanding bio''s and then sending the WRITE_FLUSH command. To drain the I/Os we use the refcnt that is used during disconnect to wait for all the I/Os before disconnecting from the frontend. We latch on its value and if it reaches either the threshold for disconnect or when there are no more outstanding I/Os, then we have drained all I/Os. Suggested-by: Christopher Hellwig <hch@infradead.org> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- drivers/block/xen-blkback/blkback.c | 37 +++++++++++++++++++++++++++++++++- drivers/block/xen-blkback/common.h | 5 ++++ drivers/block/xen-blkback/xenbus.c | 18 +++++++++++++++++ 3 files changed, 58 insertions(+), 2 deletions(-) diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index e0dab61..184b133 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -452,6 +452,23 @@ static void xen_blk_discard(struct xen_blkif *blkif, struct blkif_request *req) make_response(blkif, req->id, req->operation, status); } +static void xen_blk_drain_io(struct xen_blkif *blkif) +{ + atomic_set(&blkif->drain, 1); + do { + wait_for_completion_interruptible_timeout( + &blkif->drain_complete, HZ); + + if (!atomic_read(&blkif->drain)) + break; + /* The initial value is one, and one refcnt taken at the + * start of the xen_blkif_schedule thread. */ + if (atomic_read(&blkif->refcnt) <= 2) + break; + } while (!kthread_should_stop()); + atomic_set(&blkif->drain, 0); +} + /* * Completion callback on the bio''s. Called as bh->b_end_io() */ @@ -464,6 +481,11 @@ static void __end_block_io_op(struct pending_req *pending_req, int error) pr_debug(DRV_PFX "flush diskcache op failed, not supported\n"); xen_blkbk_flush_diskcache(XBT_NIL, pending_req->blkif->be, 0); pending_req->status = BLKIF_RSP_EOPNOTSUPP; + } else if ((pending_req->operation == BLKIF_OP_WRITE_BARRIER) && + (error == -EOPNOTSUPP)) { + pr_debug(DRV_PFX "write barrier op failed, not supported\n"); + xen_blkbk_barrier(XBT_NIL, pending_req->blkif->be, 0); + pending_req->status = BLKIF_RSP_EOPNOTSUPP; } else if (error) { pr_debug(DRV_PFX "Buffer not up-to-date at end of operation," " error=%d\n", error); @@ -481,6 +503,10 @@ static void __end_block_io_op(struct pending_req *pending_req, int error) pending_req->operation, pending_req->status); xen_blkif_put(pending_req->blkif); free_req(pending_req); + if (atomic_read(&pending_req->blkif->refcnt) <= 2) { + if (atomic_read(&pending_req->blkif->drain)) + complete(&pending_req->blkif->drain_complete); + } } } @@ -574,7 +600,6 @@ do_block_io_op(struct xen_blkif *blkif) return more_to_do; } - /* * Transmutation of the ''struct blkif_request'' to a proper ''struct bio'' * and call the ''submit_bio'' to pass it to the underlying storage. @@ -591,6 +616,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, int i, nbio = 0; int operation; struct blk_plug plug; + bool drain = false; switch (req->operation) { case BLKIF_OP_READ: @@ -601,6 +627,8 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, blkif->st_wr_req++; operation = WRITE_ODIRECT; break; + case BLKIF_OP_WRITE_BARRIER: + drain = true; case BLKIF_OP_FLUSH_DISKCACHE: blkif->st_f_req++; operation = WRITE_FLUSH; @@ -609,7 +637,6 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, blkif->st_ds_req++; operation = REQ_DISCARD; break; - case BLKIF_OP_WRITE_BARRIER: default: operation = 0; /* make gcc happy */ goto fail_response; @@ -668,6 +695,12 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, } } + /* Wait on all outstanding I/O''s and once that has been completed + * issue the WRITE_FLUSH. + */ + if (drain) + xen_blk_drain_io(pending_req->blkif); + /* * If we have failed at this point, we need to undo the M2P override, * set gnttab_set_unmap_op on all of the grant references and perform diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h index 1b1bc44..e638457 100644 --- a/drivers/block/xen-blkback/common.h +++ b/drivers/block/xen-blkback/common.h @@ -181,6 +181,9 @@ struct xen_blkif { atomic_t refcnt; wait_queue_head_t wq; + /* for barrier (drain) requests */ + struct completion drain_complete; + atomic_t drain; /* One thread per one blkif. */ struct task_struct *xenblkd; unsigned int waiting_reqs; @@ -229,6 +232,8 @@ int xen_blkif_schedule(void *arg); int xen_blkbk_flush_diskcache(struct xenbus_transaction xbt, struct backend_info *be, int state); +int xen_blkbk_barrier(struct xenbus_transaction xbt, + struct backend_info *be, int state); struct xenbus_device *xen_blkbk_xenbus(struct backend_info *be); static inline void blkif_get_x86_32_req(struct blkif_request *dst, diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index 1c44b32..a6d4303 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -114,6 +114,8 @@ static struct xen_blkif *xen_blkif_alloc(domid_t domid) spin_lock_init(&blkif->blk_ring_lock); atomic_set(&blkif->refcnt, 1); init_waitqueue_head(&blkif->wq); + init_completion(&blkif->drain_complete); + atomic_set(&blkif->drain, 0); blkif->st_print = jiffies; init_waitqueue_head(&blkif->waiting_to_free); @@ -474,6 +476,19 @@ kfree: out: return err; } +int xen_blkbk_barrier(struct xenbus_transaction xbt, + struct backend_info *be, int state) +{ + struct xenbus_device *dev = be->dev; + int err; + + err = xenbus_printf(xbt, dev->nodename, "feature-barrier", + "%d", state); + if (err) + xenbus_dev_fatal(dev, err, "writing feature-barrier"); + + return err; +} /* * Entry point to this code when a new device is created. Allocate the basic @@ -708,6 +723,9 @@ again: err = xen_blkbk_discard(xbt, be); + /* If we can''t advertise it is OK. */ + err = xen_blkbk_barrier(xbt, be, be->blkif->vbd.flush_support); + err = xenbus_printf(xbt, dev->nodename, "sectors", "%llu", (unsigned long long)vbd_sz(&be->blkif->vbd)); if (err) { -- 1.7.5.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-10 15:28 UTC
[Xen-devel] [PATCH 2/3] xen/blkback: Fix the inhibition to map pages when discarding sector ranges.
The ''operation'' parameters are the ones provided to the bio layer while the req->operation are the ones passed in between the backend and frontend. We used the wrong ''operation'' value to squash the call to map pages when processing the discard operation resulting in mapping the pages unnecessarily. CC: Li Dongyang <lidongyang@novell.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- drivers/block/xen-blkback/blkback.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index 184b133..3da9a40 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -707,7 +707,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, * the hypercall to unmap the grants - that is all done in * xen_blkbk_unmap. */ - if (operation != BLKIF_OP_DISCARD && + if (operation != REQ_DISCARD && xen_blkbk_map(req, pending_req, seg)) goto fail_flush; -- 1.7.5.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-10 15:28 UTC
[Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
Part of the blkdev_issue_discard(xx) operation is that it can also issue a secure discard operation that will permanantly remove the sectors in question. We advertise that we can support that via the ''discard-secure'' attribute and on the request, if the ''secure'' bit is set, we will attempt to pass in REQ_DISCARD | REQ_SECURE. CC: Li Dongyang <lidongyang@novell.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- drivers/block/xen-blkback/blkback.c | 4 +++- drivers/block/xen-blkback/common.h | 5 +++++ drivers/block/xen-blkback/xenbus.c | 12 ++++++++++++ drivers/block/xen-blkfront.c | 19 +++++++++++++++++-- include/xen/interface/io/blkif.h | 5 +++++ 5 files changed, 42 insertions(+), 3 deletions(-) diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index 3da9a40..0bd7143 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -427,7 +427,9 @@ static void xen_blk_discard(struct xen_blkif *blkif, struct blkif_request *req) err = blkdev_issue_discard(bdev, req->u.discard.sector_number, req->u.discard.nr_sectors, - GFP_KERNEL, 0); + GFP_KERNEL, + blkif->vbd.discard_secure ? + req->u.discard.secure : 0); else if (blkif->blk_backend_type == BLKIF_BACKEND_FILE) { /* punch a hole in the backing file */ struct loop_device *lo = bdev->bd_disk->private_data; diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h index e638457..ed64ba8 100644 --- a/drivers/block/xen-blkback/common.h +++ b/drivers/block/xen-blkback/common.h @@ -72,6 +72,7 @@ struct blkif_x86_32_request_rw { struct blkif_x86_32_request_discard { blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ uint64_t nr_sectors; + uint8_t secure:1; }; struct blkif_x86_32_request { @@ -101,6 +102,7 @@ struct blkif_x86_64_request_rw { struct blkif_x86_64_request_discard { blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ uint64_t nr_sectors; + uint8_t secure:1; }; struct blkif_x86_64_request { @@ -157,6 +159,7 @@ struct xen_vbd { /* Cached size parameter. */ sector_t size; bool flush_support; + bool discard_secure; }; struct backend_info; @@ -259,6 +262,7 @@ static inline void blkif_get_x86_32_req(struct blkif_request *dst, case BLKIF_OP_DISCARD: dst->u.discard.sector_number = src->u.discard.sector_number; dst->u.discard.nr_sectors = src->u.discard.nr_sectors; + dst->u.discard.secure = src->u.discard.secure; break; default: break; @@ -288,6 +292,7 @@ static inline void blkif_get_x86_64_req(struct blkif_request *dst, case BLKIF_OP_DISCARD: dst->u.discard.sector_number = src->u.discard.sector_number; dst->u.discard.nr_sectors = src->u.discard.nr_sectors; + dst->u.discard.secure = src->u.discard.secure; break; default: break; diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index a6d4303..0c0ce39 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -378,6 +378,9 @@ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle, if (q && q->flush_flags) vbd->flush_support = true; + if (q && blk_queue_secdiscard(q)) + vbd->discard_secure = true; + DPRINTK("Successful creation of handle=%04x (dom=%u)\n", handle, blkif->domid); return 0; @@ -460,6 +463,15 @@ int xen_blkbk_discard(struct xenbus_transaction xbt, struct backend_info *be) state = 1; blkif->blk_backend_type = BLKIF_BACKEND_PHY; } + /* Optional. */ + err = xenbus_printf(xbt, dev->nodename, + "discard-secure", "%d", + blkif->vbd.discard_secure); + if (err) { + xenbus_dev_fatal(dev, err, + "writting discard-secure"); + goto kfree; + } } } else { err = PTR_ERR(type); diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index e9d301c..bc39b5e 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -98,7 +98,8 @@ struct blkfront_info unsigned long shadow_free; unsigned int feature_flush; unsigned int flush_op; - unsigned int feature_discard; + unsigned int feature_discard:1; + unsigned int feature_secdiscard:1; unsigned int discard_granularity; unsigned int discard_alignment; int is_ready; @@ -305,11 +306,13 @@ static int blkif_queue_request(struct request *req) ring_req->operation = info->flush_op; } - if (unlikely(req->cmd_flags & REQ_DISCARD)) { + if (unlikely(req->cmd_flags & (REQ_DISCARD | REQ_SECURE))) { /* id, sector_number and handle are set above. */ ring_req->operation = BLKIF_OP_DISCARD; ring_req->nr_segments = 0; ring_req->u.discard.nr_sectors = blk_rq_sectors(req); + if ((req->cmd_flags & REQ_SECURE) && info->feature_secdiscard) + ring_req->u.discard.secure = 1; } else { ring_req->nr_segments = blk_rq_map_sg(req->q, req, info->sg); BUG_ON(ring_req->nr_segments > BLKIF_MAX_SEGMENTS_PER_REQUEST); @@ -424,6 +427,8 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size) blk_queue_max_discard_sectors(rq, get_capacity(gd)); rq->limits.discard_granularity = info->discard_granularity; rq->limits.discard_alignment = info->discard_alignment; + if (info->feature_secdiscard) + queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, rq); } /* Hard sector size and max sectors impersonate the equiv. hardware. */ @@ -749,7 +754,9 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) info->gd->disk_name); error = -EOPNOTSUPP; info->feature_discard = 0; + info->feature_secdiscard = 0; queue_flag_clear(QUEUE_FLAG_DISCARD, rq); + queue_flag_clear(QUEUE_FLAG_SECDISCARD, rq); } __blk_end_request_all(req, error); break; @@ -1135,11 +1142,13 @@ static void blkfront_setup_discard(struct blkfront_info *info) char *type; unsigned int discard_granularity; unsigned int discard_alignment; + unsigned int discard_secure; type = xenbus_read(XBT_NIL, info->xbdev->otherend, "type", NULL); if (IS_ERR(type)) return; + info->feature_secdiscard = 0; if (strncmp(type, "phy", 3) == 0) { err = xenbus_gather(XBT_NIL, info->xbdev->otherend, "discard-granularity", "%u", &discard_granularity, @@ -1150,6 +1159,12 @@ static void blkfront_setup_discard(struct blkfront_info *info) info->discard_granularity = discard_granularity; info->discard_alignment = discard_alignment; } + err = xenbus_gather(XBT_NIL, info->xbdev->otherend, + "discard-secure", "%d", &discard_secure, + NULL); + if (!err) + info->feature_secdiscard = discard_secure; + } else if (strncmp(type, "file", 4) == 0) info->feature_discard = 1; diff --git a/include/xen/interface/io/blkif.h b/include/xen/interface/io/blkif.h index 9324488..04f60b0 100644 --- a/include/xen/interface/io/blkif.h +++ b/include/xen/interface/io/blkif.h @@ -84,6 +84,10 @@ typedef uint64_t blkif_sector_t; * e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc * http://www.seagate.com/staticfiles/support/disc/manuals/ * Interface%20manuals/100293068c.pdf + * We also provide three extra XenBus options to the discard operation: + * ''discard-granularity'' - Max amount of sectors that can be discarded. + * ''discard-alignment'' - 4K, 128K, etc aligment on sectors to erased. + * ''discard-secure'' - whether the discard can also securely erase data. */ #define BLKIF_OP_DISCARD 5 @@ -107,6 +111,7 @@ struct blkif_request_rw { struct blkif_request_discard { blkif_sector_t sector_number; uint64_t nr_sectors; + uint8_t secure:1; }; struct blkif_request { -- 1.7.5.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Oct-10 16:13 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
On Mon, 2011-10-10 at 16:28 +0100, Konrad Rzeszutek Wilk wrote:> diff --git a/include/xen/interface/io/blkif.h b/include/xen/interface/io/blkif.h > index 9324488..04f60b0 100644 > --- a/include/xen/interface/io/blkif.h > +++ b/include/xen/interface/io/blkif.h > @@ -84,6 +84,10 @@ typedef uint64_t blkif_sector_t; > * e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc > * http://www.seagate.com/staticfiles/support/disc/manuals/ > * Interface%20manuals/100293068c.pdf > + * We also provide three extra XenBus options to the discard operation: > + * ''discard-granularity'' - Max amount of sectors that can be discarded. > + * ''discard-alignment'' - 4K, 128K, etc aligment on sectors to erased. > + * ''discard-secure'' - whether the discard can also securely erase data. > */ > #define BLKIF_OP_DISCARD 5 > > @@ -107,6 +111,7 @@ struct blkif_request_rw { > struct blkif_request_discard { > blkif_sector_t sector_number; > uint64_t nr_sectors; > + uint8_t secure:1; > }; > > struct blkif_request {Which tree/branch is this? I don''t see BLKIF_OP_DISCARD in mainline or your linux-next branch. Since this changes an inter-guest ABI we may need to consider backwards compatibility (I suspect this interface is new enough that no one has actually implemented it in anger and we can get away with changing it). In any case it should also be posted against the canonical inter-guest interface definition in the xen tree for review with that in mind. I think an explicit flag variable is likely to be less trouble WRT maintaining compatibility in the future than a bit-field. Also I think you may as well align the struct size to something larger than a byte, either 4 or 8 bytes would make sense. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-10 16:42 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
On Mon, Oct 10, 2011 at 05:13:07PM +0100, Ian Campbell wrote:> On Mon, 2011-10-10 at 16:28 +0100, Konrad Rzeszutek Wilk wrote: > > diff --git a/include/xen/interface/io/blkif.h b/include/xen/interface/io/blkif.h > > index 9324488..04f60b0 100644 > > --- a/include/xen/interface/io/blkif.h > > +++ b/include/xen/interface/io/blkif.h > > @@ -84,6 +84,10 @@ typedef uint64_t blkif_sector_t; > > * e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc > > * http://www.seagate.com/staticfiles/support/disc/manuals/ > > * Interface%20manuals/100293068c.pdf > > + * We also provide three extra XenBus options to the discard operation: > > + * ''discard-granularity'' - Max amount of sectors that can be discarded. > > + * ''discard-alignment'' - 4K, 128K, etc aligment on sectors to erased. > > + * ''discard-secure'' - whether the discard can also securely erase data. > > */ > > #define BLKIF_OP_DISCARD 5 > > > > @@ -107,6 +111,7 @@ struct blkif_request_rw { > > struct blkif_request_discard { > > blkif_sector_t sector_number; > > uint64_t nr_sectors; > > + uint8_t secure:1; > > }; > > > > struct blkif_request { > > Which tree/branch is this? I don''t see BLKIF_OP_DISCARD in mainline or > your linux-next branch.Uh, that is not good. I must have forgotten to merge it in - that is the #stable/for-jens-3.2 branch. Let me do that right now.> > Since this changes an inter-guest ABI we may need to consider backwards > compatibility (I suspect this interface is new enough that no one has > actually implemented it in anger and we can get away with changing it).<nods>> In any case it should also be posted against the canonical inter-guest > interface definition in the xen tree for review with that in mind.Yes! But I was thinking to first let this one rattle a bit and see what folks thought about it before submitting the xen-devel.> > I think an explicit flag variable is likely to be less trouble WRT > maintaining compatibility in the future than a bit-field. Also I think > you may as well align the struct size to something larger than a byte, > either 4 or 8 bytes would make sense.Ok. Will change it and make it an uint64_t secure_flag variable. Later on if there are any "other" flags we can chop it down. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-10 17:53 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
> > I think an explicit flag variable is likely to be less trouble WRT > > maintaining compatibility in the future than a bit-field. Also I think > > you may as well align the struct size to something larger than a byte, > > either 4 or 8 bytes would make sense. > > Ok. Will change it and make it an uint64_t secure_flag > variable. Later on if there are any "other" flags we can chop it down.New patch (it also looks like the patch I sent to xen-devel to update the blkif.h was never merged) - so let me send right now. BTW, it seems that the #pragma pack(push, 4) is used in the drivers/block/xen-blkback/common.h to compact the structures already so I don''t think we need the aligment.>From f74cc58b77c2a1c1a93c0324a50b9d1b9a0cb159 Mon Sep 17 00:00:00 2001From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Date: Mon, 10 Oct 2011 10:58:40 -0400 Subject: [PATCH] xen/blk[front|back]: Enhance discard support with secure erasing support. Part of the blkdev_issue_discard(xx) operation is that it can also issue a secure discard operation that will permanantly remove the sectors in question. We advertise that we can support that via the ''discard-secure'' attribute and on the request, if the ''secure'' bit is set, we will attempt to pass in REQ_DISCARD | REQ_SECURE. CC: Li Dongyang <lidongyang@novell.com> [v1: Used ''flag'' instead of ''secure:1'' bit] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> --- drivers/block/xen-blkback/blkback.c | 9 ++++++--- drivers/block/xen-blkback/common.h | 5 +++++ drivers/block/xen-blkback/xenbus.c | 12 ++++++++++++ drivers/block/xen-blkfront.c | 19 +++++++++++++++++-- include/xen/interface/io/blkif.h | 6 ++++++ 5 files changed, 46 insertions(+), 5 deletions(-) diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index ca23dff..5ccb648 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -422,13 +422,16 @@ static void xen_blk_discard(struct xen_blkif *blkif, struct blkif_request *req) int status = BLKIF_RSP_OKAY; struct block_device *bdev = blkif->vbd.bdev; - if (blkif->blk_backend_type == BLKIF_BACKEND_PHY) + if (blkif->blk_backend_type == BLKIF_BACKEND_PHY) { + unsigned long secure = (blkif->vbd.discard_secure && + (req->u.discard.flag & BLKIF_OP_DISCARD_FLAG_SECURE)) ? + BLKDEV_DISCARD_SECURE : 0; /* just forward the discard request */ err = blkdev_issue_discard(bdev, req->u.discard.sector_number, req->u.discard.nr_sectors, - GFP_KERNEL, 0); - else if (blkif->blk_backend_type == BLKIF_BACKEND_FILE) { + GFP_KERNEL, secure); + } else if (blkif->blk_backend_type == BLKIF_BACKEND_FILE) { /* punch a hole in the backing file */ struct loop_device *lo = bdev->bd_disk->private_data; struct file *file = lo->lo_backing_file; diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h index e638457..43b72a7 100644 --- a/drivers/block/xen-blkback/common.h +++ b/drivers/block/xen-blkback/common.h @@ -72,6 +72,7 @@ struct blkif_x86_32_request_rw { struct blkif_x86_32_request_discard { blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ uint64_t nr_sectors; + uint32_t flag; }; struct blkif_x86_32_request { @@ -101,6 +102,7 @@ struct blkif_x86_64_request_rw { struct blkif_x86_64_request_discard { blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ uint64_t nr_sectors; + uint32_t flag; }; struct blkif_x86_64_request { @@ -157,6 +159,7 @@ struct xen_vbd { /* Cached size parameter. */ sector_t size; bool flush_support; + bool discard_secure; }; struct backend_info; @@ -259,6 +262,7 @@ static inline void blkif_get_x86_32_req(struct blkif_request *dst, case BLKIF_OP_DISCARD: dst->u.discard.sector_number = src->u.discard.sector_number; dst->u.discard.nr_sectors = src->u.discard.nr_sectors; + dst->u.discard.flag = src->u.discard.flag; break; default: break; @@ -288,6 +292,7 @@ static inline void blkif_get_x86_64_req(struct blkif_request *dst, case BLKIF_OP_DISCARD: dst->u.discard.sector_number = src->u.discard.sector_number; dst->u.discard.nr_sectors = src->u.discard.nr_sectors; + dst->u.discard.flag = src->u.discard.flag; break; default: break; diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index a6d4303..0c0ce39 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -378,6 +378,9 @@ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle, if (q && q->flush_flags) vbd->flush_support = true; + if (q && blk_queue_secdiscard(q)) + vbd->discard_secure = true; + DPRINTK("Successful creation of handle=%04x (dom=%u)\n", handle, blkif->domid); return 0; @@ -460,6 +463,15 @@ int xen_blkbk_discard(struct xenbus_transaction xbt, struct backend_info *be) state = 1; blkif->blk_backend_type = BLKIF_BACKEND_PHY; } + /* Optional. */ + err = xenbus_printf(xbt, dev->nodename, + "discard-secure", "%d", + blkif->vbd.discard_secure); + if (err) { + xenbus_dev_fatal(dev, err, + "writting discard-secure"); + goto kfree; + } } } else { err = PTR_ERR(type); diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index 7b2ec59..807b7b6 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -98,7 +98,8 @@ struct blkfront_info unsigned long shadow_free; unsigned int feature_flush; unsigned int flush_op; - unsigned int feature_discard; + unsigned int feature_discard:1; + unsigned int feature_secdiscard:1; unsigned int discard_granularity; unsigned int discard_alignment; int is_ready; @@ -305,11 +306,13 @@ static int blkif_queue_request(struct request *req) ring_req->operation = info->flush_op; } - if (unlikely(req->cmd_flags & REQ_DISCARD)) { + if (unlikely(req->cmd_flags & (REQ_DISCARD | REQ_SECURE))) { /* id, sector_number and handle are set above. */ ring_req->operation = BLKIF_OP_DISCARD; ring_req->nr_segments = 0; ring_req->u.discard.nr_sectors = blk_rq_sectors(req); + if ((req->cmd_flags & REQ_SECURE) && info->feature_secdiscard) + ring_req->u.discard.flag = BLKIF_OP_DISCARD_FLAG_SECURE; } else { ring_req->nr_segments = blk_rq_map_sg(req->q, req, info->sg); BUG_ON(ring_req->nr_segments > BLKIF_MAX_SEGMENTS_PER_REQUEST); @@ -424,6 +427,8 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size) blk_queue_max_discard_sectors(rq, get_capacity(gd)); rq->limits.discard_granularity = info->discard_granularity; rq->limits.discard_alignment = info->discard_alignment; + if (info->feature_secdiscard) + queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, rq); } /* Hard sector size and max sectors impersonate the equiv. hardware. */ @@ -749,7 +754,9 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) info->gd->disk_name); error = -EOPNOTSUPP; info->feature_discard = 0; + info->feature_secdiscard = 0; queue_flag_clear(QUEUE_FLAG_DISCARD, rq); + queue_flag_clear(QUEUE_FLAG_SECDISCARD, rq); } __blk_end_request_all(req, error); break; @@ -1135,11 +1142,13 @@ static void blkfront_setup_discard(struct blkfront_info *info) char *type; unsigned int discard_granularity; unsigned int discard_alignment; + unsigned int discard_secure; type = xenbus_read(XBT_NIL, info->xbdev->otherend, "type", NULL); if (IS_ERR(type)) return; + info->feature_secdiscard = 0; if (strncmp(type, "phy", 3) == 0) { err = xenbus_gather(XBT_NIL, info->xbdev->otherend, "discard-granularity", "%u", &discard_granularity, @@ -1150,6 +1159,12 @@ static void blkfront_setup_discard(struct blkfront_info *info) info->discard_granularity = discard_granularity; info->discard_alignment = discard_alignment; } + err = xenbus_gather(XBT_NIL, info->xbdev->otherend, + "discard-secure", "%d", &discard_secure, + NULL); + if (!err) + info->feature_secdiscard = discard_secure; + } else if (strncmp(type, "file", 4) == 0) info->feature_discard = 1; diff --git a/include/xen/interface/io/blkif.h b/include/xen/interface/io/blkif.h index 9324488..13d040e 100644 --- a/include/xen/interface/io/blkif.h +++ b/include/xen/interface/io/blkif.h @@ -84,6 +84,10 @@ typedef uint64_t blkif_sector_t; * e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc * http://www.seagate.com/staticfiles/support/disc/manuals/ * Interface%20manuals/100293068c.pdf + * We also provide three extra XenBus options to the discard operation: + * ''discard-granularity'' - Max amount of sectors that can be discarded. + * ''discard-alignment'' - 4K, 128K, etc aligment on sectors to erased. + * ''discard-secure'' - whether the discard can also securely erase data. */ #define BLKIF_OP_DISCARD 5 @@ -107,6 +111,8 @@ struct blkif_request_rw { struct blkif_request_discard { blkif_sector_t sector_number; uint64_t nr_sectors; +#define BLKIF_OP_DISCARD_FLAG_SECURE (1<<1) /* ignored if discard-secure=0 */ + uint32_t flag; }; struct blkif_request { -- 1.7.5.4 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Oct-10 19:19 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
Where is your tree at the moment? On Mon, 2011-10-10 at 18:53 +0100, Konrad Rzeszutek Wilk wrote:> > > I think an explicit flag variable is likely to be less trouble WRT > > > maintaining compatibility in the future than a bit-field. Also I think > > > you may as well align the struct size to something larger than a byte, > > > either 4 or 8 bytes would make sense. > > > > Ok. Will change it and make it an uint64_t secure_flag > > variable. Later on if there are any "other" flags we can chop it down. > > New patch (it also looks like the patch I sent to xen-devel to update > the blkif.h was never merged) - so let me send right now. > > BTW, it seems that the #pragma pack(push, 4) is used in the > drivers/block/xen-blkback/common.h to compact the structures already so > I don''t think we need the aligment.Only around the x86_32 ABI definition, working around the fact that the original 32 bit interface was not 64 bit clean :-(. I think to ensure that the new DISCARD structure is sensibly aligned you probably do want it to be a multiple of 64 bits in size (32 of flags and 32 of pad would do it).> diff --git a/include/xen/interface/io/blkif.h b/include/xen/interface/io/blkif.h > index 9324488..13d040e 100644 > --- a/include/xen/interface/io/blkif.h > +++ b/include/xen/interface/io/blkif.h > @@ -84,6 +84,10 @@ typedef uint64_t blkif_sector_t; > * e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc > * http://www.seagate.com/staticfiles/support/disc/manuals/ > * Interface%20manuals/100293068c.pdf > + * We also provide three extra XenBus options to the discard operation:Who is "We" here? The frontend, backend or both? Do they both need to agree on anything?> + * ''discard-granularity'' - Max amount of sectors that can be discarded.... in a single request?> + * ''discard-alignment'' - 4K, 128K, etc aligment on sectors to erased.alignment What size are the sectors which "discard-granularity" is measured in? Is it "discard-alignment"-byte sectors or in base 512-byte sectors?> + * ''discard-secure'' - whether the discard can also securely erase data. > */ > #define BLKIF_OP_DISCARD 5 > > @@ -107,6 +111,8 @@ struct blkif_request_rw { > struct blkif_request_discard { > blkif_sector_t sector_number; > uint64_t nr_sectors; > +#define BLKIF_OP_DISCARD_FLAG_SECURE (1<<1) /* ignored if discard-secure=0 */"1<<0" is unused.> + uint32_t flag; > }; > > struct blkif_request {_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Oct-10 19:20 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
On Mon, 2011-10-10 at 17:42 +0100, Konrad Rzeszutek Wilk wrote:> On Mon, Oct 10, 2011 at 05:13:07PM +0100, Ian Campbell wrote:> > In any case it should also be posted against the canonical inter-guest > > interface definition in the xen tree for review with that in mind. > > Yes! But I was thinking to first let this one rattle a bit and see what > folks thought about it before submitting the xen-devel.It''s a good idea to get ABI changes out for review before the implementation rattles around so much that changing it becomes hard. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-10 19:57 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
On Mon, Oct 10, 2011 at 08:20:02PM +0100, Ian Campbell wrote:> On Mon, 2011-10-10 at 17:42 +0100, Konrad Rzeszutek Wilk wrote: > > On Mon, Oct 10, 2011 at 05:13:07PM +0100, Ian Campbell wrote: > > > > In any case it should also be posted against the canonical inter-guest > > > interface definition in the xen tree for review with that in mind. > > > > Yes! But I was thinking to first let this one rattle a bit and see what > > folks thought about it before submitting the xen-devel. > > It''s a good idea to get ABI changes out for review before the > implementation rattles around so much that changing it becomes hard.OK, lets drop this until we get that straigthen out. I''ve posted http://lists.xensource.com/archives/html/xen-devel/2011-10/msg00642.html the changes to Xen ABI. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Oct-11 07:25 UTC
[Xen-devel] Re: [PATCH 2/3] xen/blkback: Fix the inhibition to map pages when discarding sector ranges.
>>> On 10.10.11 at 17:28, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > The ''operation'' parameters are the ones provided to the bio layer while > the req->operation are the ones passed in between the backend and > frontend. We used the wrong ''operation'' value to squash the > call to map pages when processing the discard operation resulting > in mapping the pages unnecessarily. > > CC: Li Dongyang <lidongyang@novell.com> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > drivers/block/xen-blkback/blkback.c | 2 +- > 1 files changed, 1 insertions(+), 1 deletions(-) > > diff --git a/drivers/block/xen-blkback/blkback.c > b/drivers/block/xen-blkback/blkback.c > index 184b133..3da9a40 100644 > --- a/drivers/block/xen-blkback/blkback.c > +++ b/drivers/block/xen-blkback/blkback.c > @@ -707,7 +707,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > * the hypercall to unmap the grants - that is all done in > * xen_blkbk_unmap. > */ > - if (operation != BLKIF_OP_DISCARD && > + if (operation != REQ_DISCARD &&Why is that check necessary in the first place? xen_blkbk_map() doesn''t do any harm when req->nr_segments is zero (as could also be the case on WRITE_FLUSH ones). Jan> xen_blkbk_map(req, pending_req, seg)) > goto fail_flush; >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Li Dongyang
2011-Oct-11 07:33 UTC
[Xen-devel] Re: [PATCH 2/3] xen/blkback: Fix the inhibition to map pages when discarding sector ranges.
On Tue, Oct 11, 2011 at 3:25 PM, Jan Beulich <JBeulich@suse.com> wrote:>>>> On 10.10.11 at 17:28, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: >> The ''operation'' parameters are the ones provided to the bio layer while >> the req->operation are the ones passed in between the backend and >> frontend. We used the wrong ''operation'' value to squash the >> call to map pages when processing the discard operation resulting >> in mapping the pages unnecessarily. >> >> CC: Li Dongyang <lidongyang@novell.com> >> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >> --- >> drivers/block/xen-blkback/blkback.c | 2 +- >> 1 files changed, 1 insertions(+), 1 deletions(-) >> >> diff --git a/drivers/block/xen-blkback/blkback.c >> b/drivers/block/xen-blkback/blkback.c >> index 184b133..3da9a40 100644 >> --- a/drivers/block/xen-blkback/blkback.c >> +++ b/drivers/block/xen-blkback/blkback.c >> @@ -707,7 +707,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, >> * the hypercall to unmap the grants - that is all done in >> * xen_blkbk_unmap. >> */ >> - if (operation != BLKIF_OP_DISCARD && >> + if (operation != REQ_DISCARD && > > Why is that check necessary in the first place? xen_blkbk_map() doesn''t > do any harm when req->nr_segments is zero (as could also be the case > on WRITE_FLUSH ones). >Ah, you are right, we could remove this check then, Thanks> Jan > >> xen_blkbk_map(req, pending_req, seg)) >> goto fail_flush; >> > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Oct-11 07:36 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
>>> On 10.10.11 at 21:57, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > On Mon, Oct 10, 2011 at 08:20:02PM +0100, Ian Campbell wrote: >> On Mon, 2011-10-10 at 17:42 +0100, Konrad Rzeszutek Wilk wrote: >> > On Mon, Oct 10, 2011 at 05:13:07PM +0100, Ian Campbell wrote: >> >> > > In any case it should also be posted against the canonical inter-guest >> > > interface definition in the xen tree for review with that in mind. >> > >> > Yes! But I was thinking to first let this one rattle a bit and see what >> > folks thought about it before submitting the xen-devel. >> >> It''s a good idea to get ABI changes out for review before the >> implementation rattles around so much that changing it becomes hard. > > OK, lets drop this until we get that straigthen out. I''ve posted > http://lists.xensource.com/archives/html/xen-devel/2011-10/msg00642.html the > changes to > Xen ABI.Yeah, but that didn''t get adjusted after IanC''s comments (structure alignment, BLKIF_OP_DISCARD_FLAG_SECURE value). Further I wonder why you don''t use the "reserved" field instead of extending the structure at the end. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Oct-11 08:09 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
On Tue, 2011-10-11 at 08:36 +0100, Jan Beulich wrote:> >>> On 10.10.11 at 21:57, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > On Mon, Oct 10, 2011 at 08:20:02PM +0100, Ian Campbell wrote: > >> On Mon, 2011-10-10 at 17:42 +0100, Konrad Rzeszutek Wilk wrote: > >> > On Mon, Oct 10, 2011 at 05:13:07PM +0100, Ian Campbell wrote: > >> > >> > > In any case it should also be posted against the canonical inter-guest > >> > > interface definition in the xen tree for review with that in mind. > >> > > >> > Yes! But I was thinking to first let this one rattle a bit and see what > >> > folks thought about it before submitting the xen-devel. > >> > >> It''s a good idea to get ABI changes out for review before the > >> implementation rattles around so much that changing it becomes hard. > > > > OK, lets drop this until we get that straigthen out. I''ve posted > > http://lists.xensource.com/archives/html/xen-devel/2011-10/msg00642.html the > > changes to > > Xen ABI. > > Yeah, but that didn''t get adjusted after IanC''s comments (structure > alignment, BLKIF_OP_DISCARD_FLAG_SECURE value). > > Further I wonder why you don''t use the "reserved" field instead of > extending the structure at the end.I hadn''t noticed that field, yes using that would be preferable since it retains ABI compatibility.> > Jan >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-11 15:51 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
On Tue, Oct 11, 2011 at 08:36:33AM +0100, Jan Beulich wrote:> >>> On 10.10.11 at 21:57, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > On Mon, Oct 10, 2011 at 08:20:02PM +0100, Ian Campbell wrote: > >> On Mon, 2011-10-10 at 17:42 +0100, Konrad Rzeszutek Wilk wrote: > >> > On Mon, Oct 10, 2011 at 05:13:07PM +0100, Ian Campbell wrote: > >> > >> > > In any case it should also be posted against the canonical inter-guest > >> > > interface definition in the xen tree for review with that in mind. > >> > > >> > Yes! But I was thinking to first let this one rattle a bit and see what > >> > folks thought about it before submitting the xen-devel. > >> > >> It''s a good idea to get ABI changes out for review before the > >> implementation rattles around so much that changing it becomes hard. > > > > OK, lets drop this until we get that straigthen out. I''ve posted > > http://lists.xensource.com/archives/html/xen-devel/2011-10/msg00642.html the > > changes to > > Xen ABI. > > Yeah, but that didn''t get adjusted after IanC''s comments (structure > alignment, BLKIF_OP_DISCARD_FLAG_SECURE value).My later response to it should include it: http://lists.xensource.com/archives/html/xen-devel/2011-10/msg00652.html> > Further I wonder why you don''t use the "reserved" field instead of > extending the structure at the end.<blinks> I completly missed it. That would definitly work as well. Let me redo it with that in mind. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-11 18:39 UTC
Re: [Xen-devel] Re: [PATCH 2/3] xen/blkback: Fix the inhibition to map pages when discarding sector ranges.
On Tue, Oct 11, 2011 at 03:33:11PM +0800, Li Dongyang wrote:> On Tue, Oct 11, 2011 at 3:25 PM, Jan Beulich <JBeulich@suse.com> wrote: > >>>> On 10.10.11 at 17:28, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > >> The ''operation'' parameters are the ones provided to the bio layer while > >> the req->operation are the ones passed in between the backend and > >> frontend. We used the wrong ''operation'' value to squash the > >> call to map pages when processing the discard operation resulting > >> in mapping the pages unnecessarily. > >> > >> CC: Li Dongyang <lidongyang@novell.com> > >> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > >> --- > >> drivers/block/xen-blkback/blkback.c | 2 +- > >> 1 files changed, 1 insertions(+), 1 deletions(-) > >> > >> diff --git a/drivers/block/xen-blkback/blkback.c > >> b/drivers/block/xen-blkback/blkback.c > >> index 184b133..3da9a40 100644 > >> --- a/drivers/block/xen-blkback/blkback.c > >> +++ b/drivers/block/xen-blkback/blkback.c > >> @@ -707,7 +707,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > >> * the hypercall to unmap the grants - that is all done in > >> * xen_blkbk_unmap. > >> */ > >> - if (operation != BLKIF_OP_DISCARD && > >> + if (operation != REQ_DISCARD && > > > > Why is that check necessary in the first place? xen_blkbk_map() doesn''t > > do any harm when req->nr_segments is zero (as could also be the case > > on WRITE_FLUSH ones). > > > Ah, you are right, we could remove this check then, ThanksExcept that req->nr_segments for blkif__request_discard is actually the reserved field. See: struct blkif_request { uint8_t operation; /* BLKIF_OP_??? */ uint8_t nr_segments; /* number of segments */ blkif_vdev_t handle; /* only for read/write requests */ .. snip.. and: struct blkif_request_discard { uint8_t operation; /* BLKIF_OP_DISCARD */ /* ignored if ''discard-secure=0'' */ #define BLKIF_OP_DISCARD_FLAG_SECURE (1<<0) uint8_t flag; /* BLKIF_OP_DISCARD_FLAG_SECURE or 0 */ blkif_vdev_t handle; /* same as for read/write requests */ which will throw off the logic for nr_segments all wrong since for some discard operations it would read the nr_segments as 1. So we do need some logic in there to work with this. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-11 20:50 UTC
Re: [Xen-devel] Re: [PATCH 2/3] xen/blkback: Fix the inhibition to map pages when discarding sector ranges.
On Tue, Oct 11, 2011 at 02:39:09PM -0400, Konrad Rzeszutek Wilk wrote:> On Tue, Oct 11, 2011 at 03:33:11PM +0800, Li Dongyang wrote: > > On Tue, Oct 11, 2011 at 3:25 PM, Jan Beulich <JBeulich@suse.com> wrote: > > >>>> On 10.10.11 at 17:28, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > >> The ''operation'' parameters are the ones provided to the bio layer while > > >> the req->operation are the ones passed in between the backend and > > >> frontend. We used the wrong ''operation'' value to squash the > > >> call to map pages when processing the discard operation resulting > > >> in mapping the pages unnecessarily. > > >> > > >> CC: Li Dongyang <lidongyang@novell.com> > > >> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > >> --- > > >> drivers/block/xen-blkback/blkback.c | 2 +- > > >> 1 files changed, 1 insertions(+), 1 deletions(-) > > >> > > >> diff --git a/drivers/block/xen-blkback/blkback.c > > >> b/drivers/block/xen-blkback/blkback.c > > >> index 184b133..3da9a40 100644 > > >> --- a/drivers/block/xen-blkback/blkback.c > > >> +++ b/drivers/block/xen-blkback/blkback.c > > >> @@ -707,7 +707,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > > >> * the hypercall to unmap the grants - that is all done in > > >> * xen_blkbk_unmap. > > >> */ > > >> - if (operation != BLKIF_OP_DISCARD && > > >> + if (operation != REQ_DISCARD && > > > > > > Why is that check necessary in the first place? xen_blkbk_map() doesn''t > > > do any harm when req->nr_segments is zero (as could also be the case > > > on WRITE_FLUSH ones). > > > > > Ah, you are right, we could remove this check then, Thanks > > Except that req->nr_segments for blkif__request_discard is actually > the reserved field. > > See: > > struct blkif_request { > uint8_t operation; /* BLKIF_OP_??? */ > uint8_t nr_segments; /* number of segments */ > blkif_vdev_t handle; /* only for read/write requests */ > .. snip.. > > and: > struct blkif_request_discard { > uint8_t operation; /* BLKIF_OP_DISCARD */ > /* ignored if ''discard-secure=0'' */ > #define BLKIF_OP_DISCARD_FLAG_SECURE (1<<0) > uint8_t flag; /* BLKIF_OP_DISCARD_FLAG_SECURE or 0 */ > blkif_vdev_t handle; /* same as for read/write requests */ > > which will throw off the logic for nr_segments all wrong since for some > discard operations it would read the nr_segments as 1. > > So we do need some logic in there to work with this.So a patch like this (and there is another on top that moves the setting of nseg) should do it. commit 12679b29b2f828454f833e17e9090ed576c63afc Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Date: Mon Oct 10 00:47:49 2011 -0400 xen/blkback: Fix the inhibition to map pages when discarding sector ranges. The ''operation'' parameters are the ones provided to the bio layer while the req->operation are the ones passed in between the backend and frontend. We used the wrong ''operation'' value to squash the call to map pages when processing the discard operation resulting in an hypercall that did nothing. Lets guard against going in the mapping function by checking for the amount of segments. CC: Li Dongyang <lidongyang@novell.com> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index c15c559..94e659d 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -707,8 +707,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, * the hypercall to unmap the grants - that is all done in * xen_blkbk_unmap. */ - if (operation != BLKIF_OP_DISCARD && - xen_blkbk_map(req, pending_req, seg)) + if (nseg && xen_blkbk_map(req, pending_req, seg)) goto fail_flush; /* _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-11 20:57 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
> My later response to it should include it: > http://lists.xensource.com/archives/html/xen-devel/2011-10/msg00652.html > > > > > Further I wonder why you don''t use the "reserved" field instead of > > extending the structure at the end. > > <blinks> I completly missed it. That would definitly work as well. > > Let me redo it with that in mind.I''ve posted the Xen hypervisor ABI one that thread above. The implementation of that looks as follow: commit ae33f998d66c5982af533bda25c2b6c4f863789f Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Date: Mon Oct 10 10:58:40 2011 -0400 xen/blk[front|back]: Enhance discard support with secure erasing support. Part of the blkdev_issue_discard(xx) operation is that it can also issue a secure discard operation that will permanantly remove the sectors in question. We advertise that we can support that via the ''discard-secure'' attribute and on the request, if the ''secure'' bit is set, we will attempt to pass in REQ_DISCARD | REQ_SECURE. CC: Li Dongyang <lidongyang@novell.com> [v1: Used ''flag'' instead of ''secure:1'' bit] [v2: Use ''reserved ''uint8_t'' as a flag] Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c index 94e659d..4f33c13 100644 --- a/drivers/block/xen-blkback/blkback.c +++ b/drivers/block/xen-blkback/blkback.c @@ -362,7 +362,7 @@ static int xen_blkbk_map(struct blkif_request *req, { struct gnttab_map_grant_ref map[BLKIF_MAX_SEGMENTS_PER_REQUEST]; int i; - int nseg = req->nr_segments; + int nseg = req->u1.nr_segments; int ret = 0; /* @@ -422,13 +422,16 @@ static void xen_blk_discard(struct xen_blkif *blkif, struct blkif_request *req) int status = BLKIF_RSP_OKAY; struct block_device *bdev = blkif->vbd.bdev; - if (blkif->blk_backend_type == BLKIF_BACKEND_PHY) + if (blkif->blk_backend_type == BLKIF_BACKEND_PHY) { + unsigned long secure = (blkif->vbd.discard_secure && + (req->u1.flag & BLKIF_OP_DISCARD_FLAG_SECURE)) ? + BLKDEV_DISCARD_SECURE : 0; /* just forward the discard request */ err = blkdev_issue_discard(bdev, req->u.discard.sector_number, req->u.discard.nr_sectors, - GFP_KERNEL, 0); - else if (blkif->blk_backend_type == BLKIF_BACKEND_FILE) { + GFP_KERNEL, secure); + } else if (blkif->blk_backend_type == BLKIF_BACKEND_FILE) { /* punch a hole in the backing file */ struct loop_device *lo = bdev->bd_disk->private_data; struct file *file = lo->lo_backing_file; @@ -618,6 +621,9 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, struct blk_plug plug; bool drain = false; + /* Check that the number of segments is sane. */ + nseg = req->u1.nr_segments; + switch (req->operation) { case BLKIF_OP_READ: blkif->st_rd_req++; @@ -636,6 +642,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, case BLKIF_OP_DISCARD: blkif->st_ds_req++; operation = REQ_DISCARD; + nseg = 0; /* The nr_segments and flag share the same space. */ break; default: operation = 0; /* make gcc happy */ @@ -643,8 +650,6 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, break; } - /* Check that the number of segments is sane. */ - nseg = req->nr_segments; if (unlikely(nseg == 0 && operation != WRITE_FLUSH && operation != REQ_DISCARD) || unlikely(nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST)) { diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h index e638457..c6a5462 100644 --- a/drivers/block/xen-blkback/common.h +++ b/drivers/block/xen-blkback/common.h @@ -76,7 +76,10 @@ struct blkif_x86_32_request_discard { struct blkif_x86_32_request { uint8_t operation; /* BLKIF_OP_??? */ - uint8_t nr_segments; /* number of segments */ + union { + uint8_t nr_segments; /* number of segments */ + uint8_t flag; /* flag for blkif_x86_32_request_discard*/ + } u1; blkif_vdev_t handle; /* only for read/write requests */ uint64_t id; /* private guest value, echoed in resp */ union { @@ -105,7 +108,10 @@ struct blkif_x86_64_request_discard { struct blkif_x86_64_request { uint8_t operation; /* BLKIF_OP_??? */ - uint8_t nr_segments; /* number of segments */ + union { + uint8_t nr_segments; /* number of segments */ + uint8_t flag; /* for blkif_x86_64_request_discard */ + } u1; blkif_vdev_t handle; /* only for read/write requests */ uint64_t __attribute__((__aligned__(8))) id; union { @@ -157,6 +163,7 @@ struct xen_vbd { /* Cached size parameter. */ sector_t size; bool flush_support; + bool discard_secure; }; struct backend_info; @@ -241,7 +248,7 @@ static inline void blkif_get_x86_32_req(struct blkif_request *dst, { int i, n = BLKIF_MAX_SEGMENTS_PER_REQUEST; dst->operation = src->operation; - dst->nr_segments = src->nr_segments; + dst->u1.nr_segments = src->u1.nr_segments; dst->handle = src->handle; dst->id = src->id; switch (src->operation) { @@ -251,14 +258,16 @@ static inline void blkif_get_x86_32_req(struct blkif_request *dst, case BLKIF_OP_FLUSH_DISKCACHE: dst->u.rw.sector_number = src->u.rw.sector_number; barrier(); - if (n > dst->nr_segments) - n = dst->nr_segments; + if (n > dst->u1.nr_segments) + n = dst->u1.nr_segments; for (i = 0; i < n; i++) dst->u.rw.seg[i] = src->u.rw.seg[i]; break; case BLKIF_OP_DISCARD: dst->u.discard.sector_number = src->u.discard.sector_number; dst->u.discard.nr_sectors = src->u.discard.nr_sectors; + /* We should be doing "dst->u1.flag = src->u1.flag;" but the + * copying of u1.nr_segments does it for us already. */ break; default: break; @@ -270,7 +279,7 @@ static inline void blkif_get_x86_64_req(struct blkif_request *dst, { int i, n = BLKIF_MAX_SEGMENTS_PER_REQUEST; dst->operation = src->operation; - dst->nr_segments = src->nr_segments; + dst->u1.nr_segments = src->u1.nr_segments; dst->handle = src->handle; dst->id = src->id; switch (src->operation) { @@ -280,14 +289,16 @@ static inline void blkif_get_x86_64_req(struct blkif_request *dst, case BLKIF_OP_FLUSH_DISKCACHE: dst->u.rw.sector_number = src->u.rw.sector_number; barrier(); - if (n > dst->nr_segments) - n = dst->nr_segments; + if (n > dst->u1.nr_segments) + n = dst->u1.nr_segments; for (i = 0; i < n; i++) dst->u.rw.seg[i] = src->u.rw.seg[i]; break; case BLKIF_OP_DISCARD: dst->u.discard.sector_number = src->u.discard.sector_number; dst->u.discard.nr_sectors = src->u.discard.nr_sectors; + /* We should be doing "dst->u1.flag = src->u1.flag;" but the + * copying of u1.nr_segments does it for us already. */ break; default: break; diff --git a/drivers/block/xen-blkback/xenbus.c b/drivers/block/xen-blkback/xenbus.c index a6d4303..0c0ce39 100644 --- a/drivers/block/xen-blkback/xenbus.c +++ b/drivers/block/xen-blkback/xenbus.c @@ -378,6 +378,9 @@ static int xen_vbd_create(struct xen_blkif *blkif, blkif_vdev_t handle, if (q && q->flush_flags) vbd->flush_support = true; + if (q && blk_queue_secdiscard(q)) + vbd->discard_secure = true; + DPRINTK("Successful creation of handle=%04x (dom=%u)\n", handle, blkif->domid); return 0; @@ -460,6 +463,15 @@ int xen_blkbk_discard(struct xenbus_transaction xbt, struct backend_info *be) state = 1; blkif->blk_backend_type = BLKIF_BACKEND_PHY; } + /* Optional. */ + err = xenbus_printf(xbt, dev->nodename, + "discard-secure", "%d", + blkif->vbd.discard_secure); + if (err) { + xenbus_dev_fatal(dev, err, + "writting discard-secure"); + goto kfree; + } } } else { err = PTR_ERR(type); diff --git a/drivers/block/xen-blkfront.c b/drivers/block/xen-blkfront.c index 7b2ec59..f7e6ca5 100644 --- a/drivers/block/xen-blkfront.c +++ b/drivers/block/xen-blkfront.c @@ -98,7 +98,8 @@ struct blkfront_info unsigned long shadow_free; unsigned int feature_flush; unsigned int flush_op; - unsigned int feature_discard; + unsigned int feature_discard:1; + unsigned int feature_secdiscard:1; unsigned int discard_granularity; unsigned int discard_alignment; int is_ready; @@ -305,16 +306,19 @@ static int blkif_queue_request(struct request *req) ring_req->operation = info->flush_op; } - if (unlikely(req->cmd_flags & REQ_DISCARD)) { + if (unlikely(req->cmd_flags & (REQ_DISCARD | REQ_SECURE))) { /* id, sector_number and handle are set above. */ ring_req->operation = BLKIF_OP_DISCARD; - ring_req->nr_segments = 0; + ring_req->u1.flag = 0; ring_req->u.discard.nr_sectors = blk_rq_sectors(req); + if ((req->cmd_flags & REQ_SECURE) && info->feature_secdiscard) + ring_req->u1.flag = BLKIF_OP_DISCARD_FLAG_SECURE; } else { - ring_req->nr_segments = blk_rq_map_sg(req->q, req, info->sg); - BUG_ON(ring_req->nr_segments > BLKIF_MAX_SEGMENTS_PER_REQUEST); + ring_req->u1.nr_segments = blk_rq_map_sg(req->q, req, info->sg); + BUG_ON(ring_req->u1.nr_segments > + BLKIF_MAX_SEGMENTS_PER_REQUEST); - for_each_sg(info->sg, sg, ring_req->nr_segments, i) { + for_each_sg(info->sg, sg, ring_req->u1.nr_segments, i) { buffer_mfn = pfn_to_mfn(page_to_pfn(sg_page(sg))); fsect = sg->offset >> 9; lsect = fsect + (sg->length >> 9) - 1; @@ -424,6 +428,8 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size) blk_queue_max_discard_sectors(rq, get_capacity(gd)); rq->limits.discard_granularity = info->discard_granularity; rq->limits.discard_alignment = info->discard_alignment; + if (info->feature_secdiscard) + queue_flag_set_unlocked(QUEUE_FLAG_SECDISCARD, rq); } /* Hard sector size and max sectors impersonate the equiv. hardware. */ @@ -705,7 +711,7 @@ static void blkif_free(struct blkfront_info *info, int suspend) static void blkif_completion(struct blk_shadow *s) { int i; - for (i = 0; i < s->req.nr_segments; i++) + for (i = 0; i < s->req.u1.nr_segments; i++) gnttab_end_foreign_access(s->req.u.rw.seg[i].gref, 0, 0UL); } @@ -749,7 +755,9 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) info->gd->disk_name); error = -EOPNOTSUPP; info->feature_discard = 0; + info->feature_secdiscard = 0; queue_flag_clear(QUEUE_FLAG_DISCARD, rq); + queue_flag_clear(QUEUE_FLAG_SECDISCARD, rq); } __blk_end_request_all(req, error); break; @@ -763,7 +771,7 @@ static irqreturn_t blkif_interrupt(int irq, void *dev_id) error = -EOPNOTSUPP; } if (unlikely(bret->status == BLKIF_RSP_ERROR && - info->shadow[id].req.nr_segments == 0)) { + info->shadow[id].req.u1.nr_segments == 0)) { printk(KERN_WARNING "blkfront: %s: empty write %s op failed\n", info->flush_op == BLKIF_OP_WRITE_BARRIER ? "barrier" : "flush disk cache", @@ -1038,7 +1046,7 @@ static int blkif_recover(struct blkfront_info *info) memcpy(&info->shadow[req->id], ©[i], sizeof(copy[i])); /* Rewrite any grant references invalidated by susp/resume. */ - for (j = 0; j < req->nr_segments; j++) + for (j = 0; j < req->u1.nr_segments; j++) gnttab_grant_foreign_access_ref( req->u.rw.seg[j].gref, info->xbdev->otherend_id, @@ -1135,11 +1143,13 @@ static void blkfront_setup_discard(struct blkfront_info *info) char *type; unsigned int discard_granularity; unsigned int discard_alignment; + unsigned int discard_secure; type = xenbus_read(XBT_NIL, info->xbdev->otherend, "type", NULL); if (IS_ERR(type)) return; + info->feature_secdiscard = 0; if (strncmp(type, "phy", 3) == 0) { err = xenbus_gather(XBT_NIL, info->xbdev->otherend, "discard-granularity", "%u", &discard_granularity, @@ -1150,6 +1160,12 @@ static void blkfront_setup_discard(struct blkfront_info *info) info->discard_granularity = discard_granularity; info->discard_alignment = discard_alignment; } + err = xenbus_gather(XBT_NIL, info->xbdev->otherend, + "discard-secure", "%d", &discard_secure, + NULL); + if (!err) + info->feature_secdiscard = discard_secure; + } else if (strncmp(type, "file", 4) == 0) info->feature_discard = 1; diff --git a/include/xen/interface/io/blkif.h b/include/xen/interface/io/blkif.h index 9324488..4a01e71 100644 --- a/include/xen/interface/io/blkif.h +++ b/include/xen/interface/io/blkif.h @@ -84,6 +84,20 @@ typedef uint64_t blkif_sector_t; * e07154r6-Data_Set_Management_Proposal_for_ATA-ACS2.doc * http://www.seagate.com/staticfiles/support/disc/manuals/ * Interface%20manuals/100293068c.pdf + * The backend can optionally provide three extra XenBus attributes to + * further optimize the discard functionality: + * ''discard-aligment'' - Devices that support discard functionality may + * internally allocate space in units that are bigger than the exported + * logical block size. The discard-alignment parameter indicates how many bytes + * the beginning of the partition is offset from the internal allocation unit''s + * natural alignment. + * ''discard-granularity'' - Devices that support discard functionality may + * internally allocate space using units that are bigger than the logical block + * size. The discard-granularity parameter indicates the size of the internal + * allocation unit in bytes if reported by the device. Otherwise the + * discard-granularity will be set to match the device''s physical block size. + * ''discard-secure'' - All copies of the discarded sectors (potentially created + * by garbage collection) must also be erased. */ #define BLKIF_OP_DISCARD 5 @@ -111,7 +125,11 @@ struct blkif_request_discard { struct blkif_request { uint8_t operation; /* BLKIF_OP_??? */ - uint8_t nr_segments; /* number of segments */ + union { + uint8_t nr_segments; /* number of segments */ + uint8_t flag; /* flag for blkif_request_discard */ +#define BLKIF_OP_DISCARD_FLAG_SECURE (1<<0) /* ignored if discard-secure=0 */ + } u1; blkif_vdev_t handle; /* only for read/write requests */ uint64_t id; /* private guest value, echoed in resp */ union { _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Oct-12 10:18 UTC
Re: [Xen-devel] Re: [PATCH 2/3] xen/blkback: Fix the inhibition to map pages when discarding sector ranges.
>>> On 11.10.11 at 22:50, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > On Tue, Oct 11, 2011 at 02:39:09PM -0400, Konrad Rzeszutek Wilk wrote: >> On Tue, Oct 11, 2011 at 03:33:11PM +0800, Li Dongyang wrote: >> > On Tue, Oct 11, 2011 at 3:25 PM, Jan Beulich <JBeulich@suse.com> wrote: >> > >>>> On 10.10.11 at 17:28, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: >> > >> The ''operation'' parameters are the ones provided to the bio layer while >> > >> the req->operation are the ones passed in between the backend and >> > >> frontend. We used the wrong ''operation'' value to squash the >> > >> call to map pages when processing the discard operation resulting >> > >> in mapping the pages unnecessarily. >> > >> >> > >> CC: Li Dongyang <lidongyang@novell.com> >> > >> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >> > >> --- >> > >> drivers/block/xen-blkback/blkback.c | 2 +- >> > >> 1 files changed, 1 insertions(+), 1 deletions(-) >> > >> >> > >> diff --git a/drivers/block/xen-blkback/blkback.c >> > >> b/drivers/block/xen-blkback/blkback.c >> > >> index 184b133..3da9a40 100644 >> > >> --- a/drivers/block/xen-blkback/blkback.c >> > >> +++ b/drivers/block/xen-blkback/blkback.c >> > >> @@ -707,7 +707,7 @@ static int dispatch_rw_block_io(struct xen_blkif > *blkif, >> > >> * the hypercall to unmap the grants - that is all done in >> > >> * xen_blkbk_unmap. >> > >> */ >> > >> - if (operation != BLKIF_OP_DISCARD && >> > >> + if (operation != REQ_DISCARD && >> > > >> > > Why is that check necessary in the first place? xen_blkbk_map() doesn''t >> > > do any harm when req->nr_segments is zero (as could also be the case >> > > on WRITE_FLUSH ones). >> > > >> > Ah, you are right, we could remove this check then, Thanks >> >> Except that req->nr_segments for blkif__request_discard is actually >> the reserved field. >> >> See: >> >> struct blkif_request { >> uint8_t operation; /* BLKIF_OP_??? */ >> uint8_t nr_segments; /* number of segments */ >> blkif_vdev_t handle; /* only for read/write requests */ >> .. snip.. >> >> and: >> struct blkif_request_discard { >> uint8_t operation; /* BLKIF_OP_DISCARD */ >> /* ignored if ''discard-secure=0'' */ >> #define BLKIF_OP_DISCARD_FLAG_SECURE (1<<0) >> uint8_t flag; /* BLKIF_OP_DISCARD_FLAG_SECURE or 0 */ >> blkif_vdev_t handle; /* same as for read/write requests */ >> >> which will throw off the logic for nr_segments all wrong since for some >> discard operations it would read the nr_segments as 1. >> >> So we do need some logic in there to work with this. > > > So a patch like this (and there is another on top that moves the setting > of nseg) should do it.With that later patch, you should then probably also check that none of the so far unassigned bits in u1.flag are being set. Jan> commit 12679b29b2f828454f833e17e9090ed576c63afc > Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Date: Mon Oct 10 00:47:49 2011 -0400 > > xen/blkback: Fix the inhibition to map pages when discarding sector > ranges. > > The ''operation'' parameters are the ones provided to the bio layer while > the req->operation are the ones passed in between the backend and > frontend. We used the wrong ''operation'' value to squash the > call to map pages when processing the discard operation resulting > in an hypercall that did nothing. Lets guard against going in the > mapping function by checking for the amount of segments. > > CC: Li Dongyang <lidongyang@novell.com> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > diff --git a/drivers/block/xen-blkback/blkback.c > b/drivers/block/xen-blkback/blkback.c > index c15c559..94e659d 100644 > --- a/drivers/block/xen-blkback/blkback.c > +++ b/drivers/block/xen-blkback/blkback.c > @@ -707,8 +707,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > * the hypercall to unmap the grants - that is all done in > * xen_blkbk_unmap. > */ > - if (operation != BLKIF_OP_DISCARD && > - xen_blkbk_map(req, pending_req, seg)) > + if (nseg && xen_blkbk_map(req, pending_req, seg)) > goto fail_flush; > > /*_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Oct-12 10:30 UTC
Re: [Xen-devel] Re: [PATCH 2/3] xen/blkback: Fix the inhibition to map pages when discarding sector ranges.
On Tue, 2011-10-11 at 21:50 +0100, Konrad Rzeszutek Wilk wrote:> On Tue, Oct 11, 2011 at 02:39:09PM -0400, Konrad Rzeszutek Wilk wrote: > > On Tue, Oct 11, 2011 at 03:33:11PM +0800, Li Dongyang wrote: > > > On Tue, Oct 11, 2011 at 3:25 PM, Jan Beulich <JBeulich@suse.com> wrote: > > > >>>> On 10.10.11 at 17:28, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > > >> The ''operation'' parameters are the ones provided to the bio layer while > > > >> the req->operation are the ones passed in between the backend and > > > >> frontend. We used the wrong ''operation'' value to squash the > > > >> call to map pages when processing the discard operation resulting > > > >> in mapping the pages unnecessarily. > > > >> > > > >> CC: Li Dongyang <lidongyang@novell.com> > > > >> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > > >> --- > > > >> drivers/block/xen-blkback/blkback.c | 2 +- > > > >> 1 files changed, 1 insertions(+), 1 deletions(-) > > > >> > > > >> diff --git a/drivers/block/xen-blkback/blkback.c > > > >> b/drivers/block/xen-blkback/blkback.c > > > >> index 184b133..3da9a40 100644 > > > >> --- a/drivers/block/xen-blkback/blkback.c > > > >> +++ b/drivers/block/xen-blkback/blkback.c > > > >> @@ -707,7 +707,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > > > >> * the hypercall to unmap the grants - that is all done in > > > >> * xen_blkbk_unmap. > > > >> */ > > > >> - if (operation != BLKIF_OP_DISCARD && > > > >> + if (operation != REQ_DISCARD && > > > > > > > > Why is that check necessary in the first place? xen_blkbk_map() doesn''t > > > > do any harm when req->nr_segments is zero (as could also be the case > > > > on WRITE_FLUSH ones). > > > > > > > Ah, you are right, we could remove this check then, Thanks > > > > Except that req->nr_segments for blkif__request_discard is actually > > the reserved field.Either this field is nr_segments for op==DISCARD or it is not. If it is then it should be named nr_segments and treated as such. If it is not then it is a bug for anything to look at those bits in memory and treat them as nr_segments. There was a patch (from Own Smith) at one point to use a union in the blkif request data type, would doing that help to make it clear which fields overlap and which do not?> > See: > > > > struct blkif_request { > > uint8_t operation; /* BLKIF_OP_??? */ > > uint8_t nr_segments; /* number of segments */ > > blkif_vdev_t handle; /* only for read/write requests */ > > .. snip.. > > > > and: > > struct blkif_request_discard { > > uint8_t operation; /* BLKIF_OP_DISCARD */ > > /* ignored if ''discard-secure=0'' */ > > #define BLKIF_OP_DISCARD_FLAG_SECURE (1<<0) > > uint8_t flag; /* BLKIF_OP_DISCARD_FLAG_SECURE or 0 */ > > blkif_vdev_t handle; /* same as for read/write requests */ > > > > which will throw off the logic for nr_segments all wrong since for some > > discard operations it would read the nr_segments as 1.> > > > So we do need some logic in there to work with this. > > > So a patch like this (and there is another on top that moves the setting > of nseg) should do it. > > > commit 12679b29b2f828454f833e17e9090ed576c63afc > Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Date: Mon Oct 10 00:47:49 2011 -0400 > > xen/blkback: Fix the inhibition to map pages when discarding sector ranges. > > The ''operation'' parameters are the ones provided to the bio layer while > the req->operation are the ones passed in between the backend and > frontend. We used the wrong ''operation'' value to squash the > call to map pages when processing the discard operation resulting > in an hypercall that did nothing. Lets guard against going in the > mapping function by checking for the amount of segments. > > CC: Li Dongyang <lidongyang@novell.com> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c > index c15c559..94e659d 100644 > --- a/drivers/block/xen-blkback/blkback.c > +++ b/drivers/block/xen-blkback/blkback.c > @@ -707,8 +707,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > * the hypercall to unmap the grants - that is all done in > * xen_blkbk_unmap. > */ > - if (operation != BLKIF_OP_DISCARD && > - xen_blkbk_map(req, pending_req, seg)) > + if (nseg && xen_blkbk_map(req, pending_req, seg))nseg == reg->nr_segments, so as above either referencing nseg when operation == BLKIF_OP_DISCARD is a bug or the field is badly named. Ian.> goto fail_flush; > > /* > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Oct-12 10:54 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
On Tue, 2011-10-11 at 21:57 +0100, Konrad Rzeszutek Wilk wrote:> > My later response to it should include it: > > http://lists.xensource.com/archives/html/xen-devel/2011-10/msg00652.html > > > > > > > > Further I wonder why you don''t use the "reserved" field instead of > > > extending the structure at the end. > > > > <blinks> I completly missed it. That would definitly work as well. > > > > Let me redo it with that in mind. > > I''ve posted the Xen hypervisor ABI one that thread above. The implementation > of that looks as follow:Ian.> > commit ae33f998d66c5982af533bda25c2b6c4f863789f > Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Date: Mon Oct 10 10:58:40 2011 -0400 > > xen/blk[front|back]: Enhance discard support with secure erasing support. > > Part of the blkdev_issue_discard(xx) operation is that it can also > issue a secure discard operation that will permanantly remove the > sectors in question. We advertise that we can support that via the > ''discard-secure'' attribute and on the request, if the ''secure'' bit > is set, we will attempt to pass in REQ_DISCARD | REQ_SECURE. > > CC: Li Dongyang <lidongyang@novell.com> > [v1: Used ''flag'' instead of ''secure:1'' bit] > [v2: Use ''reserved ''uint8_t'' as a flag] > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c > index 94e659d..4f33c13 100644 > --- a/drivers/block/xen-blkback/blkback.c > +++ b/drivers/block/xen-blkback/blkback.c > @@ -362,7 +362,7 @@ static int xen_blkbk_map(struct blkif_request *req, > { > struct gnttab_map_grant_ref map[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > int i; > - int nseg = req->nr_segments; > + int nseg = req->u1.nr_segments; > int ret = 0; > > /* > @@ -422,13 +422,16 @@ static void xen_blk_discard(struct xen_blkif *blkif, struct blkif_request *req) > int status = BLKIF_RSP_OKAY; > struct block_device *bdev = blkif->vbd.bdev; > > - if (blkif->blk_backend_type == BLKIF_BACKEND_PHY) > + if (blkif->blk_backend_type == BLKIF_BACKEND_PHY) { > + unsigned long secure = (blkif->vbd.discard_secure && > + (req->u1.flag & BLKIF_OP_DISCARD_FLAG_SECURE)) ? > + BLKDEV_DISCARD_SECURE : 0; > /* just forward the discard request */ > err = blkdev_issue_discard(bdev, > req->u.discard.sector_number, > req->u.discard.nr_sectors, > - GFP_KERNEL, 0); > - else if (blkif->blk_backend_type == BLKIF_BACKEND_FILE) { > + GFP_KERNEL, secure); > + } else if (blkif->blk_backend_type == BLKIF_BACKEND_FILE) { > /* punch a hole in the backing file */ > struct loop_device *lo = bdev->bd_disk->private_data; > struct file *file = lo->lo_backing_file; > @@ -618,6 +621,9 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > struct blk_plug plug; > bool drain = false; > > + /* Check that the number of segments is sane. */ > + nseg = req->u1.nr_segments;This field is invalid (at least with these semantics) if req->operation == BLKIF_OP_DISCARD so reading it here and clearing it later when you decide it is invalid is just confusing. Why not read it inside the switch iff it is valid?> + > switch (req->operation) { > case BLKIF_OP_READ: > blkif->st_rd_req++; > @@ -636,6 +642,7 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > case BLKIF_OP_DISCARD: > blkif->st_ds_req++; > operation = REQ_DISCARD; > + nseg = 0; /* The nr_segments and flag share the same space. */ > break; > default: > operation = 0; /* make gcc happy */ > @@ -643,8 +650,6 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > break; > } > > - /* Check that the number of segments is sane. */ > - nseg = req->nr_segments; > if (unlikely(nseg == 0 && operation != WRITE_FLUSH && > operation != REQ_DISCARD) || > unlikely(nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST)) { > diff --git a/drivers/block/xen-blkback/common.h b/drivers/block/xen-blkback/common.h > index e638457..c6a5462 100644 > --- a/drivers/block/xen-blkback/common.h > +++ b/drivers/block/xen-blkback/common.h > @@ -76,7 +76,10 @@ struct blkif_x86_32_request_discard { > > struct blkif_x86_32_request { > uint8_t operation; /* BLKIF_OP_??? */ > - uint8_t nr_segments; /* number of segments */ > + union { > + uint8_t nr_segments; /* number of segments */ > + uint8_t flag; /* flag for blkif_x86_32_request_discard*/ > + } u1;this is a bit of a mess, it sucks that common fields and per-operation ones are all mixed up like this, but union { struct { nr_segments; } rw; struct { flags; } discard; } u would at least make it more obvious what was what in the code dereferencing these fields. Having seen the confusion which arises from reusing this reserved field I''m not convinced that it is worthwhile. If we don''t do that then we can have a more sane layout which makes it more explicit what is what: struct blkif_request { uint8_t operation; /* BLKIF_OP_??? */ uint8_t nr_segments; /* number of segments */ blkif_vdev_t handle; /* only for read/write requests */ uint64_t id; /* private guest value, echoed in resp */ union { struct { blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; } rw; struct { whatever; flags; } discard; struct { somethign else } anotherop; } u; }; handle isn''t really only r/w, is it? If it is then I think we should just repeat the id fields within the union and pad so the offset is correct: struct blkif_request { uint8_t operation; /* BLKIF_OP_??? */ union { struct { uint8_t nr_segments; /* number of segments */ blkif_vdev_t handle; uint64_t id; /* private guest value, echoed in resp */ blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; } rw; struct { uint8_t flags; blkif_vdev_t __pad; /* was "handle: only for read/write requests */ uint64_t id; /* private guest value, echoed in resp */ } discard; struct { somethign else; padding id; } anotherop; } u; }; it''s lame but it avoid relying on "only for op xx, yy" type comments to get things right. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-12 15:45 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
On Wed, Oct 12, 2011 at 11:54:02AM +0100, Ian Campbell wrote:> On Tue, 2011-10-11 at 21:57 +0100, Konrad Rzeszutek Wilk wrote: > > > My later response to it should include it: > > > http://lists.xensource.com/archives/html/xen-devel/2011-10/msg00652.html > > > > > > > > > > > Further I wonder why you don''t use the "reserved" field instead of > > > > extending the structure at the end. > > > > > > <blinks> I completly missed it. That would definitly work as well. > > > > > > Let me redo it with that in mind. > > > > I''ve posted the Xen hypervisor ABI one that thread above. The implementation > > of that looks as follow: > > Ian. > > > > > commit ae33f998d66c5982af533bda25c2b6c4f863789f > > Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > Date: Mon Oct 10 10:58:40 2011 -0400 > > > > xen/blk[front|back]: Enhance discard support with secure erasing support. > > > > Part of the blkdev_issue_discard(xx) operation is that it can also > > issue a secure discard operation that will permanantly remove the > > sectors in question. We advertise that we can support that via the > > ''discard-secure'' attribute and on the request, if the ''secure'' bit > > is set, we will attempt to pass in REQ_DISCARD | REQ_SECURE. > > > > CC: Li Dongyang <lidongyang@novell.com> > > [v1: Used ''flag'' instead of ''secure:1'' bit] > > [v2: Use ''reserved ''uint8_t'' as a flag] > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > > > diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c > > index 94e659d..4f33c13 100644 > > --- a/drivers/block/xen-blkback/blkback.c > > +++ b/drivers/block/xen-blkback/blkback.c > > @@ -362,7 +362,7 @@ static int xen_blkbk_map(struct blkif_request *req, > > { > > struct gnttab_map_grant_ref map[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > > int i; > > - int nseg = req->nr_segments; > > + int nseg = req->u1.nr_segments; > > int ret = 0; > > > > /* > > @@ -422,13 +422,16 @@ static void xen_blk_discard(struct xen_blkif *blkif, struct blkif_request *req) > > int status = BLKIF_RSP_OKAY; > > struct block_device *bdev = blkif->vbd.bdev; > > > > - if (blkif->blk_backend_type == BLKIF_BACKEND_PHY) > > + if (blkif->blk_backend_type == BLKIF_BACKEND_PHY) { > > + unsigned long secure = (blkif->vbd.discard_secure && > > + (req->u1.flag & BLKIF_OP_DISCARD_FLAG_SECURE)) ? > > + BLKDEV_DISCARD_SECURE : 0; > > /* just forward the discard request */ > > err = blkdev_issue_discard(bdev, > > req->u.discard.sector_number, > > req->u.discard.nr_sectors, > > - GFP_KERNEL, 0); > > - else if (blkif->blk_backend_type == BLKIF_BACKEND_FILE) { > > + GFP_KERNEL, secure); > > + } else if (blkif->blk_backend_type == BLKIF_BACKEND_FILE) { > > /* punch a hole in the backing file */ > > struct loop_device *lo = bdev->bd_disk->private_data; > > struct file *file = lo->lo_backing_file; > > @@ -618,6 +621,9 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > > struct blk_plug plug; > > bool drain = false; > > > > + /* Check that the number of segments is sane. */ > > + nseg = req->u1.nr_segments; > > This field is invalid (at least with these semantics) if req->operation > == BLKIF_OP_DISCARD so reading it here and clearing it later when you > decide it is invalid is just confusing. Why not read it inside the > switch iff it is valid?The problem was that ''nseg'' would be read after the switch, so it would contain the flag value. Which would throw off a lot of the loops which would try to enumerate "(for (i = 0; i < nseg;...)". Hence moving it to the top would make it valid for all the operations except the BLKIF_OP_DISCARD. And BLKIF_OP_DISCARD would sensibly set it nseg to zero so that we would not trip on those ''for (i = 0''). But I think you idea of making it an if statement would do, like:> > @@ -643,8 +650,6 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > > break; > > } > >if (operation != REQ_DISCARD) /* Check that the number of segments is sane. */ nseg = req->nr_segments; else nseg = 0;> > if (unlikely(nseg == 0 && operation != WRITE_FLUSH && > > operation != REQ_DISCARD) ||And I guess we can also skip the REQ_DISCARD test here.> > unlikely(nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST)) {.. snip..> handle isn''t really only r/w, is it? If it is then I think we should > just repeat the id fields within the union and pad so the offset is > correct: > > struct blkif_request { > uint8_t operation; /* BLKIF_OP_??? */ > union { > struct { > uint8_t nr_segments; /* number of segments */ > blkif_vdev_t handle; > uint64_t id; /* private guest value, echoed in resp */ > blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ > struct blkif_request_segment seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > } rw; > struct { > uint8_t flags; > blkif_vdev_t __pad; /* was "handle: only for read/write requests */ > uint64_t id; /* private guest value, echoed in resp */blkif_sector_t sectore_number; uint64_t nr_sectors;> } discard;I like that. So much easier to comprehend. Let me spin up a patch for that. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Oct-12 16:14 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
On Wed, 2011-10-12 at 16:45 +0100, Konrad Rzeszutek Wilk wrote:> On Wed, Oct 12, 2011 at 11:54:02AM +0100, Ian Campbell wrote: > > On Tue, 2011-10-11 at 21:57 +0100, Konrad Rzeszutek Wilk wrote: > > > > My later response to it should include it: > > > > http://lists.xensource.com/archives/html/xen-devel/2011-10/msg00652.html > > > > > > > > > > > > > > Further I wonder why you don''t use the "reserved" field instead of > > > > > extending the structure at the end. > > > > > > > > <blinks> I completly missed it. That would definitly work as well. > > > > > > > > Let me redo it with that in mind. > > > > > > I''ve posted the Xen hypervisor ABI one that thread above. The implementation > > > of that looks as follow: > > > > Ian. > > > > > > > > commit ae33f998d66c5982af533bda25c2b6c4f863789f > > > Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > > Date: Mon Oct 10 10:58:40 2011 -0400 > > > > > > xen/blk[front|back]: Enhance discard support with secure erasing support. > > > > > > Part of the blkdev_issue_discard(xx) operation is that it can also > > > issue a secure discard operation that will permanantly remove the > > > sectors in question. We advertise that we can support that via the > > > ''discard-secure'' attribute and on the request, if the ''secure'' bit > > > is set, we will attempt to pass in REQ_DISCARD | REQ_SECURE. > > > > > > CC: Li Dongyang <lidongyang@novell.com> > > > [v1: Used ''flag'' instead of ''secure:1'' bit] > > > [v2: Use ''reserved ''uint8_t'' as a flag] > > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > > > > > diff --git a/drivers/block/xen-blkback/blkback.c b/drivers/block/xen-blkback/blkback.c > > > index 94e659d..4f33c13 100644 > > > --- a/drivers/block/xen-blkback/blkback.c > > > +++ b/drivers/block/xen-blkback/blkback.c > > > @@ -362,7 +362,7 @@ static int xen_blkbk_map(struct blkif_request *req, > > > { > > > struct gnttab_map_grant_ref map[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > > > int i; > > > - int nseg = req->nr_segments; > > > + int nseg = req->u1.nr_segments; > > > int ret = 0; > > > > > > /* > > > @@ -422,13 +422,16 @@ static void xen_blk_discard(struct xen_blkif *blkif, struct blkif_request *req) > > > int status = BLKIF_RSP_OKAY; > > > struct block_device *bdev = blkif->vbd.bdev; > > > > > > - if (blkif->blk_backend_type == BLKIF_BACKEND_PHY) > > > + if (blkif->blk_backend_type == BLKIF_BACKEND_PHY) { > > > + unsigned long secure = (blkif->vbd.discard_secure && > > > + (req->u1.flag & BLKIF_OP_DISCARD_FLAG_SECURE)) ? > > > + BLKDEV_DISCARD_SECURE : 0; > > > /* just forward the discard request */ > > > err = blkdev_issue_discard(bdev, > > > req->u.discard.sector_number, > > > req->u.discard.nr_sectors, > > > - GFP_KERNEL, 0); > > > - else if (blkif->blk_backend_type == BLKIF_BACKEND_FILE) { > > > + GFP_KERNEL, secure); > > > + } else if (blkif->blk_backend_type == BLKIF_BACKEND_FILE) { > > > /* punch a hole in the backing file */ > > > struct loop_device *lo = bdev->bd_disk->private_data; > > > struct file *file = lo->lo_backing_file; > > > @@ -618,6 +621,9 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > > > struct blk_plug plug; > > > bool drain = false; > > > > > > + /* Check that the number of segments is sane. */ > > > + nseg = req->u1.nr_segments; > > > > This field is invalid (at least with these semantics) if req->operation > > == BLKIF_OP_DISCARD so reading it here and clearing it later when you > > decide it is invalid is just confusing. Why not read it inside the > > switch iff it is valid? > > The problem was that ''nseg'' would be read after the switch, so it would > contain the flag value. Which would throw off a lot of the loops which > would try to enumerate "(for (i = 0; i < nseg;...)". > > > Hence moving it to the top would make it valid for all the operations > except the BLKIF_OP_DISCARD. And BLKIF_OP_DISCARD would sensibly set it > nseg to zero so that we would not trip on those ''for (i = 0'').I think this is the crux of the problem: setting nsegs to an invalid value just so you can clear it again later when you check the op is completely backwards and confusing to any reader of the code.> But I think you idea of making it an if statement would do, like: > > > > > @@ -643,8 +650,6 @@ static int dispatch_rw_block_io(struct xen_blkif *blkif, > > > break; > > > } > > > > if (operation != REQ_DISCARD) > /* Check that the number of segments is sane. */ > nseg = req->nr_segments; > else > nseg = 0;Right above this hunk is a switch statement over the req->operation. The value of req->operation precisely defines the semantics/validity or otherwise of the req->nr_segments field and whether or not it contains the nr of segments or (due to the aliasing) something else. Why not set nsegs inside that switch statement (and explicitly zero it in the other cases) so that this obvious connection is retained?> > > if (unlikely(nseg == 0 && operation != WRITE_FLUSH && > > > operation != REQ_DISCARD) || > > And I guess we can also skip the REQ_DISCARD test here.I don''t think so, if nseg == 0 and operation == REQ_DISCARD that is fine, right? The fact that there is all this "operation != xx && operation != yy" conditional stuff suggests this would also be cleaner if it was also pulled up into the existing switch over req->operation. IOW overall you end up with: switch (req->operation) { case BLKIF_OP_READ: blkif->st_rd_req++; operation = READ; nseg = req->nr_segments; if (unlikely(nseg == 0 || nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST)) { pr_debug(DRV_PFX "Bad number of segments in read request (%d)\n", nseg); /* Haven''t submitted any bio''s yet. */ goto fail_response; } break; case BLKIF_OP_WRITE: blkif->st_wr_req++; operation = WRITE_ODIRECT; nseg = req->nr_segments; if (unlikely(nseg == 0 || nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST)) { pr_debug(DRV_PFX "Bad number of segments in write request (%d)\n", nseg); /* Haven''t submitted any bio''s yet. */ goto fail_response; } break; case BLKIF_OP_FLUSH_DISKCACHE: blkif->st_f_req++; operation = WRITE_FLUSH; nseg = req->nr_segments; /* nseg == 0 is ok here */ if (unlikely(nseg > BLKIF_MAX_SEGMENTS_PER_REQUEST)) { pr_debug(DRV_PFX "Bad number of segments in write/flush-cache request (%d)\n", nseg); /* Haven''t submitted any bio''s yet. */ goto fail_response; } break; case BLKIF_OP_DISCARD: blkif->st_some_stat++; operation = REQ_DISCARD; nseg = 0; /* No data associated with a discard */ break; case BLKIF_OP_WRITE_BARRIER: default: operation = 0; /* make gcc happy */ goto fail_response; break; } (I think I''m right that BLKIF_OP_FLUSH_DISKCACHE can have associated data or not) However do discard and r/w really have so much in common that handling them all in dispatch_rw_block_io() and relying on nsegs == 0 when the operation is discard makes sense? Would it be clearer if the caller (__do_block_io_op) had this switch over req->operation and called dispatch_rw_block_io(req, WRITE_FLUSH, nsegs), dispatch_discard(req) etc as appropriate? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-12 17:21 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
> > if (operation != REQ_DISCARD) > > /* Check that the number of segments is sane. */ > > nseg = req->nr_segments; > > else > > nseg = 0; > > Right above this hunk is a switch statement over the req->operation. The > value of req->operation precisely defines the semantics/validity or > otherwise of the req->nr_segments field and whether or not it contains > the nr of segments or (due to the aliasing) something else. Why not set > nsegs inside that switch statement (and explicitly zero it in the other > cases) so that this obvious connection is retained?Sure.> > > > > if (unlikely(nseg == 0 && operation != WRITE_FLUSH && > > > > operation != REQ_DISCARD) || > > > > And I guess we can also skip the REQ_DISCARD test here. > > I don''t think so, if nseg == 0 and operation == REQ_DISCARD that is > fine, right? The fact that there is all this "operation != xx &&<nods> ..snip..> (I think I''m right that BLKIF_OP_FLUSH_DISKCACHE can have associated > data or not)You are right.> > However do discard and r/w really have so much in common that handling > them all in dispatch_rw_block_io() and relying on nsegs == 0 when the > operation is discard makes sense? > > Would it be clearer if the caller (__do_block_io_op) had this switch > over req->operation and called dispatch_rw_block_io(req, WRITE_FLUSH, > nsegs), dispatch_discard(req) etc as appropriate?Potentially. It would cut down on this functions bloated size so that is a definite plus. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Oct-17 11:36 UTC
[Xen-devel] Re: [PATCH 1/3] xen/blkback: Support ''feature-barrier'' aka old-style BARRIER requests.
>>> On 10.10.11 at 17:28, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > @@ -481,6 +503,10 @@ static void __end_block_io_op(struct pending_req > *pending_req, int error) > pending_req->operation, pending_req->status); > xen_blkif_put(pending_req->blkif); > free_req(pending_req); > + if (atomic_read(&pending_req->blkif->refcnt) <= 2) { > + if (atomic_read(&pending_req->blkif->drain)) > + complete(&pending_req->blkif->drain_complete); > + }Shouldn''t this be done *before* the call the free_req()? Or alternatively a local copy of pending_req->blkif be obtained before freeing pending_req (which could be used in a couple more places in this function)? Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Oct-17 12:27 UTC
[Xen-devel] Re: [PATCH 1/3] xen/blkback: Support ''feature-barrier'' aka old-style BARRIER requests.
>>> On 10.10.11 at 17:28, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > We emulate the barrier requests by draining the outstanding bio''s > and then sending the WRITE_FLUSH command. To drain the I/Os > we use the refcnt that is used during disconnect to wait for all > the I/Os before disconnecting from the frontend. We latch on its > value and if it reaches either the threshold for disconnect or when > there are no more outstanding I/Os, then we have drained all I/Os. > > Suggested-by: Christopher Hellwig <hch@infradead.org> > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > --- > drivers/block/xen-blkback/blkback.c | 37 +++++++++++++++++++++++++++++++++- > drivers/block/xen-blkback/common.h | 5 ++++ > drivers/block/xen-blkback/xenbus.c | 18 +++++++++++++++++ > 3 files changed, 58 insertions(+), 2 deletions(-) > > diff --git a/drivers/block/xen-blkback/blkback.c > b/drivers/block/xen-blkback/blkback.c > index e0dab61..184b133 100644 > --- a/drivers/block/xen-blkback/blkback.c > +++ b/drivers/block/xen-blkback/blkback.c > @@ -452,6 +452,23 @@ static void xen_blk_discard(struct xen_blkif *blkif, > struct blkif_request *req) > make_response(blkif, req->id, req->operation, status); > } > > +static void xen_blk_drain_io(struct xen_blkif *blkif) > +{ > + atomic_set(&blkif->drain, 1); > + do { > + wait_for_completion_interruptible_timeout( > + &blkif->drain_complete, HZ); > + > + if (!atomic_read(&blkif->drain)) > + break; > + /* The initial value is one, and one refcnt taken at the > + * start of the xen_blkif_schedule thread. */ > + if (atomic_read(&blkif->refcnt) <= 2) > + break;Shouldn''t this test be done the very first thing in the loop? It looks racy the way it''s placed now, and it would incur a 1 sec stall if this was the only request currently being processed (as no completion of ane earlier request could signal completion). Jan> + } while (!kthread_should_stop()); > + atomic_set(&blkif->drain, 0); > +}_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Oct-17 13:23 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
>>> On 11.10.11 at 22:57, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > --- a/drivers/block/xen-blkfront.c > +++ b/drivers/block/xen-blkfront.c >... > @@ -705,7 +711,7 @@ static void blkif_free(struct blkfront_info *info, int suspend) > static void blkif_completion(struct blk_shadow *s) > { > int i;This function gets called for all types of requests, and hence must filter discard ones now that what would be nr_segments can be non-zero, e.g. if (s->req.operation == BLKIF_OP_DISCARD) return; Jan> - for (i = 0; i < s->req.nr_segments; i++) > + for (i = 0; i < s->req.u1.nr_segments; i++) > gnttab_end_foreign_access(s->req.u.rw.seg[i].gref, 0, 0UL); > } >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-17 16:50 UTC
[Xen-devel] Re: [PATCH 1/3] xen/blkback: Support ''feature-barrier'' aka old-style BARRIER requests.
On Mon, Oct 17, 2011 at 12:36:18PM +0100, Jan Beulich wrote:> >>> On 10.10.11 at 17:28, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > @@ -481,6 +503,10 @@ static void __end_block_io_op(struct pending_req > > *pending_req, int error) > > pending_req->operation, pending_req->status); > > xen_blkif_put(pending_req->blkif); > > free_req(pending_req); > > + if (atomic_read(&pending_req->blkif->refcnt) <= 2) { > > + if (atomic_read(&pending_req->blkif->drain)) > > + complete(&pending_req->blkif->drain_complete); > > + } > > Shouldn''t this be done *before* the call the free_req()? OrYes, otherwise we could referencing somebody''s else blkif->refcnt. Thanks for spotting that.> alternatively a local copy of pending_req->blkif be obtained before > freeing pending_req (which could be used in a couple more places > in this function)? > > Jan_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-17 16:52 UTC
[Xen-devel] Re: [PATCH 1/3] xen/blkback: Support ''feature-barrier'' aka old-style BARRIER requests.
On Mon, Oct 17, 2011 at 01:27:05PM +0100, Jan Beulich wrote:> >>> On 10.10.11 at 17:28, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > We emulate the barrier requests by draining the outstanding bio''s > > and then sending the WRITE_FLUSH command. To drain the I/Os > > we use the refcnt that is used during disconnect to wait for all > > the I/Os before disconnecting from the frontend. We latch on its > > value and if it reaches either the threshold for disconnect or when > > there are no more outstanding I/Os, then we have drained all I/Os. > > > > Suggested-by: Christopher Hellwig <hch@infradead.org> > > Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > > --- > > drivers/block/xen-blkback/blkback.c | 37 +++++++++++++++++++++++++++++++++- > > drivers/block/xen-blkback/common.h | 5 ++++ > > drivers/block/xen-blkback/xenbus.c | 18 +++++++++++++++++ > > 3 files changed, 58 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/block/xen-blkback/blkback.c > > b/drivers/block/xen-blkback/blkback.c > > index e0dab61..184b133 100644 > > --- a/drivers/block/xen-blkback/blkback.c > > +++ b/drivers/block/xen-blkback/blkback.c > > @@ -452,6 +452,23 @@ static void xen_blk_discard(struct xen_blkif *blkif, > > struct blkif_request *req) > > make_response(blkif, req->id, req->operation, status); > > } > > > > +static void xen_blk_drain_io(struct xen_blkif *blkif) > > +{ > > + atomic_set(&blkif->drain, 1); > > + do { > > + wait_for_completion_interruptible_timeout( > > + &blkif->drain_complete, HZ); > > + > > + if (!atomic_read(&blkif->drain)) > > + break; > > + /* The initial value is one, and one refcnt taken at the > > + * start of the xen_blkif_schedule thread. */ > > + if (atomic_read(&blkif->refcnt) <= 2) > > + break; > > Shouldn''t this test be done the very first thing in the loop? It looks > racy the way it''s placed now, and it would incur a 1 sec stall if this > was the only request currently being processed (as no completion > of ane earlier request could signal completion).Sure does. An earlier version of this (not posted), had this check right before going in the loop as all of the requests might have been already processed. I accidently dropped that logic as I moved this whole code into a function. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Li Dongyang
2011-Oct-20 03:17 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
I think we also should mark the vbd has discard_secure if we are usingthe file backend,as if we punch a hole in the image, those blocks are freed tofilesystem and hardly to getthem back, or maybe write zeros to the range before we punch the hole is better? On Mon, Oct 17, 2011 at 9:23 PM, Jan Beulich <JBeulich@suse.com> wrote:>>>> On 11.10.11 at 22:57, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: >> --- a/drivers/block/xen-blkfront.c >> +++ b/drivers/block/xen-blkfront.c >>... >> @@ -705,7 +711,7 @@ static void blkif_free(struct blkfront_info *info, int suspend) >> static void blkif_completion(struct blk_shadow *s) >> { >> int i; > > This function gets called for all types of requests, and hence must filter > discard ones now that what would be nr_segments can be non-zero, > e.g. > > if (s->req.operation == BLKIF_OP_DISCARD) > return; > > Jan > >> - for (i = 0; i < s->req.nr_segments; i++) >> + for (i = 0; i < s->req.u1.nr_segments; i++) >> gnttab_end_foreign_access(s->req.u.rw.seg[i].gref, 0, 0UL); >> } >> > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-20 03:46 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
On Thu, Oct 20, 2011 at 11:17:14AM +0800, Li Dongyang wrote:> I think we also should mark the vbd has discard_secure if we are > usingthe file backend,as if we punch a hole in the image, those blocks > are freed tofilesystem and hardly to getthem back, or maybe write > zeros to the range before we punch the hole is better?You would have to write zeros to that range (or perhaps random values) to emulate the secure delete. If you have a patch for that I would be interested in seeing it. Hmm, which reminds me - I should repost this patch series.> On Mon, Oct 17, 2011 at 9:23 PM, Jan Beulich <JBeulich@suse.com> wrote: > >>>> On 11.10.11 at 22:57, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > >> --- a/drivers/block/xen-blkfront.c > >> +++ b/drivers/block/xen-blkfront.c > >>... > >> @@ -705,7 +711,7 @@ static void blkif_free(struct blkfront_info *info, int suspend) > >> static void blkif_completion(struct blk_shadow *s) > >> { > >> int i; > > > > This function gets called for all types of requests, and hence must filter > > discard ones now that what would be nr_segments can be non-zero, > > e.g. > > > > if (s->req.operation == BLKIF_OP_DISCARD) > > return; > > > > Jan > > > >> - for (i = 0; i < s->req.nr_segments; i++) > >> + for (i = 0; i < s->req.u1.nr_segments; i++) > >> gnttab_end_foreign_access(s->req.u.rw.seg[i].gref, 0, 0UL); > >> } > >> > > > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Konrad Rzeszutek Wilk
2011-Oct-21 00:06 UTC
Re: [Xen-devel] [PATCH 3/3] xen/blk[front|back]: Enhance discard support with secure erasing support.
On Mon, Oct 17, 2011 at 02:23:01PM +0100, Jan Beulich wrote:> >>> On 11.10.11 at 22:57, Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> wrote: > > --- a/drivers/block/xen-blkfront.c > > +++ b/drivers/block/xen-blkfront.c > >... > > @@ -705,7 +711,7 @@ static void blkif_free(struct blkfront_info *info, int suspend) > > static void blkif_completion(struct blk_shadow *s) > > { > > int i; > > This function gets called for all types of requests, and hence must filter > discard ones now that what would be nr_segments can be non-zero,Oooh, nice catch.> e.g. > > if (s->req.operation == BLKIF_OP_DISCARD) > return; > > Jan > > > - for (i = 0; i < s->req.nr_segments; i++) > > + for (i = 0; i < s->req.u1.nr_segments; i++) > > gnttab_end_foreign_access(s->req.u.rw.seg[i].gref, 0, 0UL); > > } > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel