Li Dongyang
2011-Aug-09 07:11 UTC
[Xen-devel] [PATCH 1/2] teach blkfront driver handle discard request
The blkfront driver now will read feature-discard from xenstore, and set up the request queue, also forward the discard request to backend driver. Signed-off-by: Li Dongyang <lidongyang@novell.com> --- drivers/xen/blkfront/blkfront.c | 48 +++++++++++++++++++++++++++++++++++--- drivers/xen/blkfront/block.h | 3 ++ drivers/xen/blkfront/vbd.c | 8 ++++++ 3 files changed, 55 insertions(+), 4 deletions(-) diff --git a/drivers/xen/blkfront/blkfront.c b/drivers/xen/blkfront/blkfront.c index 9410fae..8a32e27 100644 --- a/drivers/xen/blkfront/blkfront.c +++ b/drivers/xen/blkfront/blkfront.c @@ -328,7 +328,9 @@ static void connect(struct blkfront_info *info) unsigned long long sectors; unsigned long sector_size; unsigned int binfo; - int err, barrier; + unsigned int discard_granularity; + unsigned int discard_alignment; + int err, barrier, discard; switch (info->connected) { case BLKIF_STATE_CONNECTED: @@ -377,6 +379,21 @@ static void connect(struct blkfront_info *info) else info->feature_flush = 0; + err = xenbus_scanf(XBT_NIL, info->xbdev->otherend, + "feature-discard", "%d", &discard); + if (err > 0 && discard) { + info->feature_discard = 1; + err = xenbus_gather(XBT_NIL, info->xbdev->otherend, + "discard_granularity", "%u", &discard_granularity, + "discard_alignment", "%u", &discard_alignment, + NULL); + if (!err) { + info->discard_granularity = discard_granularity; + info->discard_alignment = discard_alignment; + } + } else + info->feature_discard = 0; + err = xlvbd_add(sectors, info->vdevice, binfo, sector_size, info); if (err) { xenbus_dev_fatal(info->xbdev, err, "xlvbd_add at %s", @@ -704,9 +721,18 @@ static int blkif_queue_request(struct request *req) if (req->cmd_type == REQ_TYPE_BLOCK_PC) ring_req->operation = BLKIF_OP_PACKET; - ring_req->nr_segments = blk_rq_map_sg(req->q, req, info->sg); - BUG_ON(ring_req->nr_segments > BLKIF_MAX_SEGMENTS_PER_REQUEST); - for_each_sg(info->sg, sg, ring_req->nr_segments, i) { + if (req->cmd_flags & REQ_DISCARD) { + /* id sector_number and handle are initialized above. */ + blkif_request_trim_t *trim_req + (blkif_request_trim_t *)ring_req; + trim_req->operation = BLKIF_OP_TRIM; + trim_req->reserved = 0; + trim_req->nr_sectors = blk_rq_sectors(req); + } else { + ring_req->nr_segments = blk_rq_map_sg(req->q, req, info->sg); + BUG_ON(ring_req->nr_segments > BLKIF_MAX_SEGMENTS_PER_REQUEST); + + for_each_sg(info->sg, sg, ring_req->nr_segments, i) { buffer_mfn = page_to_phys(sg_page(sg)) >> PAGE_SHIFT; fsect = sg->offset >> 9; lsect = fsect + (sg->length >> 9) - 1; @@ -726,6 +752,7 @@ static int blkif_queue_request(struct request *req) .gref = ref, .first_sect = fsect, .last_sect = lsect }; + } } info->ring.req_prod_pvt++; @@ -821,6 +848,19 @@ static irqreturn_t blkif_int(int irq, void *dev_id) ret = bret->status == BLKIF_RSP_OKAY ? 0 : -EIO; switch (bret->operation) { + case BLKIF_OP_TRIM: + if (unlikely(bret->status == BLKIF_RSP_EOPNOTSUPP)) { + struct request_queue *rq = info->rq; + pr_warning("blkfront: %s: trim op failed\n", + info->gd->disk_name); + ret = -EOPNOTSUPP; + info->feature_discard = 0; + spin_lock_irq(rq->queue_lock); + queue_flag_clear(QUEUE_FLAG_DISCARD, rq); + spin_unlock_irq(rq->queue_lock); + } + __blk_end_request_all(req, ret); + break; case BLKIF_OP_WRITE_BARRIER: if (unlikely(bret->status == BLKIF_RSP_EOPNOTSUPP)) { pr_warning("blkfront: %s:" diff --git a/drivers/xen/blkfront/block.h b/drivers/xen/blkfront/block.h index 8db27a3..25926f3 100644 --- a/drivers/xen/blkfront/block.h +++ b/drivers/xen/blkfront/block.h @@ -112,6 +112,9 @@ struct blkfront_info struct blk_shadow shadow[BLK_RING_SIZE]; unsigned long shadow_free; int feature_flush; + int feature_discard; + unsigned int discard_granularity; + unsigned int discard_alignment; int is_ready; /** diff --git a/drivers/xen/blkfront/vbd.c b/drivers/xen/blkfront/vbd.c index 2a98439..c0170cf 100644 --- a/drivers/xen/blkfront/vbd.c +++ b/drivers/xen/blkfront/vbd.c @@ -300,6 +300,7 @@ static int xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size) { struct request_queue *rq; + struct blkfront_info *info = gd->private_data; rq = blk_init_queue(do_blkif_request, &blkif_io_lock); if (rq == NULL) @@ -307,6 +308,13 @@ xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size) #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,29) queue_flag_set_unlocked(QUEUE_FLAG_VIRT, rq); + + if (info->feature_discard) { + queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, rq); + blk_queue_max_discard_sectors(rq, get_capacity(gd)); + rq->limits.discard_granularity = info->discard_granularity; + rq->limits.discard_alignment = info->discard_alignment; + } #endif /* Hard sector size and max sectors impersonate the equiv. hardware. */ -- 1.7.6 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Li Dongyang
2011-Aug-09 07:11 UTC
[Xen-devel] [PATCH 2/2] make blkback driver handle trim request
now blkback driver can handle the trim request from guest, we will forward the request to phy if it''s really a device has trim, or we''ll punch a hole on the image file. Signed-off-by: Li Dongyang <lidongyang@novell.com> --- drivers/xen/blkback/blkback.c | 77 ++++++++++++++++++++++++++++++++++++++ drivers/xen/blkback/common.h | 2 +- drivers/xen/blkback/xenbus.c | 57 ++++++++++++++++++++++++++++ include/xen/interface/io/blkif.h | 2 + 4 files changed, 137 insertions(+), 1 deletions(-) diff --git a/drivers/xen/blkback/blkback.c b/drivers/xen/blkback/blkback.c index bb588e2..11aa09d 100644 --- a/drivers/xen/blkback/blkback.c +++ b/drivers/xen/blkback/blkback.c @@ -40,6 +40,9 @@ #include <linux/freezer.h> #include <linux/list.h> #include <linux/delay.h> +#include <linux/loop.h> +#include <linux/falloc.h> +#include <linux/fs.h> #include <xen/balloon.h> #include <xen/evtchn.h> #include <xen/gnttab.h> @@ -112,6 +115,9 @@ static int do_block_io_op(blkif_t *blkif); static void dispatch_rw_block_io(blkif_t *blkif, blkif_request_t *req, pending_req_t *pending_req); +static void dispatch_trim(blkif_t *blkif, + blkif_request_trim_t *req, + pending_req_t *pending_req); static void make_response(blkif_t *blkif, u64 id, unsigned short op, int st); @@ -369,6 +375,9 @@ static int do_block_io_op(blkif_t *blkif) blkif->st_wr_req++; dispatch_rw_block_io(blkif, &req, pending_req); break; + case BLKIF_OP_TRIM: + dispatch_trim(blkif, (blkif_request_trim_t *)&req, pending_req); + break; case BLKIF_OP_PACKET: DPRINTK("error: block operation BLKIF_OP_PACKET not implemented\n"); blkif->st_pk_req++; @@ -395,6 +404,74 @@ static int do_block_io_op(blkif_t *blkif) return more_to_do; } +static void dispatch_trim(blkif_t *blkif, + blkif_request_trim_t *req, + pending_req_t *pending_req) +{ + struct phys_req preq; + struct block_device *bdev = blkif->vbd.bdev; + int status = BLKIF_RSP_OKAY; + int err = 0; + + preq.dev = req->handle; + preq.sector_number = req->sector_number; + preq.nr_sects = req->nr_sectors; + + blkif_get(blkif); + if (blkif->vbd.type & VDISK_PHY_BACKEND) { + /* backend is a phy device, just forward the trim request */ + if (vbd_translate(&preq, blkif, WRITE) != 0) { + DPRINTK("access denied: %s of [%llu,%llu] on dev=%04x\n", + "discard", + preq.sector_number, + preq.sector_number + preq.nr_sects, preq.dev); + status = BLKIF_RSP_ERROR; + goto fail_response; + } + err = blkdev_issue_discard(bdev, + req->sector_number, + req->nr_sectors, + GFP_KERNEL, 0); + } else if (blkif->vbd.type & VDISK_FILE_BACKEND) { + /* backend is a file, punch a hole in the file */ + struct loop_device *lo = bdev->bd_disk->private_data; + struct file *file = lo->lo_backing_file; + struct inode *inode = file->f_path.dentry->d_inode; + if (vbd_translate(&preq, blkif, WRITE) != 0) { + DPRINTK("access denied: %s of [%llu,%llu] on dev=%04x\n", + "discard", + preq.sector_number, + preq.sector_number + preq.nr_sects, preq.dev); + status = BLKIF_RSP_ERROR; + goto fail_response; + } + if (inode->i_op->fallocate) { + err = inode->i_op->fallocate(inode, + FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE, + req->sector_number << 9, + req->nr_sectors << 9); + } else + err = -EOPNOTSUPP; + } else { + status = BLKIF_RSP_EOPNOTSUPP; + goto fail_response; + } + + if (err == -EOPNOTSUPP) { + DPRINTK("blkback: discard op failed, not supported\n"); + status = BLKIF_RSP_EOPNOTSUPP; + } else if (err) + status = BLKIF_RSP_ERROR; + +fail_response: + make_response(blkif, req->id, req->operation, status); + blkif_put(blkif); + free_req(pending_req); + if (status) + msleep(1); /* back off a bit */ + return; +} + static void dispatch_rw_block_io(blkif_t *blkif, blkif_request_t *req, pending_req_t *pending_req) diff --git a/drivers/xen/blkback/common.h b/drivers/xen/blkback/common.h index 7ac10df..8423fd9 100644 --- a/drivers/xen/blkback/common.h +++ b/drivers/xen/blkback/common.h @@ -126,7 +126,7 @@ unsigned long vbd_secsize(struct vbd *vbd); struct phys_req { unsigned short dev; - unsigned short nr_sects; + blkif_sector_t nr_sects; struct block_device *bdev; blkif_sector_t sector_number; }; diff --git a/drivers/xen/blkback/xenbus.c b/drivers/xen/blkback/xenbus.c index 639146c..d379fc8 100644 --- a/drivers/xen/blkback/xenbus.c +++ b/drivers/xen/blkback/xenbus.c @@ -219,6 +219,59 @@ int blkback_barrier(struct xenbus_transaction xbt, return err; } +int blkback_discard(struct xenbus_transaction xbt, struct backend_info *be) +{ + struct xenbus_device *dev = be->dev; + struct vbd *vbd = &be->blkif->vbd; + char *type; + int err; + int state = 0; + + type = xenbus_read(XBT_NIL, dev->nodename, "type", NULL); + if (!IS_ERR(type)) { + if (strcmp(type, "file") == 0) + state = 1; + vbd->type |= VDISK_FILE_BACKEND; + if (strcmp(type, "phy") == 0) { + struct block_device *bdev = be->blkif->vbd.bdev; + struct request_queue *q = bdev_get_queue(bdev); + if (blk_queue_discard(q)) { + err = xenbus_printf(xbt, dev->nodename, + "discard_granularity", "%u", + q->limits.discard_granularity); + if (err) { + xenbus_dev_fatal(dev, err, + "writing discard_granularity"); + goto kfree; + } + err = xenbus_printf(xbt, dev->nodename, + "discard_alignment", "%u", + q->limits.discard_alignment); + if (err) { + xenbus_dev_fatal(dev, err, + "writing discard_alignment"); + goto kfree; + } + state = 1; + vbd->type |= VDISK_PHY_BACKEND; + } + } + } else { + err = PTR_ERR(type); + xenbus_dev_fatal(dev, err, "reading type"); + goto out; + } + + err = xenbus_printf(xbt, dev->nodename, "feature-discard", + "%d", state); + if (err) + xenbus_dev_fatal(dev, err, "writing feature-discard"); +kfree: + kfree(type); +out: + return err; +} + /** * Entry point to this code when a new device is created. Allocate the basic * structures, and watch the store waiting for the hotplug scripts to tell us @@ -444,6 +497,10 @@ again: if (err) goto abort; + err = blkback_discard(xbt, be); + if (err) + goto abort; + err = xenbus_printf(xbt, dev->nodename, "sectors", "%llu", vbd_size(&be->blkif->vbd)); if (err) { diff --git a/include/xen/interface/io/blkif.h b/include/xen/interface/io/blkif.h index e239060..aa483cc 100644 --- a/include/xen/interface/io/blkif.h +++ b/include/xen/interface/io/blkif.h @@ -172,5 +172,7 @@ DEFINE_RING_TYPES(blkif, struct blkif_request, struct blkif_response); #define VDISK_CDROM 0x1 #define VDISK_REMOVABLE 0x2 #define VDISK_READONLY 0x4 +#define VDISK_FILE_BACKEND 0x8 +#define VDISK_PHY_BACKEND 0x10 #endif /* __XEN_PUBLIC_IO_BLKIF_H__ */ -- 1.7.6 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Aug-10 07:27 UTC
Re: [Xen-devel] [PATCH 1/2] teach blkfront driver handle discard request
>>> On 09.08.11 at 09:11, Li Dongyang <lidongyang@novell.com> wrote: > The blkfront driver now will read feature-discard from xenstore,As pointed out privately before, the xenstore node to use is actually documented in the interface header: "feature-trim". And with that, the other nodes you use are likely misnamed too. Owen, it seems you had an implementation of this functionality too - what happened to that?> and set up the request queue, also forward the discard request to > backend driver. > > Signed-off-by: Li Dongyang <lidongyang@novell.com> > --- > drivers/xen/blkfront/blkfront.c | 48 +++++++++++++++++++++++++++++++++++--- > drivers/xen/blkfront/block.h | 3 ++ > drivers/xen/blkfront/vbd.c | 8 ++++++Looks like this is against the 2.6.18 tree, which is fine by me, but which (if you want Keir to apply this at some point) ought to be pointed out in the subject.> 3 files changed, 55 insertions(+), 4 deletions(-) > > diff --git a/drivers/xen/blkfront/blkfront.c b/drivers/xen/blkfront/blkfront.c > index 9410fae..8a32e27 100644 > --- a/drivers/xen/blkfront/blkfront.c > +++ b/drivers/xen/blkfront/blkfront.c > @@ -328,7 +328,9 @@ static void connect(struct blkfront_info *info) > unsigned long long sectors; > unsigned long sector_size; > unsigned int binfo; > - int err, barrier; > + unsigned int discard_granularity; > + unsigned int discard_alignment; > + int err, barrier, discard; > > switch (info->connected) { > case BLKIF_STATE_CONNECTED: > @@ -377,6 +379,21 @@ static void connect(struct blkfront_info *info) > else > info->feature_flush = 0; > > + err = xenbus_scanf(XBT_NIL, info->xbdev->otherend, > + "feature-discard", "%d", &discard); > + if (err > 0 && discard) { > + info->feature_discard = 1; > + err = xenbus_gather(XBT_NIL, info->xbdev->otherend, > + "discard_granularity", "%u", &discard_granularity, > + "discard_alignment", "%u", &discard_alignment, > + NULL); > + if (!err) { > + info->discard_granularity = discard_granularity; > + info->discard_alignment = discard_alignment; > + } > + } else > + info->feature_discard = 0; > + > err = xlvbd_add(sectors, info->vdevice, binfo, sector_size, info); > if (err) { > xenbus_dev_fatal(info->xbdev, err, "xlvbd_add at %s", > @@ -704,9 +721,18 @@ static int blkif_queue_request(struct request *req) > if (req->cmd_type == REQ_TYPE_BLOCK_PC) > ring_req->operation = BLKIF_OP_PACKET; > > - ring_req->nr_segments = blk_rq_map_sg(req->q, req, info->sg); > - BUG_ON(ring_req->nr_segments > BLKIF_MAX_SEGMENTS_PER_REQUEST); > - for_each_sg(info->sg, sg, ring_req->nr_segments, i) { > + if (req->cmd_flags & REQ_DISCARD) {Under the assumption that without other knowledge the compiler would prefer the "if()" path over the "else" one, please invert the condition and switch around the bodies. That will at once make it more obvious that the removed lines above simply get their indentation changed.> + /* id sector_number and handle are initialized above. */ > + blkif_request_trim_t *trim_req > + (blkif_request_trim_t *)ring_req; > + trim_req->operation = BLKIF_OP_TRIM; > + trim_req->reserved = 0; > + trim_req->nr_sectors = blk_rq_sectors(req); > + } else { > + ring_req->nr_segments = blk_rq_map_sg(req->q, req, info->sg); > + BUG_ON(ring_req->nr_segments > BLKIF_MAX_SEGMENTS_PER_REQUEST); > + > + for_each_sg(info->sg, sg, ring_req->nr_segments, i) { > buffer_mfn = page_to_phys(sg_page(sg)) >> PAGE_SHIFT; > fsect = sg->offset >> 9; > lsect = fsect + (sg->length >> 9) - 1; > @@ -726,6 +752,7 @@ static int blkif_queue_request(struct request *req) > .gref = ref, > .first_sect = fsect, > .last_sect = lsect }; > + } > } > > info->ring.req_prod_pvt++; > @@ -821,6 +848,19 @@ static irqreturn_t blkif_int(int irq, void *dev_id) > > ret = bret->status == BLKIF_RSP_OKAY ? 0 : -EIO; > switch (bret->operation) { > + case BLKIF_OP_TRIM: > + if (unlikely(bret->status == BLKIF_RSP_EOPNOTSUPP)) { > + struct request_queue *rq = info->rq; > + pr_warning("blkfront: %s: trim op failed\n", > + info->gd->disk_name);pr_warning() doesn''t exist in 2.6.18, so now I''m really confused on which tree you would expect these to get applied. Posting patches against our SLE11 tree is relatively pointless, except maybe when tagged as RFC just to get comments.> + ret = -EOPNOTSUPP; > + info->feature_discard = 0; > + spin_lock_irq(rq->queue_lock); > + queue_flag_clear(QUEUE_FLAG_DISCARD, rq); > + spin_unlock_irq(rq->queue_lock); > + } > + __blk_end_request_all(req, ret); > + break; > case BLKIF_OP_WRITE_BARRIER: > if (unlikely(bret->status == BLKIF_RSP_EOPNOTSUPP)) { > pr_warning("blkfront: %s:" > diff --git a/drivers/xen/blkfront/block.h b/drivers/xen/blkfront/block.h > index 8db27a3..25926f3 100644 > --- a/drivers/xen/blkfront/block.h > +++ b/drivers/xen/blkfront/block.h > @@ -112,6 +112,9 @@ struct blkfront_info > struct blk_shadow shadow[BLK_RING_SIZE]; > unsigned long shadow_free; > int feature_flush; > + int feature_discard; > + unsigned int discard_granularity; > + unsigned int discard_alignment;Is it certain that 32 bits will now and forever suffice for representing these parameters? Jan> int is_ready; > > /** > diff --git a/drivers/xen/blkfront/vbd.c b/drivers/xen/blkfront/vbd.c > index 2a98439..c0170cf 100644 > --- a/drivers/xen/blkfront/vbd.c > +++ b/drivers/xen/blkfront/vbd.c > @@ -300,6 +300,7 @@ static int > xlvbd_init_blk_queue(struct gendisk *gd, u16 sector_size) > { > struct request_queue *rq; > + struct blkfront_info *info = gd->private_data; > > rq = blk_init_queue(do_blkif_request, &blkif_io_lock); > if (rq == NULL) > @@ -307,6 +308,13 @@ xlvbd_init_blk_queue(struct gendisk *gd, u16 > sector_size) > > #if LINUX_VERSION_CODE >= KERNEL_VERSION(2,6,29) > queue_flag_set_unlocked(QUEUE_FLAG_VIRT, rq); > + > + if (info->feature_discard) { > + queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, rq); > + blk_queue_max_discard_sectors(rq, get_capacity(gd)); > + rq->limits.discard_granularity = info->discard_granularity; > + rq->limits.discard_alignment = info->discard_alignment; > + } > #endif > > /* Hard sector size and max sectors impersonate the equiv. hardware. */_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Aug-10 07:33 UTC
Re: [Xen-devel] [PATCH 2/2] make blkback driver handle trim request
>>> On 09.08.11 at 09:11, Li Dongyang <lidongyang@novell.com> wrote: > now blkback driver can handle the trim request from guest, we will > forward the request to phy if it''s really a device has trim, or we''ll > punch a hole on the image file.Same general comments as for the frontend patch, plus ...> Signed-off-by: Li Dongyang <lidongyang@novell.com> > --- > drivers/xen/blkback/blkback.c | 77 > ++++++++++++++++++++++++++++++++++++++ > drivers/xen/blkback/common.h | 2 +- > drivers/xen/blkback/xenbus.c | 57 ++++++++++++++++++++++++++++ > include/xen/interface/io/blkif.h | 2 + > 4 files changed, 137 insertions(+), 1 deletions(-) > > diff --git a/drivers/xen/blkback/blkback.c b/drivers/xen/blkback/blkback.c > index bb588e2..11aa09d 100644 > --- a/drivers/xen/blkback/blkback.c > +++ b/drivers/xen/blkback/blkback.c > @@ -40,6 +40,9 @@ > #include <linux/freezer.h> > #include <linux/list.h> > #include <linux/delay.h> > +#include <linux/loop.h> > +#include <linux/falloc.h> > +#include <linux/fs.h> > #include <xen/balloon.h> > #include <xen/evtchn.h> > #include <xen/gnttab.h> > @@ -112,6 +115,9 @@ static int do_block_io_op(blkif_t *blkif); > static void dispatch_rw_block_io(blkif_t *blkif, > blkif_request_t *req, > pending_req_t *pending_req); > +static void dispatch_trim(blkif_t *blkif, > + blkif_request_trim_t *req, > + pending_req_t *pending_req); > static void make_response(blkif_t *blkif, u64 id, > unsigned short op, int st); > > @@ -369,6 +375,9 @@ static int do_block_io_op(blkif_t *blkif) > blkif->st_wr_req++; > dispatch_rw_block_io(blkif, &req, pending_req); > break; > + case BLKIF_OP_TRIM: > + dispatch_trim(blkif, (blkif_request_trim_t *)&req, pending_req); > + break; > case BLKIF_OP_PACKET: > DPRINTK("error: block operation BLKIF_OP_PACKET not implemented\n"); > blkif->st_pk_req++; > @@ -395,6 +404,74 @@ static int do_block_io_op(blkif_t *blkif) > return more_to_do; > } > > +static void dispatch_trim(blkif_t *blkif, > + blkif_request_trim_t *req, > + pending_req_t *pending_req) > +{ > + struct phys_req preq; > + struct block_device *bdev = blkif->vbd.bdev; > + int status = BLKIF_RSP_OKAY; > + int err = 0; > + > + preq.dev = req->handle; > + preq.sector_number = req->sector_number; > + preq.nr_sects = req->nr_sectors; > + > + blkif_get(blkif); > + if (blkif->vbd.type & VDISK_PHY_BACKEND) { > + /* backend is a phy device, just forward the trim request */ > + if (vbd_translate(&preq, blkif, WRITE) != 0) { > + DPRINTK("access denied: %s of [%llu,%llu] on dev=%04x\n", > + "discard", > + preq.sector_number, > + preq.sector_number + preq.nr_sects, preq.dev); > + status = BLKIF_RSP_ERROR; > + goto fail_response; > + } > + err = blkdev_issue_discard(bdev, > + req->sector_number, > + req->nr_sectors, > + GFP_KERNEL, 0); > + } else if (blkif->vbd.type & VDISK_FILE_BACKEND) { > + /* backend is a file, punch a hole in the file */ > + struct loop_device *lo = bdev->bd_disk->private_data; > + struct file *file = lo->lo_backing_file; > + struct inode *inode = file->f_path.dentry->d_inode; > + if (vbd_translate(&preq, blkif, WRITE) != 0) { > + DPRINTK("access denied: %s of [%llu,%llu] on dev=%04x\n", > + "discard", > + preq.sector_number, > + preq.sector_number + preq.nr_sects, preq.dev); > + status = BLKIF_RSP_ERROR; > + goto fail_response; > + } > + if (inode->i_op->fallocate) { > + err = inode->i_op->fallocate(inode, > + FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE, > + req->sector_number << 9, > + req->nr_sectors << 9); > + } else > + err = -EOPNOTSUPP; > + } else { > + status = BLKIF_RSP_EOPNOTSUPP; > + goto fail_response; > + } > + > + if (err == -EOPNOTSUPP) { > + DPRINTK("blkback: discard op failed, not supported\n"); > + status = BLKIF_RSP_EOPNOTSUPP; > + } else if (err) > + status = BLKIF_RSP_ERROR; > + > +fail_response: > + make_response(blkif, req->id, req->operation, status); > + blkif_put(blkif); > + free_req(pending_req); > + if (status) > + msleep(1); /* back off a bit */ > + return; > +} > + > static void dispatch_rw_block_io(blkif_t *blkif, > blkif_request_t *req, > pending_req_t *pending_req) > diff --git a/drivers/xen/blkback/common.h b/drivers/xen/blkback/common.h > index 7ac10df..8423fd9 100644 > --- a/drivers/xen/blkback/common.h > +++ b/drivers/xen/blkback/common.h > @@ -126,7 +126,7 @@ unsigned long vbd_secsize(struct vbd *vbd); > > struct phys_req { > unsigned short dev; > - unsigned short nr_sects; > + blkif_sector_t nr_sects; > struct block_device *bdev; > blkif_sector_t sector_number; > }; > diff --git a/drivers/xen/blkback/xenbus.c b/drivers/xen/blkback/xenbus.c > index 639146c..d379fc8 100644 > --- a/drivers/xen/blkback/xenbus.c > +++ b/drivers/xen/blkback/xenbus.c > @@ -219,6 +219,59 @@ int blkback_barrier(struct xenbus_transaction xbt, > return err; > } > > +int blkback_discard(struct xenbus_transaction xbt, struct backend_info *be)static?> +{ > + struct xenbus_device *dev = be->dev; > + struct vbd *vbd = &be->blkif->vbd; > + char *type; > + int err; > + int state = 0; > + > + type = xenbus_read(XBT_NIL, dev->nodename, "type", NULL); > + if (!IS_ERR(type)) { > + if (strcmp(type, "file") == 0) > + state = 1; > + vbd->type |= VDISK_FILE_BACKEND; > + if (strcmp(type, "phy") == 0) { > + struct block_device *bdev = be->blkif->vbd.bdev; > + struct request_queue *q = bdev_get_queue(bdev); > + if (blk_queue_discard(q)) { > + err = xenbus_printf(xbt, dev->nodename, > + "discard_granularity", "%u", > + q->limits.discard_granularity); > + if (err) { > + xenbus_dev_fatal(dev, err, > + "writing discard_granularity"); > + goto kfree; > + } > + err = xenbus_printf(xbt, dev->nodename, > + "discard_alignment", "%u", > + q->limits.discard_alignment); > + if (err) { > + xenbus_dev_fatal(dev, err, > + "writing discard_alignment"); > + goto kfree; > + } > + state = 1; > + vbd->type |= VDISK_PHY_BACKEND; > + } > + } > + } else { > + err = PTR_ERR(type); > + xenbus_dev_fatal(dev, err, "reading type"); > + goto out; > + } > + > + err = xenbus_printf(xbt, dev->nodename, "feature-discard", > + "%d", state); > + if (err) > + xenbus_dev_fatal(dev, err, "writing feature-discard"); > +kfree: > + kfree(type); > +out: > + return err; > +} > + > /** > * Entry point to this code when a new device is created. Allocate the > basic > * structures, and watch the store waiting for the hotplug scripts to tell > us > @@ -444,6 +497,10 @@ again: > if (err) > goto abort; > > + err = blkback_discard(xbt, be); > + if (err) > + goto abort; > + > err = xenbus_printf(xbt, dev->nodename, "sectors", "%llu", > vbd_size(&be->blkif->vbd)); > if (err) { > diff --git a/include/xen/interface/io/blkif.h > b/include/xen/interface/io/blkif.h > index e239060..aa483cc 100644 > --- a/include/xen/interface/io/blkif.h > +++ b/include/xen/interface/io/blkif.h > @@ -172,5 +172,7 @@ DEFINE_RING_TYPES(blkif, struct blkif_request, struct > blkif_response); > #define VDISK_CDROM 0x1 > #define VDISK_REMOVABLE 0x2 > #define VDISK_READONLY 0x4 > +#define VDISK_FILE_BACKEND 0x8 > +#define VDISK_PHY_BACKEND 0x10These two constants are being used only internally to the driver, so they shouldn''t become part of the interface (and you then also shouldn''t merge them into a variable that is used for interface defined flags). Jan> > #endif /* __XEN_PUBLIC_IO_BLKIF_H__ */_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Pasi Kärkkäinen
2011-Aug-10 11:40 UTC
Re: [Xen-devel] [PATCH 2/2] make blkback driver handle trim request
On Tue, Aug 09, 2011 at 03:11:35PM +0800, Li Dongyang wrote:> now blkback driver can handle the trim request from guest, we will > forward the request to phy if it''s really a device has trim, or we''ll > punch a hole on the image file. >Hmm.. the 1/2 patch has ''discard'' in the subject, while this has ''trim''.. is that intentional? I guess ''discard'' is the generic name, while protocol specific names are TRIM for ATA, and SCSI UNMAP for SAS ? Also: Shouldn''t this be against upstream Linux 3.x these days, aswell, now when both blkback and blkfront are upstream? -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Campbell
2011-Aug-10 13:45 UTC
Re: [Xen-devel] [PATCH 2/2] make blkback driver handle trim request
On Wed, 2011-08-10 at 12:40 +0100, Pasi Kärkkäinen wrote:> Also: Shouldn''t this be against upstream Linux 3.x these days, aswell, > now when both blkback and blkfront are upstream?Yes, please. Ideally we would insist that patches to those classic-Xen trees which are still somewhat maintained be sent to upstream first where applicable (i.e. only accept "backports" or classic-Xen specific bug fixes). Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Aug-10 13:58 UTC
Re: [Xen-devel] [PATCH 2/2] make blkback driver handle trim request
>>> On 10.08.11 at 15:45, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Wed, 2011-08-10 at 12:40 +0100, Pasi Kärkkäinen wrote: >> Also: Shouldn't this be against upstream Linux 3.x these days, aswell, >> now when both blkback and blkfront are upstream? > > Yes, please. > > Ideally we would insist that patches to those classic-Xen trees which > are still somewhat maintained be sent to upstream first where applicable > (i.e. only accept "backports" or classic-Xen specific bug fixes).Ideally yes. But that's not generally feasible, at least not always. For instance, I'm glad if I can keep on top of all the things needed for our kernels and hypervisors, and I would at best find time to compile test code for pv-ops. But with only that I certainly shouldn't really submit anything... Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2011-Aug-10 14:37 UTC
Re: [Xen-devel] [PATCH 2/2] make blkback driver handle trim request
On 10/08/2011 14:58, "Jan Beulich" <JBeulich@novell.com> wrote:>>>> On 10.08.11 at 15:45, Ian Campbell <Ian.Campbell@citrix.com> wrote: >> On Wed, 2011-08-10 at 12:40 +0100, Pasi Kärkkäinen wrote: >>> Also: Shouldn''t this be against upstream Linux 3.x these days, aswell, >>> now when both blkback and blkfront are upstream? >> >> Yes, please. >> >> Ideally we would insist that patches to those classic-Xen trees which >> are still somewhat maintained be sent to upstream first where applicable >> (i.e. only accept "backports" or classic-Xen specific bug fixes). > > Ideally yes. But that''s not generally feasible, at least not always. For > instance, I''m glad if I can keep on top of all the things needed for our > kernels and hypervisors, and I would at best find time to compile test > code for pv-ops. But with only that I certainly shouldn''t really submit > anything...I suspect that by now you are the only direct consumers of 2.6.18-xen. Is there really any benefit to keeping the public tree now? Only you commit to it; I expect only you directly inherit from it (others might indirectly, I accept). I really don''t think we should be tempting anyone else to actually *use* it as is. Hence my conclusion we could just delete the damn thing. -- Keir> Jan > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Aug-10 14:49 UTC
Re: [Xen-devel] [PATCH 2/2] make blkback driver handle trim request
>>> On 10.08.11 at 16:37, Keir Fraser <keir.xen@gmail.com> wrote: > On 10/08/2011 14:58, "Jan Beulich" <JBeulich@novell.com> wrote: > >>>>> On 10.08.11 at 15:45, Ian Campbell <Ian.Campbell@citrix.com> wrote: >>> On Wed, 2011-08-10 at 12:40 +0100, Pasi Kärkkäinen wrote: >>>> Also: Shouldn't this be against upstream Linux 3.x these days, aswell, >>>> now when both blkback and blkfront are upstream? >>> >>> Yes, please. >>> >>> Ideally we would insist that patches to those classic-Xen trees which >>> are still somewhat maintained be sent to upstream first where applicable >>> (i.e. only accept "backports" or classic-Xen specific bug fixes). >> >> Ideally yes. But that's not generally feasible, at least not always. For >> instance, I'm glad if I can keep on top of all the things needed for our >> kernels and hypervisors, and I would at best find time to compile test >> code for pv-ops. But with only that I certainly shouldn't really submit >> anything... > > I suspect that by now you are the only direct consumers of 2.6.18-xen. Is > there really any benefit to keeping the public tree now? Only you commit to > it; I expect only you directly inherit from it (others might indirectly, I > accept). I really don't think we should be tempting anyone else to actually > *use* it as is. Hence my conclusion we could just delete the damn thing.It's convenient to me, as this way I don't have to deal with a rapidly growing set of individual patches. But if the tree went dead (I wouldn't really want to see it deleted, so one can still use it for archaeological purposes), we would certainly survive. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jan Beulich
2011-Aug-10 14:50 UTC
Re: [Xen-devel] [PATCH 2/2] make blkback driver handle trim request
>>> On 10.08.11 at 16:37, Keir Fraser <keir.xen@gmail.com> wrote: > On 10/08/2011 14:58, "Jan Beulich" <JBeulich@novell.com> wrote: > >>>>> On 10.08.11 at 15:45, Ian Campbell <Ian.Campbell@citrix.com> wrote: >>> On Wed, 2011-08-10 at 12:40 +0100, Pasi Kärkkäinen wrote: >>>> Also: Shouldn't this be against upstream Linux 3.x these days, aswell, >>>> now when both blkback and blkfront are upstream? >>> >>> Yes, please. >>> >>> Ideally we would insist that patches to those classic-Xen trees which >>> are still somewhat maintained be sent to upstream first where applicable >>> (i.e. only accept "backports" or classic-Xen specific bug fixes). >> >> Ideally yes. But that's not generally feasible, at least not always. For >> instance, I'm glad if I can keep on top of all the things needed for our >> kernels and hypervisors, and I would at best find time to compile test >> code for pv-ops. But with only that I certainly shouldn't really submit >> anything... > > I suspect that by now you are the only direct consumers of 2.6.18-xen. Is > there really any benefit to keeping the public tree now? Only you commit to > it; I expect only you directly inherit from it (others might indirectly, I > accept). I really don't think we should be tempting anyone else to actually > *use* it as is. Hence my conclusion we could just delete the damn thing.Oh, and btw., in the recent history there are a couple of RedHat commits to the tree too, so for their older RHEL(s) they might still care a little. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Wed, Aug 10, 2011 at 03:49:25PM +0100, Jan Beulich wrote:> >>> On 10.08.11 at 16:37, Keir Fraser <keir.xen@gmail.com> wrote: > > On 10/08/2011 14:58, "Jan Beulich" <JBeulich@novell.com> wrote: > > > >>>>> On 10.08.11 at 15:45, Ian Campbell <Ian.Campbell@citrix.com> wrote: > >>> On Wed, 2011-08-10 at 12:40 +0100, Pasi Kärkkäinen wrote: > >>>> Also: Shouldn''t this be against upstream Linux 3.x these days, aswell, > >>>> now when both blkback and blkfront are upstream? > >>> > >>> Yes, please. > >>> > >>> Ideally we would insist that patches to those classic-Xen trees which > >>> are still somewhat maintained be sent to upstream first where applicable > >>> (i.e. only accept "backports" or classic-Xen specific bug fixes). > >> > >> Ideally yes. But that''s not generally feasible, at least not always. For > >> instance, I''m glad if I can keep on top of all the things needed for our > >> kernels and hypervisors, and I would at best find time to compile test > >> code for pv-ops. But with only that I certainly shouldn''t really submit > >> anything... > > > > I suspect that by now you are the only direct consumers of 2.6.18-xen. Is > > there really any benefit to keeping the public tree now? Only you commit to > > it; I expect only you directly inherit from it (others might indirectly, I > > accept). I really don''t think we should be tempting anyone else to actually > > *use* it as is. Hence my conclusion we could just delete the damn thing. > > It''s convenient to me, as this way I don''t have to deal with a rapidly > growing set of individual patches. But if the tree went dead (I wouldn''t > really want to see it deleted, so one can still use it for archaeological > purposes), we would certainly survive. >How about retiring the linux-2.6.18-xen.hg and create a new linux 2.6.32 based (git?) tree for the SLES11 and XCP/XenServer xenlinux based classic kernels? Those two kernel trees are probably still maintained for quite some time, so wouldn''t that be the easiest way? (until everyone is using pvops dom0) Does that make any sense? -- Pasi _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel