Stefano Stabellini
2012-Apr-16 17:05 UTC
[PATCH] xen_disk: implement BLKIF_OP_FLUSH_DISKCACHE, remove BLKIF_OP_WRITE_BARRIER
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- hw/xen_disk.c | 8 ++++---- 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/hw/xen_disk.c b/hw/xen_disk.c index 015d2af..3e4a47b 100644 --- a/hw/xen_disk.c +++ b/hw/xen_disk.c @@ -183,12 +183,12 @@ static int ioreq_parse(struct ioreq *ioreq) case BLKIF_OP_READ: ioreq->prot = PROT_WRITE; /* to memory */ break; - case BLKIF_OP_WRITE_BARRIER: + case BLKIF_OP_FLUSH_DISKCACHE: if (!ioreq->req.nr_segments) { ioreq->presync = 1; return 0; } - ioreq->presync = ioreq->postsync = 1; + ioreq->postsync = 1; /* fall through */ case BLKIF_OP_WRITE: ioreq->prot = PROT_READ; /* from memory */ @@ -354,7 +354,7 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq) qemu_aio_complete, ioreq); break; case BLKIF_OP_WRITE: - case BLKIF_OP_WRITE_BARRIER: + case BLKIF_OP_FLUSH_DISKCACHE: if (!ioreq->req.nr_segments) { break; } @@ -627,7 +627,7 @@ static int blk_init(struct XenDevice *xendev) blkdev->file_size, blkdev->file_size >> 20); /* fill info */ - xenstore_write_be_int(&blkdev->xendev, "feature-barrier", 1); + xenstore_write_be_int(&blkdev->xendev, "feature-flush-cache", 1); xenstore_write_be_int(&blkdev->xendev, "info", info); xenstore_write_be_int(&blkdev->xendev, "sector-size", blkdev->file_blk); xenstore_write_be_int(&blkdev->xendev, "sectors", -- 1.7.2.5
Konrad Rzeszutek Wilk
2012-Apr-16 17:10 UTC
Re: [PATCH] xen_disk: implement BLKIF_OP_FLUSH_DISKCACHE, remove BLKIF_OP_WRITE_BARRIER
On Mon, Apr 16, 2012 at 06:05:57PM +0100, Stefano Stabellini wrote:> Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>> --- > hw/xen_disk.c | 8 ++++---- > 1 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/hw/xen_disk.c b/hw/xen_disk.c > index 015d2af..3e4a47b 100644 > --- a/hw/xen_disk.c > +++ b/hw/xen_disk.c > @@ -183,12 +183,12 @@ static int ioreq_parse(struct ioreq *ioreq) > case BLKIF_OP_READ: > ioreq->prot = PROT_WRITE; /* to memory */ > break; > - case BLKIF_OP_WRITE_BARRIER: > + case BLKIF_OP_FLUSH_DISKCACHE: > if (!ioreq->req.nr_segments) { > ioreq->presync = 1; > return 0; > } > - ioreq->presync = ioreq->postsync = 1; > + ioreq->postsync = 1; > /* fall through */ > case BLKIF_OP_WRITE: > ioreq->prot = PROT_READ; /* from memory */ > @@ -354,7 +354,7 @@ static int ioreq_runio_qemu_aio(struct ioreq *ioreq) > qemu_aio_complete, ioreq); > break; > case BLKIF_OP_WRITE: > - case BLKIF_OP_WRITE_BARRIER: > + case BLKIF_OP_FLUSH_DISKCACHE: > if (!ioreq->req.nr_segments) { > break; > } > @@ -627,7 +627,7 @@ static int blk_init(struct XenDevice *xendev) > blkdev->file_size, blkdev->file_size >> 20); > > /* fill info */ > - xenstore_write_be_int(&blkdev->xendev, "feature-barrier", 1); > + xenstore_write_be_int(&blkdev->xendev, "feature-flush-cache", 1); > xenstore_write_be_int(&blkdev->xendev, "info", info); > xenstore_write_be_int(&blkdev->xendev, "sector-size", blkdev->file_blk); > xenstore_write_be_int(&blkdev->xendev, "sectors", > -- > 1.7.2.5
Christoph Hellwig
2012-Apr-25 08:45 UTC
Re: [PATCH] xen_disk: implement BLKIF_OP_FLUSH_DISKCACHE, remove BLKIF_OP_WRITE_BARRIER
> - case BLKIF_OP_WRITE_BARRIER: > + case BLKIF_OP_FLUSH_DISKCACHE: > if (!ioreq->req.nr_segments) { > ioreq->presync = 1; > return 0; > } > - ioreq->presync = ioreq->postsync = 1; > + ioreq->postsync = 1; > /* fall through */It might be worth documenting the semantics of BLKIF_OP_FLUSH_DISKCACHE in a comment here. I haven''t found any spec for the xen_disk protocol, but from looking at the Linux frontend it seems like the semantics of REQ_FLUSH and REQ_FUA in the Linux block driver are overloaded into BLKIF_OP_FLUSH_DISKCACHE, which is fairly confusing given that REQ_FLUSH already overload functionality. Even worse REQ_FLUSH with a payload implies a preflush, while REQ_FUA implies a post flush, and it seems like Xen has no way to distinguish the two, making thing like log writes very inefficient. Independent of that the implementation should really use a state machine around bdrv_aio_flush instead of doing guest-sychronous bdrv_flush calls.
Ian Campbell
2012-Apr-25 09:02 UTC
Re: [Xen-devel] [PATCH] xen_disk: implement BLKIF_OP_FLUSH_DISKCACHE, remove BLKIF_OP_WRITE_BARRIER
On Wed, 2012-04-25 at 09:45 +0100, Christoph Hellwig wrote:> > - case BLKIF_OP_WRITE_BARRIER: > > + case BLKIF_OP_FLUSH_DISKCACHE: > > if (!ioreq->req.nr_segments) { > > ioreq->presync = 1; > > return 0; > > } > > - ioreq->presync = ioreq->postsync = 1; > > + ioreq->postsync = 1; > > /* fall through */ > > It might be worth documenting the semantics of BLKIF_OP_FLUSH_DISKCACHE > in a comment here. I haven''t found any spec for the xen_disk protocol,The blkif spec was recently much improved, you can find it at http://xenbits.xen.org/docs/unstable/hypercall/include,public,io,blkif.h.html TBH I''m not sure it actually answers your questions wrt BLKIF_OP_FLUSH_DISKCACHE, if not please let us know and we can see about improving it. Ian
Christoph Hellwig
2012-Apr-25 10:20 UTC
Re: [Xen-devel] [PATCH] xen_disk: implement BLKIF_OP_FLUSH_DISKCACHE, remove BLKIF_OP_WRITE_BARRIER
On Wed, Apr 25, 2012 at 10:02:45AM +0100, Ian Campbell wrote:> The blkif spec was recently much improved, you can find it at > http://xenbits.xen.org/docs/unstable/hypercall/include,public,io,blkif.h.html > > TBH I''m not sure it actually answers your questions wrt > BLKIF_OP_FLUSH_DISKCACHE, if not please let us know and we can see about > improving it.That description in there is overly simple, and does not match any of the implementations known to me on either end. Talking about those: the mainline Linux blkback backend also implements different semantics from what mainline Linux blkfront seems to expect, as well as different from qemu. Looking at these three alone I can''t see how Xen ever managed to get data to disk reliably if using the paravirt interface. with the implementations in qemu and the Linux kernel frontend and backends, which> > Ian >---end quoted text---
Ian Campbell
2012-Apr-25 11:17 UTC
Re: [Xen-devel] [PATCH] xen_disk: implement BLKIF_OP_FLUSH_DISKCACHE, remove BLKIF_OP_WRITE_BARRIER
On Wed, 2012-04-25 at 12:21 +0100, Stefano Stabellini wrote:> On Wed, 25 Apr 2012, Christoph Hellwig wrote: > > On Wed, Apr 25, 2012 at 10:02:45AM +0100, Ian Campbell wrote: > > > The blkif spec was recently much improved, you can find it at > > > http://xenbits.xen.org/docs/unstable/hypercall/include,public,io,blkif.h.html > > > > > > TBH I''m not sure it actually answers your questions wrt > > > BLKIF_OP_FLUSH_DISKCACHE, if not please let us know and we can see about > > > improving it. > > > > That description in there is overly simple, and does not match any of the > > implementations known to me on either end. > > That is true, in fact I couldn''t figure out what I had to implement just > reading the comment. So I went through the blkback code and tried to > understand what I had to do, but I got it wrong. > > Reading the code again it seems to me that BLKIF_OP_FLUSH_DISKCACHE > is supposed to have the same semantics as REQ_FLUSH, that implies a > preflush if nr_segments > 0, not a postflush like I did. > > Konrad, can you please confirm this?... and then provide a patch to blkif.h. Thanks, Ian.
Stefano Stabellini
2012-Apr-25 11:21 UTC
Re: [Xen-devel] [PATCH] xen_disk: implement BLKIF_OP_FLUSH_DISKCACHE, remove BLKIF_OP_WRITE_BARRIER
On Wed, 25 Apr 2012, Christoph Hellwig wrote:> On Wed, Apr 25, 2012 at 10:02:45AM +0100, Ian Campbell wrote: > > The blkif spec was recently much improved, you can find it at > > http://xenbits.xen.org/docs/unstable/hypercall/include,public,io,blkif.h.html > > > > TBH I''m not sure it actually answers your questions wrt > > BLKIF_OP_FLUSH_DISKCACHE, if not please let us know and we can see about > > improving it. > > That description in there is overly simple, and does not match any of the > implementations known to me on either end.That is true, in fact I couldn''t figure out what I had to implement just reading the comment. So I went through the blkback code and tried to understand what I had to do, but I got it wrong. Reading the code again it seems to me that BLKIF_OP_FLUSH_DISKCACHE is supposed to have the same semantics as REQ_FLUSH, that implies a preflush if nr_segments > 0, not a postflush like I did. Konrad, can you please confirm this?
Christoph Hellwig
2012-Apr-25 11:23 UTC
Re: [Xen-devel] [PATCH] xen_disk: implement BLKIF_OP_FLUSH_DISKCACHE, remove BLKIF_OP_WRITE_BARRIER
On Wed, Apr 25, 2012 at 12:21:53PM +0100, Stefano Stabellini wrote:> That is true, in fact I couldn''t figure out what I had to implement just > reading the comment. So I went through the blkback code and tried to > understand what I had to do, but I got it wrong. > > Reading the code again it seems to me that BLKIF_OP_FLUSH_DISKCACHE > is supposed to have the same semantics as REQ_FLUSH, that implies a > preflush if nr_segments > 0, not a postflush like I did.It''s worse - blkfront translates both a REQ_FLUSH or a REQ_FUA into BLKIF_OP_FLUSH_DISKCACHE. REQ_FLUSH either is a pre flush or a pure flush without a data transfer, and REQ_FUA is a post flush. So to get the proper semantics you''ll have to do both, _and_ sequence it so that no operation starts before the previous one finished.
Konrad Rzeszutek Wilk
2012-Apr-26 15:41 UTC
Re: [Xen-devel] [PATCH] xen_disk: implement BLKIF_OP_FLUSH_DISKCACHE, remove BLKIF_OP_WRITE_BARRIER
On Wed, Apr 25, 2012 at 01:23:35PM +0200, Christoph Hellwig wrote:> On Wed, Apr 25, 2012 at 12:21:53PM +0100, Stefano Stabellini wrote: > > That is true, in fact I couldn''t figure out what I had to implement just > > reading the comment. So I went through the blkback code and tried to > > understand what I had to do, but I got it wrong. > > > > Reading the code again it seems to me that BLKIF_OP_FLUSH_DISKCACHE > > is supposed to have the same semantics as REQ_FLUSH, that implies a > > preflush if nr_segments > 0, not a postflush like I did. > > It''s worse - blkfront translates both a REQ_FLUSH or a REQ_FUA > into BLKIF_OP_FLUSH_DISKCACHE.I think that is what remained of the BARRIER request.> > REQ_FLUSH either is a pre flush or a pure flush without a data transfer, > and REQ_FUA is a post flush. So to get the proper semantics you''ll have > to do both, _and_ sequence it so that no operation starts before the > previous one finished.If I were to emulate the SCSI SYNC command which one would it be? I think REQ_FLUSH? In which I would think that the blkfront needs to get rid of the REQ_FUA part?
Stefano Stabellini
2012-May-09 12:42 UTC
Re: [Xen-devel] [PATCH] xen_disk: implement BLKIF_OP_FLUSH_DISKCACHE, remove BLKIF_OP_WRITE_BARRIER
On Thu, 26 Apr 2012, Konrad Rzeszutek Wilk wrote:> On Wed, Apr 25, 2012 at 01:23:35PM +0200, Christoph Hellwig wrote: > > On Wed, Apr 25, 2012 at 12:21:53PM +0100, Stefano Stabellini wrote: > > > That is true, in fact I couldn''t figure out what I had to implement just > > > reading the comment. So I went through the blkback code and tried to > > > understand what I had to do, but I got it wrong. > > > > > > Reading the code again it seems to me that BLKIF_OP_FLUSH_DISKCACHE > > > is supposed to have the same semantics as REQ_FLUSH, that implies a > > > preflush if nr_segments > 0, not a postflush like I did. > > > > It''s worse - blkfront translates both a REQ_FLUSH or a REQ_FUA > > into BLKIF_OP_FLUSH_DISKCACHE. > > I think that is what remained of the BARRIER request. > > > > REQ_FLUSH either is a pre flush or a pure flush without a data transfer, > > and REQ_FUA is a post flush. So to get the proper semantics you''ll have > > to do both, _and_ sequence it so that no operation starts before the > > previous one finished. > > If I were to emulate the SCSI SYNC command which one would it be? > > I think REQ_FLUSH? In which I would think that the blkfront needs to > get rid of the REQ_FUA part? >ping?
Konrad Rzeszutek Wilk
2012-Dec-19 18:46 UTC
Re: [Xen-devel] [PATCH] xen_disk: implement BLKIF_OP_FLUSH_DISKCACHE, remove BLKIF_OP_WRITE_BARRIER
On Wed, May 09, 2012 at 01:42:41PM +0100, Stefano Stabellini wrote:> On Thu, 26 Apr 2012, Konrad Rzeszutek Wilk wrote: > > On Wed, Apr 25, 2012 at 01:23:35PM +0200, Christoph Hellwig wrote: > > > On Wed, Apr 25, 2012 at 12:21:53PM +0100, Stefano Stabellini wrote: > > > > That is true, in fact I couldn''t figure out what I had to implement just > > > > reading the comment. So I went through the blkback code and tried to > > > > understand what I had to do, but I got it wrong. > > > > > > > > Reading the code again it seems to me that BLKIF_OP_FLUSH_DISKCACHE > > > > is supposed to have the same semantics as REQ_FLUSH, that implies a > > > > preflush if nr_segments > 0, not a postflush like I did. > > > > > > It''s worse - blkfront translates both a REQ_FLUSH or a REQ_FUA > > > into BLKIF_OP_FLUSH_DISKCACHE. > > > > I think that is what remained of the BARRIER request. > > > > > > REQ_FLUSH either is a pre flush or a pure flush without a data transfer, > > > and REQ_FUA is a post flush. So to get the proper semantics you''ll have > > > to do both, _and_ sequence it so that no operation starts before the > > > previous one finished. > > > > If I were to emulate the SCSI SYNC command which one would it be? > > > > I think REQ_FLUSH? In which I would think that the blkfront needs to > > get rid of the REQ_FUA part? > > > > ping?And just shy of 7 months later I answer :-) I think you are right. Getting rid of REQ_FUA looks like the right way. Oh, and blkfront already does that! 1290 err = xenbus_gather(XBT_NIL, info->xbdev->otherend, 1291 "feature-flush-cache", "%d", &flush, 1292 NULL); 1293 1294 if (!err && flush) { 1295 info->feature_flush = REQ_FLUSH; 1296 info->flush_op = BLKIF_OP_FLUSH_DISKCACHE; 1297 } 1298 So what I am missing?> > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Stefano Stabellini
2013-Jan-09 18:44 UTC
Re: [Xen-devel] [PATCH] xen_disk: implement BLKIF_OP_FLUSH_DISKCACHE, remove BLKIF_OP_WRITE_BARRIER
On Wed, 19 Dec 2012, Konrad Rzeszutek Wilk wrote:> On Wed, May 09, 2012 at 01:42:41PM +0100, Stefano Stabellini wrote: > > On Thu, 26 Apr 2012, Konrad Rzeszutek Wilk wrote: > > > On Wed, Apr 25, 2012 at 01:23:35PM +0200, Christoph Hellwig wrote: > > > > On Wed, Apr 25, 2012 at 12:21:53PM +0100, Stefano Stabellini wrote: > > > > > That is true, in fact I couldn''t figure out what I had to implement just > > > > > reading the comment. So I went through the blkback code and tried to > > > > > understand what I had to do, but I got it wrong. > > > > > > > > > > Reading the code again it seems to me that BLKIF_OP_FLUSH_DISKCACHE > > > > > is supposed to have the same semantics as REQ_FLUSH, that implies a > > > > > preflush if nr_segments > 0, not a postflush like I did. > > > > > > > > It''s worse - blkfront translates both a REQ_FLUSH or a REQ_FUA > > > > into BLKIF_OP_FLUSH_DISKCACHE. > > > > > > I think that is what remained of the BARRIER request. > > > > > > > > REQ_FLUSH either is a pre flush or a pure flush without a data transfer, > > > > and REQ_FUA is a post flush. So to get the proper semantics you''ll have > > > > to do both, _and_ sequence it so that no operation starts before the > > > > previous one finished. > > > > > > If I were to emulate the SCSI SYNC command which one would it be? > > > > > > I think REQ_FLUSH? In which I would think that the blkfront needs to > > > get rid of the REQ_FUA part? > > > > > > > ping? > > And just shy of 7 months later I answer :-) > > I think you are right. Getting rid of REQ_FUA looks like the > right way. Oh, and blkfront already does that! > > 1290 err = xenbus_gather(XBT_NIL, info->xbdev->otherend, > 1291 "feature-flush-cache", "%d", &flush, > 1292 NULL); > 1293 > 1294 if (!err && flush) { > 1295 info->feature_flush = REQ_FLUSH; > 1296 info->flush_op = BLKIF_OP_FLUSH_DISKCACHE; > 1297 } > 1298 > > So what I am missing?Nothing, thanks. I have updated and resent the patch, fixing the implementation of BLKIF_OP_FLUSH_DISKCACHE.