Roger Pau Monne
2013-Dec-03 10:57 UTC
[PATCH RFC] xen-block: correctly define structures in public headers
Using __packed__ on the public interface is not correct, this structures should be compiled using the native ABI, and __packed__ should only be used in the backend counterpart of those structures (which needs to handle different ABIs). This was even worse in the ARM case, where the Linux kernel was incorrectly using the X86_32 protocol ABI. This patch fixes it, but also breaks compatibility, so an ARM DomU kernel compiled with this patch will fail to communicate with PV disk devices unless the Dom0 also has this patch. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> Reported-by: Julien Grall <julien.grall@linaro.org> Cc: Julien Grall <julien.grall@linaro.org> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: David Vrabel <david.vrabel@citrix.com> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> --- include/xen/interface/io/blkif.h | 28 +++++++--------------------- 1 files changed, 7 insertions(+), 21 deletions(-) diff --git a/include/xen/interface/io/blkif.h b/include/xen/interface/io/blkif.h index 65e1209..002ea22 100644 --- a/include/xen/interface/io/blkif.h +++ b/include/xen/interface/io/blkif.h @@ -141,14 +141,11 @@ struct blkif_request_segment_aligned { /* @last_sect: last sector in frame to transfer (inclusive). */ uint8_t first_sect, last_sect; uint16_t _pad; /* padding to make it 8 bytes, so it's cache-aligned */ -} __attribute__((__packed__)); +}; struct blkif_request_rw { uint8_t nr_segments; /* number of segments */ blkif_vdev_t handle; /* only for read/write requests */ -#ifdef CONFIG_X86_64 - uint32_t _pad1; /* offsetof(blkif_request,u.rw.id) == 8 */ -#endif uint64_t id; /* private guest value, echoed in resp */ blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ struct blkif_request_segment { @@ -157,47 +154,36 @@ struct blkif_request_rw { /* @last_sect: last sector in frame to transfer (inclusive). */ uint8_t first_sect, last_sect; } seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; -} __attribute__((__packed__)); +}; struct blkif_request_discard { uint8_t flag; /* BLKIF_DISCARD_SECURE or zero. */ #define BLKIF_DISCARD_SECURE (1<<0) /* ignored if discard-secure=0 */ blkif_vdev_t _pad1; /* only for read/write requests */ -#ifdef CONFIG_X86_64 - uint32_t _pad2; /* offsetof(blkif_req..,u.discard.id)==8*/ -#endif uint64_t id; /* private guest value, echoed in resp */ blkif_sector_t sector_number; uint64_t nr_sectors; uint8_t _pad3; -} __attribute__((__packed__)); +}; struct blkif_request_other { uint8_t _pad1; blkif_vdev_t _pad2; /* only for read/write requests */ -#ifdef CONFIG_X86_64 - uint32_t _pad3; /* offsetof(blkif_req..,u.other.id)==8*/ -#endif uint64_t id; /* private guest value, echoed in resp */ -} __attribute__((__packed__)); +}; struct blkif_request_indirect { uint8_t indirect_op; uint16_t nr_segments; -#ifdef CONFIG_X86_64 - uint32_t _pad1; /* offsetof(blkif_...,u.indirect.id) == 8 */ -#endif uint64_t id; blkif_sector_t sector_number; blkif_vdev_t handle; uint16_t _pad2; grant_ref_t indirect_grefs[BLKIF_MAX_INDIRECT_PAGES_PER_REQUEST]; -#ifdef CONFIG_X86_64 - uint32_t _pad3; /* make it 64 byte aligned */ -#else +#ifdef CONFIG_X86_32 uint64_t _pad3; /* make it 64 byte aligned */ #endif -} __attribute__((__packed__)); +}; struct blkif_request { uint8_t operation; /* BLKIF_OP_??? */ @@ -207,7 +193,7 @@ struct blkif_request { struct blkif_request_other other; struct blkif_request_indirect indirect; } u; -} __attribute__((__packed__)); +}; struct blkif_response { uint64_t id; /* copied from request */ -- 1.7.7.5 (Apple Git-26) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
David Vrabel
2013-Dec-03 11:01 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On 03/12/13 10:57, Roger Pau Monne wrote:> Using __packed__ on the public interface is not correct, this > structures should be compiled using the native ABI, and __packed__ > should only be used in the backend counterpart of those structures > (which needs to handle different ABIs). > > This was even worse in the ARM case, where the Linux kernel was > incorrectly using the X86_32 protocol ABI. This patch fixes it, but > also breaks compatibility, so an ARM DomU kernel compiled with > this patch will fail to communicate with PV disk devices unless the > Dom0 also has this patch.This ABI change needs to be justified. Why do you think it is acceptable to break existing Linux guests? Because I don''t think it is. David
Ian Campbell
2013-Dec-03 11:05 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On Tue, 2013-12-03 at 11:57 +0100, Roger Pau Monne wrote:> Using __packed__ on the public interface is not correct, this > structures should be compiled using the native ABI, and __packed__ > should only be used in the backend counterpart of those structures > (which needs to handle different ABIs). > > This was even worse in the ARM case, where the Linux kernel was > incorrectly using the X86_32 protocol ABI. This patch fixes it, but > also breaks compatibility, so an ARM DomU kernel compiled with > this patch will fail to communicate with PV disk devices unless the > Dom0 also has this patch.This is acceptable IMHO, the ARM ABI is clearly defined and previous kernels were simply buggy. The fact that front and backend were equivalently buggy and so it happened to work is not an excuse.> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> > Reported-by: Julien Grall <julien.grall@linaro.org> > Cc: Julien Grall <julien.grall@linaro.org> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> > Cc: David Vrabel <david.vrabel@citrix.com> > Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> > Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> > --- > include/xen/interface/io/blkif.h | 28 +++++++--------------------- > 1 files changed, 7 insertions(+), 21 deletions(-) > > diff --git a/include/xen/interface/io/blkif.h b/include/xen/interface/io/blkif.h > index 65e1209..002ea22 100644 > --- a/include/xen/interface/io/blkif.h > +++ b/include/xen/interface/io/blkif.h > @@ -141,14 +141,11 @@ struct blkif_request_segment_aligned { > /* @last_sect: last sector in frame to transfer (inclusive). */ > uint8_t first_sect, last_sect; > uint16_t _pad; /* padding to make it 8 bytes, so it's cache-aligned */ > -} __attribute__((__packed__)); > +}; > > struct blkif_request_rw { > uint8_t nr_segments; /* number of segments */ > blkif_vdev_t handle; /* only for read/write requests */ > -#ifdef CONFIG_X86_64 > - uint32_t _pad1; /* offsetof(blkif_request,u.rw.id) == 8 */ > -#endifThese padding fields would still serve a purpose even after removing the packing, which is to document/clarify where there are holes for various architectures. They could either be retained or perhaps replaced by a comment? Ian. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Paul Durrant
2013-Dec-03 11:07 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
> -----Original Message----- > From: xen-devel-bounces@lists.xen.org [mailto:xen-devel- > bounces@lists.xen.org] On Behalf Of David Vrabel > Sent: 03 December 2013 11:01 > To: Roger Pau Monne > Cc: Stefano Stabellini; Julien Grall; linux-kernel@vger.kernel.org; xen- > devel@lists.xenproject.org; Boris Ostrovsky > Subject: Re: [Xen-devel] [PATCH RFC] xen-block: correctly define structures > in public headers > > On 03/12/13 10:57, Roger Pau Monne wrote: > > Using __packed__ on the public interface is not correct, this > > structures should be compiled using the native ABI, and __packed__ > > should only be used in the backend counterpart of those structures > > (which needs to handle different ABIs). > > > > This was even worse in the ARM case, where the Linux kernel was > > incorrectly using the X86_32 protocol ABI. This patch fixes it, but > > also breaks compatibility, so an ARM DomU kernel compiled with > > this patch will fail to communicate with PV disk devices unless the > > Dom0 also has this patch. > > This ABI change needs to be justified. Why do you think it is > acceptable to break existing Linux guests? Because I don''t think it is. >Even if they are lying about their ABI? Paul> David > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Ian Campbell
2013-Dec-03 11:08 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On Tue, 2013-12-03 at 11:01 +0000, David Vrabel wrote:> On 03/12/13 10:57, Roger Pau Monne wrote: > > Using __packed__ on the public interface is not correct, this > > structures should be compiled using the native ABI, and __packed__ > > should only be used in the backend counterpart of those structures > > (which needs to handle different ABIs). > > > > This was even worse in the ARM case, where the Linux kernel was > > incorrectly using the X86_32 protocol ABI. This patch fixes it, but > > also breaks compatibility, so an ARM DomU kernel compiled with > > this patch will fail to communicate with PV disk devices unless the > > Dom0 also has this patch. > > This ABI change needs to be justified. Why do you think it is > acceptable to break existing Linux guests? Because I don''t think it is.As I explained in my reply those guests are buggy. Ian.
Roger Pau Monné
2013-Dec-03 11:09 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On 03/12/13 12:01, David Vrabel wrote:> On 03/12/13 10:57, Roger Pau Monne wrote: >> Using __packed__ on the public interface is not correct, this >> structures should be compiled using the native ABI, and __packed__ >> should only be used in the backend counterpart of those structures >> (which needs to handle different ABIs). >> >> This was even worse in the ARM case, where the Linux kernel was >> incorrectly using the X86_32 protocol ABI. This patch fixes it, but >> also breaks compatibility, so an ARM DomU kernel compiled with >> this patch will fail to communicate with PV disk devices unless the >> Dom0 also has this patch. > > This ABI change needs to be justified. Why do you think it is > acceptable to break existing Linux guests? Because I don''t think it is.ARM Linux guests are not using the ABI found in Xen public headers, they are using the X86_32 ABI, because of the __packed__ and #ifdef CONFIG_X86_64 (which made ARM kernels use the X86_32 ABI). If someone picks a pristine version of the structures found in blkif.h from upstream Xen and compile them on ARM they will see that it differs from what the Linux kernel currently generates, which is not acceptable. IMHO keeping it like this is a big mistake, this prevents compatibility with ARM guests that implement the correct ABI.
Jan Beulich
2013-Dec-03 11:11 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
>>> On 03.12.13 at 12:05, Ian Campbell <Ian.Campbell@citrix.com> wrote: > On Tue, 2013-12-03 at 11:57 +0100, Roger Pau Monne wrote: >> Using __packed__ on the public interface is not correct, this >> structures should be compiled using the native ABI, and __packed__ >> should only be used in the backend counterpart of those structures >> (which needs to handle different ABIs). >> >> This was even worse in the ARM case, where the Linux kernel was >> incorrectly using the X86_32 protocol ABI. This patch fixes it, but >> also breaks compatibility, so an ARM DomU kernel compiled with >> this patch will fail to communicate with PV disk devices unless the >> Dom0 also has this patch. > > This is acceptable IMHO, the ARM ABI is clearly defined and previous > kernels were simply buggy. The fact that front and backend were > equivalently buggy and so it happened to work is not an excuse.But afaics the change is not just to the ARM form of the ABI, but to x86 (32- and 64-bit) too. And that clearly must not be altered. Jan
Roger Pau Monné
2013-Dec-03 11:14 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On 03/12/13 12:05, Ian Campbell wrote:> On Tue, 2013-12-03 at 11:57 +0100, Roger Pau Monne wrote: >> Using __packed__ on the public interface is not correct, this >> structures should be compiled using the native ABI, and __packed__ >> should only be used in the backend counterpart of those structures >> (which needs to handle different ABIs). >> >> This was even worse in the ARM case, where the Linux kernel was >> incorrectly using the X86_32 protocol ABI. This patch fixes it, but >> also breaks compatibility, so an ARM DomU kernel compiled with >> this patch will fail to communicate with PV disk devices unless the >> Dom0 also has this patch. > > This is acceptable IMHO, the ARM ABI is clearly defined and previous > kernels were simply buggy. The fact that front and backend were > equivalently buggy and so it happened to work is not an excuse. > >> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> >> Reported-by: Julien Grall <julien.grall@linaro.org> >> Cc: Julien Grall <julien.grall@linaro.org> >> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> >> Cc: David Vrabel <david.vrabel@citrix.com> >> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> >> Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com> >> --- >> include/xen/interface/io/blkif.h | 28 +++++++--------------------- >> 1 files changed, 7 insertions(+), 21 deletions(-) >> >> diff --git a/include/xen/interface/io/blkif.h b/include/xen/interface/io/blkif.h >> index 65e1209..002ea22 100644 >> --- a/include/xen/interface/io/blkif.h >> +++ b/include/xen/interface/io/blkif.h >> @@ -141,14 +141,11 @@ struct blkif_request_segment_aligned { >> /* @last_sect: last sector in frame to transfer (inclusive). */ >> uint8_t first_sect, last_sect; >> uint16_t _pad; /* padding to make it 8 bytes, so it's cache-aligned */ >> -} __attribute__((__packed__)); >> +}; >> >> struct blkif_request_rw { >> uint8_t nr_segments; /* number of segments */ >> blkif_vdev_t handle; /* only for read/write requests */ >> -#ifdef CONFIG_X86_64 >> - uint32_t _pad1; /* offsetof(blkif_request,u.rw.id) == 8 */ >> -#endif > > These padding fields would still serve a purpose even after removing the > packing, which is to document/clarify where there are holes for various > architectures. They could either be retained or perhaps replaced by a > comment?Those paddings are already present in drivers/block/xen-blkback/common.h for each of the different ABIs, which I think is enough, but if I had to, I would rather replace them with comments. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Jan Beulich
2013-Dec-03 11:14 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
>>> On 03.12.13 at 11:57, Roger Pau Monne <roger.pau@citrix.com> wrote: > struct blkif_request_rw { > uint8_t nr_segments; /* number of segments */ > blkif_vdev_t handle; /* only for read/write requests */ > -#ifdef CONFIG_X86_64 > - uint32_t _pad1; /* offsetof(blkif_request,u.rw.id) == 8 */ > -#endif > uint64_t id; /* private guest value, echoed in resp */ > blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ > struct blkif_request_segment { > @@ -157,47 +154,36 @@ struct blkif_request_rw { > /* @last_sect: last sector in frame to transfer (inclusive). */ > uint8_t first_sect, last_sect; > } seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; > -} __attribute__((__packed__)); > +};Removing the packed attribute here and below is not possible as long as the defined structures get used in struct blkif_request as the second field after a uint8_t one, and as long as the individual fields here aren''t natively aligned. Jan
Ian Campbell
2013-Dec-03 11:15 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On Tue, 2013-12-03 at 11:11 +0000, Jan Beulich wrote:> >>> On 03.12.13 at 12:05, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Tue, 2013-12-03 at 11:57 +0100, Roger Pau Monne wrote: > >> Using __packed__ on the public interface is not correct, this > >> structures should be compiled using the native ABI, and __packed__ > >> should only be used in the backend counterpart of those structures > >> (which needs to handle different ABIs). > >> > >> This was even worse in the ARM case, where the Linux kernel was > >> incorrectly using the X86_32 protocol ABI. This patch fixes it, but > >> also breaks compatibility, so an ARM DomU kernel compiled with > >> this patch will fail to communicate with PV disk devices unless the > >> Dom0 also has this patch. > > > > This is acceptable IMHO, the ARM ABI is clearly defined and previous > > kernels were simply buggy. The fact that front and backend were > > equivalently buggy and so it happened to work is not an excuse. > > But afaics the change is not just to the ARM form of the ABI, > but to x86 (32- and 64-bit) too. And that clearly must not be > altered.I understood there to be no (intentional) change to the x86 ABIs here. Obviously that would be a bug in the patch. Ian.
Roger Pau Monné
2013-Dec-03 11:18 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On 03/12/13 12:14, Jan Beulich wrote:>>>> On 03.12.13 at 11:57, Roger Pau Monne <roger.pau@citrix.com> wrote: >> struct blkif_request_rw { >> uint8_t nr_segments; /* number of segments */ >> blkif_vdev_t handle; /* only for read/write requests */ >> -#ifdef CONFIG_X86_64 >> - uint32_t _pad1; /* offsetof(blkif_request,u.rw.id) == 8 */ >> -#endif >> uint64_t id; /* private guest value, echoed in resp */ >> blkif_sector_t sector_number;/* start sector idx on disk (r/w only) */ >> struct blkif_request_segment { >> @@ -157,47 +154,36 @@ struct blkif_request_rw { >> /* @last_sect: last sector in frame to transfer (inclusive). */ >> uint8_t first_sect, last_sect; >> } seg[BLKIF_MAX_SEGMENTS_PER_REQUEST]; >> -} __attribute__((__packed__)); >> +}; > > Removing the packed attribute here and below is not possible > as long as the defined structures get used in struct blkif_request > as the second field after a uint8_t one, and as long as the > individual fields here aren''t natively aligned.Sorry, the patch is incorrect, let me resend that.
David Vrabel
2013-Dec-03 11:59 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On 03/12/13 11:08, Ian Campbell wrote:> On Tue, 2013-12-03 at 11:01 +0000, David Vrabel wrote: >> On 03/12/13 10:57, Roger Pau Monne wrote: >>> Using __packed__ on the public interface is not correct, this >>> structures should be compiled using the native ABI, and __packed__ >>> should only be used in the backend counterpart of those structures >>> (which needs to handle different ABIs). >>> >>> This was even worse in the ARM case, where the Linux kernel was >>> incorrectly using the X86_32 protocol ABI. This patch fixes it, but >>> also breaks compatibility, so an ARM DomU kernel compiled with >>> this patch will fail to communicate with PV disk devices unless the >>> Dom0 also has this patch. >> >> This ABI change needs to be justified. Why do you think it is >> acceptable to break existing Linux guests? Because I don''t think it is. > > As I explained in my reply those guests are buggy.The kernel has a strong policy on not changing ABIs, even to fix bugs. I don''t think a bug fix alone is sufficient justification for ABI breakage. I think this change will cause real problems. e.g., if someone tries to bisect a different guest problem across this change. The commit message doesn''t really give enough details on the problem so please correct me if I''m misunderstanding. 1. The ARM ABI for blkif was specified as uniform between 32-bit and 64-bit and is equivalent to the x86_64 ABI. 2. ARM 32-bit back and frontend implementation did /not/ use this defined ABI, but instead used the x86_32 ABI. What did 32-bit ARM frontends report as their ABI? x86_32? or native? 3. ARM 64-bit back and frontend implementation did use the specified ABI, but the backend is not compatible with 32-bit ARM guests. What did 64-bit ARM frontends report as their ABI? x86_64? or native? 4. Support for 64-bit ARM guests is not upstream in Linux yet (so I don''t mind if 64-bit guests are broken). I think this should be resolved in a backward compatible way. 1. Introduce a new blkif ABI that is uniform across all architectures and 32-/64-bit. i.e., everything naturally aligned with explicit padding fields as necessary. 2. Backend exports a ''feature-abi-v2'' xenstore key if it supports this new ABI. 3. Frontends uses the ABI and reports it, iff feature-abi-v2 is present. Otherwise it must use the existing ABI. ARM 64-bit guests can require this v2 ABI. 4. Backend may need an #if ARM assume x86_32 ABI if abi-v2 is not reported by the frontend. David
Ian Campbell
2013-Dec-03 13:41 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On Tue, 2013-12-03 at 11:59 +0000, David Vrabel wrote:> On 03/12/13 11:08, Ian Campbell wrote: > > On Tue, 2013-12-03 at 11:01 +0000, David Vrabel wrote: > >> On 03/12/13 10:57, Roger Pau Monne wrote: > >>> Using __packed__ on the public interface is not correct, this > >>> structures should be compiled using the native ABI, and __packed__ > >>> should only be used in the backend counterpart of those structures > >>> (which needs to handle different ABIs). > >>> > >>> This was even worse in the ARM case, where the Linux kernel was > >>> incorrectly using the X86_32 protocol ABI. This patch fixes it, but > >>> also breaks compatibility, so an ARM DomU kernel compiled with > >>> this patch will fail to communicate with PV disk devices unless the > >>> Dom0 also has this patch. > >> > >> This ABI change needs to be justified. Why do you think it is > >> acceptable to break existing Linux guests? Because I don''t think it is. > > > > As I explained in my reply those guests are buggy. > > The kernel has a strong policy on not changing ABIs, even to fix bugs. > I don''t think a bug fix alone is sufficient justification for ABI breakage.This is not the kernels ABI to fix in stone. The ABI is defined by the upstream Xen project and Linux has failed to follow the specified ABI correctly.> 1. The ARM ABI for blkif was specified as uniform between 32-bit and > 64-bit and is equivalent to the x86_64 ABI.The ARM ABI is specified here: http://xenbits.xen.org/docs/unstable/hypercall/arm/include,public,arch-arm.h.html#incontents_arm_abi> 2. ARM 32-bit back and frontend implementation did /not/ use this > defined ABI, but instead used the x86_32 ABI. What did 32-bit ARM > frontends report as their ABI? x86_32? or native?They reported themselves as ARM (XEN_IO_PROTO_ABI_ARM == "arm") but they unintentionally implemented something else which was akin to the x86_32 API. They didn''t actually "implement the x86_32 API", it''s perhaps confusing to say that they did.> 3. ARM 64-bit back and frontend implementation did use the specified > ABI, but the backend is not compatible with 32-bit ARM guests. What did > 64-bit ARM frontends report as their ABI? x86_64? or native?They report themselves as ARM (XEN_IO_PROTO_ABI_ARM == "arm").> 4. Support for 64-bit ARM guests is not upstream in Linux yet (so I > don''t mind if 64-bit guests are broken).Yes, it is upstream.> I think this should be resolved in a backward compatible way.No. Linux is in error here and should be fixed. We have done this before (e.g. with event channels being the wrong bit size). There is no deployed legacy of guests to support here. At this stage we have not even formally committed to keeping the hypercall ABI stable. We will likely do so with the Xen 4.4 release. Ian.
David Vrabel
2013-Dec-03 15:11 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On 03/12/13 13:41, Ian Campbell wrote:> On Tue, 2013-12-03 at 11:59 +0000, David Vrabel wrote: >> On 03/12/13 11:08, Ian Campbell wrote: >>> On Tue, 2013-12-03 at 11:01 +0000, David Vrabel wrote: >>>> On 03/12/13 10:57, Roger Pau Monne wrote: >>>>> Using __packed__ on the public interface is not correct, this >>>>> structures should be compiled using the native ABI, and __packed__ >>>>> should only be used in the backend counterpart of those structures >>>>> (which needs to handle different ABIs). >>>>> >>>>> This was even worse in the ARM case, where the Linux kernel was >>>>> incorrectly using the X86_32 protocol ABI. This patch fixes it, but >>>>> also breaks compatibility, so an ARM DomU kernel compiled with >>>>> this patch will fail to communicate with PV disk devices unless the >>>>> Dom0 also has this patch. >>>> >>>> This ABI change needs to be justified. Why do you think it is >>>> acceptable to break existing Linux guests? Because I don''t think it is. >>> >>> As I explained in my reply those guests are buggy. >> >> The kernel has a strong policy on not changing ABIs, even to fix bugs. >> I don''t think a bug fix alone is sufficient justification for ABI breakage. > > This is not the kernels ABI to fix in stone. The ABI is defined by the > upstream Xen project and Linux has failed to follow the specified ABI > correctly.I disagree. The working back and front end implementations have created a new, different ABI.> There is no deployed legacy of guests to support here.I not sure I agree. It does seem to be widely used and I''m not sure we have sufficient visibility on how Xen on Arm is being used to know if this change will cause other people problems. If Konrad and Boris agree that breaking the kernel''s ABI in this way is acceptable in this specific case, I''ll defer to them. David
Ian Campbell
2013-Dec-03 15:17 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On Tue, 2013-12-03 at 15:11 +0000, David Vrabel wrote:> On 03/12/13 13:41, Ian Campbell wrote: > > On Tue, 2013-12-03 at 11:59 +0000, David Vrabel wrote: > >> On 03/12/13 11:08, Ian Campbell wrote: > >>> On Tue, 2013-12-03 at 11:01 +0000, David Vrabel wrote: > >>>> On 03/12/13 10:57, Roger Pau Monne wrote: > >>>>> Using __packed__ on the public interface is not correct, this > >>>>> structures should be compiled using the native ABI, and __packed__ > >>>>> should only be used in the backend counterpart of those structures > >>>>> (which needs to handle different ABIs). > >>>>> > >>>>> This was even worse in the ARM case, where the Linux kernel was > >>>>> incorrectly using the X86_32 protocol ABI. This patch fixes it, but > >>>>> also breaks compatibility, so an ARM DomU kernel compiled with > >>>>> this patch will fail to communicate with PV disk devices unless the > >>>>> Dom0 also has this patch. > >>>> > >>>> This ABI change needs to be justified. Why do you think it is > >>>> acceptable to break existing Linux guests? Because I don''t think it is. > >>> > >>> As I explained in my reply those guests are buggy. > >> > >> The kernel has a strong policy on not changing ABIs, even to fix bugs. > >> I don''t think a bug fix alone is sufficient justification for ABI breakage. > > > > This is not the kernels ABI to fix in stone. The ABI is defined by the > > upstream Xen project and Linux has failed to follow the specified ABI > > correctly. > > I disagree. The working back and front end implementations have created > a new, different ABI.Which is completely incompatible with every other OS. This issue was discovered trying to run a NetBSD guest on a Linux dom0. I will not legitimize this broken ABI because then all these other OSes would need to implement all of this compatibility cruft in order to interoperate with Linux.> > There is no deployed legacy of guests to support here. > > I not sure I agree. It does seem to be widely used and I''m not sure we > have sufficient visibility on how Xen on Arm is being used to know if > this change will cause other people problems.It''s being widely developed on and used for PoC etc. It is not widely deployed.> If Konrad and Boris agree that breaking the kernel''s ABI in this way is > acceptable in this specific case, I''ll defer to them.My opinion as Xen on ARM hypervisor maintainer is that this is the right thing to do in this case. Ian.
One Thousand Gnomes
2013-Dec-03 15:51 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
> > If Konrad and Boris agree that breaking the kernel''s ABI in this way is > > acceptable in this specific case, I''ll defer to them. > > My opinion as Xen on ARM hypervisor maintainer is that this is the right > thing to do in this case.Sounds to me like the difference between "product" and "research toy". You don''t break back compatibility in a product when you can avoid it. You may wish the publically humiliate those responsible (Linus seems to) but at the end of the day it''s done. Your boolean choice is a false one anyway - you can do at least three different things - Implement and tell people to use the new API, break everyone''s PoC and deployed systems, prevent old kernels running on newer Xen and generally make users lose confidence in it - Keep the erroneous API and live with the uglies - Keep the erroneous API working but implement a new clean API (and possibly make misuse produce a one per boot whine about fixing your kernel) The Linux approach has tended to be the last one most of the time, coupled with Linus having a rant 8) Alan
Ian Campbell
2013-Dec-03 16:06 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On Tue, 2013-12-03 at 15:51 +0000, One Thousand Gnomes wrote:> > > If Konrad and Boris agree that breaking the kernel''s ABI in this way is > > > acceptable in this specific case, I''ll defer to them. > > > > My opinion as Xen on ARM hypervisor maintainer is that this is the right > > thing to do in this case. > > Sounds to me like the difference between "product" and "research toy". > You don''t break back compatibility in a product when you can avoid it. > You may wish the publically humiliate those responsible (Linus seems to) > but at the end of the day it''s done.We''ve been quite explicit about the fact that the ABI is not set in stone yet. The only Xen release which included ARM support had it explicitly marked as a tech preview and we have changed the ABI multiple times during the development process as we worked through the kinks. I expect that with the Xen 4.4 release this will change, and at the point we will have to live with the ABI we''ve got, including compatibility.> Your boolean choice is a false one anyway - you can do at least three > different thingsThe right thing to do here is to fix the implementation and move on. Ian.
Konrad Rzeszutek Wilk
2013-Dec-03 20:11 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
> > If Konrad and Boris agree that breaking the kernel''s ABI in this way is > > acceptable in this specific case, I''ll defer to them. > > My opinion as Xen on ARM hypervisor maintainer is that this is the right > thing to do in this case.Heh. If somebody can guarantee me that (by testing the right variants and mentioning this in the git commit) that this does not break x86, then I am fine. And by ''break x86'' I mean that this combination works: 32-bit domU on 64-bit dom0 64-bit domU on 32-bit dom0 And perhaps also the obvious: 64-bit domU on 64-bit dom0 32-bit domU on 32-bit dom0 Since the xen-blkback has its own version of the structs there is no need to change change newer and older version of it. As long as that works I am OK sticking it in. I think from the ARM perspective it is still in ''experimental'' phase so anything goes to make it work under ARM.
Ian Campbell
2013-Dec-04 09:28 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On Tue, 2013-12-03 at 15:11 -0500, Konrad Rzeszutek Wilk wrote:> > > If Konrad and Boris agree that breaking the kernel''s ABI in this way is > > > acceptable in this specific case, I''ll defer to them. > > > > My opinion as Xen on ARM hypervisor maintainer is that this is the right > > thing to do in this case. > > Heh. If somebody can guarantee me that (by testing the right variants and > mentioning this in the git commit) that this does not break x86, then > I am fine. > > And by ''break x86'' I mean that this combination works: > 32-bit domU on 64-bit dom0 > 64-bit domU on 32-bit dom0 > > And perhaps also the obvious: > 64-bit domU on 64-bit dom0 > 32-bit domU on 32-bit dom0One way to test this is with gdb on a vmlinux for each arch with CONFIG_DEBUG_INFO=y. For each MEMBER of each interesting STRUCT: (gdb) print &((struct STRUCT *)0)->MEMBER (this is effectively an open coded offsetof) This could probably even be semi automated by producing a script to feed to gdb which run through all of the options and diffing the result. If I could have the moon on a stick I would have a tool such as this running against the canonical Xen headers, to catch breakage as it is introduced upstream and a tool which could run against an arbitrary ELF binary to validate it against the upstream results. tools/include/xen-foreign/mkchecker.py goes some way towards that but isn''t really extensible to the extent we would need/want. While I''m asking for unicorns a gcc __attribute__((warn_on_holes)) which could be applied to a struct to enforce the need for explicit padding would probably be incredibly useful for this of thing.> Since the xen-blkback has its own version of the structs there is no > need to change change newer and older version of it.Someone should check that these are producing the right interface on ARM though!> As long as that works I am OK sticking it in.Thanks.> I think from the ARM perspective it is still in ''experimental'' phase > so anything goes to make it work under ARM.Ian.
Roger Pau Monné
2013-Dec-04 09:43 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On 04/12/13 10:28, Ian Campbell wrote:> On Tue, 2013-12-03 at 15:11 -0500, Konrad Rzeszutek Wilk wrote: >>>> If Konrad and Boris agree that breaking the kernel''s ABI in this way is >>>> acceptable in this specific case, I''ll defer to them. >>> >>> My opinion as Xen on ARM hypervisor maintainer is that this is the right >>> thing to do in this case. >> >> Heh. If somebody can guarantee me that (by testing the right variants and >> mentioning this in the git commit) that this does not break x86, then >> I am fine. >> >> And by ''break x86'' I mean that this combination works: >> 32-bit domU on 64-bit dom0 >> 64-bit domU on 32-bit dom0 >> >> And perhaps also the obvious: >> 64-bit domU on 64-bit dom0 >> 32-bit domU on 32-bit dom0 > > One way to test this is with gdb on a vmlinux for each arch with > CONFIG_DEBUG_INFO=y. For each MEMBER of each interesting STRUCT: > (gdb) print &((struct STRUCT *)0)->MEMBER > (this is effectively an open coded offsetof) > > This could probably even be semi automated by producing a script to feed > to gdb which run through all of the options and diffing the result. > > If I could have the moon on a stick I would have a tool such as this > running against the canonical Xen headers, to catch breakage as it is > introduced upstream and a tool which could run against an arbitrary ELF > binary to validate it against the upstream results. > tools/include/xen-foreign/mkchecker.py goes some way towards that but > isn''t really extensible to the extent we would need/want. > > While I''m asking for unicorns a gcc __attribute__((warn_on_holes)) which > could be applied to a struct to enforce the need for explicit padding > would probably be incredibly useful for this of thing.Right now I would be happy to add something like: #if !defined(CONFIG_X86_32) && !defined(CONFIG_X86_64) && !defined(CONFIG_ARM... #error This architecture is not supported by the Xen PV block ABI #endif To the Linux copy of blkif.h>> Since the xen-blkback has its own version of the structs there is no >> need to change change newer and older version of it. > > Someone should check that these are producing the right interface on ARM > though!AFAICT blkback on ARM is not using the structures defined in blkback/common.h, since the protocol is "native", it''s using the same structures defined in the public header (just as blkfront). There''s no translation needed and blkback just does a memcpy from the ring to the native struct.
Stefano Stabellini
2013-Dec-04 11:03 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On Tue, 3 Dec 2013, Konrad Rzeszutek Wilk wrote:> > > If Konrad and Boris agree that breaking the kernel''s ABI in this way is > > > acceptable in this specific case, I''ll defer to them. > > > > My opinion as Xen on ARM hypervisor maintainer is that this is the right > > thing to do in this case. > > Heh. If somebody can guarantee me that (by testing the right variants and > mentioning this in the git commit) that this does not break x86, then > I am fine. > > And by ''break x86'' I mean that this combination works: > 32-bit domU on 64-bit dom0 > 64-bit domU on 32-bit dom0 > > And perhaps also the obvious: > 64-bit domU on 64-bit dom0 > 32-bit domU on 32-bit dom0 > > Since the xen-blkback has its own version of the structs there is no > need to change change newer and older version of it. > > As long as that works I am OK sticking it in. > > I think from the ARM perspective it is still in ''experimental'' phase > so anything goes to make it work under ARM.To be honest I am unhappy about this, but I don''t want to clutter even more a code path already plagued by an ifdef infestation. Even if the ARM port is experimental, I would prefer to retain compatibility if it was possible to do so with a couple of lines fix. Otherwise I would rather break ABI compatibility than introducing another half a dozen ifdefs.
Stefano Stabellini
2013-Dec-04 11:33 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On Wed, 4 Dec 2013, Stefano Stabellini wrote:> On Tue, 3 Dec 2013, Konrad Rzeszutek Wilk wrote: > > > > If Konrad and Boris agree that breaking the kernel''s ABI in this way is > > > > acceptable in this specific case, I''ll defer to them. > > > > > > My opinion as Xen on ARM hypervisor maintainer is that this is the right > > > thing to do in this case. > > > > Heh. If somebody can guarantee me that (by testing the right variants and > > mentioning this in the git commit) that this does not break x86, then > > I am fine. > > > > And by ''break x86'' I mean that this combination works: > > 32-bit domU on 64-bit dom0 > > 64-bit domU on 32-bit dom0 > > > > And perhaps also the obvious: > > 64-bit domU on 64-bit dom0 > > 32-bit domU on 32-bit dom0 > > > > Since the xen-blkback has its own version of the structs there is no > > need to change change newer and older version of it. > > > > As long as that works I am OK sticking it in. > > > > I think from the ARM perspective it is still in ''experimental'' phase > > so anything goes to make it work under ARM. > > To be honest I am unhappy about this, but I don''t want to clutter even > more a code path already plagued by an ifdef infestation. > > Even if the ARM port is experimental, I would prefer to retain > compatibility if it was possible to do so with a couple of lines fix. > Otherwise I would rather break ABI compatibility than introducing > another half a dozen ifdefs.Let me rephrase this as it can be misinterpreted as a NACK for this patch while it is not. I would like not to break existing ARM guests. However we do need to fix this and I can''t see a decent way to do so retaining compatibility with the broken interface that we are currently implementing. Therefore I am OK with the patch.
Ian Campbell
2013-Dec-04 12:10 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On Wed, 2013-12-04 at 09:28 +0000, Ian Campbell wrote:> This could probably even be semi automated by producing a script to feed > to gdb which run through all of the options and diffing the result. > > If I could have the moon on a stick I would have a tool such as this > running against the canonical Xen headers, to catch breakage as it is > introduced upstream and a tool which could run against an arbitrary ELF > binary to validate it against the upstream results. > tools/include/xen-foreign/mkchecker.py goes some way towards that but > isn''t really extensible to the extent we would need/want.Perhaps the gdb python extensions are what we need: http://stackoverflow.com/questions/9788679/how-to-get-the-relative-adress-of-a-field-in-a-structure-dump-c or perhaps http://turtle.ee.ncku.edu.tw/docs/perl/manual/utils/c2ph.html
Stefano Stabellini
2013-Dec-11 16:18 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On Tue, 3 Dec 2013, Konrad Rzeszutek Wilk wrote:> > > If Konrad and Boris agree that breaking the kernel''s ABI in this way is > > > acceptable in this specific case, I''ll defer to them. > > > > My opinion as Xen on ARM hypervisor maintainer is that this is the right > > thing to do in this case. > > Heh. If somebody can guarantee me that (by testing the right variants and > mentioning this in the git commit) that this does not break x86, then > I am fine. > > And by ''break x86'' I mean that this combination works: > 32-bit domU on 64-bit dom0 > 64-bit domU on 32-bit dom0 > > And perhaps also the obvious: > 64-bit domU on 64-bit dom0 > 32-bit domU on 32-bit dom0 > > Since the xen-blkback has its own version of the structs there is no > need to change change newer and older version of it. > > As long as that works I am OK sticking it in. > > I think from the ARM perspective it is still in ''experimental'' phase > so anything goes to make it work under ARM.Roger, can you please test this patch on x86 as suggested by Konrad and confirm that it doesn''t break anything?
Roger Pau Monné
2013-Dec-11 16:23 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On 11/12/13 17:18, Stefano Stabellini wrote:> On Tue, 3 Dec 2013, Konrad Rzeszutek Wilk wrote: >>>> If Konrad and Boris agree that breaking the kernel''s ABI in this way is >>>> acceptable in this specific case, I''ll defer to them. >>> >>> My opinion as Xen on ARM hypervisor maintainer is that this is the right >>> thing to do in this case. >> >> Heh. If somebody can guarantee me that (by testing the right variants and >> mentioning this in the git commit) that this does not break x86, then >> I am fine. >> >> And by ''break x86'' I mean that this combination works: >> 32-bit domU on 64-bit dom0 >> 64-bit domU on 32-bit dom0 >> >> And perhaps also the obvious: >> 64-bit domU on 64-bit dom0 >> 32-bit domU on 32-bit dom0 >> >> Since the xen-blkback has its own version of the structs there is no >> need to change change newer and older version of it. >> >> As long as that works I am OK sticking it in. >> >> I think from the ARM perspective it is still in ''experimental'' phase >> so anything goes to make it work under ARM. > > > Roger, can you please test this patch on x86 as suggested by Konrad and > confirm that it doesn''t break anything?This is not the right patch, the right one is the one posted by Julien: http://marc.info/?l=linux-kernel&m=138608528604584&w=2
Stefano Stabellini
2013-Dec-11 16:29 UTC
Re: [PATCH RFC] xen-block: correctly define structures in public headers
On Wed, 11 Dec 2013, Roger Pau Monné wrote:> On 11/12/13 17:18, Stefano Stabellini wrote: > > On Tue, 3 Dec 2013, Konrad Rzeszutek Wilk wrote: > >>>> If Konrad and Boris agree that breaking the kernel''s ABI in this way is > >>>> acceptable in this specific case, I''ll defer to them. > >>> > >>> My opinion as Xen on ARM hypervisor maintainer is that this is the right > >>> thing to do in this case. > >> > >> Heh. If somebody can guarantee me that (by testing the right variants and > >> mentioning this in the git commit) that this does not break x86, then > >> I am fine. > >> > >> And by ''break x86'' I mean that this combination works: > >> 32-bit domU on 64-bit dom0 > >> 64-bit domU on 32-bit dom0 > >> > >> And perhaps also the obvious: > >> 64-bit domU on 64-bit dom0 > >> 32-bit domU on 32-bit dom0 > >> > >> Since the xen-blkback has its own version of the structs there is no > >> need to change change newer and older version of it. > >> > >> As long as that works I am OK sticking it in. > >> > >> I think from the ARM perspective it is still in ''experimental'' phase > >> so anything goes to make it work under ARM. > > > > > > Roger, can you please test this patch on x86 as suggested by Konrad and > > confirm that it doesn''t break anything? > > This is not the right patch, the right one is the one posted by Julien: > > http://marc.info/?l=linux-kernel&m=138608528604584&w=2 >Right. In that case, Julien or Roger can you test that it doesn''t break any of the x86 configurations? _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel