Michael S. Tsirkin
2013-May-29 13:24 UTC
[PATCH RFC] virtio-pci: new config layout: using memory BAR
On Wed, May 29, 2013 at 07:52:37AM -0500, Anthony Liguori wrote:> Rusty Russell <rusty at rustcorp.com.au> writes: > > > Anthony Liguori <aliguori at us.ibm.com> writes: > >> "Michael S. Tsirkin" <mst at redhat.com> writes: > >>> + case offsetof(struct virtio_pci_common_cfg, device_feature_select): > >>> + return proxy->device_feature_select; > >> > >> Oh dear no... Please use defines like the rest of QEMU. > > > > It is pretty ugly. > > I think beauty is in the eye of the beholder here... > > Pretty much every device we have has a switch statement like this. > Consistency wins when it comes to qualitative arguments like this. > > > Yet the structure definitions are descriptive, capturing layout, size > > and endianness in natural a format readable by any C programmer. > > >From an API design point of view, here are the problems I see: > > 1) C makes no guarantees about structure layout beyond the first > member. Yes, if it's naturally aligned or has a packed attribute, > GCC does the right thing. But this isn't kernel land anymore, > portability matters and there are more compilers than GCC.You expect a compiler to pad this structure: struct foo { uint8_t a; uint8_t b; uint16_t c; uint32_t d; }; I'm guessing any compiler that decides to waste memory in this way will quickly get dropped by users and then we won't worry about building QEMU with it.> 2) If we every introduce anything like latching, this doesn't work out > so well anymore because it's hard to express in a single C structure > the register layout at that point. Perhaps a union could be used but > padding may make it a bit challenging.Then linux won't use it either.> 3) It suspect it's harder to review because a subtle change could more > easily have broad impact. If someone changed the type of a field > from u32 to u16, it changes the offset of every other field. That's > not terribly obvious in the patch itself unless you understand how > the structure is used elsewhere. > > This may not be a problem for virtio because we all understand that > the structures are part of an ABI, but if we used this pattern more > in QEMU, it would be a lot less obvious.So let's not use it more in QEMU.> > So AFAICT the question is, do we put the required > > > > #define VIRTIO_PCI_CFG_FEATURE_SEL \ > > (offsetof(struct virtio_pci_common_cfg, device_feature_select)) > > > > etc. in the kernel headers or qemu? > > I'm pretty sure we would end up just having our own integer defines. We > carry our own virtio headers today because we can't easily import the > kernel headers.Yes we can easily import them. And at least we copy headers verbatim.> >> Haven't looked at the proposed new ring layout yet. > > > > No change, but there's an open question on whether we should nail it to > > little endian (or define the endian by the transport). > > > > Of course, I can't rule out that the 1.0 standard *may* decide to frob > > the ring layout somehow, > > Well, given that virtio is widely deployed today, I would think the 1.0 > standard should strictly reflect what's deployed today, no? > Any new config layout would be 2.0 material, right?Not as it's currently planned. Devices can choose to support a legacy layout in addition to the new one, and if you look at the patch you will see that that is exactly what it does.> Re: the new config layout, I don't think we would want to use it for > anything but new devices. Forcing a guest driver changeThere's no forcing. If you look at the patches closely, you will see that we still support the old layout on BAR0.> is a really big > deal and I see no reason to do that unless there's a compelling reason > to.There are many a compelling reasons, and they are well known limitations of virtio PCI: - PCI spec compliance (madates device operation with IO memory disabled). - support 64 bit addressing - add more than 32 feature bits. - individually disable queues. - sanely support cross-endian systems. - support very small (<1 PAGE) for virtio rings. - support a separate page for each vq kick. - make each device place config at flexible offset. Addressing any one of these would cause us to add a substantially new way to operate virtio devices. And since it's a guest change anyway, it seemed like a good time to do the new layout and fix everything in one go. And they are needed like yesterday.> So we're stuck with the 1.0 config layout for a very long time. > > Regards, > > Anthony LiguoriAbsolutely. This patch let us support both which will allow for a gradual transition over the next 10 years or so.> > reason. I suggest that's 2.0 material... > > > > Cheers, > > Rusty. > > > > -- > > To unsubscribe from this list: send the line "unsubscribe kvm" in > > the body of a message to majordomo at vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html
Peter Maydell
2013-May-29 13:35 UTC
[PATCH RFC] virtio-pci: new config layout: using memory BAR
On 29 May 2013 14:24, Michael S. Tsirkin <mst at redhat.com> wrote:> On Wed, May 29, 2013 at 07:52:37AM -0500, Anthony Liguori wrote: >> 1) C makes no guarantees about structure layout beyond the first >> member. Yes, if it's naturally aligned or has a packed attribute, >> GCC does the right thing. But this isn't kernel land anymore, >> portability matters and there are more compilers than GCC. > > You expect a compiler to pad this structure: > > struct foo { > uint8_t a; > uint8_t b; > uint16_t c; > uint32_t d; > }; > > I'm guessing any compiler that decides to waste memory in this way > will quickly get dropped by users and then we won't worry > about building QEMU with it.Structure alignment is a platform ABI choice. Oddly enough people choose operating systems on more criteria than the amount of padding their ABI mandates in structures. In any case it's really tedious to read a structure and wade through working out whether it's going to have all its members naturally aligned, especially once it's more than a handful of members in size. And getting it wrong is "silent portability problem". thanks -- PMM
Paolo Bonzini
2013-May-29 13:41 UTC
[PATCH RFC] virtio-pci: new config layout: using memory BAR
Il 29/05/2013 15:24, Michael S. Tsirkin ha scritto:> You expect a compiler to pad this structure: > > struct foo { > uint8_t a; > uint8_t b; > uint16_t c; > uint32_t d; > }; > > I'm guessing any compiler that decides to waste memory in this way > will quickly get dropped by users and then we won't worry > about building QEMU with it.You know the virtio-pci config structures are padded, but not all of them are. For example, virtio_balloon_stat is not padded and indeed has an __attribute__((__packed__)) in the spec. For this reason I prefer to have the attribute everywhere. So people don't have to wonder why it's here and not there. Paolo
Michael S. Tsirkin
2013-May-29 14:02 UTC
[PATCH RFC] virtio-pci: new config layout: using memory BAR
On Wed, May 29, 2013 at 03:41:09PM +0200, Paolo Bonzini wrote:> Il 29/05/2013 15:24, Michael S. Tsirkin ha scritto: > > You expect a compiler to pad this structure: > > > > struct foo { > > uint8_t a; > > uint8_t b; > > uint16_t c; > > uint32_t d; > > }; > > > > I'm guessing any compiler that decides to waste memory in this way > > will quickly get dropped by users and then we won't worry > > about building QEMU with it. > > You know the virtio-pci config structures are padded, but not all of > them are. For example, virtio_balloon_stat is not padded and indeed has > an __attribute__((__packed__)) in the spec. > > For this reason I prefer to have the attribute everywhere. So people > don't have to wonder why it's here and not there. > > PaoloFWIW I think it was a mistake to lay the balloon structure out like that. We should have padded it manually. __attribute__((__packed__)) is really easy to misuse. If you get into such a situation, just use offset enums ... -- MST
Anthony Liguori
2013-May-29 14:16 UTC
[PATCH RFC] virtio-pci: new config layout: using memory BAR
"Michael S. Tsirkin" <mst at redhat.com> writes:> On Wed, May 29, 2013 at 07:52:37AM -0500, Anthony Liguori wrote: >> 1) C makes no guarantees about structure layout beyond the first >> member. Yes, if it's naturally aligned or has a packed attribute, >> GCC does the right thing. But this isn't kernel land anymore, >> portability matters and there are more compilers than GCC. > > You expect a compiler to pad this structure: > > struct foo { > uint8_t a; > uint8_t b; > uint16_t c; > uint32_t d; > }; > > I'm guessing any compiler that decides to waste memory in this way > will quickly get dropped by users and then we won't worry > about building QEMU with it.There are other responses in the thread here and I don't really care to bikeshed on this issue.>> Well, given that virtio is widely deployed today, I would think the 1.0 >> standard should strictly reflect what's deployed today, no? >> Any new config layout would be 2.0 material, right? > > Not as it's currently planned. Devices can choose > to support a legacy layout in addition to the new one, > and if you look at the patch you will see that that > is exactly what it does.Adding a new BAR most certainly requires bumping the revision ID or changing the device ID, no? Didn't we run into this problem with the virtio-win drivers with just the BAR size changing?>> Re: the new config layout, I don't think we would want to use it for >> anything but new devices. Forcing a guest driver change > > There's no forcing. > If you look at the patches closely, you will see that > we still support the old layout on BAR0. > > >> is a really big >> deal and I see no reason to do that unless there's a compelling reason >> to. > > There are many a compelling reasons, and they are well known > limitations of virtio PCI: > > - PCI spec compliance (madates device operation with IO memory > disabled).PCI express spec. We are fully compliant with the PCI spec. And what's the user visible advantage of pointing an emulated virtio device behind a PCI-e bus verses a legacy PCI bus? This is a very good example because if we have to disable BAR0, then it's an ABI breaker plan and simple.> - support 64 bit addressingWe currently support 44-bit addressing for the ring. While I agree we need to bump it, there's no immediate problem with 44-bit addressing.> - add more than 32 feature bits. > - individually disable queues. > - sanely support cross-endian systems. > - support very small (<1 PAGE) for virtio rings. > - support a separate page for each vq kick. > - make each device place config at flexible offset.None of these things are holding us back today. I'm not saying we shouldn't introduce a new device. But adoption of that device will be slow and realistically will be limited to new devices only. We'll be supporting both devices for a very, very long time. Compatibility is the fundamental value that we provide. We need to go out of our way to make sure that existing guests work and work as well as possible. Sticking virtio devices behind a PCI-e bus just for the hell of it isn't a compelling reason to break existing guests. Regards, Anthony Liguori> Addressing any one of these would cause us to add a substantially new > way to operate virtio devices. > > And since it's a guest change anyway, it seemed like a > good time to do the new layout and fix everything in one go. > > And they are needed like yesterday. > > >> So we're stuck with the 1.0 config layout for a very long time. >> >> Regards, >> >> Anthony Liguori > > Absolutely. This patch let us support both which will allow for > a gradual transition over the next 10 years or so. > >> > reason. I suggest that's 2.0 material... >> > >> > Cheers, >> > Rusty. >> > >> > -- >> > To unsubscribe from this list: send the line "unsubscribe kvm" in >> > the body of a message to majordomo at vger.kernel.org >> > More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
Anthony Liguori
2013-May-29 14:18 UTC
[PATCH RFC] virtio-pci: new config layout: using memory BAR
Paolo Bonzini <pbonzini at redhat.com> writes:> Il 29/05/2013 15:24, Michael S. Tsirkin ha scritto: >> You expect a compiler to pad this structure: >> >> struct foo { >> uint8_t a; >> uint8_t b; >> uint16_t c; >> uint32_t d; >> }; >> >> I'm guessing any compiler that decides to waste memory in this way >> will quickly get dropped by users and then we won't worry >> about building QEMU with it. > > You know the virtio-pci config structures are padded, but not all of > them are. For example, virtio_balloon_stat is not padded and indeed has > an __attribute__((__packed__)) in the spec.Not that these structures are actually used for something. We store the config in these structures so they are actually used for something. The proposed structures only serve as a way to express offsets. You would never actually have a variable of this type. Regards, Anthony Liguori> > For this reason I prefer to have the attribute everywhere. So people > don't have to wonder why it's here and not there. > > Paolo > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin
2013-May-29 14:30 UTC
[PATCH RFC] virtio-pci: new config layout: using memory BAR
On Wed, May 29, 2013 at 09:16:39AM -0500, Anthony Liguori wrote:> "Michael S. Tsirkin" <mst at redhat.com> writes: > > > On Wed, May 29, 2013 at 07:52:37AM -0500, Anthony Liguori wrote: > >> 1) C makes no guarantees about structure layout beyond the first > >> member. Yes, if it's naturally aligned or has a packed attribute, > >> GCC does the right thing. But this isn't kernel land anymore, > >> portability matters and there are more compilers than GCC. > > > > You expect a compiler to pad this structure: > > > > struct foo { > > uint8_t a; > > uint8_t b; > > uint16_t c; > > uint32_t d; > > }; > > > > I'm guessing any compiler that decides to waste memory in this way > > will quickly get dropped by users and then we won't worry > > about building QEMU with it. > > There are other responses in the thread here and I don't really care to > bikeshed on this issue.Great. Let's make the bikeshed blue then?> >> Well, given that virtio is widely deployed today, I would think the 1.0 > >> standard should strictly reflect what's deployed today, no? > >> Any new config layout would be 2.0 material, right? > > > > Not as it's currently planned. Devices can choose > > to support a legacy layout in addition to the new one, > > and if you look at the patch you will see that that > > is exactly what it does. > > Adding a new BAR most certainly requires bumping the revision ID or > changing the device ID, no?No, why would it? If a device dropped BAR0, that would be a good reason to bump revision ID. We don't do this yet.> Didn't we run into this problem with the virtio-win drivers with just > the BAR size changing?Because they had a bug: they validated BAR0 size. AFAIK they don't care what happens with other bars.> >> Re: the new config layout, I don't think we would want to use it for > >> anything but new devices. Forcing a guest driver change > > > > There's no forcing. > > If you look at the patches closely, you will see that > > we still support the old layout on BAR0. > > > > > >> is a really big > >> deal and I see no reason to do that unless there's a compelling reason > >> to. > > > > There are many a compelling reasons, and they are well known > > limitations of virtio PCI: > > > > - PCI spec compliance (madates device operation with IO memory > > disabled). > > PCI express spec. We are fully compliant with the PCI spec. And what's > the user visible advantage of pointing an emulated virtio device behind > a PCI-e bus verses a legacy PCI bus?Native hotplug support.> This is a very good example because if we have to disable BAR0, then > it's an ABI breaker plan and simple.Not we. The BIOS can disable IO BAR: it can do this already but the device won't be functional.> > - support 64 bit addressing > > We currently support 44-bit addressing for the ring. While I agree we > need to bump it, there's no immediate problem with 44-bit addressing.I heard developers (though not users) complaining.> > - add more than 32 feature bits. > > - individually disable queues. > > - sanely support cross-endian systems. > > - support very small (<1 PAGE) for virtio rings. > > - support a separate page for each vq kick. > > - make each device place config at flexible offset. > > None of these things are holding us back today.All of them do, to bigger or lesser degree.> I'm not saying we shouldn't introduce a new device. But adoption of > that device will be slow and realistically will be limited to new > devices only. > > We'll be supporting both devices for a very, very long time.This is true for any new feature. What are you trying to say here? We won't add new features to old config: for once, we have run out of feature bits.> Compatibility is the fundamental value that we provide. We need to go > out of our way to make sure that existing guests work and work as well > as possible.What are you trying to say? There's nothing here that breaks compatibility. Have you looked at the patch? I'm wasting my time arguing on the mailing list, but once I tear myself away from this occupation, I intend to verify that I can run an old guest on qemu with this patch without issues.> Sticking virtio devices behind a PCI-e bus just for the hell of it isn't > a compelling reason to break existing guests. > > Regards, > > Anthony LiguoriThat's why my patch does not break existing guests.> > > Addressing any one of these would cause us to add a substantially new > > way to operate virtio devices. > > > > And since it's a guest change anyway, it seemed like a > > good time to do the new layout and fix everything in one go. > > > > And they are needed like yesterday. > > > > > >> So we're stuck with the 1.0 config layout for a very long time. > >> > >> Regards, > >> > >> Anthony Liguori > > > > Absolutely. This patch let us support both which will allow for > > a gradual transition over the next 10 years or so. > > > >> > reason. I suggest that's 2.0 material... > >> > > >> > Cheers, > >> > Rusty. > >> > > >> > -- > >> > To unsubscribe from this list: send the line "unsubscribe kvm" in > >> > the body of a message to majordomo at vger.kernel.org > >> > More majordomo info at http://vger.kernel.org/majordomo-info.html > > -- > > To unsubscribe from this list: send the line "unsubscribe kvm" in > > the body of a message to majordomo at vger.kernel.org > > More majordomo info at http://vger.kernel.org/majordomo-info.html
Michael S. Tsirkin
2013-May-30 07:43 UTC
[PATCH RFC] virtio-pci: new config layout: using memory BAR
On Wed, May 29, 2013 at 03:41:09PM +0200, Paolo Bonzini wrote:> Il 29/05/2013 15:24, Michael S. Tsirkin ha scritto: > > You expect a compiler to pad this structure: > > > > struct foo { > > uint8_t a; > > uint8_t b; > > uint16_t c; > > uint32_t d; > > }; > > > > I'm guessing any compiler that decides to waste memory in this way > > will quickly get dropped by users and then we won't worry > > about building QEMU with it. > > You know the virtio-pci config structures are padded, but not all of > them are. For example, virtio_balloon_stat is not padded and indeed has > an __attribute__((__packed__)) in the spec. > > For this reason I prefer to have the attribute everywhere. So people > don't have to wonder why it's here and not there. > > PaoloBTW we don't even do this consistently everywhere in QEMU. It would have been better to have a rule to avoid packed as much as possible, then we'd have found the misaligned field bug in balloon before it's too late. That would be a good rule to adopt I think: any pack if something is misaligned, and document the reason. -- MST