V4V is a copy based inter vm communication system. Please have a look at this thread for more detail about V4V. http://lists.xen.org/archives/html/xen-devel/2012-05/msg01866.html This patch series is work in progress but I wanted to post it early enough so I can get feedback from people. Jean Guyader (5): xen: add ssize_t to types.h xen: Add headers to include/Makefile v4v: Introduce VIRQ_V4V xen: Enforce casting for guest_handle_cast xen: Add V4V implementation xen/arch/x86/hvm/hvm.c | 9 +- xen/arch/x86/x86_32/entry.S | 2 + xen/arch/x86/x86_64/compat/entry.S | 2 + xen/arch/x86/x86_64/entry.S | 2 + xen/common/Makefile | 1 + xen/common/domain.c | 11 +- xen/common/event_channel.c | 1 + xen/common/v4v.c | 1779 ++++++++++++++++++++++++++++++++++++ xen/include/Makefile | 3 +- xen/include/asm-x86/guest_access.h | 2 +- xen/include/public/v4v.h | 544 +++++++++++ xen/include/public/xen.h | 4 +- xen/include/xen/sched.h | 5 + xen/include/xen/types.h | 1 + xen/include/xen/v4v.h | 109 +++ 15 files changed, 2467 insertions(+), 8 deletions(-) create mode 100644 xen/common/v4v.c create mode 100644 xen/include/public/v4v.h create mode 100644 xen/include/xen/v4v.h -- 1.7.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Signed-off-by: Jean Guyader <jean.guyader@citrix.com> --- xen/include/xen/types.h | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Add stdlib.h for size_t Add string.h for memcpy Add sys/types.h for ssize_t Signed-off-by: Jean Guyader <jean.guyader@citrix.com> --- xen/include/Makefile | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Introduce the global virq VIRQ_V4V Signed-off-by: Jean Guyader <jean.guyader@citrix.com> --- xen/common/event_channel.c | 1 + xen/include/public/xen.h | 2 +- 2 files changed, 2 insertions(+), 1 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
This is useful if you want to cast a guest handle to char * to do pointer aritmetics afterwards with functions like guest_handle_add_offset. Signed-off-by: Jean Guyader <jean.guyader@citrix.com> --- xen/include/asm-x86/guest_access.h | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Setup of v4v domains a domain gets created and cleanup when a domain die. Wire up the v4v hypercall. Include v4v internal and public headers. Signed-off-by: Jean Guyader <jean.guyader@citrix.com> --- xen/arch/x86/hvm/hvm.c | 9 +- xen/arch/x86/x86_32/entry.S | 2 + xen/arch/x86/x86_64/compat/entry.S | 2 + xen/arch/x86/x86_64/entry.S | 2 + xen/common/Makefile | 1 + xen/common/domain.c | 11 +- xen/common/v4v.c | 1779 ++++++++++++++++++++++++++++++++++++ xen/include/public/v4v.h | 544 +++++++++++ xen/include/public/xen.h | 2 +- xen/include/xen/sched.h | 5 + xen/include/xen/v4v.h | 109 +++ 11 files changed, 2461 insertions(+), 5 deletions(-) create mode 100644 xen/common/v4v.c create mode 100644 xen/include/public/v4v.h create mode 100644 xen/include/xen/v4v.h _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
V4V is a copy based inter vm communication system. Please have a look at this thread for more detail about V4V. http://lists.xen.org/archives/html/xen-devel/2012-05/msg01866.html This patch series is work in progress but I wanted to post it early enough so I can get feedback from people. Jean Guyader (5): xen: add ssize_t to types.h xen: Add headers to include/Makefile v4v: Introduce VIRQ_V4V xen: Enforce casting for guest_handle_cast xen: Add V4V implementation xen/arch/x86/hvm/hvm.c | 9 +- xen/arch/x86/x86_32/entry.S | 2 + xen/arch/x86/x86_64/compat/entry.S | 2 + xen/arch/x86/x86_64/entry.S | 2 + xen/common/Makefile | 1 + xen/common/domain.c | 11 +- xen/common/event_channel.c | 1 + xen/common/v4v.c | 1779 ++++++++++++++++++++++++++++++++++++ xen/include/Makefile | 3 +- xen/include/asm-x86/guest_access.h | 2 +- xen/include/public/v4v.h | 544 +++++++++++ xen/include/public/xen.h | 4 +- xen/include/xen/sched.h | 5 + xen/include/xen/types.h | 1 + xen/include/xen/v4v.h | 109 +++ 15 files changed, 2467 insertions(+), 8 deletions(-) create mode 100644 xen/common/v4v.c create mode 100644 xen/include/public/v4v.h create mode 100644 xen/include/xen/v4v.h -- 1.7.2.5 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Signed-off-by: Jean Guyader <jean.guyader@citrix.com> --- xen/include/xen/types.h | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Add stdlib.h for size_t Add string.h for memcpy Add sys/types.h for ssize_t Signed-off-by: Jean Guyader <jean.guyader@citrix.com> --- xen/include/Makefile | 3 ++- 1 files changed, 2 insertions(+), 1 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Introduce the global virq VIRQ_V4V Signed-off-by: Jean Guyader <jean.guyader@citrix.com> --- xen/common/event_channel.c | 1 + xen/include/public/xen.h | 2 +- 2 files changed, 2 insertions(+), 1 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
This is useful if you want to cast a guest handle to char * to do pointer aritmetics afterwards with functions like guest_handle_add_offset. Signed-off-by: Jean Guyader <jean.guyader@citrix.com> --- xen/include/asm-x86/guest_access.h | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
Setup of v4v domains a domain gets created and cleanup when a domain die. Wire up the v4v hypercall. Include v4v internal and public headers. Signed-off-by: Jean Guyader <jean.guyader@citrix.com> --- xen/arch/x86/hvm/hvm.c | 9 +- xen/arch/x86/x86_32/entry.S | 2 + xen/arch/x86/x86_64/compat/entry.S | 2 + xen/arch/x86/x86_64/entry.S | 2 + xen/common/Makefile | 1 + xen/common/domain.c | 11 +- xen/common/v4v.c | 1779 ++++++++++++++++++++++++++++++++++++ xen/include/public/v4v.h | 544 +++++++++++ xen/include/public/xen.h | 2 +- xen/include/xen/sched.h | 5 + xen/include/xen/v4v.h | 109 +++ 11 files changed, 2461 insertions(+), 5 deletions(-) create mode 100644 xen/common/v4v.c create mode 100644 xen/include/public/v4v.h create mode 100644 xen/include/xen/v4v.h _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
>>> On 31.05.12 at 17:07, Jean Guyader <jean.guyader@citrix.com> wrote: >--- a/xen/include/xen/types.h >+++ b/xen/include/xen/types.h >@@ -58,5 +58,6 @@ typedef __u64 __le64; > typedef __u64 __be64; > > typedef unsigned long uintptr_t; >+typedef int ssize_t;If you add a new type, it needs to be defined correctly and portably - here, it obviously needs to follow the definition of size_t (which isn''t a typedef of "unsigned int" unilaterally). With ssize_t in particular I have always been wondering what benefit it provides over ptrdiff_t (i.e. whether there are environments where the two could sensibly be different). Jan> > #endif /* __TYPES_H__ */
>>> On 31.05.12 at 17:07, Jean Guyader <jean.guyader@citrix.com> wrote:> Add stdlib.h for size_t > Add string.h for memcpy > Add sys/types.h for ssize_t > > Signed-off-by: Jean Guyader <jean.guyader@citrix.com>Sorry, I don''t think this is correct: What you''re modifying is the rule to validate public headers, and public headers should never include any of the three above - the only headers consuming defined symbols of which is half-way acceptable are those that exclusively deal with compiler properties (stddef.h, stdint.h, stdbool.h, stdarg.h, and maybe inttypes.h). Plus I don''t see how public headers would legitimately and portably (keep 32-on-64 in mind!) use any of the three symbols mentioned above. Jan
>>> On 31.05.12 at 17:07, Jean Guyader <jean.guyader@citrix.com> wrote: >--- a/xen/common/event_channel.c >+++ b/xen/common/event_channel.c >@@ -107,6 +107,7 @@ static int virq_is_global(uint32_t virq) > case VIRQ_TIMER: > case VIRQ_DEBUG: > case VIRQ_XENOPROF: >+ case VIRQ_V4V:Either the placement here is wrong (the vIRQ being per-vCPU), ...> rc = 0; > break; > case VIRQ_ARCH_0 ... VIRQ_ARCH_7: >--- a/xen/include/public/xen.h >+++ b/xen/include/public/xen.h >@@ -157,7 +157,7 @@ DEFINE_XEN_GUEST_HANDLE(xen_pfn_t); > #define VIRQ_CON_RING 8 /* G. (DOM0) Bytes received on console */ > #define VIRQ_PCPU_STATE 9 /* G. (DOM0) PCPU state changed */ > #define VIRQ_MEM_EVENT 10 /* G. (DOM0) A memory event has occured */ >-#define VIRQ_XC_RESERVED 11 /* G. Reserved for XenClient */ >+#define VIRQ_V4V 11 /* G. V4V event has occurred */... or the comment here is (and was before). This is an ABI property, end hence you can''t really convert a vIRQ defined to be global to a per-vCPU one. So if it turns out the comment was wrong, I would argue whether the change here is really acceptable - our kernel, for example, has built-in knowledge of which vIRQ-s are per-vCPU (but of course this should be benign when the only consumer of the vIRQ lives in userland; otoh, a userland consumer can hardly really make use of a per-vCPU one). Jan> #define VIRQ_ENOMEM 12 /* G. (DOM0) Low on heap memory */ > > /* Architecture-specific VIRQ definitions. */
Jan Beulich
2012-May-31 15:47 UTC
Re: [PATCH 4/5] xen: Enforce casting for guest_handle_cast
>>> On 31.05.12 at 17:07, Jean Guyader <jean.guyader@citrix.com> wrote: >--- a/xen/include/asm-x86/guest_access.h >+++ b/xen/include/asm-x86/guest_access.h >@@ -47,7 +47,7 @@ > > /* Cast a guest handle to the specified type of handle. */ > #define guest_handle_cast(hnd, type) ({ \ >- type *_x = (hnd).p; \ >+ type *_x = (type *)(hnd).p; \You would have to explain how this is safe: Without the cast, we get compiler warnings (and hence build failures due to -Werror) if "type *" and typeof((hnd).p) are incompatible. Adding an explicit cast removes that intentional check. Jan> (XEN_GUEST_HANDLE(type)) { _x }; \ > }) >
>>> On 31.05.12 at 17:07, Jean Guyader <jean.guyader@citrix.com> wrote: > Setup of v4v domains a domain gets created and cleanup > when a domain die. Wire up the v4v hypercall. > > Include v4v internal and public headers.Iirc there was discussion about 6-argument hypercalls already, and it was suggested to convert the argument set to a structure instead. Am I mis-remembering, or is there any new reason why introducing these is necessary? Further, the public header doesn''t seem to meet basic requirements. For example, who told you that each and every compiler other than gcc understands #pragma warning(...)? Nor is adding mb() again an option (see io/ring.h), not to speak of adding an MSVC specific implementation (this should be a requirement on the consumers of the header). Nor any other inline functions - they just don''t belong here imo (if you absolutely need those, they should be in a conditional that by default prevents them from being seen by the compiler - with that, some of the other hackery you did in the earlier patches also becomes unnecessary). Structure packing should be achieved using explicit padding fields rather than using compiler- dependent constructs. Jan
>>> On 31.05.12 at 17:07, Jean Guyader <jean.guyader@citrix.com> wrote:Is there a particular reason this series got resent with no apparent change (and also none mentioned in the description here or in the individual patches)? All of the comments sent yesterday still apply, and I continue to consider patches 1-4 as unnecessary/broken. Jan
On Fri, Jun 1, 2012 at 1:41 PM, Jan Beulich <JBeulich@suse.com> wrote:>>>> On 31.05.12 at 17:07, Jean Guyader <jean.guyader@citrix.com> wrote: > > Is there a particular reason this series got resent with no apparent > change (and also none mentioned in the description here or in the > individual patches)? All of the comments sent yesterday still apply, > and I continue to consider patches 1-4 as unnecessary/broken.The date on the second set of e-mails is 15 minutes before the original series. I presume this means that Jean sent the series but it got held up for some reason, so sent it again 15 minutes later; and the second series made it through but the first was held up until today. Never attribute to malice that which can be attributed to computer error. :-) -George
On Thu, 2012-05-31 at 16:07 +0100, Jean Guyader wrote:> V4V is a copy based inter vm communication system. > > Please have a look at this thread for more detail > about V4V. > http://lists.xen.org/archives/html/xen-devel/2012-05/msg01866.html > > This patch series is work in progress but I wanted to > post it early enough so I can get feedback from people.The main thing which is missing here, which makes it rather hard to review, is any kind of design documentation. It would also be great to see the rationale for why things have to be done this way. For example it seems that there is a bunch of stuff being added to the hypervisor which could live in the context of some guest or service domain -- putting stuff like that in the hypervisor needs strong arguments and rationale why it has to be there rather than somewhere else. Likewise perhaps the v4v hypercall could effectively be implemented with a multicall containing some grant copy ops and evtchn manipulations? But in the absence of any descriptions of the whats, hows and whys of v4v its rather hard to have these sorts of conversations. The above are some concrete examples but I''m more interested in seeing the more general descriptions before we get into those. Ian.
On 01/06 02:47, Ian Campbell wrote:> On Thu, 2012-05-31 at 16:07 +0100, Jean Guyader wrote: > > V4V is a copy based inter vm communication system. > > > > Please have a look at this thread for more detail > > about V4V. > > http://lists.xen.org/archives/html/xen-devel/2012-05/msg01866.html > > > > This patch series is work in progress but I wanted to > > post it early enough so I can get feedback from people. > > The main thing which is missing here, which makes it rather hard to > review, is any kind of design documentation. It would also be great to > see the rationale for why things have to be done this way. > > For example it seems that there is a bunch of stuff being added to the > hypervisor which could live in the context of some guest or service > domain -- putting stuff like that in the hypervisor needs strong > arguments and rationale why it has to be there rather than somewhere > else. > > Likewise perhaps the v4v hypercall could effectively be implemented with > a multicall containing some grant copy ops and evtchn manipulations? > > But in the absence of any descriptions of the whats, hows and whys of > v4v its rather hard to have these sorts of conversations. The above are > some concrete examples but I''m more interested in seeing the more > general descriptions before we get into those. >Here is some documentation of the Xen V4V API and the ring structure. Let me know if you have other questions. * Overview ** Architecture v4v has a very simple architecture. The receiving domain registers a ring with xen. The sending domain requests that xen inserts data into the receiving domain''s ring. Notification that there is new data to read is delivered by a VIRQ, and a domain can request an interrupt be generated when sufficient space is available in another domain''s ring. The code that inserts the data into the ring is in xen, and xen writes a header describing the data and where it came from infront of the data. As both the ring manipulation and the header are written by the hypervisor, the receiving domain can trust their content and the format of the ring. ** Addressing v4v_addr_t identifies an endpoint address. typedef struct v4v_addr { uint32_t port; domid_t domain; } v4v_addr_t; struct v4v_ring_id uniquely identifies a ring on the platform. struct v4v_ring_id { struct v4v_addr addr; domid_t partner; }; When a send operation is performed, xen will attempt to find a ring registered by destination domain, with the required port, and with a partner of the sending domain. Failing that it will then search for a ring with the correct port and destination domain, but with partner set to V4V_DOMID_ANY. ** Setup A domain first generates a ring, and a structure describing the pages in the ring. Rings must be page aligned, and contiguous in virtual memory, but need not be contiguous in physical(pfn) or machine(mfn) memory. A ring is uniquely identified on the platform by its struct v4v_ring_id. A ring is contained in a v4v_ring_t typedef struct v4v_ring { uint64_t magic; /* * Identifies ring_id - xen only looks at this during register/unregister * and will fill in id.addr.domain */ struct v4v_ring_id id; /* length of ring[], must be a multiple of 16 */ uint32_t len; /* rx_ptr - modified by domain */ uint32_t rx_ptr; /* tx_ptr - modified by xen */ volatile uint32_t tx_ptr; uint64_t reserved[4]; volatile uint8_t ring[0]; } v4v_ring_t; To register the ring, xen also needs a list of pfn that the ring occupies. This is provided in a v4v_pfn_list_t typedef struct v4v_pfn_list_t { uint64_t magic; uint32_t npage; uint32_t pad; uint64_t reserved[3]; v4v_pfn_t pages[0]; } v4v_pfn_list_t; Registering the ring is done with a hypercall int hypercall(NR_V4V,V4VOP_register_ring,v4v_ring_t *ring,v4v_pfn_list_t *pfn_list); On registering, xen will check that the tx_ptr and rx_ptr values are valid (aligned and within the size of the ring) and modify them if they are not, it will also update id.addr.domain with the calling domain''s domid and take an internal copy of all the relevant information. After registration the only field that xen will read is ring->rx_ptr, xen will write to ring->ring[] and ring->tx_ptr. Thus pfn_list can be freed after the registration call. Critically, registration is idempotent, and all a domain need do after hibernation or migration is re-register all its rings. (NB the id.addr.domain field may change). ** Sending To send data a guest merely calls the send or sendv hypercall int hypercall(NR_V4V,V4VOP_send,v4v_addr_t *src,v4v_addr_t *dst,void *buf, uint32_t len, uint32_t protocol); the caller provides a source address, a destination address, a buffer, a number of bytes to copy and a 32 bit protocol number. Xen ignores src->domain and instead uses the domain id of the caller. Xen will attempt to insert into a ring with id.addr equal to dst, and with id.partner equal to the caller''s domid. Otherwise it will insert into a ring with id.addr equal to dst and id.partner equal to V4V_DOMID_ANY. If the insert is successfull then Xen will trigger the V4V VIRQ line in the receiving domain. ** Reception Xen pads all messages on the ring to 16 bytes (the macro V4V_ROUNDUP is provided for convience), and inserts a message header infront of all messages it copies onto the ring. struct v4v_ring_message_header { uint32_t len; struct v4v_addr source; uint16_t pad; uint32_t protocol; uint8_t data[0]; }; len is the number of bytes in the message including the struct v4v_ring_message_header source specifies the source address from which the message came, source.domain is set by xen, and source.port is taken from the arguments to the send or sendv call. protocol is the protocol value given to the send call. An inline function is provided to make reading from the ring easier: ssize_t v4v_copy_out (struct v4v_ring *r, struct v4v_addr *from, uint32_t * protocol, void *_buf, size_t t, int consume) v4v_copy_out will copy at most t bytes from the ring r and place them in buf, if it is non-null. If from and protocol are not NULL pointers then the source address and protocol number will by copied into them. If consume is non-zero then the ring''s receive pointer will be advanced. The function returns the number of bytes of payload that the message has (ie. the number of bytes passed to the original send or sendv call). Thus, v4v_copy_out(r,NULL,NULL,NULL,0,0); will return the number of bytes in the next message, and v4v_copy_out(r,NULL,NULL,NULL,0,1); will return the number of bytes in the next message and delete it from the ring. ** Notification If the receiving ring is full in a send or sendv hypercall, the hypercall will return with the error -EAGAIN. When sufficient space becomes available in the ring xen will raise the V4V VIRQ line. Jean
On 01/06 02:47, Ian Campbell wrote:> On Thu, 2012-05-31 at 16:07 +0100, Jean Guyader wrote: > > V4V is a copy based inter vm communication system. > > > > Please have a look at this thread for more detail > > about V4V. > > http://lists.xen.org/archives/html/xen-devel/2012-05/msg01866.html > > > > This patch series is work in progress but I wanted to > > post it early enough so I can get feedback from people. > > The main thing which is missing here, which makes it rather hard to > review, is any kind of design documentation. It would also be great to > see the rationale for why things have to be done this way. > > For example it seems that there is a bunch of stuff being added to the > hypervisor which could live in the context of some guest or service > domain -- putting stuff like that in the hypervisor needs strong > arguments and rationale why it has to be there rather than somewhere > else. > > Likewise perhaps the v4v hypercall could effectively be implemented with > a multicall containing some grant copy ops and evtchn manipulations? > > But in the absence of any descriptions of the whats, hows and whys of > v4v its rather hard to have these sorts of conversations. The above are > some concrete examples but I''m more interested in seeing the more > general descriptions before we get into those. > > Ian. > >(resent, sorry if you get this message twice) * Overview ** Architecture v4v has a very simple architecture. The receiving domain registers a ring with xen. The sending domain requests that xen inserts data into the receiving domain''s ring. Notification that there is new data to read is delivered by a VIRQ, and a domain can request an interrupt be generated when sufficient space is available in another domain''s ring. The code that inserts the data into the ring is in xen, and xen writes a header describing the data and where it came from infront of the data. As both the ring manipulation and the header are written by the hypervisor, the receiving domain can trust their content and the format of the ring. ** Addressing v4v_addr_t identifies an endpoint address. typedef struct v4v_addr { uint32_t port; domid_t domain; } v4v_addr_t; struct v4v_ring_id uniquely identifies a ring on the platform. struct v4v_ring_id { struct v4v_addr addr; domid_t partner; }; When a send operation is performed, xen will attempt to find a ring registered by destination domain, with the required port, and with a partner of the sending domain. Failing that it will then search for a ring with the correct port and destination domain, but with partner set to V4V_DOMID_ANY. ** Setup A domain first generates a ring, and a structure describing the pages in the ring. Rings must be page aligned, and contiguous in virtual memory, but need not be contiguous in physical(pfn) or machine(mfn) memory. A ring is uniquely identified on the platform by its struct v4v_ring_id. A ring is contained in a v4v_ring_t typedef struct v4v_ring { uint64_t magic; /* * Identifies ring_id - xen only looks at this during register/unregister * and will fill in id.addr.domain */ struct v4v_ring_id id; /* length of ring[], must be a multiple of 16 */ uint32_t len; /* rx_ptr - modified by domain */ uint32_t rx_ptr; /* tx_ptr - modified by xen */ volatile uint32_t tx_ptr; uint64_t reserved[4]; volatile uint8_t ring[0]; } v4v_ring_t; To register the ring, xen also needs a list of pfn that the ring occupies. This is provided in a v4v_pfn_list_t typedef struct v4v_pfn_list_t { uint64_t magic; uint32_t npage; uint32_t pad; uint64_t reserved[3]; v4v_pfn_t pages[0]; } v4v_pfn_list_t; Registering the ring is done with a hypercall int hypercall(NR_V4V,V4VOP_register_ring,v4v_ring_t *ring,v4v_pfn_list_t *pfn_list); On registering, xen will check that the tx_ptr and rx_ptr values are valid (aligned and within the size of the ring) and modify them if they are not, it will also update id.addr.domain with the calling domain''s domid and take an internal copy of all the relevant information. After registration the only field that xen will read is ring->rx_ptr, xen will write to ring->ring[] and ring->tx_ptr. Thus pfn_list can be freed after the registration call. Critically, registration is idempotent, and all a domain need do after hibernation or migration is re-register all its rings. (NB the id.addr.domain field may change). ** Sending To send data a guest merely calls the send or sendv hypercall int hypercall(NR_V4V,V4VOP_send,v4v_addr_t *src,v4v_addr_t *dst,void *buf, uint32_t len, uint32_t protocol); the caller provides a source address, a destination address, a buffer, a number of bytes to copy and a 32 bit protocol number. Xen ignores src->domain and instead uses the domain id of the caller. Xen will attempt to insert into a ring with id.addr equal to dst, and with id.partner equal to the caller''s domid. Otherwise it will insert into a ring with id.addr equal to dst and id.partner equal to V4V_DOMID_ANY. If the insert is successfull then Xen will trigger the V4V VIRQ line in the receiving domain. ** Reception Xen pads all messages on the ring to 16 bytes (the macro V4V_ROUNDUP is provided for convience), and inserts a message header infront of all messages it copies onto the ring. struct v4v_ring_message_header { uint32_t len; struct v4v_addr source; uint16_t pad; uint32_t protocol; uint8_t data[0]; }; len is the number of bytes in the message including the struct v4v_ring_message_header source specifies the source address from which the message came, source.domain is set by xen, and source.port is taken from the arguments to the send or sendv call. protocol is the protocol value given to the send call. An inline function is provided to make reading from the ring easier: ssize_t v4v_copy_out (struct v4v_ring *r, struct v4v_addr *from, uint32_t * protocol, void *_buf, size_t t, int consume) v4v_copy_out will copy at most t bytes from the ring r and place them in buf, if it is non-null. If from and protocol are not NULL pointers then the source address and protocol number will by copied into them. If consume is non-zero then the ring''s receive pointer will be advanced. The function returns the number of bytes of payload that the message has (ie. the number of bytes passed to the original send or sendv call). Thus, v4v_copy_out(r,NULL,NULL,NULL,0,0); will return the number of bytes in the next message, and v4v_copy_out(r,NULL,NULL,NULL,0,1); will return the number of bytes in the next message and delete it from the ring. ** Notification If the receiving ring is full in a send or sendv hypercall, the hypercall will return with the error -EAGAIN. When sufficient space becomes available in the ring xen will raise the V4V VIRQ line. Jean
Hi, Thanks for this. Overall, it looks like Xen is doing a few things here: - nameservice for registering services and finding endpoints; - ring manipulation arithmetic; - copying data; and - notifying endpoints. The shared-ring logic was able to do all of these, with a few drawbacks: - The xenstore handshake stuff is really grotty; - grant maps can cause zombie domains; and - it doesn''t do many-to-one multicast. Is that a fair summary? Using copy semantics intead of shared rings fixes the zombie domain problem, and I think we should definitely take that. We talked last week about ways we could improve xenstore to provde the nameservice aspects of v4v; if that can be done I think it should be. The rest of is within spitting distance of a multicall of grant copies and event-channel pokes - except for the many-to-one multicast. That relies on Xen policing the ring headers to make sure the length fields are correct. What''s the use case for that? Do you use it just to negotiate a dedicated ring for each sender or is there a need for multiple data streams to be multiplexed together? I have a few questions on the design, inline below. At 10:42 +0100 on 07 Jun (1339065745), Jean Guyader wrote:> v4v has a very simple architecture. The receiving domain registers a > ring with xen. The sending domain requests that xen inserts data into > the receiving domain''s ring. Notification that there is new data to > read is delivered by a VIRQNot an evtchn?> , and a domain can request an interrupt be generated when sufficient > space is available in another domain''s ring.How is that arranged? It doesn''t look like the receiver notifies Xen when it consumes a ring entry.> The code that inserts > the data into the ring is in xen, and xen writes a header describing > the data and where it came from infront of the data. As both the ring > manipulation and the header are written by the hypervisor, the > receiving domain can trust their content and the format of the ring.Does that save a copy on the receiving side?> ** Addressing > v4v_addr_t identifies an endpoint address. > typedef struct v4v_addr > { > uint32_t port; > domid_t domain; > } v4v_addr_t; > struct v4v_ring_id uniquely identifies a ring on the platform. > > struct v4v_ring_id > { > struct v4v_addr addr; > domid_t partner; > }; > > When a send operation is performed, xen will attempt to find a ring > registered by destination domain, with the required port, and with > a partner of the sending domain. Failing that it will then search > for a ring with the correct port and destination domain, but with > partner set to V4V_DOMID_ANY. > > ** Setup > A domain first generates a ring, and a structure describing the pages > in the ring. Rings must be page aligned, and contiguous in virtual > memory, but need not be contiguous in physical(pfn) or machine(mfn) > memory. A ring is uniquely identified on the platform by its struct > v4v_ring_id. > > A ring is contained in a v4v_ring_t > typedef struct v4v_ring > { > uint64_t magic; > /* > * Identifies ring_id - xen only looks at this during register/unregister > * and will fill in id.addr.domain > */ > struct v4v_ring_id id; > /* length of ring[], must be a multiple of 16 */ > uint32_t len; > /* rx_ptr - modified by domain */ > uint32_t rx_ptr; > /* tx_ptr - modified by xen */ > volatile uint32_t tx_ptr; > uint64_t reserved[4]; > volatile uint8_t ring[0]; > } v4v_ring_t;(Digressing into code review for a minute, I don''t think we should use ''volatile'' in the public headers, and possibly it will be more efficient to use barriers anyway if we''re processing the received data in place).> To register the ring, xen also needs a list of pfn that the ring occupies. > This is provided in a v4v_pfn_list_t > typedef struct v4v_pfn_list_t > { > uint64_t magic; > uint32_t npage; > uint32_t pad; > uint64_t reserved[3]; > v4v_pfn_t pages[0]; > } v4v_pfn_list_t; > > Registering the ring is done with a hypercall > > int hypercall(NR_V4V,V4VOP_register_ring,v4v_ring_t *ring,v4v_pfn_list_t *pfn_list); > > On registering, xen will check that the tx_ptr and rx_ptr values are valid (aligned > and within the size of the ring) and modify them if they are notModify them? Not return an error?>, it will also update > id.addr.domain with the calling domain''s domid and take an internal copy of all the > relevant information. After registration the only field that xen will read is > ring->rx_ptr, xen will write to ring->ring[] and ring->tx_ptr. > Thus pfn_list can be freed after the registration call. Critically, registration is > idempotent, and all a domain need do after hibernation or migration is re-register all > its rings. (NB the id.addr.domain field may change). > > ** Sending > > To send data a guest merely calls the send or sendv hypercall > int hypercall(NR_V4V,V4VOP_send,v4v_addr_t *src,v4v_addr_t *dst,void *buf, uint32_t len, > uint32_t protocol);What does the source address do here? Do I need to register a ring in order to send to another ring? Why the separate ''protocol'' argument when ''port'' is bound into the address struct? A single ring receives messages for a given port but all protocols?> the caller provides a source address, a destination address, a buffer, a number of bytes to copy > and a 32 bit protocol number. Xen ignores src->domain and instead uses the domain id of the caller. > > Xen will attempt to insert into a ring with id.addr equal to dst, and with > id.partner equal to the caller''s domid. Otherwise it will insert into a ring > with id.addr equal to dst and id.partner equal to V4V_DOMID_ANY. If the insert is successfull > then Xen will trigger the V4V VIRQ line in the receiving domain. > > ** Reception > > Xen pads all messages on the ring to 16 bytes (the macro V4V_ROUNDUP is provided > for convience), and inserts a message header infront of all messages it copies onto the ring. > > struct v4v_ring_message_header > { > uint32_t len; > struct v4v_addr source; > uint16_t pad; > uint32_t protocol; > uint8_t data[0]; > }; > > len is the number of bytes in the message including the struct v4v_ring_message_header > source specifies the source address from which the message came, source.domain is > set by xen, and source.port is taken from the arguments to the send or sendv call. > protocol is the protocol value given to the send call. > > An inline function is provided to make reading from the ring easier: > > ssize_t > v4v_copy_out (struct v4v_ring *r, struct v4v_addr *from, uint32_t * protocol, > void *_buf, size_t t, int consume)I guess that answers my question about avoiding a copy on the receiver. :( Cheers, Tim.
At 12:40 +0100 on 07 Jun (1339072831), Tim Deegan wrote:> Hi, > > Thanks for this. > > Overall, it looks like Xen is doing a few things here: > - nameservice for registering services and finding endpoints; > - ring manipulation arithmetic; > - copying data; and > - notifying endpoints. > > The shared-ring logic was able to do all of these, with a few drawbacks: > - The xenstore handshake stuff is really grotty; > - grant maps can cause zombie domains; and > - it doesn''t do many-to-one multicast.We''ve just had a discussion of this in person, and from my notes the two things that stand out are: - yes, you want to do many-to-one multiplexing, in particular to avoid the server needing a ring for every client; and - one reason not to use Xenstore is that there is only one Xenstore page per VM and it may not be possible to safely share it with other users (in particular between a BIOS/EFI user and the main OS). The Xenstore point is understandable, and probably something that ought to be fixed anyway -- we have seen people run into similar problems with BIOS drivers for blkfront and netfront. Using one ring for all clients raises the question of access control and admission control -- in particular how do you avoid DoS from badly-behaved clients? And, given your concerns about sharing an OS with an uncooperative Xenstore client, how do you handle sharing the OS with a badly behaved v4v client? If we _do_ need one ring with multiple writers, and therefore Xen needs to arbitrate writes, there''s still room for the policy-based parts (controlling connection setup, for example) to live outside the hypervisor, openvswitch-style. Cheers, Tim.
On 07/06 04:36, Tim Deegan wrote:> At 12:40 +0100 on 07 Jun (1339072831), Tim Deegan wrote: > > Hi, > > > > Thanks for this. > > > > Overall, it looks like Xen is doing a few things here: > > - nameservice for registering services and finding endpoints; > > - ring manipulation arithmetic; > > - copying data; and > > - notifying endpoints. > > > > The shared-ring logic was able to do all of these, with a few drawbacks: > > - The xenstore handshake stuff is really grotty; > > - grant maps can cause zombie domains; and > > - it doesn''t do many-to-one multicast. > > We''ve just had a discussion of this in person, and from my notes the two > things that stand out are: > > - yes, you want to do many-to-one multiplexing, in particular to > avoid the server needing a ring for every client; and > - one reason not to use Xenstore is that there is only one Xenstore > page per VM and it may not be possible to safely share it with other > users (in particular between a BIOS/EFI user and the main OS). > > The Xenstore point is understandable, and probably something that ought > to be fixed anyway -- we have seen people run into similar problems with > BIOS drivers for blkfront and netfront. > > Using one ring for all clients raises the question of access control and > admission control -- in particular how do you avoid DoS from > badly-behaved clients? > > And, given your concerns about sharing an OS with an uncooperative > Xenstore client, how do you handle sharing the OS with a badly behaved > v4v client? > > If we _do_ need one ring with multiple writers, and therefore Xen needs > to arbitrate writes, there''s still room for the policy-based parts > (controlling connection setup, for example) to live outside the > hypervisor, openvswitch-style. >Thanks for summary Tim. Today the acl check in V4V (not part of the current patch serie) is done for every copy by Xen. Moving the policy control outside of Xen would mean that you still need to have a copy of the acls in Xen and the worst thing that can happen is for the copy to get out of sync. What do you think would be the next step going forward? Jean
At 11:48 +0100 on 13 Jun (1339588138), Jean Guyader wrote:> On 07/06 04:36, Tim Deegan wrote: > > Using one ring for all clients raises the question of access control and > > admission control -- in particular how do you avoid DoS from > > badly-behaved clients? > > > > And, given your concerns about sharing an OS with an uncooperative > > Xenstore client, how do you handle sharing the OS with a badly behaved > > v4v client? > > > > If we _do_ need one ring with multiple writers, and therefore Xen needs > > to arbitrate writes, there''s still room for the policy-based parts > > (controlling connection setup, for example) to live outside the > > hypervisor, openvswitch-style. > > Today the acl check in V4V (not part of the current patch serie) is > done for every copy by Xen.OK; can you describe roughly how it works? Is it a whitelist of good domains, or a blacklist of bad? Does it just do access control or is there rate limiting? Can it detect and block badly-behaved clients, or is that something you do in the server? Have you given any thought to my second question -- if you can''t rely on the existing xenstore code, are there problems with sharing a VM with other v4v drivers? My intuition is that at least it would not be so bad, since non-malicious drivers ought to be able to coexist, but I''d like to know your opinion.> Moving the policy control outside of Xen > would mean that you still need to have a copy of the acls in Xen and > the worst thing that can happen is for the copy to get out of sync.What I meant by openvswitch-style is to have a single low-level ACL in the hypervisor (maybe as a whitelist of known good connections) and fault all unusual behaviour (new domain appears, &c) to the more complex policy engine in user-space. Really it depends on what kind of policy you want to be able to express. If a simple yes/no whitelist is good enough (and always will be) then it may as well live in the hypervisor and we can skip the ''defer to userspace'' part.> What do you think would be the next step going forward?Hopefully, to get some feedback from other people (specifically Ian, Ian and Stefano but anyone else too!) and decide what, if anything, ought to be changed about the design. My feedback has mostly been about the hypervisor side of things -- I''m sure the tools maintainers have some ideas about how this should be merged with libvchan, to avoid having two separate comms interfaces. Cheers, Tim.
On 13 June 2012 12:44, Tim Deegan <tim@xen.org> wrote:> At 11:48 +0100 on 13 Jun (1339588138), Jean Guyader wrote: >> On 07/06 04:36, Tim Deegan wrote: >> > Using one ring for all clients raises the question of access control and >> > admission control -- in particular how do you avoid DoS from >> > badly-behaved clients? >> > >> > And, given your concerns about sharing an OS with an uncooperative >> > Xenstore client, how do you handle sharing the OS with a badly behaved >> > v4v client? >> > >> > If we _do_ need one ring with multiple writers, and therefore Xen needs >> > to arbitrate writes, there''s still room for the policy-based parts >> > (controlling connection setup, for example) to live outside the >> > hypervisor, openvswitch-style. >> >> Today the acl check in V4V (not part of the current patch serie) is >> done for every copy by Xen. > > OK; can you describe roughly how it works? Is it a whitelist of good > domains, or a blacklist of bad? Does it just do access control or is > there rate limiting? Can it detect and block badly-behaved clients, > or is that something you do in the server? >In xen we keep of list of simple acls. Here is the usage of the userspace tools that controls it. This is a straight forward implementation of the rule API. Usage: viptables command [rule] Commands : --append -A Append rule --insert -I <n> Insert rule before rule <n> --list -L List rules --delete -D[<n>] Delete rule <n> or the following rule --flush -F Flush rules --help -h Help Rule options: --dom-in -d <n> Client domid --dom-out -e <n> Server domid --port-in -p <n> Client port --port-out -q <n> Server port --jump -j {ACCEPT|REJECT} What to do> Have you given any thought to my second question -- if you can''t rely on > the existing xenstore code, are there problems with sharing a VM with > other v4v drivers? My intuition is that at least it would not be so > bad, since non-malicious drivers ought to be able to coexist, but I''d > like to know your opinion. >Are you talking about having different version of V4V driver running in the same VM? I don''t think that is a problem they both interact with Xen via hypercall directly so if they follow the v4v hypercall interface it''s all fine.>> Moving the policy control outside of Xen >> would mean that you still need to have a copy of the acls in Xen and >> the worst thing that can happen is for the copy to get out of sync. > > What I meant by openvswitch-style is to have a single low-level ACL in > the hypervisor (maybe as a whitelist of known good connections) and > fault all unusual behaviour (new domain appears, &c) to the more complex > policy engine in user-space. >Yes sure.> Really it depends on what kind of policy you want to be able to express. > If a simple yes/no whitelist is good enough (and always will be) then it > may as well live in the hypervisor and we can skip the ''defer to > userspace'' part. > >> What do you think would be the next step going forward? > > Hopefully, to get some feedback from other people (specifically Ian, Ian > and Stefano but anyone else too!) and decide what, if anything, ought to > be changed about the design. > > My feedback has mostly been about the hypervisor side of things -- I''m > sure the tools maintainers have some ideas about how this should be > merged with libvchan, to avoid having two separate comms interfaces. >Ok. Thanks for your feedback. Jean
On 01/06 02:24, George Dunlap wrote:> On Fri, Jun 1, 2012 at 1:41 PM, Jan Beulich <JBeulich@suse.com> wrote: > >>>> On 31.05.12 at 17:07, Jean Guyader <jean.guyader@citrix.com> wrote: > > > > Is there a particular reason this series got resent with no apparent > > change (and also none mentioned in the description here or in the > > individual patches)? All of the comments sent yesterday still apply, > > and I continue to consider patches 1-4 as unnecessary/broken. > > The date on the second set of e-mails is 15 minutes before the > original series. I presume this means that Jean sent the series but > it got held up for some reason, so sent it again 15 minutes later; and > the second series made it through but the first was held up until > today. > > Never attribute to malice that which can be attributed to computer error. :-) >George is absolutely spot on, this is exactly what happen. Sorry about that :( Jean
Jean Guyader
2012-Jun-14 14:08 UTC
Re: [PATCH 4/5] xen: Enforce casting for guest_handle_cast
On 31/05 04:47, Jan Beulich wrote:> >>> On 31.05.12 at 17:07, Jean Guyader <jean.guyader@citrix.com> wrote: > >--- a/xen/include/asm-x86/guest_access.h > >+++ b/xen/include/asm-x86/guest_access.h > >@@ -47,7 +47,7 @@ > > > > /* Cast a guest handle to the specified type of handle. */ > > #define guest_handle_cast(hnd, type) ({ \ > >- type *_x = (hnd).p; \ > >+ type *_x = (type *)(hnd).p; \ > > > You would have to explain how this is safe: Without the cast, we > get compiler warnings (and hence build failures due to -Werror) > if "type *" and typeof((hnd).p) are incompatible. Adding an > explicit cast removes that intentional check. >I can''t realy explain how this is safe because I agree it make this function less safe. Maybe I should put here the reason that led me to do something like that. Here is what I''m trying to do: XEN_GUEST_HANDLE (uint8_t) slop_hnd guest_handle_cast (pfn_list_hnd, uint8_t); guest_handle_add_offset (slop_hnd, sizeof (v4v_pfn_list_t)); pfn_hnd = guest_handle_cast (slop_hnd, v4v_pfn_t); I need to cast to uint8_t first to get the add_offset to behave correctly. Maybe what I need would need a new macro that would do those two operations. What would be the proper way to doing something like this? Thanks, Jean
Jan Beulich
2012-Jun-14 14:23 UTC
Re: [PATCH 4/5] xen: Enforce casting for guest_handle_cast
>>> On 14.06.12 at 16:08, Jean Guyader <jean.guyader@citrix.com> wrote: > On 31/05 04:47, Jan Beulich wrote: >> >>> On 31.05.12 at 17:07, Jean Guyader <jean.guyader@citrix.com> wrote: >> >--- a/xen/include/asm-x86/guest_access.h >> >+++ b/xen/include/asm-x86/guest_access.h >> >@@ -47,7 +47,7 @@ >> > >> > /* Cast a guest handle to the specified type of handle. */ >> > #define guest_handle_cast(hnd, type) ({ \ >> >- type *_x = (hnd).p; \ >> >+ type *_x = (type *)(hnd).p; \ >> >> >> You would have to explain how this is safe: Without the cast, we >> get compiler warnings (and hence build failures due to -Werror) >> if "type *" and typeof((hnd).p) are incompatible. Adding an >> explicit cast removes that intentional check. >> > > I can''t realy explain how this is safe because I agree it make > this function less safe. > > Maybe I should put here the reason that led me to do something > like that. Here is what I''m trying to do: > > XEN_GUEST_HANDLE (uint8_t) slop_hnd > guest_handle_cast (pfn_list_hnd, uint8_t); > guest_handle_add_offset (slop_hnd, sizeof (v4v_pfn_list_t)); > pfn_hnd = guest_handle_cast (slop_hnd, v4v_pfn_t); > > I need to cast to uint8_t first to get the add_offset to behave > correctly. Maybe what I need would need a new macro that would > do those two operations. > > What would be the proper way to doing something like this?Depends on what pfn_list_hnd''s type is. Maybe going through XEN_GUEST_HANDLE(void) would do (perhaps you could even replace uint8_t with void in your code above, making use of gcc''s extension allowing void pointer arithmetic). If that doesn''t help, maybe you could post a complete example, compilable with your modification, and I could look into making this work with what is there already. Jan
Tim Deegan
2012-Jun-14 14:26 UTC
Re: [PATCH 4/5] xen: Enforce casting for guest_handle_cast
At 15:08 +0100 on 14 Jun (1339686495), Jean Guyader wrote:> On 31/05 04:47, Jan Beulich wrote: > > >>> On 31.05.12 at 17:07, Jean Guyader <jean.guyader@citrix.com> wrote: > > >--- a/xen/include/asm-x86/guest_access.h > > >+++ b/xen/include/asm-x86/guest_access.h > > >@@ -47,7 +47,7 @@ > > > > > > /* Cast a guest handle to the specified type of handle. */ > > > #define guest_handle_cast(hnd, type) ({ \ > > >- type *_x = (hnd).p; \ > > >+ type *_x = (type *)(hnd).p; \ > > > > > > You would have to explain how this is safe: Without the cast, we > > get compiler warnings (and hence build failures due to -Werror) > > if "type *" and typeof((hnd).p) are incompatible. Adding an > > explicit cast removes that intentional check. > > > > I can''t realy explain how this is safe because I agree it make > this function less safe. > > Maybe I should put here the reason that led me to do something > like that. Here is what I''m trying to do: > > XEN_GUEST_HANDLE (uint8_t) slop_hnd > guest_handle_cast (pfn_list_hnd, uint8_t); > guest_handle_add_offset (slop_hnd, sizeof (v4v_pfn_list_t)); > pfn_hnd = guest_handle_cast (slop_hnd, v4v_pfn_t); > > I need to cast to uint8_t first to get the add_offset to behave > correctly. Maybe what I need would need a new macro that would > do those two operations. > > What would be the proper way to doing something like this?You could avoid it altogether by dropping struct v4v_ring_data, and passing a v4v_pfn_t array directly with the ''npage'' as a separate hypercall argument. AFAICS struct v4v_ring_data has no other useful fields. Tim.
Tim Deegan
2012-Jun-14 14:27 UTC
Re: [PATCH 4/5] xen: Enforce casting for guest_handle_cast
At 15:26 +0100 on 14 Jun (1339687574), Tim Deegan wrote:> At 15:08 +0100 on 14 Jun (1339686495), Jean Guyader wrote: > > Maybe I should put here the reason that led me to do something > > like that. Here is what I''m trying to do: > > > > XEN_GUEST_HANDLE (uint8_t) slop_hnd > > guest_handle_cast (pfn_list_hnd, uint8_t); > > guest_handle_add_offset (slop_hnd, sizeof (v4v_pfn_list_t)); > > pfn_hnd = guest_handle_cast (slop_hnd, v4v_pfn_t); > > > > I need to cast to uint8_t first to get the add_offset to behave > > correctly. Maybe what I need would need a new macro that would > > do those two operations. > > > > What would be the proper way to doing something like this? > > You could avoid it altogether by dropping struct v4v_ring_data, and > passing a v4v_pfn_t array directly with the ''npage'' as a separate > hypercall argument. AFAICS struct v4v_ring_data has no other useful > fields.Excuse me, I meant struct v4v_pfn_list_t. Tim.
Jean Guyader
2012-Jun-14 14:40 UTC
Re: [PATCH 4/5] xen: Enforce casting for guest_handle_cast
On 14/06 03:27, Tim Deegan wrote:> At 15:26 +0100 on 14 Jun (1339687574), Tim Deegan wrote: > > At 15:08 +0100 on 14 Jun (1339686495), Jean Guyader wrote: > > > Maybe I should put here the reason that led me to do something > > > like that. Here is what I''m trying to do: > > > > > > XEN_GUEST_HANDLE (uint8_t) slop_hnd > > > guest_handle_cast (pfn_list_hnd, uint8_t); > > > guest_handle_add_offset (slop_hnd, sizeof (v4v_pfn_list_t)); > > > pfn_hnd = guest_handle_cast (slop_hnd, v4v_pfn_t); > > > > > > I need to cast to uint8_t first to get the add_offset to behave > > > correctly. Maybe what I need would need a new macro that would > > > do those two operations. > > > > > > What would be the proper way to doing something like this? > > > > You could avoid it altogether by dropping struct v4v_ring_data, and > > passing a v4v_pfn_t array directly with the ''npage'' as a separate > > hypercall argument. AFAICS struct v4v_ring_data has no other useful > > fields. > > Excuse me, I meant struct v4v_pfn_list_t. >You probably mean both, because we doing the same thing with v4v_ring_data_t and v4v_ring_data_ent_t. Jean
At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote:> Are you talking about having different version of V4V driver running > in the same VM?Yes.> I don''t think that is a problem they both interact with Xen via > hypercall directly so if they follow the v4v hypercall interface it''s > all fine.AFAICS if they both try to register the same port then one of them will silently get its ring discarded. And if they both try to communicate with the same remote port their entries on the pending lists will get merged (which is probably not too bad). I think the possibility for confusion depends on how you use the service. Still, it seems better than the xenstore case, anyway. :) Tim.
On 14/06 03:56, Tim Deegan wrote:> At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: > > Are you talking about having different version of V4V driver running > > in the same VM? > > Yes. > > > I don''t think that is a problem they both interact with Xen via > > hypercall directly so if they follow the v4v hypercall interface it''s > > all fine. > > AFAICS if they both try to register the same port then one of them will > silently get its ring discarded. And if they both try to communicate > with the same remote port their entries on the pending lists will get > merged (which is probably not too bad). I think the possibility for > confusion depends on how you use the service. Still, it seems better > than the xenstore case, anyway. :) >Not silently, register_ring will return an error. Jean
At 16:10 +0100 on 14 Jun (1339690244), Jean Guyader wrote:> On 14/06 03:56, Tim Deegan wrote: > > At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: > > > Are you talking about having different version of V4V driver running > > > in the same VM? > > > > Yes. > > > > > I don''t think that is a problem they both interact with Xen via > > > hypercall directly so if they follow the v4v hypercall interface it''s > > > all fine. > > > > AFAICS if they both try to register the same port then one of them will > > silently get its ring discarded. And if they both try to communicate > > with the same remote port their entries on the pending lists will get > > merged (which is probably not too bad). I think the possibility for > > confusion depends on how you use the service. Still, it seems better > > than the xenstore case, anyway. :) > > > > Not silently, register_ring will return an error.Will it? It looks to me like v4v_ring_add just clobbers the old MFN list. Tim.
Jean Guyader
2012-Jun-14 15:39 UTC
Re: [PATCH 4/5] xen: Enforce casting for guest_handle_cast
On 14/06 03:40, Jean Guyader wrote:> On 14/06 03:27, Tim Deegan wrote: > > At 15:26 +0100 on 14 Jun (1339687574), Tim Deegan wrote: > > > At 15:08 +0100 on 14 Jun (1339686495), Jean Guyader wrote: > > > > Maybe I should put here the reason that led me to do something > > > > like that. Here is what I''m trying to do: > > > > > > > > XEN_GUEST_HANDLE (uint8_t) slop_hnd > > > > guest_handle_cast (pfn_list_hnd, uint8_t); > > > > guest_handle_add_offset (slop_hnd, sizeof (v4v_pfn_list_t)); > > > > pfn_hnd = guest_handle_cast (slop_hnd, v4v_pfn_t); > > > > > > > > I need to cast to uint8_t first to get the add_offset to behave > > > > correctly. Maybe what I need would need a new macro that would > > > > do those two operations. > > > > > > > > What would be the proper way to doing something like this? > > > > > > You could avoid it altogether by dropping struct v4v_ring_data, and > > > passing a v4v_pfn_t array directly with the ''npage'' as a separate > > > hypercall argument. AFAICS struct v4v_ring_data has no other useful > > > fields. > > > > Excuse me, I meant struct v4v_pfn_list_t. > > > > You probably mean both, because we doing the same thing with > v4v_ring_data_t and v4v_ring_data_ent_t. >Actually I don''t want to get rid of the v4v_ring_data_t struct. The idea being this struct is that we might want to extend it in the future so having a wrapper arround with a magic is important to detect which version of the struct is being used. Here are the structs: typedef struct v4v_ring_data_ent { struct v4v_addr ring; uint16_t flags; uint16_t pad0; uint32_t space_required; uint32_t max_message_size; } v4v_ring_data_ent_t; DEFINE_XEN_GUEST_HANDLE (v4v_ring_data_ent_t); typedef struct v4v_ring_data { uint64_t magic; uint32_t nent; uint32_t padding; uint64_t reserved[4]; v4v_ring_data_ent_t ring[0]; } v4v_ring_data_t; DEFINE_XEN_GUEST_HANDLE (v4v_ring_data_t); I get a XEN_GUEST_HANDLE(v4v_ring_data_t) as argument of my hypercall and I would like to access the ring data inside it which is a XEN_GUEST_HANDLE as well. Here is the code that I use for doing that (with explicte cast in guest_handle_cast): XEN_GUEST_HANDLE (v4v_ring_data_ent_t) ring_data_ent_hnd; XEN_GUEST_HANDLE (uint8_t) slop_hnd guest_handle_cast (ring_data_hnd, uint8_t); guest_handle_add_offset (slop_hnd, sizeof (v4v_ring_data_t)); ring_data_ent_hnd guest_handle_cast (slop_hnd, v4v_ring_data_ent_t); ret = v4v_fill_ring_datas (d, ring_data.nent, ring_data_ent_hnd); Thanks for your help. Jean
Tim Deegan
2012-Jun-14 15:50 UTC
Re: [PATCH 4/5] xen: Enforce casting for guest_handle_cast
At 16:39 +0100 on 14 Jun (1339691953), Jean Guyader wrote:> On 14/06 03:40, Jean Guyader wrote: > > On 14/06 03:27, Tim Deegan wrote: > > > At 15:26 +0100 on 14 Jun (1339687574), Tim Deegan wrote: > > > > At 15:08 +0100 on 14 Jun (1339686495), Jean Guyader wrote: > > > > > Maybe I should put here the reason that led me to do something > > > > > like that. Here is what I''m trying to do: > > > > > > > > > > XEN_GUEST_HANDLE (uint8_t) slop_hnd > > > > > guest_handle_cast (pfn_list_hnd, uint8_t); > > > > > guest_handle_add_offset (slop_hnd, sizeof (v4v_pfn_list_t)); > > > > > pfn_hnd = guest_handle_cast (slop_hnd, v4v_pfn_t); > > > > > > > > > > I need to cast to uint8_t first to get the add_offset to behave > > > > > correctly. Maybe what I need would need a new macro that would > > > > > do those two operations. > > > > > > > > > > What would be the proper way to doing something like this? > > > > > > > > You could avoid it altogether by dropping struct v4v_ring_data, and > > > > passing a v4v_pfn_t array directly with the ''npage'' as a separate > > > > hypercall argument. AFAICS struct v4v_ring_data has no other useful > > > > fields. > > > > > > Excuse me, I meant struct v4v_pfn_list_t. > > > > > > > You probably mean both, because we doing the same thing with > > v4v_ring_data_t and v4v_ring_data_ent_t. > > > > Actually I don''t want to get rid of the v4v_ring_data_t struct. > > The idea being this struct is that we might want to extend it in the future > so having a wrapper arround with a magic is important to detect which > version of the struct is being used.Do you have concrete ideas for how you might want to extend the header, or is this just generic futureproofing? Since each of these structs is only used for one command you can allocate new command numbers if you need to add more arguments later. That''s worked well enough for other hypercalls in the past. Cheers, Tim.
Jan Beulich
2012-Jun-14 16:00 UTC
Re: [PATCH 4/5] xen: Enforce casting for guest_handle_cast
>>> On 14.06.12 at 17:39, Jean Guyader <jean.guyader@citrix.com> wrote: > Here are the structs: > > typedef struct v4v_ring_data_ent > { > struct v4v_addr ring; > uint16_t flags; > uint16_t pad0; > uint32_t space_required; > uint32_t max_message_size; > } v4v_ring_data_ent_t; > DEFINE_XEN_GUEST_HANDLE (v4v_ring_data_ent_t); > > typedef struct v4v_ring_data > { > uint64_t magic; > uint32_t nent; > uint32_t padding; > uint64_t reserved[4]; > v4v_ring_data_ent_t ring[0]; > } v4v_ring_data_t; > DEFINE_XEN_GUEST_HANDLE (v4v_ring_data_t); > > I get a XEN_GUEST_HANDLE(v4v_ring_data_t) as argument of my hypercall and I > would like to access the ring data inside it which is a XEN_GUEST_HANDLE as well. > > Here is the code that I use for doing that (with explicte cast in > guest_handle_cast): > XEN_GUEST_HANDLE (v4v_ring_data_ent_t) ring_data_ent_hnd; > XEN_GUEST_HANDLE (uint8_t) slop_hnd > guest_handle_cast (ring_data_hnd, uint8_t); > guest_handle_add_offset (slop_hnd, sizeof (v4v_ring_data_t)); > ring_data_ent_hnd > guest_handle_cast (slop_hnd, v4v_ring_data_ent_t); > ret = v4v_fill_ring_datas (d, ring_data.nent, ring_data_ent_hnd);So you really don''t want to add anything (as not being type-safe), but instead want to get a guest handle representation for accessing the ring[0] member. That is, I''d introduce a guest_handle_for_field() for this purpose. Let me see whether I can put something together for you early next week. Jan
On 14 June 2012 16:35, Tim Deegan <tim@xen.org> wrote:> At 16:10 +0100 on 14 Jun (1339690244), Jean Guyader wrote: >> On 14/06 03:56, Tim Deegan wrote: >> > At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: >> > > Are you talking about having different version of V4V driver running >> > > in the same VM? >> > >> > Yes. >> > >> > > I don''t think that is a problem they both interact with Xen via >> > > hypercall directly so if they follow the v4v hypercall interface it''s >> > > all fine. >> > >> > AFAICS if they both try to register the same port then one of them will >> > silently get its ring discarded. And if they both try to communicate >> > with the same remote port their entries on the pending lists will get >> > merged (which is probably not too bad). I think the possibility for >> > confusion depends on how you use the service. Still, it seems better >> > than the xenstore case, anyway. :) >> > >> >> Not silently, register_ring will return an error. > > Will it? It looks to me like v4v_ring_add just clobbers the old MFN > list. >Ha yes. It does that now but I think it should return an error informing up the stack that a ring has already been registered. Jean
Jean Guyader
2012-Jun-14 21:19 UTC
Re: [PATCH 4/5] xen: Enforce casting for guest_handle_cast
On 14/06 05:00, Jan Beulich wrote:> >>> On 14.06.12 at 17:39, Jean Guyader <jean.guyader@citrix.com> wrote: > > Here are the structs: > > > > typedef struct v4v_ring_data_ent > > { > > struct v4v_addr ring; > > uint16_t flags; > > uint16_t pad0; > > uint32_t space_required; > > uint32_t max_message_size; > > } v4v_ring_data_ent_t; > > DEFINE_XEN_GUEST_HANDLE (v4v_ring_data_ent_t); > > > > typedef struct v4v_ring_data > > { > > uint64_t magic; > > uint32_t nent; > > uint32_t padding; > > uint64_t reserved[4]; > > v4v_ring_data_ent_t ring[0]; > > } v4v_ring_data_t; > > DEFINE_XEN_GUEST_HANDLE (v4v_ring_data_t); > > > > I get a XEN_GUEST_HANDLE(v4v_ring_data_t) as argument of my hypercall and I > > would like to access the ring data inside it which is a XEN_GUEST_HANDLE as well. > > > > Here is the code that I use for doing that (with explicte cast in > > guest_handle_cast): > > XEN_GUEST_HANDLE (v4v_ring_data_ent_t) ring_data_ent_hnd; > > XEN_GUEST_HANDLE (uint8_t) slop_hnd > > guest_handle_cast (ring_data_hnd, uint8_t); > > guest_handle_add_offset (slop_hnd, sizeof (v4v_ring_data_t)); > > ring_data_ent_hnd > > guest_handle_cast (slop_hnd, v4v_ring_data_ent_t); > > ret = v4v_fill_ring_datas (d, ring_data.nent, ring_data_ent_hnd); > > So you really don''t want to add anything (as not being type-safe), > but instead want to get a guest handle representation for accessing > the ring[0] member. That is, I''d introduce a guest_handle_for_field() > for this purpose. Let me see whether I can put something together > for you early next week. >Thanks but I''ll follow Tim''s advice and get rid of the wrapper structure. I still think having some macro to get a handle from a field can be quite useful in general. Jean
Jan Beulich
2012-Jun-18 11:36 UTC
Re: [PATCH 4/5] xen: Enforce casting for guest_handle_cast
>>> On 14.06.12 at 17:39, Jean Guyader <jean.guyader@citrix.com> wrote: > Here are the structs: > > typedef struct v4v_ring_data_ent > > { > > struct v4v_addr ring; > > uint16_t flags; > > uint16_t pad0; > > uint32_t space_required; > > uint32_t max_message_size; > > } v4v_ring_data_ent_t; > > DEFINE_XEN_GUEST_HANDLE (v4v_ring_data_ent_t); > > > > typedef struct v4v_ring_data > > { > > uint64_t magic; > > uint32_t nent; > > uint32_t padding; > > uint64_t reserved[4]; > > v4v_ring_data_ent_t ring[0]; > > } v4v_ring_data_t; > > DEFINE_XEN_GUEST_HANDLE (v4v_ring_data_t); > > I get a XEN_GUEST_HANDLE(v4v_ring_data_t) as argument of my hypercall and I > would like to access the ring data inside it which is a XEN_GUEST_HANDLE as well. > > Here is the code that I use for doing that (with explicte cast in > guest_handle_cast): > XEN_GUEST_HANDLE (v4v_ring_data_ent_t) ring_data_ent_hnd; > XEN_GUEST_HANDLE (uint8_t) slop_hnd > guest_handle_cast (ring_data_hnd, uint8_t); > guest_handle_add_offset (slop_hnd, sizeof (v4v_ring_data_t)); > ring_data_ent_hnd > guest_handle_cast (slop_hnd, v4v_ring_data_ent_t); > ret = v4v_fill_ring_datas (d, ring_data.nent, ring_data_ent_hnd);Something as simple as #define guest_handle_for_field(hnd, type, fld) \ ((XEN_GUEST_HANDLE(type)) { &(hnd).p->fld }) works quite fine for me is an example like int v4v_test(struct domain *d, XEN_GUEST_HANDLE(v4v_ring_data_t) urp) { v4v_ring_data_t ring_data; XEN_GUEST_HANDLE(v4v_ring_data_ent_t) ring_data_ent_hnd; copy_from_guest(&ring_data, urp, 1); ring_data_ent_hnd = guest_handle_for_field(urp, v4v_ring_data_ent_t, ring[0]); return v4v_fill_ring_datas(d, ring_data.nent, ring_data_ent_hnd); } Jan
Jean Guyader
2012-Jun-18 12:50 UTC
Re: [PATCH 4/5] xen: Enforce casting for guest_handle_cast
On 18 June 2012 12:36, Jan Beulich <JBeulich@suse.com> wrote:>>>> On 14.06.12 at 17:39, Jean Guyader <jean.guyader@citrix.com> wrote: >> Here are the structs: >> >> typedef struct v4v_ring_data_ent >> >> { >> >> struct v4v_addr ring; >> >> uint16_t flags; >> >> uint16_t pad0; >> >> uint32_t space_required; >> >> uint32_t max_message_size; >> >> } v4v_ring_data_ent_t; >> >> DEFINE_XEN_GUEST_HANDLE (v4v_ring_data_ent_t); >> >> >> >> typedef struct v4v_ring_data >> >> { >> >> uint64_t magic; >> >> uint32_t nent; >> >> uint32_t padding; >> >> uint64_t reserved[4]; >> >> v4v_ring_data_ent_t ring[0]; >> >> } v4v_ring_data_t; >> >> DEFINE_XEN_GUEST_HANDLE (v4v_ring_data_t); >> >> I get a XEN_GUEST_HANDLE(v4v_ring_data_t) as argument of my hypercall and I >> would like to access the ring data inside it which is a XEN_GUEST_HANDLE as well. >> >> Here is the code that I use for doing that (with explicte cast in >> guest_handle_cast): >> XEN_GUEST_HANDLE (v4v_ring_data_ent_t) ring_data_ent_hnd; >> XEN_GUEST_HANDLE (uint8_t) slop_hnd >> guest_handle_cast (ring_data_hnd, uint8_t); >> guest_handle_add_offset (slop_hnd, sizeof (v4v_ring_data_t)); >> ring_data_ent_hnd >> guest_handle_cast (slop_hnd, v4v_ring_data_ent_t); >> ret = v4v_fill_ring_datas (d, ring_data.nent, ring_data_ent_hnd); > > Something as simple as > > #define guest_handle_for_field(hnd, type, fld) \ > ((XEN_GUEST_HANDLE(type)) { &(hnd).p->fld }) > > works quite fine for me is an example like > > int v4v_test(struct domain *d, XEN_GUEST_HANDLE(v4v_ring_data_t) urp) { > v4v_ring_data_t ring_data; > XEN_GUEST_HANDLE(v4v_ring_data_ent_t) ring_data_ent_hnd; > > copy_from_guest(&ring_data, urp, 1); > ring_data_ent_hnd = guest_handle_for_field(urp, v4v_ring_data_ent_t, ring[0]); > return v4v_fill_ring_datas(d, ring_data.nent, ring_data_ent_hnd); > } >Thanks! That works for me too. I''ll replace the code in my patch series. Jean
At 22:14 +0100 on 14 Jun (1339712061), Jean Guyader wrote:> On 14 June 2012 16:35, Tim Deegan <tim@xen.org> wrote: > > At 16:10 +0100 on 14 Jun (1339690244), Jean Guyader wrote: > >> On 14/06 03:56, Tim Deegan wrote: > >> > At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: > >> > > Are you talking about having different version of V4V driver running > >> > > in the same VM? > >> > > >> > Yes. > >> > > >> > > I don''t think that is a problem they both interact with Xen via > >> > > hypercall directly so if they follow the v4v hypercall interface it''s > >> > > all fine. > >> > > >> > AFAICS if they both try to register the same port then one of them will > >> > silently get its ring discarded. And if they both try to communicate > >> > with the same remote port their entries on the pending lists will get > >> > merged (which is probably not too bad). I think the possibility for > >> > confusion depends on how you use the service. Still, it seems better > >> > than the xenstore case, anyway. :) > >> > > >> > >> Not silently, register_ring will return an error. > > > > Will it? It looks to me like v4v_ring_add just clobbers the old MFN > > list. > > > > Ha yes. It does that now but I think it should return an error > informing up the stack that a ring has already been registered.Actually, I think it''s deliberate, to allow a guest to re-register all its rings after a suspend/resume or migration, without having to worry about whether it was actually migrated into a new domain or not. That raises the question of how a v4v client ought to handle migration; doesn''t it have to rely on other OS components to notify it that one has happened? Cheers, Tim.
Hi, Sorry it''s taken me so long to get round to responding to this. On Mon, 2012-06-25 at 10:05 +0100, Tim Deegan wrote:> At 22:14 +0100 on 14 Jun (1339712061), Jean Guyader wrote: > > On 14 June 2012 16:35, Tim Deegan <tim@xen.org> wrote: > > > At 16:10 +0100 on 14 Jun (1339690244), Jean Guyader wrote: > > >> On 14/06 03:56, Tim Deegan wrote: > > >> > At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: > > >> > > Are you talking about having different version of V4V driver running > > >> > > in the same VM? > > >> > > > >> > Yes. > > >> > > > >> > > I don''t think that is a problem they both interact with Xen via > > >> > > hypercall directly so if they follow the v4v hypercall interface it''s > > >> > > all fine. > > >> > > > >> > AFAICS if they both try to register the same port then one of them will > > >> > silently get its ring discarded. And if they both try to communicate > > >> > with the same remote port their entries on the pending lists will get > > >> > merged (which is probably not too bad). I think the possibility for > > >> > confusion depends on how you use the service. Still, it seems better > > >> > than the xenstore case, anyway. :) > > >> > > > >> > > >> Not silently, register_ring will return an error. > > > > > > Will it? It looks to me like v4v_ring_add just clobbers the old MFN > > > list. > > > > > > > Ha yes. It does that now but I think it should return an error > > informing up the stack that a ring has already been registered. > > Actually, I think it''s deliberate, to allow a guest to re-register all > its rings after a suspend/resume or migration, without having to worry > about whether it was actually migrated into a new domain or not.Which takes us back to the original issue Tim asked about with cohabitation of multiple (perhaps just plain buggy or even malicious) v4v clients in a single domain, doesn''t it? If one of the reasons for not using xenstore is the inability for multiple clients to co-exist then there is no point moving to a different scheme with the same (even if lesser) issue. Up until now v4v has only been used by XenClient so it has avoided this issue but if we take it upstream then presumably the potential for this sort of problem to creep in comes back. Some other things from my notes... Can you remind me of the reasoning behind using VIRQ_V4V instead of regular event channels? I can''t see any reason why these shouldn''t/couldn''t be regular event channels demuxed in the usual way. Are you going to submit any client code? I think the most important of these would be code to make libvchan able to use v4v if it is available (and both ends agree), but any other client code (like the sockets interface overlay I''ve heard about) would be interesting to see too. Have you had any contact from any one else who is interested in building stuff with these interfaces? (This is just curiosity about potential wider usage) I think you and Tim have been discussing access control -- can we get a summary of what this is going to look like going forward. I suppose there will be tools for manipulating this stuff -- can you post them? The patches need plenty of documentation adding to them, both in the interface docs in xen/include/public (marked up such that it comes out nicely in docs/html aka http://xenbits.xen.org/docs/unstable/hypercall/index.html) as well as design docs and any other useful material you have under docs itself. Are we generally happy that we could run e.g. block or net traffic over this channel? As it stands for example the single VIRQ would seem to provide a scalability limit to how many disks and nics a domain could have. Are there any other considerations we need to take into account here? Lastly -- have you given any thoughts to making the copy operation asynchronous? This might allow us to take advantage of copy offload hardware in the future? Ian.
On 25/06 10:05, Tim Deegan wrote:> At 22:14 +0100 on 14 Jun (1339712061), Jean Guyader wrote: > > On 14 June 2012 16:35, Tim Deegan <tim@xen.org> wrote: > > > At 16:10 +0100 on 14 Jun (1339690244), Jean Guyader wrote: > > >> On 14/06 03:56, Tim Deegan wrote: > > >> > At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: > > >> > > Are you talking about having different version of V4V driver running > > >> > > in the same VM? > > >> > > > >> > Yes. > > >> > > > >> > > I don''t think that is a problem they both interact with Xen via > > >> > > hypercall directly so if they follow the v4v hypercall interface it''s > > >> > > all fine. > > >> > > > >> > AFAICS if they both try to register the same port then one of them will > > >> > silently get its ring discarded. And if they both try to communicate > > >> > with the same remote port their entries on the pending lists will get > > >> > merged (which is probably not too bad). I think the possibility for > > >> > confusion depends on how you use the service. Still, it seems better > > >> > than the xenstore case, anyway. :) > > >> > > > >> > > >> Not silently, register_ring will return an error. > > > > > > Will it? It looks to me like v4v_ring_add just clobbers the old MFN > > > list. > > > > > > > Ha yes. It does that now but I think it should return an error > > informing up the stack that a ring has already been registered. > > Actually, I think it''s deliberate, to allow a guest to re-register all > its rings after a suspend/resume or migration, without having to worry > about whether it was actually migrated into a new domain or not. >Yes your are right. The driver will still try to register but a correct error code could tell it weather or not it has been done.> That raises the question of how a v4v client ought to handle migration; > doesn''t it have to rely on other OS components to notify it that one has > happened? >Migration will cause the connection to close and the event will propagated up the stack. Jean
On 26/06 03:38, Ian Campbell wrote:> Hi, > > Sorry it''s taken me so long to get round to responding to this. > > On Mon, 2012-06-25 at 10:05 +0100, Tim Deegan wrote: > > At 22:14 +0100 on 14 Jun (1339712061), Jean Guyader wrote: > > > On 14 June 2012 16:35, Tim Deegan <tim@xen.org> wrote: > > > > At 16:10 +0100 on 14 Jun (1339690244), Jean Guyader wrote: > > > >> On 14/06 03:56, Tim Deegan wrote: > > > >> > At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: > > > >> > > Are you talking about having different version of V4V driver running > > > >> > > in the same VM? > > > >> > > > > >> > Yes. > > > >> > > > > >> > > I don''t think that is a problem they both interact with Xen via > > > >> > > hypercall directly so if they follow the v4v hypercall interface it''s > > > >> > > all fine. > > > >> > > > > >> > AFAICS if they both try to register the same port then one of them will > > > >> > silently get its ring discarded. And if they both try to communicate > > > >> > with the same remote port their entries on the pending lists will get > > > >> > merged (which is probably not too bad). I think the possibility for > > > >> > confusion depends on how you use the service. Still, it seems better > > > >> > than the xenstore case, anyway. :) > > > >> > > > > >> > > > >> Not silently, register_ring will return an error. > > > > > > > > Will it? It looks to me like v4v_ring_add just clobbers the old MFN > > > > list. > > > > > > > > > > Ha yes. It does that now but I think it should return an error > > > informing up the stack that a ring has already been registered. > > > > Actually, I think it''s deliberate, to allow a guest to re-register all > > its rings after a suspend/resume or migration, without having to worry > > about whether it was actually migrated into a new domain or not. > > Which takes us back to the original issue Tim asked about with > cohabitation of multiple (perhaps just plain buggy or even malicious) > v4v clients in a single domain, doesn''t it? >There is nothing wrong the two v4v driver running in the same guest. The probably that Tim reported was about trying to create two connections on the same port. Today with the code that I''ve submited in the RFC one will overwrite the other silently which isn''t a good thing, that can easily be changed to notify which one got registered up the stack.> If one of the reasons for not using xenstore is the inability for > multiple clients to co-exist then there is no point moving to a > different scheme with the same (even if lesser) issue. Up until now v4v > has only been used by XenClient so it has avoided this issue but if we > take it upstream then presumably the potential for this sort of problem > to creep in comes back. > > Some other things from my notes... > > Can you remind me of the reasoning behind using VIRQ_V4V instead of > regular event channels? I can''t see any reason why these > shouldn''t/couldn''t be regular event channels demuxed in the usual way. >No good reason, the virq can be changed to a event channel. We thought that event channels required other machinery to function. If we can deal with the event channel in the v4v driver directly I don''t have any problem changing it to a event channel. What I have in mind is if one would want to use v4v in a rombios for instance where you can''t rely on any of the xen pv driver code to be present.> Are you going to submit any client code? I think the most important of > these would be code to make libvchan able to use v4v if it is available > (and both ends agree), but any other client code (like the sockets > interface overlay I''ve heard about) would be interesting to see too. >Yes. I still have to submit the kernel driver code and the v4v library that talks to it. But now that you''ve pointed out that we might be able to use an event channel instead of a virq, I think we might event be able to have a driver in userspace only.> Have you had any contact from any one else who is interested in building > stuff with these interfaces? (This is just curiosity about potential > wider usage) >Today in XenClient we use this interface for: - USB transport between dom0 and guest - RPC bus beween dom0 and domU - Cross domain Citrix ICA connections. The good thing that this interface can offer is a transport for all the existing networking programs. I haven''t had any contact from people interested in building stuff with these interfaces but I think V4V has quite a lot to offer already in term of thing at the top that can use it.> I think you and Tim have been discussing access control -- can we get a > summary of what this is going to look like going forward. I suppose > there will be tools for manipulating this stuff -- can you post them? >I''ll post a new series this week that will include some of a feedback and the access control.> The patches need plenty of documentation adding to them, both in the > interface docs in xen/include/public (marked up such that it comes out > nicely in docs/html aka > http://xenbits.xen.org/docs/unstable/hypercall/index.html) as well as > design docs and any other useful material you have under docs itself. >I agree such change need a lot of document. Once we agreed on the interfaces I will start working on a complete documentation for them.> Are we generally happy that we could run e.g. block or net traffic over > this channel? As it stands for example the single VIRQ would seem to > provide a scalability limit to how many disks and nics a domain could > have. Are there any other considerations we need to take into account > here? > > Lastly -- have you given any thoughts to making the copy operation > asynchronous? This might allow us to take advantage of copy offload > hardware in the future? >A CPU copy already has to happen once in the guest to put it in the ring so the second copy that is done in Xen will be really cheap because it''s very linkely going to be in the cache. I don''t doing async copy in Xen will have a huge impact on the performance. Thanks for taking the time to review. Jean
At 11:38 +0100 on 28 Jun (1340883538), Jean Guyader wrote:> On 26/06 03:38, Ian Campbell wrote: > > Lastly -- have you given any thoughts to making the copy operation > > asynchronous? This might allow us to take advantage of copy offload > > hardware in the future? > > A CPU copy already has to happen once in the guest to put it in the > ringI don''t understand -- isn''t the vector-send operation designed to have Xen do a single copy from scattered sources into the destination ring? So the sender could be zero-copy?> so the second copy that is done in Xen will be really cheap because > it''s very linkely going to be in the cache. I don''t doing async copy > in Xen will have a huge impact on the performance.Using a DMA engine to do the copy could free up CPU cycles that Xen could use for other vcpus, so even if there isn''t a throughput increase on the v4v channel, there could be a benefit to the system as a whole. We could do that with synchronous copies, pausing the vcpu while the copy happens, but I think to get full throughput we''d want to pipeline a few writes ahead. Anyway, that''s all very forward-looking: AFAICS there''s no need to address it right now except to be careful not to bake things into the interface that would stop us doing this later on. Cheers, Tim.
On 28/06 11:50, Tim Deegan wrote:> At 11:38 +0100 on 28 Jun (1340883538), Jean Guyader wrote: > > On 26/06 03:38, Ian Campbell wrote: > > > Lastly -- have you given any thoughts to making the copy operation > > > asynchronous? This might allow us to take advantage of copy offload > > > hardware in the future? > > > > A CPU copy already has to happen once in the guest to put it in the > > ring > > I don''t understand -- isn''t the vector-send operation designed to have > Xen do a single copy from scattered sources into the destination ring? > So the sender could be zero-copy? >Yes sorry that is right. On send a get a pointer from the guest and then Xen memcpy the data on the receiver''s ring. So the second copy that will probably happened between the ring and some memory in the receiver will be mostly free.> > so the second copy that is done in Xen will be really cheap because > > it''s very linkely going to be in the cache. I don''t doing async copy > > in Xen will have a huge impact on the performance. > > Using a DMA engine to do the copy could free up CPU cycles that Xen > could use for other vcpus, so even if there isn''t a throughput increase > on the v4v channel, there could be a benefit to the system as a whole. > We could do that with synchronous copies, pausing the vcpu while the > copy happens, but I think to get full throughput we''d want to pipeline a > few writes ahead. >Maybe if you ring is big enough and if the guest sends very big buffer to be copied that will makes sence to use async copy like i/oat or DMA.> Anyway, that''s all very forward-looking: AFAICS there''s no need to > address it right now except to be careful not to bake things into the > interface that would stop us doing this later on. >Jean
On Thu, 2012-06-28 at 11:38 +0100, Jean Guyader wrote:> On 26/06 03:38, Ian Campbell wrote: > > Hi, > > > > Sorry it''s taken me so long to get round to responding to this. > > > > On Mon, 2012-06-25 at 10:05 +0100, Tim Deegan wrote: > > > At 22:14 +0100 on 14 Jun (1339712061), Jean Guyader wrote: > > > > On 14 June 2012 16:35, Tim Deegan <tim@xen.org> wrote: > > > > > At 16:10 +0100 on 14 Jun (1339690244), Jean Guyader wrote: > > > > >> On 14/06 03:56, Tim Deegan wrote: > > > > >> > At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: > > > > >> > > Are you talking about having different version of V4V driver running > > > > >> > > in the same VM? > > > > >> > > > > > >> > Yes. > > > > >> > > > > > >> > > I don''t think that is a problem they both interact with Xen via > > > > >> > > hypercall directly so if they follow the v4v hypercall interface it''s > > > > >> > > all fine. > > > > >> > > > > > >> > AFAICS if they both try to register the same port then one of them will > > > > >> > silently get its ring discarded. And if they both try to communicate > > > > >> > with the same remote port their entries on the pending lists will get > > > > >> > merged (which is probably not too bad). I think the possibility for > > > > >> > confusion depends on how you use the service. Still, it seems better > > > > >> > than the xenstore case, anyway. :) > > > > >> > > > > > >> > > > > >> Not silently, register_ring will return an error. > > > > > > > > > > Will it? It looks to me like v4v_ring_add just clobbers the old MFN > > > > > list. > > > > > > > > > > > > > Ha yes. It does that now but I think it should return an error > > > > informing up the stack that a ring has already been registered. > > > > > > Actually, I think it''s deliberate, to allow a guest to re-register all > > > its rings after a suspend/resume or migration, without having to worry > > > about whether it was actually migrated into a new domain or not. > > > > Which takes us back to the original issue Tim asked about with > > cohabitation of multiple (perhaps just plain buggy or even malicious) > > v4v clients in a single domain, doesn''t it? > > > > There is nothing wrong the two v4v driver running in the same guest. > The probably that Tim reported was about trying to create two connections > on the same port. Today with the code that I''ve submited in the RFC > one will overwrite the other silently which isn''t a good thing, that can > easily be changed to notify which one got registered up the stack.So they''d somehow need to randomise (and retry) their use of source ports in order to co-exist? Speaking of ports, is there a registry somewhere of the well known port numbers and/or any scheme for administering these? (a text file in the repo would be find by me).> > If one of the reasons for not using xenstore is the inability for > > multiple clients to co-exist then there is no point moving to a > > different scheme with the same (even if lesser) issue. Up until now v4v > > has only been used by XenClient so it has avoided this issue but if we > > take it upstream then presumably the potential for this sort of problem > > to creep in comes back. > > > > Some other things from my notes... > > > > Can you remind me of the reasoning behind using VIRQ_V4V instead of > > regular event channels? I can''t see any reason why these > > shouldn''t/couldn''t be regular event channels demuxed in the usual way. > > > > No good reason, the virq can be changed to a event channel. We thought > that event channels required other machinery to function. If we can > deal with the event channel in the v4v driver directly I don''t have any > problem changing it to a event channel. What I have in mind is if one > would want to use v4v in a rombios for instance where you can''t rely > on any of the xen pv driver code to be present.I think that will be ok -- under the hood a VIRQ is just an evtchn too so if you can make the VIRQ work you can a regular evtchn work too.> > Are you going to submit any client code? I think the most important of > > these would be code to make libvchan able to use v4v if it is available > > (and both ends agree), but any other client code (like the sockets > > interface overlay I''ve heard about) would be interesting to see too. > > > > Yes. I still have to submit the kernel driver code and the v4v library that > talks to it. But now that you''ve pointed out that we might be able to use > an event channel instead of a virq, I think we might event be able to have > a driver in userspace only.That would be a nice property to have!> > The patches need plenty of documentation adding to them, both in the > > interface docs in xen/include/public (marked up such that it comes out > > nicely in docs/html aka > > http://xenbits.xen.org/docs/unstable/hypercall/index.html) as well as > > design docs and any other useful material you have under docs itself. > > > > I agree such change need a lot of document. Once we agreed on the interfaces > I will start working on a complete documentation for them.Thanks! Ian.
On 28/06 12:34, Ian Campbell wrote:> On Thu, 2012-06-28 at 11:38 +0100, Jean Guyader wrote: > > On 26/06 03:38, Ian Campbell wrote: > > > Hi, > > > > > > Sorry it''s taken me so long to get round to responding to this. > > > > > > On Mon, 2012-06-25 at 10:05 +0100, Tim Deegan wrote: > > > > At 22:14 +0100 on 14 Jun (1339712061), Jean Guyader wrote: > > > > > On 14 June 2012 16:35, Tim Deegan <tim@xen.org> wrote: > > > > > > At 16:10 +0100 on 14 Jun (1339690244), Jean Guyader wrote: > > > > > >> On 14/06 03:56, Tim Deegan wrote: > > > > > >> > At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: > > > > > >> > > Are you talking about having different version of V4V driver running > > > > > >> > > in the same VM? > > > > > >> > > > > > > >> > Yes. > > > > > >> > > > > > > >> > > I don''t think that is a problem they both interact with Xen via > > > > > >> > > hypercall directly so if they follow the v4v hypercall interface it''s > > > > > >> > > all fine. > > > > > >> > > > > > > >> > AFAICS if they both try to register the same port then one of them will > > > > > >> > silently get its ring discarded. And if they both try to communicate > > > > > >> > with the same remote port their entries on the pending lists will get > > > > > >> > merged (which is probably not too bad). I think the possibility for > > > > > >> > confusion depends on how you use the service. Still, it seems better > > > > > >> > than the xenstore case, anyway. :) > > > > > >> > > > > > > >> > > > > > >> Not silently, register_ring will return an error. > > > > > > > > > > > > Will it? It looks to me like v4v_ring_add just clobbers the old MFN > > > > > > list. > > > > > > > > > > > > > > > > Ha yes. It does that now but I think it should return an error > > > > > informing up the stack that a ring has already been registered. > > > > > > > > Actually, I think it''s deliberate, to allow a guest to re-register all > > > > its rings after a suspend/resume or migration, without having to worry > > > > about whether it was actually migrated into a new domain or not. > > > > > > Which takes us back to the original issue Tim asked about with > > > cohabitation of multiple (perhaps just plain buggy or even malicious) > > > v4v clients in a single domain, doesn''t it? > > > > > > > There is nothing wrong the two v4v driver running in the same guest. > > The probably that Tim reported was about trying to create two connections > > on the same port. Today with the code that I''ve submited in the RFC > > one will overwrite the other silently which isn''t a good thing, that can > > easily be changed to notify which one got registered up the stack. > > So they''d somehow need to randomise (and retry) their use of source > ports in order to co-exist? >That can be assimilated to two userspace programs trying to bind to the same TCP port. I think it''s not v4v''s responsability to solve this problem.> Speaking of ports, is there a registry somewhere of the well known port > numbers and/or any scheme for administering these? (a text file in the > repo would be find by me). >Port numbers are 32 bits, by convention the first 65535 will match the TCP onces, then for the rest we can have a file in the repo to reference them.> > > If one of the reasons for not using xenstore is the inability for > > > multiple clients to co-exist then there is no point moving to a > > > different scheme with the same (even if lesser) issue. Up until now v4v > > > has only been used by XenClient so it has avoided this issue but if we > > > take it upstream then presumably the potential for this sort of problem > > > to creep in comes back. > > > > > > Some other things from my notes... > > > > > > Can you remind me of the reasoning behind using VIRQ_V4V instead of > > > regular event channels? I can''t see any reason why these > > > shouldn''t/couldn''t be regular event channels demuxed in the usual way. > > > > > > > No good reason, the virq can be changed to a event channel. We thought > > that event channels required other machinery to function. If we can > > deal with the event channel in the v4v driver directly I don''t have any > > problem changing it to a event channel. What I have in mind is if one > > would want to use v4v in a rombios for instance where you can''t rely > > on any of the xen pv driver code to be present. > > I think that will be ok -- under the hood a VIRQ is just an evtchn too > so if you can make the VIRQ work you can a regular evtchn work too. > > > > Are you going to submit any client code? I think the most important of > > > these would be code to make libvchan able to use v4v if it is available > > > (and both ends agree), but any other client code (like the sockets > > > interface overlay I''ve heard about) would be interesting to see too. > > > > > > > Yes. I still have to submit the kernel driver code and the v4v library that > > talks to it. But now that you''ve pointed out that we might be able to use > > an event channel instead of a virq, I think we might event be able to have > > a driver in userspace only. > > That would be a nice property to have! > > > > The patches need plenty of documentation adding to them, both in the > > > interface docs in xen/include/public (marked up such that it comes out > > > nicely in docs/html aka > > > http://xenbits.xen.org/docs/unstable/hypercall/index.html) as well as > > > design docs and any other useful material you have under docs itself. > > > > > > > I agree such change need a lot of document. Once we agreed on the interfaces > > I will start working on a complete documentation for them. > > Thanks! >Jean
On Thu, 2012-06-28 at 12:43 +0100, Jean Guyader wrote:> On 28/06 12:34, Ian Campbell wrote: > > On Thu, 2012-06-28 at 11:38 +0100, Jean Guyader wrote: > > > On 26/06 03:38, Ian Campbell wrote: > > > > Hi, > > > > > > > > Sorry it''s taken me so long to get round to responding to this. > > > > > > > > On Mon, 2012-06-25 at 10:05 +0100, Tim Deegan wrote: > > > > > At 22:14 +0100 on 14 Jun (1339712061), Jean Guyader wrote: > > > > > > On 14 June 2012 16:35, Tim Deegan <tim@xen.org> wrote: > > > > > > > At 16:10 +0100 on 14 Jun (1339690244), Jean Guyader wrote: > > > > > > >> On 14/06 03:56, Tim Deegan wrote: > > > > > > >> > At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: > > > > > > >> > > Are you talking about having different version of V4V driver running > > > > > > >> > > in the same VM? > > > > > > >> > > > > > > > >> > Yes. > > > > > > >> > > > > > > > >> > > I don''t think that is a problem they both interact with Xen via > > > > > > >> > > hypercall directly so if they follow the v4v hypercall interface it''s > > > > > > >> > > all fine. > > > > > > >> > > > > > > > >> > AFAICS if they both try to register the same port then one of them will > > > > > > >> > silently get its ring discarded. And if they both try to communicate > > > > > > >> > with the same remote port their entries on the pending lists will get > > > > > > >> > merged (which is probably not too bad). I think the possibility for > > > > > > >> > confusion depends on how you use the service. Still, it seems better > > > > > > >> > than the xenstore case, anyway. :) > > > > > > >> > > > > > > > >> > > > > > > >> Not silently, register_ring will return an error. > > > > > > > > > > > > > > Will it? It looks to me like v4v_ring_add just clobbers the old MFN > > > > > > > list. > > > > > > > > > > > > > > > > > > > Ha yes. It does that now but I think it should return an error > > > > > > informing up the stack that a ring has already been registered. > > > > > > > > > > Actually, I think it''s deliberate, to allow a guest to re-register all > > > > > its rings after a suspend/resume or migration, without having to worry > > > > > about whether it was actually migrated into a new domain or not. > > > > > > > > Which takes us back to the original issue Tim asked about with > > > > cohabitation of multiple (perhaps just plain buggy or even malicious) > > > > v4v clients in a single domain, doesn''t it? > > > > > > > > > > There is nothing wrong the two v4v driver running in the same guest. > > > The probably that Tim reported was about trying to create two connections > > > on the same port. Today with the code that I''ve submited in the RFC > > > one will overwrite the other silently which isn''t a good thing, that can > > > easily be changed to notify which one got registered up the stack. > > > > So they''d somehow need to randomise (and retry) their use of source > > ports in order to co-exist? > > > > That can be assimilated to two userspace programs trying to bind to the > same TCP port. I think it''s not v4v''s responsability to solve this problem.An application using TCP doesn''t need to worry about choosing its own source port though. Or does this only effect destination / listening ports?> > > Speaking of ports, is there a registry somewhere of the well known port > > numbers and/or any scheme for administering these? (a text file in the > > repo would be find by me). > > > > Port numbers are 32 bits, by convention the first 65535 will match the TCP onces, > then for the rest we can have a file in the repo to reference them.OK. Ian.
On 28/06 12:58, Ian Campbell wrote:> On Thu, 2012-06-28 at 12:43 +0100, Jean Guyader wrote: > > On 28/06 12:34, Ian Campbell wrote: > > > On Thu, 2012-06-28 at 11:38 +0100, Jean Guyader wrote: > > > > On 26/06 03:38, Ian Campbell wrote: > > > > > Hi, > > > > > > > > > > Sorry it''s taken me so long to get round to responding to this. > > > > > > > > > > On Mon, 2012-06-25 at 10:05 +0100, Tim Deegan wrote: > > > > > > At 22:14 +0100 on 14 Jun (1339712061), Jean Guyader wrote: > > > > > > > On 14 June 2012 16:35, Tim Deegan <tim@xen.org> wrote: > > > > > > > > At 16:10 +0100 on 14 Jun (1339690244), Jean Guyader wrote: > > > > > > > >> On 14/06 03:56, Tim Deegan wrote: > > > > > > > >> > At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: > > > > > > > >> > > Are you talking about having different version of V4V driver running > > > > > > > >> > > in the same VM? > > > > > > > >> > > > > > > > > >> > Yes. > > > > > > > >> > > > > > > > > >> > > I don''t think that is a problem they both interact with Xen via > > > > > > > >> > > hypercall directly so if they follow the v4v hypercall interface it''s > > > > > > > >> > > all fine. > > > > > > > >> > > > > > > > > >> > AFAICS if they both try to register the same port then one of them will > > > > > > > >> > silently get its ring discarded. And if they both try to communicate > > > > > > > >> > with the same remote port their entries on the pending lists will get > > > > > > > >> > merged (which is probably not too bad). I think the possibility for > > > > > > > >> > confusion depends on how you use the service. Still, it seems better > > > > > > > >> > than the xenstore case, anyway. :) > > > > > > > >> > > > > > > > > >> > > > > > > > >> Not silently, register_ring will return an error. > > > > > > > > > > > > > > > > Will it? It looks to me like v4v_ring_add just clobbers the old MFN > > > > > > > > list. > > > > > > > > > > > > > > > > > > > > > > Ha yes. It does that now but I think it should return an error > > > > > > > informing up the stack that a ring has already been registered. > > > > > > > > > > > > Actually, I think it''s deliberate, to allow a guest to re-register all > > > > > > its rings after a suspend/resume or migration, without having to worry > > > > > > about whether it was actually migrated into a new domain or not. > > > > > > > > > > Which takes us back to the original issue Tim asked about with > > > > > cohabitation of multiple (perhaps just plain buggy or even malicious) > > > > > v4v clients in a single domain, doesn''t it? > > > > > > > > > > > > > There is nothing wrong the two v4v driver running in the same guest. > > > > The probably that Tim reported was about trying to create two connections > > > > on the same port. Today with the code that I''ve submited in the RFC > > > > one will overwrite the other silently which isn''t a good thing, that can > > > > easily be changed to notify which one got registered up the stack. > > > > > > So they''d somehow need to randomise (and retry) their use of source > > > ports in order to co-exist? > > > > > > > That can be assimilated to two userspace programs trying to bind to the > > same TCP port. I think it''s not v4v''s responsability to solve this problem. > > An application using TCP doesn''t need to worry about choosing its own > source port though. > > Or does this only effect destination / listening ports? >The guest v4v driver knows which port are in used so if you put port 0 we will pick a random unused number for the source port. Jean> > > > > Speaking of ports, is there a registry somewhere of the well known port > > > numbers and/or any scheme for administering these? (a text file in the > > > repo would be find by me). > > > > > > > Port numbers are 32 bits, by convention the first 65535 will match the TCP onces, > > then for the rest we can have a file in the repo to reference them. > > OK. > > Ian. > >
On Thu, 2012-06-28 at 13:10 +0100, Jean Guyader wrote:> On 28/06 12:58, Ian Campbell wrote: > > On Thu, 2012-06-28 at 12:43 +0100, Jean Guyader wrote: > > > On 28/06 12:34, Ian Campbell wrote: > > > > On Thu, 2012-06-28 at 11:38 +0100, Jean Guyader wrote: > > > > > On 26/06 03:38, Ian Campbell wrote: > > > > > > Hi, > > > > > > > > > > > > Sorry it''s taken me so long to get round to responding to this. > > > > > > > > > > > > On Mon, 2012-06-25 at 10:05 +0100, Tim Deegan wrote: > > > > > > > At 22:14 +0100 on 14 Jun (1339712061), Jean Guyader wrote: > > > > > > > > On 14 June 2012 16:35, Tim Deegan <tim@xen.org> wrote: > > > > > > > > > At 16:10 +0100 on 14 Jun (1339690244), Jean Guyader wrote: > > > > > > > > >> On 14/06 03:56, Tim Deegan wrote: > > > > > > > > >> > At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: > > > > > > > > >> > > Are you talking about having different version of V4V driver running > > > > > > > > >> > > in the same VM? > > > > > > > > >> > > > > > > > > > >> > Yes. > > > > > > > > >> > > > > > > > > > >> > > I don''t think that is a problem they both interact with Xen via > > > > > > > > >> > > hypercall directly so if they follow the v4v hypercall interface it''s > > > > > > > > >> > > all fine. > > > > > > > > >> > > > > > > > > > >> > AFAICS if they both try to register the same port then one of them will > > > > > > > > >> > silently get its ring discarded. And if they both try to communicate > > > > > > > > >> > with the same remote port their entries on the pending lists will get > > > > > > > > >> > merged (which is probably not too bad). I think the possibility for > > > > > > > > >> > confusion depends on how you use the service. Still, it seems better > > > > > > > > >> > than the xenstore case, anyway. :) > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> Not silently, register_ring will return an error. > > > > > > > > > > > > > > > > > > Will it? It looks to me like v4v_ring_add just clobbers the old MFN > > > > > > > > > list. > > > > > > > > > > > > > > > > > > > > > > > > > Ha yes. It does that now but I think it should return an error > > > > > > > > informing up the stack that a ring has already been registered. > > > > > > > > > > > > > > Actually, I think it''s deliberate, to allow a guest to re-register all > > > > > > > its rings after a suspend/resume or migration, without having to worry > > > > > > > about whether it was actually migrated into a new domain or not. > > > > > > > > > > > > Which takes us back to the original issue Tim asked about with > > > > > > cohabitation of multiple (perhaps just plain buggy or even malicious) > > > > > > v4v clients in a single domain, doesn''t it? > > > > > > > > > > > > > > > > There is nothing wrong the two v4v driver running in the same guest. > > > > > The probably that Tim reported was about trying to create two connections > > > > > on the same port. Today with the code that I''ve submited in the RFC > > > > > one will overwrite the other silently which isn''t a good thing, that can > > > > > easily be changed to notify which one got registered up the stack. > > > > > > > > So they''d somehow need to randomise (and retry) their use of source > > > > ports in order to co-exist? > > > > > > > > > > That can be assimilated to two userspace programs trying to bind to the > > > same TCP port. I think it''s not v4v''s responsability to solve this problem. > > > > An application using TCP doesn''t need to worry about choosing its own > > source port though. > > > > Or does this only effect destination / listening ports? > > > > The guest v4v driver knows which port are in used so if you put port 0 > we will pick a random unused number for the source port.Except when there are two such drivers each doesn''t know which the other one is using. Ian.> > Jean > > > > > > > > Speaking of ports, is there a registry somewhere of the well known port > > > > numbers and/or any scheme for administering these? (a text file in the > > > > repo would be find by me). > > > > > > > > > > Port numbers are 32 bits, by convention the first 65535 will match the TCP onces, > > > then for the rest we can have a file in the repo to reference them. > > > > OK. > > > > Ian. > > > >
On 28/06 01:36, Ian Campbell wrote:> On Thu, 2012-06-28 at 13:10 +0100, Jean Guyader wrote: > > On 28/06 12:58, Ian Campbell wrote: > > > On Thu, 2012-06-28 at 12:43 +0100, Jean Guyader wrote: > > > > On 28/06 12:34, Ian Campbell wrote: > > > > > On Thu, 2012-06-28 at 11:38 +0100, Jean Guyader wrote: > > > > > > On 26/06 03:38, Ian Campbell wrote: > > > > > > > Hi, > > > > > > > > > > > > > > Sorry it''s taken me so long to get round to responding to this. > > > > > > > > > > > > > > On Mon, 2012-06-25 at 10:05 +0100, Tim Deegan wrote: > > > > > > > > At 22:14 +0100 on 14 Jun (1339712061), Jean Guyader wrote: > > > > > > > > > On 14 June 2012 16:35, Tim Deegan <tim@xen.org> wrote: > > > > > > > > > > At 16:10 +0100 on 14 Jun (1339690244), Jean Guyader wrote: > > > > > > > > > >> On 14/06 03:56, Tim Deegan wrote: > > > > > > > > > >> > At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: > > > > > > > > > >> > > Are you talking about having different version of V4V driver running > > > > > > > > > >> > > in the same VM? > > > > > > > > > >> > > > > > > > > > > >> > Yes. > > > > > > > > > >> > > > > > > > > > > >> > > I don''t think that is a problem they both interact with Xen via > > > > > > > > > >> > > hypercall directly so if they follow the v4v hypercall interface it''s > > > > > > > > > >> > > all fine. > > > > > > > > > >> > > > > > > > > > > >> > AFAICS if they both try to register the same port then one of them will > > > > > > > > > >> > silently get its ring discarded. And if they both try to communicate > > > > > > > > > >> > with the same remote port their entries on the pending lists will get > > > > > > > > > >> > merged (which is probably not too bad). I think the possibility for > > > > > > > > > >> > confusion depends on how you use the service. Still, it seems better > > > > > > > > > >> > than the xenstore case, anyway. :) > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> Not silently, register_ring will return an error. > > > > > > > > > > > > > > > > > > > > Will it? It looks to me like v4v_ring_add just clobbers the old MFN > > > > > > > > > > list. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Ha yes. It does that now but I think it should return an error > > > > > > > > > informing up the stack that a ring has already been registered. > > > > > > > > > > > > > > > > Actually, I think it''s deliberate, to allow a guest to re-register all > > > > > > > > its rings after a suspend/resume or migration, without having to worry > > > > > > > > about whether it was actually migrated into a new domain or not. > > > > > > > > > > > > > > Which takes us back to the original issue Tim asked about with > > > > > > > cohabitation of multiple (perhaps just plain buggy or even malicious) > > > > > > > v4v clients in a single domain, doesn''t it? > > > > > > > > > > > > > > > > > > > There is nothing wrong the two v4v driver running in the same guest. > > > > > > The probably that Tim reported was about trying to create two connections > > > > > > on the same port. Today with the code that I''ve submited in the RFC > > > > > > one will overwrite the other silently which isn''t a good thing, that can > > > > > > easily be changed to notify which one got registered up the stack. > > > > > > > > > > So they''d somehow need to randomise (and retry) their use of source > > > > > ports in order to co-exist? > > > > > > > > > > > > > That can be assimilated to two userspace programs trying to bind to the > > > > same TCP port. I think it''s not v4v''s responsability to solve this problem. > > > > > > An application using TCP doesn''t need to worry about choosing its own > > > source port though. > > > > > > Or does this only effect destination / listening ports? > > > > > > > The guest v4v driver knows which port are in used so if you put port 0 > > we will pick a random unused number for the source port. > > Except when there are two such drivers each doesn''t know which the other > one is using. >Then the kernel will try to register the ring and the hypercall will fail because it''s already registered. Jean
On Thu, 2012-06-28 at 14:43 +0100, Jean Guyader wrote:> On 28/06 01:36, Ian Campbell wrote: > > On Thu, 2012-06-28 at 13:10 +0100, Jean Guyader wrote: > > > On 28/06 12:58, Ian Campbell wrote: > > > > On Thu, 2012-06-28 at 12:43 +0100, Jean Guyader wrote: > > > > > On 28/06 12:34, Ian Campbell wrote: > > > > > > On Thu, 2012-06-28 at 11:38 +0100, Jean Guyader wrote: > > > > > > > On 26/06 03:38, Ian Campbell wrote: > > > > > > > > Hi, > > > > > > > > > > > > > > > > Sorry it''s taken me so long to get round to responding to this. > > > > > > > > > > > > > > > > On Mon, 2012-06-25 at 10:05 +0100, Tim Deegan wrote: > > > > > > > > > At 22:14 +0100 on 14 Jun (1339712061), Jean Guyader wrote: > > > > > > > > > > On 14 June 2012 16:35, Tim Deegan <tim@xen.org> wrote: > > > > > > > > > > > At 16:10 +0100 on 14 Jun (1339690244), Jean Guyader wrote: > > > > > > > > > > >> On 14/06 03:56, Tim Deegan wrote: > > > > > > > > > > >> > At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: > > > > > > > > > > >> > > Are you talking about having different version of V4V driver running > > > > > > > > > > >> > > in the same VM? > > > > > > > > > > >> > > > > > > > > > > > >> > Yes. > > > > > > > > > > >> > > > > > > > > > > > >> > > I don''t think that is a problem they both interact with Xen via > > > > > > > > > > >> > > hypercall directly so if they follow the v4v hypercall interface it''s > > > > > > > > > > >> > > all fine. > > > > > > > > > > >> > > > > > > > > > > > >> > AFAICS if they both try to register the same port then one of them will > > > > > > > > > > >> > silently get its ring discarded. And if they both try to communicate > > > > > > > > > > >> > with the same remote port their entries on the pending lists will get > > > > > > > > > > >> > merged (which is probably not too bad). I think the possibility for > > > > > > > > > > >> > confusion depends on how you use the service. Still, it seems better > > > > > > > > > > >> > than the xenstore case, anyway. :) > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> Not silently, register_ring will return an error. > > > > > > > > > > > > > > > > > > > > > > Will it? It looks to me like v4v_ring_add just clobbers the old MFN > > > > > > > > > > > list. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Ha yes. It does that now but I think it should return an error > > > > > > > > > > informing up the stack that a ring has already been registered. > > > > > > > > > > > > > > > > > > Actually, I think it''s deliberate, to allow a guest to re-register all > > > > > > > > > its rings after a suspend/resume or migration, without having to worry > > > > > > > > > about whether it was actually migrated into a new domain or not. > > > > > > > > > > > > > > > > Which takes us back to the original issue Tim asked about with > > > > > > > > cohabitation of multiple (perhaps just plain buggy or even malicious) > > > > > > > > v4v clients in a single domain, doesn''t it? > > > > > > > > > > > > > > > > > > > > > > There is nothing wrong the two v4v driver running in the same guest. > > > > > > > The probably that Tim reported was about trying to create two connections > > > > > > > on the same port. Today with the code that I''ve submited in the RFC > > > > > > > one will overwrite the other silently which isn''t a good thing, that can > > > > > > > easily be changed to notify which one got registered up the stack. > > > > > > > > > > > > So they''d somehow need to randomise (and retry) their use of source > > > > > > ports in order to co-exist? > > > > > > > > > > > > > > > > That can be assimilated to two userspace programs trying to bind to the > > > > > same TCP port. I think it''s not v4v''s responsability to solve this problem. > > > > > > > > An application using TCP doesn''t need to worry about choosing its own > > > > source port though. > > > > > > > > Or does this only effect destination / listening ports? > > > > > > > > > > The guest v4v driver knows which port are in used so if you put port 0 > > > we will pick a random unused number for the source port. > > > > Except when there are two such drivers each doesn''t know which the other > > one is using. > > > > Then the kernel will try to register the ring and the hypercall will fail > because it''s already registered.At which point what happens? How do two unrelated V4V drivers co-exist given this? Ian.
On 28 June 2012 14:47, Ian Campbell <Ian.Campbell@citrix.com> wrote:> On Thu, 2012-06-28 at 14:43 +0100, Jean Guyader wrote: >> On 28/06 01:36, Ian Campbell wrote: >> > On Thu, 2012-06-28 at 13:10 +0100, Jean Guyader wrote: >> > > On 28/06 12:58, Ian Campbell wrote: >> > > > On Thu, 2012-06-28 at 12:43 +0100, Jean Guyader wrote: >> > > > > On 28/06 12:34, Ian Campbell wrote: >> > > > > > On Thu, 2012-06-28 at 11:38 +0100, Jean Guyader wrote: >> > > > > > > On 26/06 03:38, Ian Campbell wrote: >> > > > > > > > Hi, >> > > > > > > > >> > > > > > > > Sorry it''s taken me so long to get round to responding to this. >> > > > > > > > >> > > > > > > > On Mon, 2012-06-25 at 10:05 +0100, Tim Deegan wrote: >> > > > > > > > > At 22:14 +0100 on 14 Jun (1339712061), Jean Guyader wrote: >> > > > > > > > > > On 14 June 2012 16:35, Tim Deegan <tim@xen.org> wrote: >> > > > > > > > > > > At 16:10 +0100 on 14 Jun (1339690244), Jean Guyader wrote: >> > > > > > > > > > >> On 14/06 03:56, Tim Deegan wrote: >> > > > > > > > > > >> > At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: >> > > > > > > > > > >> > > Are you talking about having different version of V4V driver running >> > > > > > > > > > >> > > in the same VM? >> > > > > > > > > > >> > >> > > > > > > > > > >> > Yes. >> > > > > > > > > > >> > >> > > > > > > > > > >> > > I don''t think that is a problem they both interact with Xen via >> > > > > > > > > > >> > > hypercall directly so if they follow the v4v hypercall interface it''s >> > > > > > > > > > >> > > all fine. >> > > > > > > > > > >> > >> > > > > > > > > > >> > AFAICS if they both try to register the same port then one of them will >> > > > > > > > > > >> > silently get its ring discarded. And if they both try to communicate >> > > > > > > > > > >> > with the same remote port their entries on the pending lists will get >> > > > > > > > > > >> > merged (which is probably not too bad). I think the possibility for >> > > > > > > > > > >> > confusion depends on how you use the service. Still, it seems better >> > > > > > > > > > >> > than the xenstore case, anyway. :) >> > > > > > > > > > >> > >> > > > > > > > > > >> >> > > > > > > > > > >> Not silently, register_ring will return an error. >> > > > > > > > > > > >> > > > > > > > > > > Will it? It looks to me like v4v_ring_add just clobbers the old MFN >> > > > > > > > > > > list. >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > Ha yes. It does that now but I think it should return an error >> > > > > > > > > > informing up the stack that a ring has already been registered. >> > > > > > > > > >> > > > > > > > > Actually, I think it''s deliberate, to allow a guest to re-register all >> > > > > > > > > its rings after a suspend/resume or migration, without having to worry >> > > > > > > > > about whether it was actually migrated into a new domain or not. >> > > > > > > > >> > > > > > > > Which takes us back to the original issue Tim asked about with >> > > > > > > > cohabitation of multiple (perhaps just plain buggy or even malicious) >> > > > > > > > v4v clients in a single domain, doesn''t it? >> > > > > > > > >> > > > > > > >> > > > > > > There is nothing wrong the two v4v driver running in the same guest. >> > > > > > > The probably that Tim reported was about trying to create two connections >> > > > > > > on the same port. Today with the code that I''ve submited in the RFC >> > > > > > > one will overwrite the other silently which isn''t a good thing, that can >> > > > > > > easily be changed to notify which one got registered up the stack. >> > > > > > >> > > > > > So they''d somehow need to randomise (and retry) their use of source >> > > > > > ports in order to co-exist? >> > > > > > >> > > > > >> > > > > That can be assimilated to two userspace programs trying to bind to the >> > > > > same TCP port. I think it''s not v4v''s responsability to solve this problem. >> > > > >> > > > An application using TCP doesn''t need to worry about choosing its own >> > > > source port though. >> > > > >> > > > Or does this only effect destination / listening ports? >> > > > >> > > >> > > The guest v4v driver knows which port are in used so if you put port 0 >> > > we will pick a random unused number for the source port. >> > >> > Except when there are two such drivers each doesn''t know which the other >> > one is using. >> > >> >> Then the kernel will try to register the ring and the hypercall will fail >> because it''s already registered. > > At which point what happens? How do two unrelated V4V drivers co-exist > given this? >It happens when both driver will start registering their rings and if they are using the same port. One will get a success out of the hypercall the other one a failure. If we are in the case that user space used 0 for port we will retry with another random port number until it succeed. Jean
On Thu, 2012-06-28 at 17:35 +0100, Jean Guyader wrote:> On 28 June 2012 14:47, Ian Campbell <Ian.Campbell@citrix.com> wrote: > > On Thu, 2012-06-28 at 14:43 +0100, Jean Guyader wrote: > >> On 28/06 01:36, Ian Campbell wrote: > >> > On Thu, 2012-06-28 at 13:10 +0100, Jean Guyader wrote: > >> > > On 28/06 12:58, Ian Campbell wrote: > >> > > > On Thu, 2012-06-28 at 12:43 +0100, Jean Guyader wrote: > >> > > > > On 28/06 12:34, Ian Campbell wrote: > >> > > > > > On Thu, 2012-06-28 at 11:38 +0100, Jean Guyader wrote: > >> > > > > > > On 26/06 03:38, Ian Campbell wrote: > >> > > > > > > > Hi, > >> > > > > > > > > >> > > > > > > > Sorry it''s taken me so long to get round to responding to this. > >> > > > > > > > > >> > > > > > > > On Mon, 2012-06-25 at 10:05 +0100, Tim Deegan wrote: > >> > > > > > > > > At 22:14 +0100 on 14 Jun (1339712061), Jean Guyader wrote: > >> > > > > > > > > > On 14 June 2012 16:35, Tim Deegan <tim@xen.org> wrote: > >> > > > > > > > > > > At 16:10 +0100 on 14 Jun (1339690244), Jean Guyader wrote: > >> > > > > > > > > > >> On 14/06 03:56, Tim Deegan wrote: > >> > > > > > > > > > >> > At 11:55 +0100 on 14 Jun (1339674908), Jean Guyader wrote: > >> > > > > > > > > > >> > > Are you talking about having different version of V4V driver running > >> > > > > > > > > > >> > > in the same VM? > >> > > > > > > > > > >> > > >> > > > > > > > > > >> > Yes. > >> > > > > > > > > > >> > > >> > > > > > > > > > >> > > I don''t think that is a problem they both interact with Xen via > >> > > > > > > > > > >> > > hypercall directly so if they follow the v4v hypercall interface it''s > >> > > > > > > > > > >> > > all fine. > >> > > > > > > > > > >> > > >> > > > > > > > > > >> > AFAICS if they both try to register the same port then one of them will > >> > > > > > > > > > >> > silently get its ring discarded. And if they both try to communicate > >> > > > > > > > > > >> > with the same remote port their entries on the pending lists will get > >> > > > > > > > > > >> > merged (which is probably not too bad). I think the possibility for > >> > > > > > > > > > >> > confusion depends on how you use the service. Still, it seems better > >> > > > > > > > > > >> > than the xenstore case, anyway. :) > >> > > > > > > > > > >> > > >> > > > > > > > > > >> > >> > > > > > > > > > >> Not silently, register_ring will return an error. > >> > > > > > > > > > > > >> > > > > > > > > > > Will it? It looks to me like v4v_ring_add just clobbers the old MFN > >> > > > > > > > > > > list. > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > Ha yes. It does that now but I think it should return an error > >> > > > > > > > > > informing up the stack that a ring has already been registered. > >> > > > > > > > > > >> > > > > > > > > Actually, I think it''s deliberate, to allow a guest to re-register all > >> > > > > > > > > its rings after a suspend/resume or migration, without having to worry > >> > > > > > > > > about whether it was actually migrated into a new domain or not. > >> > > > > > > > > >> > > > > > > > Which takes us back to the original issue Tim asked about with > >> > > > > > > > cohabitation of multiple (perhaps just plain buggy or even malicious) > >> > > > > > > > v4v clients in a single domain, doesn''t it? > >> > > > > > > > > >> > > > > > > > >> > > > > > > There is nothing wrong the two v4v driver running in the same guest. > >> > > > > > > The probably that Tim reported was about trying to create two connections > >> > > > > > > on the same port. Today with the code that I''ve submited in the RFC > >> > > > > > > one will overwrite the other silently which isn''t a good thing, that can > >> > > > > > > easily be changed to notify which one got registered up the stack. > >> > > > > > > >> > > > > > So they''d somehow need to randomise (and retry) their use of source > >> > > > > > ports in order to co-exist? > >> > > > > > > >> > > > > > >> > > > > That can be assimilated to two userspace programs trying to bind to the > >> > > > > same TCP port. I think it''s not v4v''s responsability to solve this problem. > >> > > > > >> > > > An application using TCP doesn''t need to worry about choosing its own > >> > > > source port though. > >> > > > > >> > > > Or does this only effect destination / listening ports? > >> > > > > >> > > > >> > > The guest v4v driver knows which port are in used so if you put port 0 > >> > > we will pick a random unused number for the source port. > >> > > >> > Except when there are two such drivers each doesn''t know which the other > >> > one is using. > >> > > >> > >> Then the kernel will try to register the ring and the hypercall will fail > >> because it''s already registered. > > > > At which point what happens? How do two unrelated V4V drivers co-exist > > given this? > > > > It happens when both driver will start registering their rings and if they are > using the same port. One will get a success out of the hypercall the other one > a failure. If we are in the case that user space used 0 for port we > will retry with > another random port number until it succeed.So the upshot is that for two v4v clients to co-exist then they must both be prepared to do this retry loop as well as to handle unexpected failures to bind to particular ports which they might have thought were free. If a particular client is buggy in this regard then the only impact is it to itself and not other, correct, clients (which is scant consolation if it is the one built into your OS and driving your network). So, not exactly ideal but I suppose it is somewhat better than XenStore (which currently has only one client which must provide a suitable API to any other XS users). Ian.