Hi Samuel and Daniel
I sent a patch to switch cxenstored''s event loop from using select to
using poll several weeks ago, however this would break xenstore-stubdom
as Mini-OS has no poll(2) implementation at the moment.
I think implementing poll(2) for Mini-OS could be a good idea, but I
don''t know how far I should go. I''m not familiar with
xenstore-stubdom,
and I tried setting it up but it didn''t work so I gave up. :-(
To my understanding we only care about the interface but not the
implementation. I looked into Mini-OS''s lib/sys.c this morning,
noticing
that the internal file abstraction only supports up to 32 files. Is this
xenstore-stubdom only for DomU? If it is for Dom0, how can it handle
more than 32 fds?
My main question is, is it possible to just wrap around select(2) in
Mini-OS to implement poll(2)? (as shown in the conceptual patch)
Wei.
conceptual patch like this:
-----8<----
diff --git a/extras/mini-os/lib/sys.c b/extras/mini-os/lib/sys.c
index 3cc3340..d463231 100644
--- a/extras/mini-os/lib/sys.c
+++ b/extras/mini-os/lib/sys.c
@@ -678,6 +678,29 @@ static void dump_set(int nfds, fd_set *readfds, fd_set
*writefds, fd_set *except
 #define dump_set(nfds, readfds, writefds, exceptfds, timeout)
 #endif
 
+#ifdef LIBC_DEBUG
+static void dump_pollfds(struct pollfd *pfd, int nfds, int timeout)
+{
+    int i, comma, fd;
+
+    printk("[");
+    comma = 0;
+    for (i = 0; i < nfds; i++) {
+	 fd = pfd[i].fd;
+	 if (comma)
+	      printk(", ");
+	 printk("%d(%c)/%02x", fd, file_types[files[fd].type],
+		pfd[i].events);
+	    comma = 1;
+    }
+    printk("]");
+
+    printk(", %d, %d", nfds, timeout);
+}
+#else
+#define dump_pollfds(pfds, nfds, timeout)
+#endif
+
 /* Just poll without blocking */
 static int select_poll(int nfds, fd_set *readfds, fd_set *writefds, fd_set
*exceptfds)
 {
@@ -983,6 +1006,53 @@ out:
     return ret;
 }
 
+/*
+ * As Mini-OS only supports NOFILE files, we can just wrap around
+ * select?
+ */
+int poll(struct pollfd pfd[], int nfds, int timeout)
+{
+    int ret;
+    int i, fd;
+    struct timeval _timeo;
+    fd_set rfds, wfds;
+
+    DEBUG("poll(");
+    dump_pollfds(pfd, nfds, timeout);
+    DEBUG(")\n");
+
+    /* Timeout in poll is in second. */
+    _timeo.tv_sec  = timeout;
+    _timeo.tv_usec = 0;
+
+    FD_ZERO(&rfds);
+    FD_ZERO(&wfds);
+
+    for (i = 0; i < nfds; i++) {
+        fd = pfd[i].fd;
+	if (pfd[i].events & POLLIN)
+            FD_SET(fd, &rfds);
+	if (pfd[i].events & POLLOUT)
+            FD_SET(fd, &wfds);
+    }
+
+    ret = select(&rfds, &wfds, NULL, &_timeo);
+
+    for (i = 0; i < nfds; i++) {
+        fd = pfd[i].fd;
+	if (FD_ISSET(fd, &rfds))
+            pfd[i].revents |= POLLIN;
+	if (FD_ISSET(fd, &wfds))
+            pfd[i].revents |= POLLOUT;
+    }
+
+    DEBUG("poll(");
+    dump_pollfds(pfd, nfds, timeout);
+    DEBUG(")\n");
+
+    return ret;
+}
+
 #ifdef HAVE_LWIP
 int socket(int domain, int type, int protocol)
 {
@@ -1360,7 +1430,6 @@ unsupported_function(int, tcgetattr, 0);
 unsupported_function(int, grantpt, -1);
 unsupported_function(int, unlockpt, -1);
 unsupported_function(char *, ptsname, NULL);
-unsupported_function(int, poll, -1);
 
 /* net/if.h */
 unsupported_function_log(unsigned int, if_nametoindex, -1);
On Mon, 2013-02-18 at 15:40 +0000, Wei Liu wrote:> Hi Samuel and Daniel > > I sent a patch to switch cxenstored''s event loop from using select to > using poll several weeks ago, however this would break xenstore-stubdom > as Mini-OS has no poll(2) implementation at the moment. > > I think implementing poll(2) for Mini-OS could be a good idea, but I > don''t know how far I should go. I''m not familiar with xenstore-stubdom, > and I tried setting it up but it didn''t work so I gave up. :-( > > To my understanding we only care about the interface but not the > implementation. I looked into Mini-OS''s lib/sys.c this morning, noticing > that the internal file abstraction only supports up to 32 files. Is this > xenstore-stubdom only for DomU? If it is for Dom0, how can it handle > more than 32 fds?What do you mean? A stubdom must necessarily be dom!=0 but it should be able to service all other domains, including dom0 and other domUs. Do we use an fd per evtchn or only one in xenstore? If the former then that''s a bit of a limitation of the xenstore stubdom!> My main question is, is it possible to just wrap around select(2) in > Mini-OS to implement poll(2)? (as shown in the conceptual patch) > > > > Wei. > > conceptual patch like this: > -----8<---- > diff --git a/extras/mini-os/lib/sys.c b/extras/mini-os/lib/sys.c > index 3cc3340..d463231 100644 > --- a/extras/mini-os/lib/sys.c > +++ b/extras/mini-os/lib/sys.c > @@ -678,6 +678,29 @@ static void dump_set(int nfds, fd_set *readfds, fd_set *writefds, fd_set *except > #define dump_set(nfds, readfds, writefds, exceptfds, timeout) > #endif > > +#ifdef LIBC_DEBUG > +static void dump_pollfds(struct pollfd *pfd, int nfds, int timeout) > +{ > + int i, comma, fd; > + > + printk("["); > + comma = 0; > + for (i = 0; i < nfds; i++) { > + fd = pfd[i].fd; > + if (comma) > + printk(", "); > + printk("%d(%c)/%02x", fd, file_types[files[fd].type], > + pfd[i].events); > + comma = 1; > + } > + printk("]"); > + > + printk(", %d, %d", nfds, timeout); > +} > +#else > +#define dump_pollfds(pfds, nfds, timeout) > +#endif > + > /* Just poll without blocking */ > static int select_poll(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds) > { > @@ -983,6 +1006,53 @@ out: > return ret; > } > > +/* > + * As Mini-OS only supports NOFILE files, we can just wrap around > + * select? > + */ > +int poll(struct pollfd pfd[], int nfds, int timeout) > +{ > + int ret; > + int i, fd; > + struct timeval _timeo; > + fd_set rfds, wfds; > + > + DEBUG("poll("); > + dump_pollfds(pfd, nfds, timeout); > + DEBUG(")\n"); > + > + /* Timeout in poll is in second. */ > + _timeo.tv_sec = timeout; > + _timeo.tv_usec = 0; > + > + FD_ZERO(&rfds); > + FD_ZERO(&wfds); > + > + for (i = 0; i < nfds; i++) { > + fd = pfd[i].fd; > + if (pfd[i].events & POLLIN) > + FD_SET(fd, &rfds); > + if (pfd[i].events & POLLOUT) > + FD_SET(fd, &wfds); > + } > + > + ret = select(&rfds, &wfds, NULL, &_timeo); > + > + for (i = 0; i < nfds; i++) { > + fd = pfd[i].fd; > + if (FD_ISSET(fd, &rfds)) > + pfd[i].revents |= POLLIN; > + if (FD_ISSET(fd, &wfds)) > + pfd[i].revents |= POLLOUT; > + } > + > + DEBUG("poll("); > + dump_pollfds(pfd, nfds, timeout); > + DEBUG(")\n"); > + > + return ret; > +} > + > #ifdef HAVE_LWIP > int socket(int domain, int type, int protocol) > { > @@ -1360,7 +1430,6 @@ unsupported_function(int, tcgetattr, 0); > unsupported_function(int, grantpt, -1); > unsupported_function(int, unlockpt, -1); > unsupported_function(char *, ptsname, NULL); > -unsupported_function(int, poll, -1); > > /* net/if.h */ > unsupported_function_log(unsigned int, if_nametoindex, -1); > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
On Mon, 2013-02-18 at 16:06 +0000, Ian Campbell wrote:> On Mon, 2013-02-18 at 15:40 +0000, Wei Liu wrote: > > Hi Samuel and Daniel > > > > I sent a patch to switch cxenstored''s event loop from using select to > > using poll several weeks ago, however this would break xenstore-stubdom > > as Mini-OS has no poll(2) implementation at the moment. > > > > I think implementing poll(2) for Mini-OS could be a good idea, but I > > don''t know how far I should go. I''m not familiar with xenstore-stubdom, > > and I tried setting it up but it didn''t work so I gave up. :-( > > > > To my understanding we only care about the interface but not the > > implementation. I looked into Mini-OS''s lib/sys.c this morning, noticing > > that the internal file abstraction only supports up to 32 files. Is this > > xenstore-stubdom only for DomU? If it is for Dom0, how can it handle > > more than 32 fds? > > What do you mean? A stubdom must necessarily be dom!=0 but it should be > able to service all other domains, including dom0 and other domUs. >Hah? This is my first impression, but we still need a xenstored running in Dom0, right? Otherwise what''s the output of `xenstore-ls` in Dom0?> Do we use an fd per evtchn or only one in xenstore? If the former then > that''s a bit of a limitation of the xenstore stubdom! >Xenstore has a struct connection which has one fd for each connection. So if there are too many connections, how can xenstore stubdom handle this situation? As I can see in a xenstore process running in Dom0, it can certainly opens more than 32 fds. Wei.> > My main question is, is it possible to just wrap around select(2) in > > Mini-OS to implement poll(2)? (as shown in the conceptual patch) > > > > > > > > Wei. > > > > conceptual patch like this: > > -----8<---- > > diff --git a/extras/mini-os/lib/sys.c b/extras/mini-os/lib/sys.c > > index 3cc3340..d463231 100644 > > --- a/extras/mini-os/lib/sys.c > > +++ b/extras/mini-os/lib/sys.c > > @@ -678,6 +678,29 @@ static void dump_set(int nfds, fd_set *readfds, fd_set *writefds, fd_set *except > > #define dump_set(nfds, readfds, writefds, exceptfds, timeout) > > #endif > > > > +#ifdef LIBC_DEBUG > > +static void dump_pollfds(struct pollfd *pfd, int nfds, int timeout) > > +{ > > + int i, comma, fd; > > + > > + printk("["); > > + comma = 0; > > + for (i = 0; i < nfds; i++) { > > + fd = pfd[i].fd; > > + if (comma) > > + printk(", "); > > + printk("%d(%c)/%02x", fd, file_types[files[fd].type], > > + pfd[i].events); > > + comma = 1; > > + } > > + printk("]"); > > + > > + printk(", %d, %d", nfds, timeout); > > +} > > +#else > > +#define dump_pollfds(pfds, nfds, timeout) > > +#endif > > + > > /* Just poll without blocking */ > > static int select_poll(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds) > > { > > @@ -983,6 +1006,53 @@ out: > > return ret; > > } > > > > +/* > > + * As Mini-OS only supports NOFILE files, we can just wrap around > > + * select? > > + */ > > +int poll(struct pollfd pfd[], int nfds, int timeout) > > +{ > > + int ret; > > + int i, fd; > > + struct timeval _timeo; > > + fd_set rfds, wfds; > > + > > + DEBUG("poll("); > > + dump_pollfds(pfd, nfds, timeout); > > + DEBUG(")\n"); > > + > > + /* Timeout in poll is in second. */ > > + _timeo.tv_sec = timeout; > > + _timeo.tv_usec = 0; > > + > > + FD_ZERO(&rfds); > > + FD_ZERO(&wfds); > > + > > + for (i = 0; i < nfds; i++) { > > + fd = pfd[i].fd; > > + if (pfd[i].events & POLLIN) > > + FD_SET(fd, &rfds); > > + if (pfd[i].events & POLLOUT) > > + FD_SET(fd, &wfds); > > + } > > + > > + ret = select(&rfds, &wfds, NULL, &_timeo); > > + > > + for (i = 0; i < nfds; i++) { > > + fd = pfd[i].fd; > > + if (FD_ISSET(fd, &rfds)) > > + pfd[i].revents |= POLLIN; > > + if (FD_ISSET(fd, &wfds)) > > + pfd[i].revents |= POLLOUT; > > + } > > + > > + DEBUG("poll("); > > + dump_pollfds(pfd, nfds, timeout); > > + DEBUG(")\n"); > > + > > + return ret; > > +} > > + > > #ifdef HAVE_LWIP > > int socket(int domain, int type, int protocol) > > { > > @@ -1360,7 +1430,6 @@ unsupported_function(int, tcgetattr, 0); > > unsupported_function(int, grantpt, -1); > > unsupported_function(int, unlockpt, -1); > > unsupported_function(char *, ptsname, NULL); > > -unsupported_function(int, poll, -1); > > > > /* net/if.h */ > > unsupported_function_log(unsigned int, if_nametoindex, -1); > > > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xen.org > > http://lists.xen.org/xen-devel > >
On Mon, 2013-02-18 at 16:18 +0000, Wei Liu wrote:> On Mon, 2013-02-18 at 16:06 +0000, Ian Campbell wrote: > > On Mon, 2013-02-18 at 15:40 +0000, Wei Liu wrote: > > > Hi Samuel and Daniel > > > > > > I sent a patch to switch cxenstored''s event loop from using select to > > > using poll several weeks ago, however this would break xenstore-stubdom > > > as Mini-OS has no poll(2) implementation at the moment. > > > > > > I think implementing poll(2) for Mini-OS could be a good idea, but I > > > don''t know how far I should go. I''m not familiar with xenstore-stubdom, > > > and I tried setting it up but it didn''t work so I gave up. :-( > > > > > > To my understanding we only care about the interface but not the > > > implementation. I looked into Mini-OS''s lib/sys.c this morning, noticing > > > that the internal file abstraction only supports up to 32 files. Is this > > > xenstore-stubdom only for DomU? If it is for Dom0, how can it handle > > > more than 32 fds? > > > > What do you mean? A stubdom must necessarily be dom!=0 but it should be > > able to service all other domains, including dom0 and other domUs. > > > > Hah? This is my first impression, but we still need a xenstored running > in Dom0, right?No> Otherwise what''s the output of `xenstore-ls` in Dom0?It talks to the remote stubdom, in the same way that a domU normally would. There is only one xenstored in the entire system. (Ignoring extreme disaggregation like the XOAR paper).> > Do we use an fd per evtchn or only one in xenstore? If the former then > > that''s a bit of a limitation of the xenstore stubdom! > > > > Xenstore has a struct connection which has one fd for each connection. > So if there are too many connections, how can xenstore stubdom handle > this situation? As I can see in a xenstore process running in Dom0, it > can certainly opens more than 32 fds.In theory it could share the same /dev/xen/evtchn handle between all connections. Ian.
On Mon, 2013-02-18 at 16:25 +0000, Ian Campbell wrote:> On Mon, 2013-02-18 at 16:18 +0000, Wei Liu wrote: > > On Mon, 2013-02-18 at 16:06 +0000, Ian Campbell wrote: > > > On Mon, 2013-02-18 at 15:40 +0000, Wei Liu wrote: > > > > Hi Samuel and Daniel > > > > > > > > I sent a patch to switch cxenstored''s event loop from using select to > > > > using poll several weeks ago, however this would break xenstore-stubdom > > > > as Mini-OS has no poll(2) implementation at the moment. > > > > > > > > I think implementing poll(2) for Mini-OS could be a good idea, but I > > > > don''t know how far I should go. I''m not familiar with xenstore-stubdom, > > > > and I tried setting it up but it didn''t work so I gave up. :-( > > > > > > > > To my understanding we only care about the interface but not the > > > > implementation. I looked into Mini-OS''s lib/sys.c this morning, noticing > > > > that the internal file abstraction only supports up to 32 files. Is this > > > > xenstore-stubdom only for DomU? If it is for Dom0, how can it handle > > > > more than 32 fds? > > > > > > What do you mean? A stubdom must necessarily be dom!=0 but it should be > > > able to service all other domains, including dom0 and other domUs. > > > > > > > Hah? This is my first impression, but we still need a xenstored running > > in Dom0, right? > > No > > > Otherwise what''s the output of `xenstore-ls` in Dom0? > > It talks to the remote stubdom, in the same way that a domU normally > would. > > There is only one xenstored in the entire system. (Ignoring extreme > disaggregation like the XOAR paper). >OK.> > > Do we use an fd per evtchn or only one in xenstore? If the former then > > > that''s a bit of a limitation of the xenstore stubdom! > > > > > > > Xenstore has a struct connection which has one fd for each connection. > > So if there are too many connections, how can xenstore stubdom handle > > this situation? As I can see in a xenstore process running in Dom0, it > > can certainly opens more than 32 fds. > > In theory it could share the same /dev/xen/evtchn handle between all > connections. >I think the real magic is that in cxenstore''s implementation there is some ifdef around specific code to avoid using too many fds, but I haven''t gone too deep into that... Wei.> Ian. >
On Mon, 2013-02-18 at 17:05 +0000, Wei Liu wrote:> On Mon, 2013-02-18 at 16:25 +0000, Ian Campbell wrote: > > On Mon, 2013-02-18 at 16:18 +0000, Wei Liu wrote: > > > On Mon, 2013-02-18 at 16:06 +0000, Ian Campbell wrote: > > > > On Mon, 2013-02-18 at 15:40 +0000, Wei Liu wrote: > > > > > Hi Samuel and Daniel > > > > > > > > > > I sent a patch to switch cxenstored''s event loop from using select to > > > > > using poll several weeks ago, however this would break xenstore-stubdom > > > > > as Mini-OS has no poll(2) implementation at the moment. > > > > > > > > > > I think implementing poll(2) for Mini-OS could be a good idea, but I > > > > > don''t know how far I should go. I''m not familiar with xenstore-stubdom, > > > > > and I tried setting it up but it didn''t work so I gave up. :-( > > > > > > > > > > To my understanding we only care about the interface but not the > > > > > implementation. I looked into Mini-OS''s lib/sys.c this morning, noticing > > > > > that the internal file abstraction only supports up to 32 files. Is this > > > > > xenstore-stubdom only for DomU? If it is for Dom0, how can it handle > > > > > more than 32 fds? > > > > > > > > What do you mean? A stubdom must necessarily be dom!=0 but it should be > > > > able to service all other domains, including dom0 and other domUs. > > > > > > > > > > Hah? This is my first impression, but we still need a xenstored running > > > in Dom0, right? > > > > No > > > > > Otherwise what''s the output of `xenstore-ls` in Dom0? > > > > It talks to the remote stubdom, in the same way that a domU normally > > would. > > > > There is only one xenstored in the entire system. (Ignoring extreme > > disaggregation like the XOAR paper). > > > > OK. > > > > > Do we use an fd per evtchn or only one in xenstore? If the former then > > > > that''s a bit of a limitation of the xenstore stubdom! > > > > > > > > > > Xenstore has a struct connection which has one fd for each connection. > > > So if there are too many connections, how can xenstore stubdom handle > > > this situation? As I can see in a xenstore process running in Dom0, it > > > can certainly opens more than 32 fds. > > > > In theory it could share the same /dev/xen/evtchn handle between all > > connections. > > > > I think the real magic is that in cxenstore''s implementation there is > some ifdef around specific code to avoid using too many fds, but I > haven''t gone too deep into that... >Oops, hit send for a mis-composted email, please ignore this... I''m still investigating this. Wei.
Wei Liu, le Mon 18 Feb 2013 15:40:56 +0000, a écrit :> I sent a patch to switch cxenstored''s event loop from using select to > using poll several weeks ago,What is the rationale BTW? Efficiency?> My main question is, is it possible to just wrap around select(2) in > Mini-OS to implement poll(2)? (as shown in the conceptual patch)Yes, except that there are evil small differences between poll and select, which we need to take care of. About the 32 fd limitation, it indeed comes from the days when we always had a bounded number of things to open. We need to drop that limitation, but it should be easy, by turning the `files'' array into a reallocable pointer. There is just one important thing: the xenbus_event_queue events field has to be made a pointer to a dynamically-allocated queue, otherwise it will get moved by the reallocation and things will go wrong very badly.> +int poll(struct pollfd pfd[], int nfds, int timeout) > +{...> + > + /* Timeout in poll is in second. */ > + _timeo.tv_sec = timeout;FIXME: timeout is in ms, not sec.> + _timeo.tv_usec = 0;Samuel
On Mon, 2013-02-18 at 17:12 +0000, Samuel Thibault wrote:> Wei Liu, le Mon 18 Feb 2013 15:40:56 +0000, a écrit : > > I sent a patch to switch cxenstored's event loop from using select to > > using poll several weeks ago, > > What is the rationale BTW? Efficiency? >Oh, the rationale is that select can only support 1024 fds in Linux. If we run more than 1024 guest (in fact the real limit is like 1022), xenstore hangs and makes the whole system unusable.> > My main question is, is it possible to just wrap around select(2) in > > Mini-OS to implement poll(2)? (as shown in the conceptual patch) > > Yes, except that there are evil small differences between poll and > select, which we need to take care of. >Sure. :-)> About the 32 fd limitation, it indeed comes from the days when we always > had a bounded number of things to open. We need to drop that limitation, > but it should be easy, by turning the `files' array into a reallocable > pointer. There is just one important thing: the xenbus_event_queue > events field has to be made a pointer to a dynamically-allocated queue, > otherwise it will get moved by the reallocation and things will go wrong > very badly. >So apart from the modification needed, is this "32" really a bottleneck for xenstore-stubdom? How many domains can a xenstore-stubdom serve? (I think these two questions are for Daniel) Wei. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xen.org http://lists.xen.org/xen-devel
On Mon, 2013-02-18 at 17:21 +0000, Wei Liu wrote:> So apart from the modification needed, is this "32" really a bottleneck > for xenstore-stubdom? How many domains can a xenstore-stubdom serve?There''s no reason in principal to expect this to be any less than xenstored running in dom0 can support, issues like the one you''ve found notwithstanding. Ian.
Wei Liu, le Mon 18 Feb 2013 17:21:02 +0000, a écrit :> So apart from the modification needed, is this "32" really a bottleneck > for xenstore-stubdom?Actually I don''t think it is: from what I see in the xenstore source, different FDs are only used for connexions on the unix-connexion socket, but for stubdoms, such socket does not make sense and is thus disabled anyway. So NOFILE being 32 is actually not an issue at all for xenstore-stubdom. Samuel
Wei Liu, le Mon 18 Feb 2013 15:40:56 +0000, a écrit :> + ret = select(&rfds, &wfds, NULL, &_timeo); > + > + for (i = 0; i < nfds; i++) { > + fd = pfd[i].fd;Here we have to set revents to 0 before adding some bits. The caller may not have cleared revents already.> + if (FD_ISSET(fd, &rfds)) > + pfd[i].revents |= POLLIN; > + if (FD_ISSET(fd, &wfds)) > + pfd[i].revents |= POLLOUT; > + }Samuel