Bamvor Jian Zhang
2013-Oct-24  07:30 UTC
[PATCH] improve the error message in "xl list --long"
it is a bug reported by customer, xl will output the following message
when issue "xl list --long domain_name":
Domain name must be specified.
the error message is completely unclear. this error is because such
domain is not created by xl itself. xl could saw the domain name but
could not get the domain configuration.
Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com>
---
change since RFC:
raise the error in list_domains_details instead of modifying the 
libxl_userdata_retrieve itself.
there are four command would invoke the libxl_userdata_retrieve: create, 
save, migrate, list. issue create command mean the domain is not called
from other toolstack, and there is libxl_userdata_retrieve return value
check in save and migrate. So, the only place need to check is in
list_domain_details.
 tools/libxl/xl_cmdimpl.c | 5 +++++
 1 file changed, 5 insertions(+)
diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c
index a8261be..81bdb91 100644
--- a/tools/libxl/xl_cmdimpl.c
+++ b/tools/libxl/xl_cmdimpl.c
@@ -3064,6 +3064,11 @@ static void list_domains_details(const libxl_dominfo
*info, int nb_domain)
         rc = libxl_userdata_retrieve(ctx, info[i].domid, "xl",
&data, &len);
         if (rc)
             continue;
+        if (len == 0) {
+            fprintf(stderr, "No xl userdata found. Is domain id"
+                    "''%u'' owned by another
toolstack?", info[i].domid);
+            continue;
+        }
         CHK_ERRNO(asprintf(&config_source, "<domid %d
data>", info[i].domid));
         libxl_domain_config_init(&d_config);
         parse_config_data(config_source, (char *)data, len, &d_config,
NULL);
-- 
1.8.1.4
On Thu, Oct 24, 2013 at 03:30:57PM +0800, Bamvor Jian Zhang wrote:> it is a bug reported by customer, xl will output the following message > when issue "xl list --long domain_name": > Domain name must be specified. > > the error message is completely unclear. this error is because such > domain is not created by xl itself. xl could saw the domain name but > could not get the domain configuration. >The root cause is that there is no sotred config file in xl''s private storage. This can be caused by the reason you stated above. Another posibility is that Xen fails to clean up a domain after its death. That domain remains visible in Xen, however at that time the stored config file is already deleted. IMHO we can skip this domain without printing this info. Or this info needs to cover all the situations. Wei.> Signed-off-by: Bamvor Jian Zhang <bjzhang@suse.com> > --- > change since RFC: > raise the error in list_domains_details instead of modifying the > libxl_userdata_retrieve itself. > there are four command would invoke the libxl_userdata_retrieve: create, > save, migrate, list. issue create command mean the domain is not called > from other toolstack, and there is libxl_userdata_retrieve return value > check in save and migrate. So, the only place need to check is in > list_domain_details. > > tools/libxl/xl_cmdimpl.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/tools/libxl/xl_cmdimpl.c b/tools/libxl/xl_cmdimpl.c > index a8261be..81bdb91 100644 > --- a/tools/libxl/xl_cmdimpl.c > +++ b/tools/libxl/xl_cmdimpl.c > @@ -3064,6 +3064,11 @@ static void list_domains_details(const libxl_dominfo *info, int nb_domain) > rc = libxl_userdata_retrieve(ctx, info[i].domid, "xl", &data, &len); > if (rc) > continue; > + if (len == 0) { > + fprintf(stderr, "No xl userdata found. Is domain id" > + "''%u'' owned by another toolstack?", info[i].domid); > + continue; > + } > CHK_ERRNO(asprintf(&config_source, "<domid %d data>", info[i].domid)); > libxl_domain_config_init(&d_config); > parse_config_data(config_source, (char *)data, len, &d_config, NULL); > -- > 1.8.1.4 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xen.org > http://lists.xen.org/xen-devel
Ian Jackson
2013-Oct-28  15:19 UTC
Re: [PATCH] improve the error message in "xl list --long"
Wei Liu writes ("Re: [Xen-devel] [PATCH] improve the error message in
"xl list --long""):> Another posibility is that Xen fails to clean up a domain after its
> death. That domain remains visible in Xen, however at that time the
> stored config file is already deleted.
> 
> IMHO we can skip this domain without printing this info. Or this info
> needs to cover all the situations.
I think for now (that is, until we can reconstruct the domain config
from the running state) we should not skip the domain, but instead
simply not include the missing info in the long output.
Ian.
On Mon, Oct 28, 2013 at 03:19:36PM +0000, Ian Jackson wrote:> Wei Liu writes ("Re: [Xen-devel] [PATCH] improve the error message in "xl list --long""): > > Another posibility is that Xen fails to clean up a domain after its > > death. That domain remains visible in Xen, however at that time the > > stored config file is already deleted. > > > > IMHO we can skip this domain without printing this info. Or this info > > needs to cover all the situations. > > I think for now (that is, until we can reconstruct the domain config > from the running state) we should not skip the domain, but instead > simply not include the missing info in the long output. >TBH I''m confused... You seemed to agree to skip in another patch. <21094.40745.754203.842615@mariner.uk.xensource.com> Wei.> Ian.
Ian Jackson
2013-Oct-28  16:22 UTC
Re: [PATCH] improve the error message in "xl list --long"
Wei Liu writes ("Re: [Xen-devel] [PATCH] improve the error message in
"xl list --long""):> On Mon, Oct 28, 2013 at 03:19:36PM +0000, Ian Jackson wrote:
> > Wei Liu writes ("Re: [Xen-devel] [PATCH] improve the error
message in "xl list --long""):
> > > Another posibility is that Xen fails to clean up a domain after
its
> > > death. That domain remains visible in Xen, however at that time
the
> > > stored config file is already deleted.
> > > 
> > > IMHO we can skip this domain without printing this info. Or this
info
> > > needs to cover all the situations.
> > 
> > I think for now (that is, until we can reconstruct the domain config
> > from the running state) we should not skip the domain, but instead
> > simply not include the missing info in the long output.
> 
> TBH I''m confused... You seemed to agree to skip in another patch.
> <21094.40745.754203.842615@mariner.uk.xensource.com>
Yes, sorry, but, I think I was wrong earlier.
I don''t think missing an existing domain out of the listing can be
right.  That would make it just appear to vanish!
Ian.
On Mon, Oct 28, 2013 at 04:22:20PM +0000, Ian Jackson wrote:> Wei Liu writes ("Re: [Xen-devel] [PATCH] improve the error message in "xl list --long""): > > On Mon, Oct 28, 2013 at 03:19:36PM +0000, Ian Jackson wrote: > > > Wei Liu writes ("Re: [Xen-devel] [PATCH] improve the error message in "xl list --long""): > > > > Another posibility is that Xen fails to clean up a domain after its > > > > death. That domain remains visible in Xen, however at that time the > > > > stored config file is already deleted. > > > > > > > > IMHO we can skip this domain without printing this info. Or this info > > > > needs to cover all the situations. > > > > > > I think for now (that is, until we can reconstruct the domain config > > > from the running state) we should not skip the domain, but instead > > > simply not include the missing info in the long output. > > > > TBH I''m confused... You seemed to agree to skip in another patch. > > <21094.40745.754203.842615@mariner.uk.xensource.com> > > Yes, sorry, but, I think I was wrong earlier. > > I don''t think missing an existing domain out of the listing can be > right. That would make it just appear to vanish! >Rereading your earlier comment. "we should not skip the domain, but instead not include the missing info in the long output" This is what <21094.40745.754203.842615@mariner.uk.xensource.com> and this patch does -- only that this patch prints one more line to info user the situation. The problem I''m seeing with this patch is that the info doesn''t seem to cover all the cases. This patch and the one I proposed skips zero-length domain in long output. You can still see the domain in short output -- it doesn''t vanish. :-) Wei.> Ian.
Ian Jackson
2013-Oct-28  16:45 UTC
Re: [PATCH] improve the error message in "xl list --long"
Wei Liu writes ("Re: [Xen-devel] [PATCH] improve the error message in
"xl list --long""):> "we should not skip the domain, but instead not include the missing
info
> in the long output"
> 
> This is what <21094.40745.754203.842615@mariner.uk.xensource.com> and
> this patch does -- only that this patch prints one more line to info
> user the situation. The problem I''m seeing with this patch is that
the
> info doesn''t seem to cover all the cases.
> 
> This patch and the one I proposed skips zero-length domain in long
> output. You can still see the domain in short output -- it doesn''t
> vanish. :-)
What I mean is that I think that domains shouldn''t appear in the short
output but not the long output.
Ian.
Bamvor Jian Zhang
2013-Oct-30  06:44 UTC
Re: [PATCH] improve the error message in "xl list --long"
Hi, Wei> On Thu, Oct 24, 2013 at 03:30:57PM +0800, Bamvor Jian Zhang wrote: > > it is a bug reported by customer, xl will output the following message > > when issue "xl list --long domain_name": > > Domain name must be specified. > > > > the error message is completely unclear. this error is because such > > domain is not created by xl itself. xl could saw the domain name but > > could not get the domain configuration. > > > > The root cause is that there is no sotred config file in xl''s private > storage. This can be caused by the reason you stated above. > > Another posibility is that Xen fails to clean up a domain after its > death. That domain remains visible in Xen, however at that time the > stored config file is already deleted. > > IMHO we can skip this domain without printing this info. Or this info > needs to cover all the situations. >sorry reply this so late. this issue is reported to us because our user using libvirt and xl at the same time. in xend times, it is ok to combine with libvirt and xend. We should tell user should not do it with xl. as I said in my first email, there is no prompt for missing config in "xl list --long" compare with other command. how about print like this: fprintf(stderr, "No xl userdata found. Is domain id" "''%u'' owned by another toolstack or fail in cleanup? ", info[i].domid); Bamvor> Wei.
Matt Wilson
2013-Nov-10  23:26 UTC
Re: [PATCH] improve the error message in "xl list --long"
On Mon, Oct 28, 2013 at 03:19:36PM +0000, Ian Jackson wrote:> Wei Liu writes ("Re: [Xen-devel] [PATCH] improve the error message in "xl list --long""): > > Another posibility is that Xen fails to clean up a domain after its > > death. That domain remains visible in Xen, however at that time the > > stored config file is already deleted. > > > > IMHO we can skip this domain without printing this info. Or this info > > needs to cover all the situations. > > I think for now (that is, until we can reconstruct the domain config > from the running state) we should not skip the domain, but instead > simply not include the missing info in the long output.I agree. What can''t be reconstructed today? Is there a list somewhere? --msw
Ian Jackson
2013-Nov-11  14:52 UTC
Re: [PATCH] improve the error message in "xl list --long"
Matt Wilson writes ("Re: [Xen-devel] [PATCH] improve the error message in
"xl list --long""):> On Mon, Oct 28, 2013 at 03:19:36PM +0000, Ian Jackson wrote:
> > I think for now (that is, until we can reconstruct the domain config
> > from the running state) we should not skip the domain, but instead
> > simply not include the missing info in the long output.
> 
> I agree. What can''t be reconstructed today? Is there a list
somewhere?
Currently there is no code at all that attempts to do this.  Instead,
we rely on saving the text of the config file.
Ian.
Ian Campbell
2013-Nov-11  14:59 UTC
Re: [PATCH] improve the error message in "xl list --long"
On Mon, 2013-11-11 at 14:52 +0000, Ian Jackson wrote:> Matt Wilson writes ("Re: [Xen-devel] [PATCH] improve the error message in "xl list --long""): > > On Mon, Oct 28, 2013 at 03:19:36PM +0000, Ian Jackson wrote: > > > I think for now (that is, until we can reconstruct the domain config > > > from the running state) we should not skip the domain, but instead > > > simply not include the missing info in the long output. > > > > I agree. What can''t be reconstructed today? Is there a list somewhere? > > Currently there is no code at all that attempts to do this.I made a hacky start in <1382369724.1657.39.camel@hastur.hellion.org.uk>. I think most of what is required is there in xenstore or the results of hypervalls, or can be easily enough stashed somewhere in xenstore on create, but we won''t know until someone keeps running with it... Ian.