Tony Breeds
2006-Nov-30 05:59 UTC
[Xen-devel] [RFC][PATCH] 2/3] [TOOLS][XENTRACE] Update tools to write data to disk in a known endian''ness.
Use network byte order when writing data to disk, making the data portable to any machine. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> --- tools/xentrace/xentrace.c | 47 ++++++++++++++++++++++++++++++++++------- tools/xentrace/xentrace_format | 7 +++--- 2 files changed, 44 insertions(+), 10 deletions(-) Index: xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace.c ==================================================================--- xen-unstable.hg-mainline.xentrace.orig/tools/xentrace/xentrace.c +++ xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace.c @@ -28,6 +28,20 @@ #include <xenctrl.h> +#include <arpa/inet.h> /* hton*(), ntoh*() */ +#include <endian.h> + +/* There is no 64-bit htonll, so create one */ +#if __BYTE_ORDER == __LITTLE_ENDIAN +#define htonll(x) ( (((uint64_t)htonl(x)) << 32) + htonl(x >> 32) ) +#define ntohll(x) ( (((uint64_t)ntohl(x)) << 32) + ntohl(x >> 32) ) +#else +#define htonll(x) (x) +#define ntohll(x) (x) +#endif + +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) + #define PERROR(_m, _a...) \ do { \ int __saved_errno = errno; \ @@ -90,12 +104,30 @@ struct timespec millis_to_timespec(unsig * Outputs the trace record to a filestream, prepending the CPU ID of the * source trace buffer. */ -void write_rec(unsigned int cpu, struct t_rec *rec, FILE *out) +void write_rec(uint32_t cpu, struct t_rec *rec, FILE *out) { size_t written = 0; - written += fwrite(&cpu, sizeof(cpu), 1, out); - written += fwrite(rec, sizeof(*rec), 1, out); - if ( written != 2 ) + int i; + /* Place network byte order representation in temp vars, rather than + * write back into kernel/xen memory */ + uint64_t tmp64; + uint32_t tmp32; + + tmp32 = htonl(cpu); + written += fwrite(&tmp32, sizeof(tmp32), 1, out); + + tmp64 = htonll(rec->cycles); + written += fwrite(&tmp64, sizeof(tmp64), 1, out); + + tmp32 = htonl(rec->event); + written += fwrite(&tmp32, sizeof(tmp32), 1, out); + + for ( i=0; i<ARRAY_SIZE(rec->data); i++ ) { + tmp64 = htonl(rec->data[i]); + written += fwrite(&tmp64, sizeof(tmp64), 1, out); + } + + if ( written != 8 ) { PERROR("Failed to write trace record"); exit(EXIT_FAILURE); @@ -147,6 +179,7 @@ struct t_buf *map_tbufs(unsigned long tb exit(EXIT_FAILURE); } + /* On PPC (At least) the DOMID arg is ignored in dom0 */ tbufs_mapped = xc_map_foreign_range(xc_handle, DOMID_XEN, size * num, PROT_READ | PROT_WRITE, tbufs_mfn); @@ -253,7 +286,7 @@ struct t_rec **init_rec_ptrs(struct t_bu /** * get_num_cpus - get the number of logical CPUs */ -unsigned int get_num_cpus(void) +uint32_t get_num_cpus(void) { xc_physinfo_t physinfo; int xc_handle = xc_interface_open(); @@ -282,14 +315,14 @@ unsigned int get_num_cpus(void) */ int monitor_tbufs(FILE *logfile) { - int i; + uint32_t i; void *tbufs_mapped; /* pointer to where the tbufs are mapped */ struct t_buf **meta; /* pointers to the trace buffer metadata */ struct t_rec **data; /* pointers to the trace buffer data areas * where they are mapped into user space. */ unsigned long tbufs_mfn; /* mfn of the tbufs */ - unsigned int num; /* number of trace buffers / logical CPUS */ + uint32_t num; /* number of trace buffers / logical CPUS */ unsigned long size; /* size of a single trace buffer */ int size_in_recs; Index: xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace_format ==================================================================--- xen-unstable.hg-mainline.xentrace.orig/tools/xentrace/xentrace_format +++ xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace_format @@ -84,10 +84,11 @@ interrupted = 0 defs = read_defs(arg[0]) # structure of trace record + prepended CPU id (as output by xentrace): -# CPU(I) TSC(Q) EVENT(L) D1(L) D2(L) D3(L) D4(L) D5(L) +# CPU(L) TSC(Q) EVENT(L) D1(Q) D2(Q) D3(Q) D4(Q) D5(Q) # read CPU id separately to avoid structure packing problems on 64-bit arch. -CPUREC = "I" -TRCREC = "QLLLLLL" +# Force network byte order. +CPUREC = "!L" +TRCREC = "!QLQQQQQ" last_tsc = [0] _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tony Breeds
2006-Nov-30 05:59 UTC
[Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fields for exported xentrace data.
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> --- xen/common/trace.c | 6 +++--- xen/include/public/trace.h | 2 +- xen/include/xen/trace.h | 14 +++++++------- 3 files changed, 11 insertions(+), 11 deletions(-) Index: xen-unstable.hg-mainline.xentrace/xen/common/trace.c ==================================================================--- xen-unstable.hg-mainline.xentrace.orig/xen/common/trace.c +++ xen-unstable.hg-mainline.xentrace/xen/common/trace.c @@ -46,7 +46,7 @@ static int nr_recs; static int t_buf_highwater; /* Number of records lost due to per-CPU trace buffer being full. */ -static DEFINE_PER_CPU(unsigned long, lost_records); +static DEFINE_PER_CPU(uint64_t, lost_records); /* a flag recording whether initialization has been done */ /* or more properly, if the tbuf subsystem is enabled right now */ @@ -228,8 +228,8 @@ int tb_control(xen_sysctl_tbuf_op_t *tbc * failure, otherwise 0. Failure occurs only if the trace buffers are not yet * initialised. */ -void trace(u32 event, unsigned long d1, unsigned long d2, - unsigned long d3, unsigned long d4, unsigned long d5) +void trace(uint32_t event, uint64_t d1, uint64_t d2, uint64_t d3, uint64_t d4, + uint64_t d5) { struct t_buf *buf; struct t_rec *rec; Index: xen-unstable.hg-mainline.xentrace/xen/include/public/trace.h ==================================================================--- xen-unstable.hg-mainline.xentrace.orig/xen/include/public/trace.h +++ xen-unstable.hg-mainline.xentrace/xen/include/public/trace.h @@ -76,7 +76,7 @@ struct t_rec { uint64_t cycles; /* cycle counter timestamp */ uint32_t event; /* event ID */ - unsigned long data[5]; /* event data items */ + uint64_t data[5]; /* event data items */ }; /* Index: xen-unstable.hg-mainline.xentrace/xen/include/xen/trace.h ==================================================================--- xen-unstable.hg-mainline.xentrace.orig/xen/include/xen/trace.h +++ xen-unstable.hg-mainline.xentrace/xen/include/xen/trace.h @@ -33,19 +33,19 @@ void init_trace_bufs(void); /* used to retrieve the physical address of the trace buffers */ int tb_control(struct xen_sysctl_tbuf_op *tbc); -void trace(u32 event, unsigned long d1, unsigned long d2, - unsigned long d3, unsigned long d4, unsigned long d5); +void trace(uint32_t event, uint64_t d1, uint64_t d2, uint64_t d3, uint64_t d4, + uint64_t d5); /* Avoids troubling the caller with casting their arguments to a trace macro */ #define trace_do_casts(e,d1,d2,d3,d4,d5) \ do { \ if ( unlikely(tb_init_done) ) \ trace(e, \ - (unsigned long)d1, \ - (unsigned long)d2, \ - (unsigned long)d3, \ - (unsigned long)d4, \ - (unsigned long)d5); \ + (uint64_t)d1, \ + (uint64_t)d2, \ + (uint64_t)d3, \ + (uint64_t)d4, \ + (uint64_t)d5); \ } while ( 0 ) /* Convenience macros for calling the trace function. */ _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tony Breeds
2006-Nov-30 05:59 UTC
[Xen-devel] [RFC][PATCH] 0/3] [V2] Update xentrace to be compatable with PPC.
Hello All, This Patch series updates xentrace such that it will compile and work correctly (assuming underlying hypervisor support) on all architectures. Also allowing trace files captured on one architecture to be portable to other systems with no changes to tools. Essentially the patches do: 1. Use unit64_t instead of unsigned long. I did attempt to leave "data" as unsigned long, but in 64bit mode this will not compile on ppc. Using explicitly sized types in shared hypervisor/userspace header files is arguably cleaner IMO. 2. Make the xentrace tool (xentrace copies the xentrace buffers from xen memory to disk) write data in network byte order. 3. Various cleanups. This patch is optional. These patches have been tested on ppc64 and x86_32, and datafiles transported between said hosts. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tony Breeds
2006-Nov-30 05:59 UTC
[Xen-devel] [RFC][PATCH] 3/3] [TOOLS][XENTRACE] Various tidyups to xentrace tools.
- use err/errx/warn, instead of PERROR, perror and fprintf - Match () place ment consistent in this files - Use consistent tabs and whitespacing. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> --- tools/xentrace/xentrace.c | 137 ++++++++++++++--------------------------- tools/xentrace/xentrace_format | 51 +++++++-------- 2 files changed, 73 insertions(+), 115 deletions(-) Index: xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace.c ==================================================================--- xen-unstable.hg-mainline.xentrace.orig/tools/xentrace/xentrace.c +++ xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace.c @@ -22,6 +22,7 @@ #include <signal.h> #include <inttypes.h> #include <string.h> +#include <err.h> #include <xen/xen.h> #include <xen/trace.h> @@ -42,16 +43,6 @@ #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) -#define PERROR(_m, _a...) \ -do { \ - int __saved_errno = errno; \ - fprintf(stderr, "ERROR: " _m " (%d = %s)\n" , ## _a , \ - __saved_errno, strerror(__saved_errno)); \ - errno = __saved_errno; \ -} while (0) - -extern FILE *stderr; - /***** Compile time configuration of defaults ********************************/ /* when we''ve got more records than this waiting, we log it to the output */ @@ -88,7 +79,7 @@ void close_handler(int signal) struct timespec millis_to_timespec(unsigned long millis) { struct timespec spec; - + spec.tv_sec = millis / 1000; spec.tv_nsec = (millis % 1000) * 1000; @@ -128,10 +119,7 @@ void write_rec(uint32_t cpu, struct t_re } if ( written != 8 ) - { - PERROR("Failed to write trace record"); - exit(EXIT_FAILURE); - } + err(EXIT_FAILURE, "Failed to write trace record"); } static void get_tbufs(unsigned long *mfn, unsigned long *size) @@ -139,21 +127,16 @@ static void get_tbufs(unsigned long *mfn int xc_handle = xc_interface_open(); int ret; - if ( xc_handle < 0 ) - { - exit(EXIT_FAILURE); - } + if ( xc_handle < 0 ) + errx(EXIT_FAILURE, "Unable to open the xc interface"); - if(!opts.tbuf_size) - opts.tbuf_size = DEFAULT_TBUF_SIZE; + if ( !opts.tbuf_size ) + opts.tbuf_size = DEFAULT_TBUF_SIZE; ret = xc_tbuf_enable(xc_handle, opts.tbuf_size, mfn, size); if ( ret != 0 ) - { - perror("Couldn''t enable trace buffers"); - exit(1); - } + err(EXIT_FAILURE, "Couldn''t enable trace buffers"); xc_interface_close(xc_handle); } @@ -174,10 +157,8 @@ struct t_buf *map_tbufs(unsigned long tb xc_handle = xc_interface_open(); - if ( xc_handle < 0 ) - { - exit(EXIT_FAILURE); - } + if ( xc_handle < 0 ) + errx(EXIT_FAILURE, "Unable to open the xc interface"); /* On PPC (At least) the DOMID arg is ignored in dom0 */ tbufs_mapped = xc_map_foreign_range(xc_handle, DOMID_XEN, @@ -186,18 +167,15 @@ struct t_buf *map_tbufs(unsigned long tb xc_interface_close(xc_handle); - if ( tbufs_mapped == 0 ) - { - PERROR("Failed to mmap trace buffers"); - exit(EXIT_FAILURE); - } + if ( tbufs_mapped == 0 ) + err(EXIT_FAILURE, "Failed to mmap trace buffers"); return tbufs_mapped; } /** * set_mask - set the cpu/event mask in HV - * @mask: the new mask + * @mask: the new mask * @type: the new mask type,0-event mask, 1-cpu mask * */ @@ -206,21 +184,19 @@ void set_mask(uint32_t mask, int type) int ret = 0; int xc_handle = xc_interface_open(); /* for accessing control interface */ - if (type == 1) { + if ( type == 1 ) { ret = xc_tbuf_set_cpu_mask(xc_handle, mask); - fprintf(stderr, "change cpumask to 0x%x\n", mask); - } else if (type == 0) { + warnx("change cpumask to 0x%x\n", mask); + } else if ( type == 0 ) { ret = xc_tbuf_set_evt_mask(xc_handle, mask); - fprintf(stderr, "change evtmask to 0x%x\n", mask); + warnx("change evtmask to 0x%x\n", mask); } xc_interface_close(xc_handle); if ( ret != 0 ) - { - PERROR("Failure to get trace buffer pointer from Xen and set the new mask"); - exit(EXIT_FAILURE); - } + err(EXIT_FAILURE, "Failure to get trace buffer pointer from " + "Xen and set the new mask"); } /** @@ -240,11 +216,8 @@ struct t_buf **init_bufs_ptrs(void *bufs user_ptrs = (struct t_buf **)calloc(num, sizeof(struct t_buf *)); if ( user_ptrs == NULL ) - { - PERROR( "Failed to allocate memory for buffer pointers\n"); - exit(EXIT_FAILURE); - } - + err(EXIT_FAILURE, "Failed to allocate memory for buffer pointers"); + /* initialise pointers to the trace buffers - given the size of a trace * buffer and the value of bufs_maped, we can easily calculate these */ for ( i = 0; i<num; i++ ) @@ -269,13 +242,10 @@ struct t_rec **init_rec_ptrs(struct t_bu { int i; struct t_rec **data; - + data = calloc(num, sizeof(struct t_rec *)); if ( data == NULL ) - { - PERROR("Failed to allocate memory for data pointers\n"); - exit(EXIT_FAILURE); - } + err(EXIT_FAILURE, "Failed to allocate memory for data pointers"); for ( i = 0; i < num; i++ ) data[i] = (struct t_rec *)(meta[i] + 1); @@ -291,14 +261,11 @@ uint32_t get_num_cpus(void) xc_physinfo_t physinfo; int xc_handle = xc_interface_open(); int ret; - + ret = xc_physinfo(xc_handle, &physinfo); - + if ( ret != 0 ) - { - PERROR("Failure to get logical CPU count from Xen"); - exit(EXIT_FAILURE); - } + errx(EXIT_FAILURE, "Failure to get logical CPU count from Xen"); xc_interface_close(xc_handle); @@ -377,17 +344,17 @@ int parse_evtmask(char *arg, struct argp char *inval; /* search filtering class */ - if (strcmp(arg, "gen") == 0){ + if ( strcmp(arg, "gen") == 0 ) setup->evt_mask |= TRC_GEN; - } else if(strcmp(arg, "sched") == 0){ + else if ( strcmp(arg, "sched") == 0 ) setup->evt_mask |= TRC_SCHED; - } else if(strcmp(arg, "dom0op") == 0){ + else if ( strcmp(arg, "dom0op") == 0 ) setup->evt_mask |= TRC_DOM0OP; - } else if(strcmp(arg, "vmx") == 0){ + else if ( strcmp(arg, "vmx") == 0 ) setup->evt_mask |= TRC_VMX; - } else if(strcmp(arg, "all") == 0){ + else if ( strcmp(arg, "all") == 0 ) setup->evt_mask |= TRC_ALL; - } else { + else { setup->evt_mask = strtol(arg, &inval, 0); if ( inval == arg ) argp_usage(state); @@ -430,13 +397,13 @@ error_t cmd_parser(int key, char *arg, s argp_usage(state); } break; - + case ''e'': /* set new event mask for filtering*/ { parse_evtmask(arg, state); } break; - + case ''S'': /* set tbuf size (given in pages) */ { char *inval; @@ -454,7 +421,7 @@ error_t cmd_parser(int key, char *arg, s argp_usage(state); } break; - + default: return ARGP_ERR_UNKNOWN; } @@ -473,16 +440,16 @@ const struct argp_option cmd_opts[] "(default " xstr(NEW_DATA_THRESH) ")." }, { .name = "poll-sleep", .key=''s'', .arg="p", - .doc = + .doc "Set sleep time, p, in milliseconds between polling the trace buffer " "for new data (default " xstr(POLL_SLEEP_MILLIS) ")." }, { .name = "cpu-mask", .key=''c'', .arg="c", - .doc = + .doc "set cpu-mask " }, { .name = "evt-mask", .key=''e'', .arg="e", - .doc = + .doc "set evt-mask " }, { .name = "trace-buf-size", .key=''S'', .arg="N", @@ -504,7 +471,7 @@ const struct argp parser_def "\v" "This tool is used to capture trace buffer data from Xen. The data is " "output in a binary format, in the following order:\n\n" - " CPU(uint) TSC(uint64_t) EVENT(uint32_t) D1 D2 D3 D4 D5 " + " CPU(uint32_t) TSC(uint64_t) EVENT(uint32_t) D1 D2 D3 D4 D5 " "(all uint32_t)\n\n" "The output should be parsed using the tool xentrace_format, which can " "produce human-readable output in ASCII format." @@ -513,8 +480,8 @@ const struct argp parser_def const char *argp_program_version = "xentrace v1.1"; const char *argp_program_bug_address = "<mark.a.williamson@intel.com>"; - - + + int main(int argc, char **argv) { int outfd = 1, ret; @@ -529,31 +496,23 @@ int main(int argc, char **argv) argp_parse(&parser_def, argc, argv, 0, 0, &opts); - if (opts.evt_mask != 0) { + if ( opts.evt_mask != 0 ) set_mask(opts.evt_mask, 0); - } - if (opts.cpu_mask != 0) { + if ( opts.cpu_mask != 0 ) set_mask(opts.cpu_mask, 1); - } if ( opts.outfile ) outfd = open(opts.outfile, O_WRONLY | O_CREAT | O_LARGEFILE, 0644); - if(outfd < 0) - { - perror("Could not open output file"); - exit(EXIT_FAILURE); - } + if ( outfd < 0 ) + err(EXIT_FAILURE, "Could not open output file"); - if(isatty(outfd)) - { - fprintf(stderr, "Cannot output to a TTY, specify a log file.\n"); - exit(EXIT_FAILURE); - } + if ( isatty(outfd) ) + errx(EXIT_FAILURE, "Cannot output to a TTY, specify a log file.\n"); logfile = fdopen(outfd, "w"); - + /* ensure that if we get a signal, we''ll do cleanup, then exit */ act.sa_handler = close_handler; act.sa_flags = 0; Index: xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace_format ==================================================================--- xen-unstable.hg-mainline.xentrace.orig/tools/xentrace/xentrace_format +++ xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace_format @@ -24,7 +24,7 @@ def usage(): Which correspond to the CPU number, event ID, timestamp counter and the 5 data fields from the trace record. There should be one such rule for each type of event. - + Depending on your system and the volume of trace buffer data, this script may not be able to keep up with the output of xentrace if it is piped directly. In these circumstances you should have @@ -34,7 +34,7 @@ def usage(): def read_defs(defs_file): defs = {} - + fd = open(defs_file) reg = re.compile(''(\S+)\s+(\S.*)'') @@ -43,14 +43,14 @@ def read_defs(defs_file): line = fd.readline() if not line: break - - if line[0] == ''#'' or line[0] == ''\n'': - continue - + + if line[0] == ''#'' or line[0] == ''\n'': + continue + m = reg.match(line) if not m: print >> sys.stderr, "Bad format file" ; sys.exit(1) - + defs[str(eval(m.group(1)))] = m.group(2) return defs @@ -70,7 +70,7 @@ try: opts, arg = getopt.getopt(sys.argv[1:], "c:" ) for opt in opts: - if opt[0] == ''-c'' : mhz = int(opt[1]) + if opt[0] == ''-c'' : mhz = int(opt[1]) except getopt.GetoptError: usage() @@ -96,7 +96,7 @@ i=0 while not interrupted: try: - i=i+1 + i=i+1 line = sys.stdin.read(struct.calcsize(CPUREC)) if not line: break @@ -108,19 +108,20 @@ while not interrupted: (tsc, event, d1, d2, d3, d4, d5) = struct.unpack(TRCREC, line) - #tsc = (tscH<<32) | tscL + #tsc = (tscH<<32) | tscL - #print i, tsc + #print i, tsc if cpu >= len(last_tsc): last_tsc += [0] * (cpu - len(last_tsc) + 1) - elif tsc < last_tsc[cpu]: - print "TSC stepped backward cpu %d ! %d %d" % (cpu,tsc,last_tsc[cpu]) + elif tsc < last_tsc[cpu]: + print "TSC stepped backward cpu %d ! %d %d" % (cpu, tsc, + last_tsc[cpu]) - last_tsc[cpu] = tsc + last_tsc[cpu] = tsc - if mhz: - tsc = tsc / (mhz*1000000.0) + if mhz: + tsc = tsc / (mhz*1000000.0) args = {''cpu'' : cpu, ''tsc'' : tsc, @@ -131,15 +132,13 @@ while not interrupted: ''4'' : d4, ''5'' : d5 } - try: - - if defs.has_key(str(event)): - print defs[str(event)] % args - else: - if defs.has_key(str(0)): print defs[str(0)] % args - except TypeError: - print defs[str(event)] - print args - + try: + if defs.has_key(str(event)): + print defs[str(event)] % args + else: + if defs.has_key(str(0)): print defs[str(0)] % args + except TypeError: + print defs[str(event)] + print args except IOError, struct.error: sys.exit() _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2006-Nov-30 12:57 UTC
Re: [Xen-devel] [RFC][PATCH] 3/3] [TOOLS][XENTRACE] Various tidyups to xentrace tools.
> - use err/errx/warn, instead of PERROR, perror and fprintfGood stuff! I like this.> - Match () place ment consistent in this filesThanks for doing this. While you''re at it, would you like to fix the bracing? Some of it is in K&R style: if ( aoeuaoeu ) { Whereas other Xen code uses: if ( aoeuaoeu ) { Braces always starting on a new line, for all types of block.> - Use consistent tabs and whitespacing.Thanks for doing this. Cheers, Mark> > Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> > --- > > tools/xentrace/xentrace.c | 137 > ++++++++++++++--------------------------- tools/xentrace/xentrace_format | > 51 +++++++-------- > 2 files changed, 73 insertions(+), 115 deletions(-) > > Index: xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace.c > ==================================================================> --- xen-unstable.hg-mainline.xentrace.orig/tools/xentrace/xentrace.c > +++ xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace.c > @@ -22,6 +22,7 @@ > #include <signal.h> > #include <inttypes.h> > #include <string.h> > +#include <err.h> > > #include <xen/xen.h> > #include <xen/trace.h> > @@ -42,16 +43,6 @@ > > #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) > > -#define PERROR(_m, _a...) \ > -do { \ > - int __saved_errno = errno; \ > - fprintf(stderr, "ERROR: " _m " (%d = %s)\n" , ## _a , \ > - __saved_errno, strerror(__saved_errno)); \ > - errno = __saved_errno; \ > -} while (0) > - > -extern FILE *stderr; > - > /***** Compile time configuration of defaults > ********************************/ > > /* when we''ve got more records than this waiting, we log it to the output > */ @@ -88,7 +79,7 @@ void close_handler(int signal) > struct timespec millis_to_timespec(unsigned long millis) > { > struct timespec spec; > - > + > spec.tv_sec = millis / 1000; > spec.tv_nsec = (millis % 1000) * 1000; > > @@ -128,10 +119,7 @@ void write_rec(uint32_t cpu, struct t_re > } > > if ( written != 8 ) > - { > - PERROR("Failed to write trace record"); > - exit(EXIT_FAILURE); > - } > + err(EXIT_FAILURE, "Failed to write trace record"); > } > > static void get_tbufs(unsigned long *mfn, unsigned long *size) > @@ -139,21 +127,16 @@ static void get_tbufs(unsigned long *mfn > int xc_handle = xc_interface_open(); > int ret; > > - if ( xc_handle < 0 ) > - { > - exit(EXIT_FAILURE); > - } > + if ( xc_handle < 0 ) > + errx(EXIT_FAILURE, "Unable to open the xc interface"); > > - if(!opts.tbuf_size) > - opts.tbuf_size = DEFAULT_TBUF_SIZE; > + if ( !opts.tbuf_size ) > + opts.tbuf_size = DEFAULT_TBUF_SIZE; > > ret = xc_tbuf_enable(xc_handle, opts.tbuf_size, mfn, size); > > if ( ret != 0 ) > - { > - perror("Couldn''t enable trace buffers"); > - exit(1); > - } > + err(EXIT_FAILURE, "Couldn''t enable trace buffers"); > > xc_interface_close(xc_handle); > } > @@ -174,10 +157,8 @@ struct t_buf *map_tbufs(unsigned long tb > > xc_handle = xc_interface_open(); > > - if ( xc_handle < 0 ) > - { > - exit(EXIT_FAILURE); > - } > + if ( xc_handle < 0 ) > + errx(EXIT_FAILURE, "Unable to open the xc interface"); > > /* On PPC (At least) the DOMID arg is ignored in dom0 */ > tbufs_mapped = xc_map_foreign_range(xc_handle, DOMID_XEN, > @@ -186,18 +167,15 @@ struct t_buf *map_tbufs(unsigned long tb > > xc_interface_close(xc_handle); > > - if ( tbufs_mapped == 0 ) > - { > - PERROR("Failed to mmap trace buffers"); > - exit(EXIT_FAILURE); > - } > + if ( tbufs_mapped == 0 ) > + err(EXIT_FAILURE, "Failed to mmap trace buffers"); > > return tbufs_mapped; > } > > /** > * set_mask - set the cpu/event mask in HV > - * @mask: the new mask > + * @mask: the new mask > * @type: the new mask type,0-event mask, 1-cpu mask > * > */ > @@ -206,21 +184,19 @@ void set_mask(uint32_t mask, int type) > int ret = 0; > int xc_handle = xc_interface_open(); /* for accessing control > interface */ > > - if (type == 1) { > + if ( type == 1 ) { > ret = xc_tbuf_set_cpu_mask(xc_handle, mask); > - fprintf(stderr, "change cpumask to 0x%x\n", mask); > - } else if (type == 0) { > + warnx("change cpumask to 0x%x\n", mask); > + } else if ( type == 0 ) { > ret = xc_tbuf_set_evt_mask(xc_handle, mask); > - fprintf(stderr, "change evtmask to 0x%x\n", mask); > + warnx("change evtmask to 0x%x\n", mask); > } > > xc_interface_close(xc_handle); > > if ( ret != 0 ) > - { > - PERROR("Failure to get trace buffer pointer from Xen and set the > new mask"); - exit(EXIT_FAILURE); > - } > + err(EXIT_FAILURE, "Failure to get trace buffer pointer from " > + "Xen and set the new mask"); > } > > /** > @@ -240,11 +216,8 @@ struct t_buf **init_bufs_ptrs(void *bufs > > user_ptrs = (struct t_buf **)calloc(num, sizeof(struct t_buf *)); > if ( user_ptrs == NULL ) > - { > - PERROR( "Failed to allocate memory for buffer pointers\n"); > - exit(EXIT_FAILURE); > - } > - > + err(EXIT_FAILURE, "Failed to allocate memory for buffer > pointers"); + > /* initialise pointers to the trace buffers - given the size of a > trace * buffer and the value of bufs_maped, we can easily calculate these > */ for ( i = 0; i<num; i++ ) > @@ -269,13 +242,10 @@ struct t_rec **init_rec_ptrs(struct t_bu > { > int i; > struct t_rec **data; > - > + > data = calloc(num, sizeof(struct t_rec *)); > if ( data == NULL ) > - { > - PERROR("Failed to allocate memory for data pointers\n"); > - exit(EXIT_FAILURE); > - } > + err(EXIT_FAILURE, "Failed to allocate memory for data pointers"); > > for ( i = 0; i < num; i++ ) > data[i] = (struct t_rec *)(meta[i] + 1); > @@ -291,14 +261,11 @@ uint32_t get_num_cpus(void) > xc_physinfo_t physinfo; > int xc_handle = xc_interface_open(); > int ret; > - > + > ret = xc_physinfo(xc_handle, &physinfo); > - > + > if ( ret != 0 ) > - { > - PERROR("Failure to get logical CPU count from Xen"); > - exit(EXIT_FAILURE); > - } > + errx(EXIT_FAILURE, "Failure to get logical CPU count from Xen"); > > xc_interface_close(xc_handle); > > @@ -377,17 +344,17 @@ int parse_evtmask(char *arg, struct argp > char *inval; > > /* search filtering class */ > - if (strcmp(arg, "gen") == 0){ > + if ( strcmp(arg, "gen") == 0 ) > setup->evt_mask |= TRC_GEN; > - } else if(strcmp(arg, "sched") == 0){ > + else if ( strcmp(arg, "sched") == 0 ) > setup->evt_mask |= TRC_SCHED; > - } else if(strcmp(arg, "dom0op") == 0){ > + else if ( strcmp(arg, "dom0op") == 0 ) > setup->evt_mask |= TRC_DOM0OP; > - } else if(strcmp(arg, "vmx") == 0){ > + else if ( strcmp(arg, "vmx") == 0 ) > setup->evt_mask |= TRC_VMX; > - } else if(strcmp(arg, "all") == 0){ > + else if ( strcmp(arg, "all") == 0 ) > setup->evt_mask |= TRC_ALL; > - } else { > + else { > setup->evt_mask = strtol(arg, &inval, 0); > if ( inval == arg ) > argp_usage(state); > @@ -430,13 +397,13 @@ error_t cmd_parser(int key, char *arg, s > argp_usage(state); > } > break; > - > + > case ''e'': /* set new event mask for filtering*/ > { > parse_evtmask(arg, state); > } > break; > - > + > case ''S'': /* set tbuf size (given in pages) */ > { > char *inval; > @@ -454,7 +421,7 @@ error_t cmd_parser(int key, char *arg, s > argp_usage(state); > } > break; > - > + > default: > return ARGP_ERR_UNKNOWN; > } > @@ -473,16 +440,16 @@ const struct argp_option cmd_opts[] > "(default " xstr(NEW_DATA_THRESH) ")." }, > > { .name = "poll-sleep", .key=''s'', .arg="p", > - .doc > + .doc > "Set sleep time, p, in milliseconds between polling the trace buffer > " "for new data (default " xstr(POLL_SLEEP_MILLIS) ")." }, > > { .name = "cpu-mask", .key=''c'', .arg="c", > - .doc > + .doc > "set cpu-mask " }, > > { .name = "evt-mask", .key=''e'', .arg="e", > - .doc > + .doc > "set evt-mask " }, > > { .name = "trace-buf-size", .key=''S'', .arg="N", > @@ -504,7 +471,7 @@ const struct argp parser_def > "\v" > "This tool is used to capture trace buffer data from Xen. The data is > " "output in a binary format, in the following order:\n\n" > - " CPU(uint) TSC(uint64_t) EVENT(uint32_t) D1 D2 D3 D4 D5 " > + " CPU(uint32_t) TSC(uint64_t) EVENT(uint32_t) D1 D2 D3 D4 D5 " > "(all uint32_t)\n\n" > "The output should be parsed using the tool xentrace_format, which can > " "produce human-readable output in ASCII format." > @@ -513,8 +480,8 @@ const struct argp parser_def > > const char *argp_program_version = "xentrace v1.1"; > const char *argp_program_bug_address = "<mark.a.williamson@intel.com>"; > - > - > + > + > int main(int argc, char **argv) > { > int outfd = 1, ret; > @@ -529,31 +496,23 @@ int main(int argc, char **argv) > > argp_parse(&parser_def, argc, argv, 0, 0, &opts); > > - if (opts.evt_mask != 0) { > + if ( opts.evt_mask != 0 ) > set_mask(opts.evt_mask, 0); > - } > > - if (opts.cpu_mask != 0) { > + if ( opts.cpu_mask != 0 ) > set_mask(opts.cpu_mask, 1); > - } > > if ( opts.outfile ) > outfd = open(opts.outfile, O_WRONLY | O_CREAT | O_LARGEFILE, > 0644); > > - if(outfd < 0) > - { > - perror("Could not open output file"); > - exit(EXIT_FAILURE); > - } > + if ( outfd < 0 ) > + err(EXIT_FAILURE, "Could not open output file"); > > - if(isatty(outfd)) > - { > - fprintf(stderr, "Cannot output to a TTY, specify a log file.\n"); > - exit(EXIT_FAILURE); > - } > + if ( isatty(outfd) ) > + errx(EXIT_FAILURE, "Cannot output to a TTY, specify a log > file.\n"); > > logfile = fdopen(outfd, "w"); > - > + > /* ensure that if we get a signal, we''ll do cleanup, then exit */ > act.sa_handler = close_handler; > act.sa_flags = 0; > Index: xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace_format > ==================================================================> --- xen-unstable.hg-mainline.xentrace.orig/tools/xentrace/xentrace_format > +++ xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace_format > @@ -24,7 +24,7 @@ def usage(): > Which correspond to the CPU number, event ID, timestamp counter > and the 5 data fields from the trace record. There should be one such rule > for each type of event. > - > + > Depending on your system and the volume of trace buffer data, > this script may not be able to keep up with the output of > xentrace if it is piped directly. In these circumstances you should have > @@ -34,7 +34,7 @@ def usage(): > > def read_defs(defs_file): > defs = {} > - > + > fd = open(defs_file) > > reg = re.compile(''(\S+)\s+(\S.*)'') > @@ -43,14 +43,14 @@ def read_defs(defs_file): > line = fd.readline() > if not line: > break > - > - if line[0] == ''#'' or line[0] == ''\n'': > - continue > - > + > + if line[0] == ''#'' or line[0] == ''\n'': > + continue > + > m = reg.match(line) > > if not m: print >> sys.stderr, "Bad format file" ; sys.exit(1) > - > + > defs[str(eval(m.group(1)))] = m.group(2) > > return defs > @@ -70,7 +70,7 @@ try: > opts, arg = getopt.getopt(sys.argv[1:], "c:" ) > > for opt in opts: > - if opt[0] == ''-c'' : mhz = int(opt[1]) > + if opt[0] == ''-c'' : mhz = int(opt[1]) > > except getopt.GetoptError: > usage() > @@ -96,7 +96,7 @@ i=0 > > while not interrupted: > try: > - i=i+1 > + i=i+1 > line = sys.stdin.read(struct.calcsize(CPUREC)) > if not line: > break > @@ -108,19 +108,20 @@ while not interrupted: > > (tsc, event, d1, d2, d3, d4, d5) = struct.unpack(TRCREC, line) > > - #tsc = (tscH<<32) | tscL > + #tsc = (tscH<<32) | tscL > > - #print i, tsc > + #print i, tsc > > if cpu >= len(last_tsc): > last_tsc += [0] * (cpu - len(last_tsc) + 1) > - elif tsc < last_tsc[cpu]: > - print "TSC stepped backward cpu %d ! %d %d" % > (cpu,tsc,last_tsc[cpu]) + elif tsc < last_tsc[cpu]: > + print "TSC stepped backward cpu %d ! %d %d" % (cpu, tsc, > + last_tsc[cpu]) > > - last_tsc[cpu] = tsc > + last_tsc[cpu] = tsc > > - if mhz: > - tsc = tsc / (mhz*1000000.0) > + if mhz: > + tsc = tsc / (mhz*1000000.0) > > args = {''cpu'' : cpu, > ''tsc'' : tsc, > @@ -131,15 +132,13 @@ while not interrupted: > ''4'' : d4, > ''5'' : d5 } > > - try: > - > - if defs.has_key(str(event)): > - print defs[str(event)] % args > - else: > - if defs.has_key(str(0)): print defs[str(0)] % args > - except TypeError: > - print defs[str(event)] > - print args > - > + try: > + if defs.has_key(str(event)): > + print defs[str(event)] % args > + else: > + if defs.has_key(str(0)): print defs[str(0)] % args > + except TypeError: > + print defs[str(event)] > + print args > > except IOError, struct.error: sys.exit() > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2006-Nov-30 13:02 UTC
Re: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fields for exported xentrace data.
It''s worth noting that this patch will (roughly) halve the number of trace records. Nobody should probably be relying on trace buffer size anymore, so this shouldn''t be an issue. No objections here. Cheers, Mark On Thursday 30 November 2006 05:59, Tony Breeds wrote:> Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> > --- > > xen/common/trace.c | 6 +++--- > xen/include/public/trace.h | 2 +- > xen/include/xen/trace.h | 14 +++++++------- > 3 files changed, 11 insertions(+), 11 deletions(-) > > Index: xen-unstable.hg-mainline.xentrace/xen/common/trace.c > ==================================================================> --- xen-unstable.hg-mainline.xentrace.orig/xen/common/trace.c > +++ xen-unstable.hg-mainline.xentrace/xen/common/trace.c > @@ -46,7 +46,7 @@ static int nr_recs; > static int t_buf_highwater; > > /* Number of records lost due to per-CPU trace buffer being full. */ > -static DEFINE_PER_CPU(unsigned long, lost_records); > +static DEFINE_PER_CPU(uint64_t, lost_records); > > /* a flag recording whether initialization has been done */ > /* or more properly, if the tbuf subsystem is enabled right now */ > @@ -228,8 +228,8 @@ int tb_control(xen_sysctl_tbuf_op_t *tbc > * failure, otherwise 0. Failure occurs only if the trace buffers are not > yet * initialised. > */ > -void trace(u32 event, unsigned long d1, unsigned long d2, > - unsigned long d3, unsigned long d4, unsigned long d5) > +void trace(uint32_t event, uint64_t d1, uint64_t d2, uint64_t d3, uint64_t > d4, + uint64_t d5) > { > struct t_buf *buf; > struct t_rec *rec; > Index: xen-unstable.hg-mainline.xentrace/xen/include/public/trace.h > ==================================================================> --- xen-unstable.hg-mainline.xentrace.orig/xen/include/public/trace.h > +++ xen-unstable.hg-mainline.xentrace/xen/include/public/trace.h > @@ -76,7 +76,7 @@ > struct t_rec { > uint64_t cycles; /* cycle counter timestamp */ > uint32_t event; /* event ID */ > - unsigned long data[5]; /* event data items */ > + uint64_t data[5]; /* event data items */ > }; > > /* > Index: xen-unstable.hg-mainline.xentrace/xen/include/xen/trace.h > ==================================================================> --- xen-unstable.hg-mainline.xentrace.orig/xen/include/xen/trace.h > +++ xen-unstable.hg-mainline.xentrace/xen/include/xen/trace.h > @@ -33,19 +33,19 @@ void init_trace_bufs(void); > /* used to retrieve the physical address of the trace buffers */ > int tb_control(struct xen_sysctl_tbuf_op *tbc); > > -void trace(u32 event, unsigned long d1, unsigned long d2, > - unsigned long d3, unsigned long d4, unsigned long d5); > +void trace(uint32_t event, uint64_t d1, uint64_t d2, uint64_t d3, uint64_t > d4, + uint64_t d5); > > /* Avoids troubling the caller with casting their arguments to a trace > macro */ #define trace_do_casts(e,d1,d2,d3,d4,d5) \ > do { \ > if ( unlikely(tb_init_done) ) \ > trace(e, \ > - (unsigned long)d1, \ > - (unsigned long)d2, \ > - (unsigned long)d3, \ > - (unsigned long)d4, \ > - (unsigned long)d5); \ > + (uint64_t)d1, \ > + (uint64_t)d2, \ > + (uint64_t)d3, \ > + (uint64_t)d4, \ > + (uint64_t)d5); \ > } while ( 0 ) > > /* Convenience macros for calling the trace function. */ > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2006-Nov-30 13:12 UTC
Re: [Xen-devel] [RFC][PATCH] 2/3] [TOOLS][XENTRACE] Update tools to write data to disk in a known endian''ness.
> Use network byte order when writing data to disk, making the data portable > to any machine.Thanks for splitting this from the formatting changes - it''s a big help. No objections to this here, seems like a nice improvement. I did wonder if perhaps some sort of trace file header would be useful, giving metadata. For instance, a record of what dom0 / Xen builds were being used, and type of machine might be useful. A future extensible format would be a plus too, I guess. None of this is strictly necessary, it just might help with organising trace files. Cheers, Mark> Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> > --- > > tools/xentrace/xentrace.c | 47 > ++++++++++++++++++++++++++++++++++------- tools/xentrace/xentrace_format | > 7 +++--- > 2 files changed, 44 insertions(+), 10 deletions(-) > > Index: xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace.c > ==================================================================> --- xen-unstable.hg-mainline.xentrace.orig/tools/xentrace/xentrace.c > +++ xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace.c > @@ -28,6 +28,20 @@ > > #include <xenctrl.h> > > +#include <arpa/inet.h> /* hton*(), ntoh*() */ > +#include <endian.h> > + > +/* There is no 64-bit htonll, so create one */ > +#if __BYTE_ORDER == __LITTLE_ENDIAN > +#define htonll(x) ( (((uint64_t)htonl(x)) << 32) + htonl(x >> 32) > ) +#define ntohll(x) ( (((uint64_t)ntohl(x)) << 32) + ntohl(x >> > 32) ) +#else > +#define htonll(x) (x) > +#define ntohll(x) (x) > +#endif > + > +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) > + > #define PERROR(_m, _a...) \ > do { \ > int __saved_errno = errno; \ > @@ -90,12 +104,30 @@ struct timespec millis_to_timespec(unsig > * Outputs the trace record to a filestream, prepending the CPU ID of the > * source trace buffer. > */ > -void write_rec(unsigned int cpu, struct t_rec *rec, FILE *out) > +void write_rec(uint32_t cpu, struct t_rec *rec, FILE *out) > { > size_t written = 0; > - written += fwrite(&cpu, sizeof(cpu), 1, out); > - written += fwrite(rec, sizeof(*rec), 1, out); > - if ( written != 2 ) > + int i; > + /* Place network byte order representation in temp vars, rather than > + * write back into kernel/xen memory */ > + uint64_t tmp64; > + uint32_t tmp32; > + > + tmp32 = htonl(cpu); > + written += fwrite(&tmp32, sizeof(tmp32), 1, out); > + > + tmp64 = htonll(rec->cycles); > + written += fwrite(&tmp64, sizeof(tmp64), 1, out); > + > + tmp32 = htonl(rec->event); > + written += fwrite(&tmp32, sizeof(tmp32), 1, out); > + > + for ( i=0; i<ARRAY_SIZE(rec->data); i++ ) { > + tmp64 = htonl(rec->data[i]); > + written += fwrite(&tmp64, sizeof(tmp64), 1, out); > + } > + > + if ( written != 8 ) > { > PERROR("Failed to write trace record"); > exit(EXIT_FAILURE); > @@ -147,6 +179,7 @@ struct t_buf *map_tbufs(unsigned long tb > exit(EXIT_FAILURE); > } > > + /* On PPC (At least) the DOMID arg is ignored in dom0 */ > tbufs_mapped = xc_map_foreign_range(xc_handle, DOMID_XEN, > size * num, PROT_READ | > PROT_WRITE, tbufs_mfn); > @@ -253,7 +286,7 @@ struct t_rec **init_rec_ptrs(struct t_bu > /** > * get_num_cpus - get the number of logical CPUs > */ > -unsigned int get_num_cpus(void) > +uint32_t get_num_cpus(void) > { > xc_physinfo_t physinfo; > int xc_handle = xc_interface_open(); > @@ -282,14 +315,14 @@ unsigned int get_num_cpus(void) > */ > int monitor_tbufs(FILE *logfile) > { > - int i; > + uint32_t i; > > void *tbufs_mapped; /* pointer to where the tbufs are mapped > */ struct t_buf **meta; /* pointers to the trace buffer metadata > */ struct t_rec **data; /* pointers to the trace buffer data > areas * where they are mapped into user space. */ unsigned long > tbufs_mfn; /* mfn of the tbufs */ - unsigned > int num; /* number of trace buffers / logical CPUS */ + > uint32_t num; /* number of trace buffers / logical CPUS */ > unsigned long size; /* size of a single trace buffer */ > > int size_in_recs; > Index: xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace_format > ==================================================================> --- xen-unstable.hg-mainline.xentrace.orig/tools/xentrace/xentrace_format > +++ xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace_format > @@ -84,10 +84,11 @@ interrupted = 0 > defs = read_defs(arg[0]) > > # structure of trace record + prepended CPU id (as output by xentrace): > -# CPU(I) TSC(Q) EVENT(L) D1(L) D2(L) D3(L) D4(L) D5(L) > +# CPU(L) TSC(Q) EVENT(L) D1(Q) D2(Q) D3(Q) D4(Q) D5(Q) > # read CPU id separately to avoid structure packing problems on 64-bit > arch. -CPUREC = "I" > -TRCREC = "QLLLLLL" > +# Force network byte order. > +CPUREC = "!L" > +TRCREC = "!QLQQQQQ" > > last_tsc = [0] > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2006-Nov-30 16:27 UTC
Re: [Xen-devel] [RFC][PATCH] 2/3] [TOOLS][XENTRACE] Update tools to write data to disk in a known endian''ness.
Do you have an idea how much extra cpu this takes for high-bandwidth tracing? I''m using xentrace for performance analysis, and it''s already got a measurable impact due to the disk bandwidth and copying. I''d like to keep that as minimal as possible. -George On 11/30/06, Tony Breeds <tony@bakeyournoodle.com> wrote:> Use network byte order when writing data to disk, making the data portable to > any machine. > > Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> > --- > > tools/xentrace/xentrace.c | 47 ++++++++++++++++++++++++++++++++++------- > tools/xentrace/xentrace_format | 7 +++--- > 2 files changed, 44 insertions(+), 10 deletions(-) > > Index: xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace.c > ==================================================================> --- xen-unstable.hg-mainline.xentrace.orig/tools/xentrace/xentrace.c > +++ xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace.c > @@ -28,6 +28,20 @@ > > #include <xenctrl.h> > > +#include <arpa/inet.h> /* hton*(), ntoh*() */ > +#include <endian.h> > + > +/* There is no 64-bit htonll, so create one */ > +#if __BYTE_ORDER == __LITTLE_ENDIAN > +#define htonll(x) ( (((uint64_t)htonl(x)) << 32) + htonl(x >> 32) ) > +#define ntohll(x) ( (((uint64_t)ntohl(x)) << 32) + ntohl(x >> 32) ) > +#else > +#define htonll(x) (x) > +#define ntohll(x) (x) > +#endif > + > +#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) > + > #define PERROR(_m, _a...) \ > do { \ > int __saved_errno = errno; \ > @@ -90,12 +104,30 @@ struct timespec millis_to_timespec(unsig > * Outputs the trace record to a filestream, prepending the CPU ID of the > * source trace buffer. > */ > -void write_rec(unsigned int cpu, struct t_rec *rec, FILE *out) > +void write_rec(uint32_t cpu, struct t_rec *rec, FILE *out) > { > size_t written = 0; > - written += fwrite(&cpu, sizeof(cpu), 1, out); > - written += fwrite(rec, sizeof(*rec), 1, out); > - if ( written != 2 ) > + int i; > + /* Place network byte order representation in temp vars, rather than > + * write back into kernel/xen memory */ > + uint64_t tmp64; > + uint32_t tmp32; > + > + tmp32 = htonl(cpu); > + written += fwrite(&tmp32, sizeof(tmp32), 1, out); > + > + tmp64 = htonll(rec->cycles); > + written += fwrite(&tmp64, sizeof(tmp64), 1, out); > + > + tmp32 = htonl(rec->event); > + written += fwrite(&tmp32, sizeof(tmp32), 1, out); > + > + for ( i=0; i<ARRAY_SIZE(rec->data); i++ ) { > + tmp64 = htonl(rec->data[i]); > + written += fwrite(&tmp64, sizeof(tmp64), 1, out); > + } > + > + if ( written != 8 ) > { > PERROR("Failed to write trace record"); > exit(EXIT_FAILURE); > @@ -147,6 +179,7 @@ struct t_buf *map_tbufs(unsigned long tb > exit(EXIT_FAILURE); > } > > + /* On PPC (At least) the DOMID arg is ignored in dom0 */ > tbufs_mapped = xc_map_foreign_range(xc_handle, DOMID_XEN, > size * num, PROT_READ | PROT_WRITE, > tbufs_mfn); > @@ -253,7 +286,7 @@ struct t_rec **init_rec_ptrs(struct t_bu > /** > * get_num_cpus - get the number of logical CPUs > */ > -unsigned int get_num_cpus(void) > +uint32_t get_num_cpus(void) > { > xc_physinfo_t physinfo; > int xc_handle = xc_interface_open(); > @@ -282,14 +315,14 @@ unsigned int get_num_cpus(void) > */ > int monitor_tbufs(FILE *logfile) > { > - int i; > + uint32_t i; > > void *tbufs_mapped; /* pointer to where the tbufs are mapped */ > struct t_buf **meta; /* pointers to the trace buffer metadata */ > struct t_rec **data; /* pointers to the trace buffer data areas > * where they are mapped into user space. */ > unsigned long tbufs_mfn; /* mfn of the tbufs */ > - unsigned int num; /* number of trace buffers / logical CPUS */ > + uint32_t num; /* number of trace buffers / logical CPUS */ > unsigned long size; /* size of a single trace buffer */ > > int size_in_recs; > Index: xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace_format > ==================================================================> --- xen-unstable.hg-mainline.xentrace.orig/tools/xentrace/xentrace_format > +++ xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace_format > @@ -84,10 +84,11 @@ interrupted = 0 > defs = read_defs(arg[0]) > > # structure of trace record + prepended CPU id (as output by xentrace): > -# CPU(I) TSC(Q) EVENT(L) D1(L) D2(L) D3(L) D4(L) D5(L) > +# CPU(L) TSC(Q) EVENT(L) D1(Q) D2(Q) D3(Q) D4(Q) D5(Q) > # read CPU id separately to avoid structure packing problems on 64-bit arch. > -CPUREC = "I" > -TRCREC = "QLLLLLL" > +# Force network byte order. > +CPUREC = "!L" > +TRCREC = "!QLQQQQQ" > > last_tsc = [0] > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2006-Nov-30 16:58 UTC
Re: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fields for exported xentrace data.
Hmm... this has the unfortunate side-effect of doubling the size of the trace, and effectively halving the effectiveness of the trace buffer in avoiding drops. My moderate-length traces are already in the gigabyte range, and I occasionally lose trace records even with a buffer size of 256. It would be really nice if we could avoid that. I happen to be using the VMENTER/VMEXIT tracing, which could be consolidated into one record if we went to a 64-bit trace. Is anyone else doing high-bandwidth tracing that this would affect in a significantly negative way? -George On 11/30/06, Tony Breeds <tony@bakeyournoodle.com> wrote:> Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> > --- > > xen/common/trace.c | 6 +++--- > xen/include/public/trace.h | 2 +- > xen/include/xen/trace.h | 14 +++++++------- > 3 files changed, 11 insertions(+), 11 deletions(-) > > Index: xen-unstable.hg-mainline.xentrace/xen/common/trace.c > ==================================================================> --- xen-unstable.hg-mainline.xentrace.orig/xen/common/trace.c > +++ xen-unstable.hg-mainline.xentrace/xen/common/trace.c > @@ -46,7 +46,7 @@ static int nr_recs; > static int t_buf_highwater; > > /* Number of records lost due to per-CPU trace buffer being full. */ > -static DEFINE_PER_CPU(unsigned long, lost_records); > +static DEFINE_PER_CPU(uint64_t, lost_records); > > /* a flag recording whether initialization has been done */ > /* or more properly, if the tbuf subsystem is enabled right now */ > @@ -228,8 +228,8 @@ int tb_control(xen_sysctl_tbuf_op_t *tbc > * failure, otherwise 0. Failure occurs only if the trace buffers are not yet > * initialised. > */ > -void trace(u32 event, unsigned long d1, unsigned long d2, > - unsigned long d3, unsigned long d4, unsigned long d5) > +void trace(uint32_t event, uint64_t d1, uint64_t d2, uint64_t d3, uint64_t d4, > + uint64_t d5) > { > struct t_buf *buf; > struct t_rec *rec; > Index: xen-unstable.hg-mainline.xentrace/xen/include/public/trace.h > ==================================================================> --- xen-unstable.hg-mainline.xentrace.orig/xen/include/public/trace.h > +++ xen-unstable.hg-mainline.xentrace/xen/include/public/trace.h > @@ -76,7 +76,7 @@ > struct t_rec { > uint64_t cycles; /* cycle counter timestamp */ > uint32_t event; /* event ID */ > - unsigned long data[5]; /* event data items */ > + uint64_t data[5]; /* event data items */ > }; > > /* > Index: xen-unstable.hg-mainline.xentrace/xen/include/xen/trace.h > ==================================================================> --- xen-unstable.hg-mainline.xentrace.orig/xen/include/xen/trace.h > +++ xen-unstable.hg-mainline.xentrace/xen/include/xen/trace.h > @@ -33,19 +33,19 @@ void init_trace_bufs(void); > /* used to retrieve the physical address of the trace buffers */ > int tb_control(struct xen_sysctl_tbuf_op *tbc); > > -void trace(u32 event, unsigned long d1, unsigned long d2, > - unsigned long d3, unsigned long d4, unsigned long d5); > +void trace(uint32_t event, uint64_t d1, uint64_t d2, uint64_t d3, uint64_t d4, > + uint64_t d5); > > /* Avoids troubling the caller with casting their arguments to a trace macro */ > #define trace_do_casts(e,d1,d2,d3,d4,d5) \ > do { \ > if ( unlikely(tb_init_done) ) \ > trace(e, \ > - (unsigned long)d1, \ > - (unsigned long)d2, \ > - (unsigned long)d3, \ > - (unsigned long)d4, \ > - (unsigned long)d5); \ > + (uint64_t)d1, \ > + (uint64_t)d2, \ > + (uint64_t)d3, \ > + (uint64_t)d4, \ > + (uint64_t)d5); \ > } while ( 0 ) > > /* Convenience macros for calling the trace function. */ > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Nov-30 17:03 UTC
Re: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fields for exported xentrace data.
On 30/11/06 16:58, "George Dunlap" <dunlapg@umich.edu> wrote:> Hmm... this has the unfortunate side-effect of doubling the size of > the trace, and effectively halving the effectiveness of the trace > buffer in avoiding drops. My moderate-length traces are already in > the gigabyte range, and I occasionally lose trace records even with a > buffer size of 256. It would be really nice if we could avoid that. > > I happen to be using the VMENTER/VMEXIT tracing, which could be > consolidated into one record if we went to a 64-bit trace. Is anyone > else doing high-bandwidth tracing that this would affect in a > significantly negative way?As we move increasingly towards x86/64 this is an issue that will need to be addressed even if we leave the tracing fields as longs. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2006-Dec-01 02:29 UTC
Re: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fields for exported xentrace data.
I guess one possibility would be to continue using unsigned longs but store the machine word size and endianness in a header in the trace file. This gets us platform independence. This avoids adding extra overhead on the fast path, the extra processing can happen offline (and probably not at all in the common case that you''re on the same endianness / word size as the trace was collected on). Another alternative would be to allow some combination of 32-bit or (fewer) 64-bit words in the record. This would let us keep the same record size, but have a bit more flexibility. Going the whole hog, we could even make the trace data opaque to trace.c - have a char[] for the data, and deal with the semantics in terms of "longs" "u64" etc in macros in the traced code, and in xentrace_format. If we did this, the logical extension would be to have variable length trace records with a fixed-size header giving the full length. I think this would be a good direction to go in, and would ensure that we maximise use of the trace buffer space. It shouldn''t be that hard to modify the system to do this - most of the work may even be in making it nice to use! Cheers, Mark On Thursday 30 November 2006 17:03, Keir Fraser wrote:> On 30/11/06 16:58, "George Dunlap" <dunlapg@umich.edu> wrote: > > Hmm... this has the unfortunate side-effect of doubling the size of > > the trace, and effectively halving the effectiveness of the trace > > buffer in avoiding drops. My moderate-length traces are already in > > the gigabyte range, and I occasionally lose trace records even with a > > buffer size of 256. It would be really nice if we could avoid that. > > > > I happen to be using the VMENTER/VMEXIT tracing, which could be > > consolidated into one record if we went to a 64-bit trace. Is anyone > > else doing high-bandwidth tracing that this would affect in a > > significantly negative way? > > As we move increasingly towards x86/64 this is an issue that will need to > be addressed even if we leave the tracing fields as longs. > > -- Keir > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tony Breeds
2006-Dec-01 05:23 UTC
Re: [Xen-devel] [RFC][PATCH] 2/3] [TOOLS][XENTRACE] Update tools to write data to disk in a known endian''ness.
On Thu, Nov 30, 2006 at 01:12:43PM +0000, Mark Williamson wrote:> No objections to this here, seems like a nice improvement.That''s what we like to hear ;P> I did wonder if > perhaps some sort of trace file header would be useful, giving metadata. For > instance, a record of what dom0 / Xen builds were being used, and type of > machine might be useful.I did think about doing something similar. I though I could steal the high 16bits of the first CPU number in the file as a xentrace format version number. Sure this would limit us to 64k CPUs but we can deal with that limit later ;P> A future extensible format would be a plus too, I > guess. None of this is strictly necessary, it just might help with > organising trace files.You spoke in more detail about that in another email so I''ll keep the discussion there. Yours Tony linux.conf.au http://linux.conf.au/ || http://lca2007.linux.org.au/ Jan 15-20 2007 The Australian Linux Technical Conference! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tony Breeds
2006-Dec-01 05:39 UTC
Re: [Xen-devel] [RFC][PATCH] 3/3] [TOOLS][XENTRACE] Various tidyups to xentrace tools.
On Thu, Nov 30, 2006 at 12:57:53PM +0000, Mark Williamson wrote:> Thanks for doing this. While you''re at it, would you like to fix the bracing? > Some of it is in K&R style:Sure. The version below fixes the couple of places that needed it, it needs [2/3] to apply cleanly. I''ll post a complete series tomorrow, pending a decision on the 32/64bit discussion. --- From: Tony Breeds <tony@bakeyournoodle.com> Subject: [RFC][PATCH] 3/3] [TOOLS][XENTRACE] Various tidyups to xentrace tools. - use err/errx/warn, instead of PERROR, perror and fprintf - Make () and {} placement consistent in this file, and with the xen codeing style. - Use consistent tabs and whitespacing. Signed-off-by: Tony Breeds <tony@bakeyournoodle.com> --- tools/xentrace/xentrace.c | 142 +++++++++++++++-------------------------- tools/xentrace/xentrace_format | 51 +++++++------- 2 files changed, 78 insertions(+), 115 deletions(-) Index: xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace.c ==================================================================--- xen-unstable.hg-mainline.xentrace.orig/tools/xentrace/xentrace.c +++ xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace.c @@ -22,6 +22,7 @@ #include <signal.h> #include <inttypes.h> #include <string.h> +#include <err.h> #include <xen/xen.h> #include <xen/trace.h> @@ -42,16 +43,6 @@ #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) -#define PERROR(_m, _a...) \ -do { \ - int __saved_errno = errno; \ - fprintf(stderr, "ERROR: " _m " (%d = %s)\n" , ## _a , \ - __saved_errno, strerror(__saved_errno)); \ - errno = __saved_errno; \ -} while (0) - -extern FILE *stderr; - /***** Compile time configuration of defaults ********************************/ /* when we''ve got more records than this waiting, we log it to the output */ @@ -88,7 +79,7 @@ void close_handler(int signal) struct timespec millis_to_timespec(unsigned long millis) { struct timespec spec; - + spec.tv_sec = millis / 1000; spec.tv_nsec = (millis % 1000) * 1000; @@ -122,16 +113,14 @@ void write_rec(uint32_t cpu, struct t_re tmp32 = htonl(rec->event); written += fwrite(&tmp32, sizeof(tmp32), 1, out); - for ( i=0; i<ARRAY_SIZE(rec->data); i++ ) { + for ( i=0; i<ARRAY_SIZE(rec->data); i++ ) + { tmp64 = htonl(rec->data[i]); written += fwrite(&tmp64, sizeof(tmp64), 1, out); } if ( written != 8 ) - { - PERROR("Failed to write trace record"); - exit(EXIT_FAILURE); - } + err(EXIT_FAILURE, "Failed to write trace record"); } static void get_tbufs(unsigned long *mfn, unsigned long *size) @@ -139,21 +128,16 @@ static void get_tbufs(unsigned long *mfn int xc_handle = xc_interface_open(); int ret; - if ( xc_handle < 0 ) - { - exit(EXIT_FAILURE); - } + if ( xc_handle < 0 ) + errx(EXIT_FAILURE, "Unable to open the xc interface"); - if(!opts.tbuf_size) - opts.tbuf_size = DEFAULT_TBUF_SIZE; + if ( !opts.tbuf_size ) + opts.tbuf_size = DEFAULT_TBUF_SIZE; ret = xc_tbuf_enable(xc_handle, opts.tbuf_size, mfn, size); if ( ret != 0 ) - { - perror("Couldn''t enable trace buffers"); - exit(1); - } + err(EXIT_FAILURE, "Couldn''t enable trace buffers"); xc_interface_close(xc_handle); } @@ -174,10 +158,8 @@ struct t_buf *map_tbufs(unsigned long tb xc_handle = xc_interface_open(); - if ( xc_handle < 0 ) - { - exit(EXIT_FAILURE); - } + if ( xc_handle < 0 ) + errx(EXIT_FAILURE, "Unable to open the xc interface"); /* On PPC (At least) the DOMID arg is ignored in dom0 */ tbufs_mapped = xc_map_foreign_range(xc_handle, DOMID_XEN, @@ -186,18 +168,15 @@ struct t_buf *map_tbufs(unsigned long tb xc_interface_close(xc_handle); - if ( tbufs_mapped == 0 ) - { - PERROR("Failed to mmap trace buffers"); - exit(EXIT_FAILURE); - } + if ( tbufs_mapped == 0 ) + err(EXIT_FAILURE, "Failed to mmap trace buffers"); return tbufs_mapped; } /** * set_mask - set the cpu/event mask in HV - * @mask: the new mask + * @mask: the new mask * @type: the new mask type,0-event mask, 1-cpu mask * */ @@ -206,21 +185,22 @@ void set_mask(uint32_t mask, int type) int ret = 0; int xc_handle = xc_interface_open(); /* for accessing control interface */ - if (type == 1) { + if ( type == 1 ) + { ret = xc_tbuf_set_cpu_mask(xc_handle, mask); - fprintf(stderr, "change cpumask to 0x%x\n", mask); - } else if (type == 0) { + warnx("change cpumask to 0x%x\n", mask); + } + else if ( type == 0 ) + { ret = xc_tbuf_set_evt_mask(xc_handle, mask); - fprintf(stderr, "change evtmask to 0x%x\n", mask); + warnx("change evtmask to 0x%x\n", mask); } xc_interface_close(xc_handle); if ( ret != 0 ) - { - PERROR("Failure to get trace buffer pointer from Xen and set the new mask"); - exit(EXIT_FAILURE); - } + err(EXIT_FAILURE, "Failure to get trace buffer pointer from " + "Xen and set the new mask"); } /** @@ -240,11 +220,8 @@ struct t_buf **init_bufs_ptrs(void *bufs user_ptrs = (struct t_buf **)calloc(num, sizeof(struct t_buf *)); if ( user_ptrs == NULL ) - { - PERROR( "Failed to allocate memory for buffer pointers\n"); - exit(EXIT_FAILURE); - } - + err(EXIT_FAILURE, "Failed to allocate memory for buffer pointers"); + /* initialise pointers to the trace buffers - given the size of a trace * buffer and the value of bufs_maped, we can easily calculate these */ for ( i = 0; i<num; i++ ) @@ -269,13 +246,10 @@ struct t_rec **init_rec_ptrs(struct t_bu { int i; struct t_rec **data; - + data = calloc(num, sizeof(struct t_rec *)); if ( data == NULL ) - { - PERROR("Failed to allocate memory for data pointers\n"); - exit(EXIT_FAILURE); - } + err(EXIT_FAILURE, "Failed to allocate memory for data pointers"); for ( i = 0; i < num; i++ ) data[i] = (struct t_rec *)(meta[i] + 1); @@ -291,14 +265,11 @@ uint32_t get_num_cpus(void) xc_physinfo_t physinfo; int xc_handle = xc_interface_open(); int ret; - + ret = xc_physinfo(xc_handle, &physinfo); - + if ( ret != 0 ) - { - PERROR("Failure to get logical CPU count from Xen"); - exit(EXIT_FAILURE); - } + errx(EXIT_FAILURE, "Failure to get logical CPU count from Xen"); xc_interface_close(xc_handle); @@ -377,17 +348,18 @@ int parse_evtmask(char *arg, struct argp char *inval; /* search filtering class */ - if (strcmp(arg, "gen") == 0){ + if ( strcmp(arg, "gen") == 0 ) setup->evt_mask |= TRC_GEN; - } else if(strcmp(arg, "sched") == 0){ + else if ( strcmp(arg, "sched") == 0 ) setup->evt_mask |= TRC_SCHED; - } else if(strcmp(arg, "dom0op") == 0){ + else if ( strcmp(arg, "dom0op") == 0 ) setup->evt_mask |= TRC_DOM0OP; - } else if(strcmp(arg, "vmx") == 0){ + else if ( strcmp(arg, "vmx") == 0 ) setup->evt_mask |= TRC_VMX; - } else if(strcmp(arg, "all") == 0){ + else if ( strcmp(arg, "all") == 0 ) setup->evt_mask |= TRC_ALL; - } else { + else + { setup->evt_mask = strtol(arg, &inval, 0); if ( inval == arg ) argp_usage(state); @@ -430,13 +402,13 @@ error_t cmd_parser(int key, char *arg, s argp_usage(state); } break; - + case ''e'': /* set new event mask for filtering*/ { parse_evtmask(arg, state); } break; - + case ''S'': /* set tbuf size (given in pages) */ { char *inval; @@ -454,7 +426,7 @@ error_t cmd_parser(int key, char *arg, s argp_usage(state); } break; - + default: return ARGP_ERR_UNKNOWN; } @@ -473,16 +445,16 @@ const struct argp_option cmd_opts[] "(default " xstr(NEW_DATA_THRESH) ")." }, { .name = "poll-sleep", .key=''s'', .arg="p", - .doc = + .doc "Set sleep time, p, in milliseconds between polling the trace buffer " "for new data (default " xstr(POLL_SLEEP_MILLIS) ")." }, { .name = "cpu-mask", .key=''c'', .arg="c", - .doc = + .doc "set cpu-mask " }, { .name = "evt-mask", .key=''e'', .arg="e", - .doc = + .doc "set evt-mask " }, { .name = "trace-buf-size", .key=''S'', .arg="N", @@ -514,8 +486,8 @@ const struct argp parser_def const char *argp_program_version = "xentrace v1.1"; const char *argp_program_bug_address = "<mark.a.williamson@intel.com>"; - - + + int main(int argc, char **argv) { int outfd = 1, ret; @@ -530,31 +502,23 @@ int main(int argc, char **argv) argp_parse(&parser_def, argc, argv, 0, 0, &opts); - if (opts.evt_mask != 0) { + if ( opts.evt_mask != 0 ) set_mask(opts.evt_mask, 0); - } - if (opts.cpu_mask != 0) { + if ( opts.cpu_mask != 0 ) set_mask(opts.cpu_mask, 1); - } if ( opts.outfile ) outfd = open(opts.outfile, O_WRONLY | O_CREAT | O_LARGEFILE, 0644); - if(outfd < 0) - { - perror("Could not open output file"); - exit(EXIT_FAILURE); - } + if ( outfd < 0 ) + err(EXIT_FAILURE, "Could not open output file"); - if(isatty(outfd)) - { - fprintf(stderr, "Cannot output to a TTY, specify a log file.\n"); - exit(EXIT_FAILURE); - } + if ( isatty(outfd) ) + errx(EXIT_FAILURE, "Cannot output to a TTY, specify a log file.\n"); logfile = fdopen(outfd, "w"); - + /* ensure that if we get a signal, we''ll do cleanup, then exit */ act.sa_handler = close_handler; act.sa_flags = 0; Index: xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace_format ==================================================================--- xen-unstable.hg-mainline.xentrace.orig/tools/xentrace/xentrace_format +++ xen-unstable.hg-mainline.xentrace/tools/xentrace/xentrace_format @@ -24,7 +24,7 @@ def usage(): Which correspond to the CPU number, event ID, timestamp counter and the 5 data fields from the trace record. There should be one such rule for each type of event. - + Depending on your system and the volume of trace buffer data, this script may not be able to keep up with the output of xentrace if it is piped directly. In these circumstances you should have @@ -34,7 +34,7 @@ def usage(): def read_defs(defs_file): defs = {} - + fd = open(defs_file) reg = re.compile(''(\S+)\s+(\S.*)'') @@ -43,14 +43,14 @@ def read_defs(defs_file): line = fd.readline() if not line: break - - if line[0] == ''#'' or line[0] == ''\n'': - continue - + + if line[0] == ''#'' or line[0] == ''\n'': + continue + m = reg.match(line) if not m: print >> sys.stderr, "Bad format file" ; sys.exit(1) - + defs[str(eval(m.group(1)))] = m.group(2) return defs @@ -70,7 +70,7 @@ try: opts, arg = getopt.getopt(sys.argv[1:], "c:" ) for opt in opts: - if opt[0] == ''-c'' : mhz = int(opt[1]) + if opt[0] == ''-c'' : mhz = int(opt[1]) except getopt.GetoptError: usage() @@ -96,7 +96,7 @@ i=0 while not interrupted: try: - i=i+1 + i=i+1 line = sys.stdin.read(struct.calcsize(CPUREC)) if not line: break @@ -108,19 +108,20 @@ while not interrupted: (tsc, event, d1, d2, d3, d4, d5) = struct.unpack(TRCREC, line) - #tsc = (tscH<<32) | tscL + #tsc = (tscH<<32) | tscL - #print i, tsc + #print i, tsc if cpu >= len(last_tsc): last_tsc += [0] * (cpu - len(last_tsc) + 1) - elif tsc < last_tsc[cpu]: - print "TSC stepped backward cpu %d ! %d %d" % (cpu,tsc,last_tsc[cpu]) + elif tsc < last_tsc[cpu]: + print "TSC stepped backward cpu %d ! %d %d" % (cpu, tsc, + last_tsc[cpu]) - last_tsc[cpu] = tsc + last_tsc[cpu] = tsc - if mhz: - tsc = tsc / (mhz*1000000.0) + if mhz: + tsc = tsc / (mhz*1000000.0) args = {''cpu'' : cpu, ''tsc'' : tsc, @@ -131,15 +132,13 @@ while not interrupted: ''4'' : d4, ''5'' : d5 } - try: - - if defs.has_key(str(event)): - print defs[str(event)] % args - else: - if defs.has_key(str(0)): print defs[str(0)] % args - except TypeError: - print defs[str(event)] - print args - + try: + if defs.has_key(str(event)): + print defs[str(event)] % args + else: + if defs.has_key(str(0)): print defs[str(0)] % args + except TypeError: + print defs[str(event)] + print args except IOError, struct.error: sys.exit() Yours Tony linux.conf.au http://linux.conf.au/ || http://lca2007.linux.org.au/ Jan 15-20 2007 The Australian Linux Technical Conference! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tony Breeds
2006-Dec-01 05:43 UTC
Re: [Xen-devel] [RFC][PATCH] 2/3] [TOOLS][XENTRACE] Update tools to write data to disk in a known endian''ness.
On Thu, Nov 30, 2006 at 11:27:27AM -0500, George Dunlap wrote:> Do you have an idea how much extra cpu this takes for high-bandwidth > tracing?I don''t imagine that hton*() are that expensive, and fwrite already buffers intreranll for 128k writes so with these patches you won''t be touching the disk anymore often.> I''m using xentrace for performance analysis, and it''s already got a > measurable impact due to the disk bandwidth and copying. I''d like to > keep that as minimal as possible.To be honnest I''m not doing any tracing, this patchset is the start required to use xentrace on PPC. Yours Tony linux.conf.au http://linux.conf.au/ || http://lca2007.linux.org.au/ Jan 15-20 2007 The Australian Linux Technical Conference! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Jimi Xenidis
2006-Dec-01 11:32 UTC
Re: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fields for exported xentrace data.
I''ll go one (huge) step further... should we adopt LTTng? http://ltt.polymtl.ca/ The trace stanzas are well defined and comes with a bunch of visualization and analysis tools. I know a lot of people in at IBM have been using it for performance studies of various operating systems and I have been asked how Xen could support if for Xen but for domains as well. (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/xen-port.txt) -JX On Nov 30, 2006, at 9:29 PM, Mark Williamson wrote:> I guess one possibility would be to continue using unsigned longs > but store > the machine word size and endianness in a header in the trace > file. This > gets us platform independence. > > This avoids adding extra overhead on the fast path, the extra > processing can > happen offline (and probably not at all in the common case that > you''re on the > same endianness / word size as the trace was collected on). > > Another alternative would be to allow some combination of 32-bit or > (fewer) > 64-bit words in the record. This would let us keep the same record > size, but > have a bit more flexibility. > > Going the whole hog, we could even make the trace data opaque to > trace.c - > have a char[] for the data, and deal with the semantics in terms > of "longs" "u64" etc in macros in the traced code, and in > xentrace_format. > > If we did this, the logical extension would be to have variable > length trace > records with a fixed-size header giving the full length. I think > this would > be a good direction to go in, and would ensure that we maximise use > of the > trace buffer space. It shouldn''t be that hard to modify the system > to do > this - most of the work may even be in making it nice to use! > > Cheers, > Mark > > On Thursday 30 November 2006 17:03, Keir Fraser wrote: >> On 30/11/06 16:58, "George Dunlap" <dunlapg@umich.edu> wrote: >>> Hmm... this has the unfortunate side-effect of doubling the size of >>> the trace, and effectively halving the effectiveness of the trace >>> buffer in avoiding drops. My moderate-length traces are already in >>> the gigabyte range, and I occasionally lose trace records even >>> with a >>> buffer size of 256. It would be really nice if we could avoid that. >>> >>> I happen to be using the VMENTER/VMEXIT tracing, which could be >>> consolidated into one record if we went to a 64-bit trace. Is >>> anyone >>> else doing high-bandwidth tracing that this would affect in a >>> significantly negative way? >> >> As we move increasingly towards x86/64 this is an issue that will >> need to >> be addressed even if we leave the tracing fields as longs. >> >> -- Keir >> >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel > > -- > Dave: Just a question. What use is a unicyle with no seat? And no > pedals! > Mark: To answer a question with a question: What use is a skateboard? > Dave: Skateboards have wheels. > Mark: My wheel has a wheel! > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2006-Dec-01 17:54 UTC
Re: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fields for exported xentrace data.
Variable-length trace buffers would certainly be nice. Right now, for some records, I''m squeezing bits into things, and just *one* more byte would be great... but for other records, I''m storing 4 words of 0, and there just doesn''t seem to be any other useful information to add. Keir, is the simpler, "one trace size fits all" method just because it was easier to implement originally, or is the simplicity expected to greatly reduce overhead and/or bugginess? If the former, then there seems enough interest in making the tracing more flexible to be worth changing; if the latter, then we should probably chose something and live with it, or perhaps a compromise (i.e., two record sizes). -George On 11/30/06, Mark Williamson <mark.williamson@cl.cam.ac.uk> wrote:> I guess one possibility would be to continue using unsigned longs but store > the machine word size and endianness in a header in the trace file. This > gets us platform independence. > > This avoids adding extra overhead on the fast path, the extra processing can > happen offline (and probably not at all in the common case that you''re on the > same endianness / word size as the trace was collected on). > > Another alternative would be to allow some combination of 32-bit or (fewer) > 64-bit words in the record. This would let us keep the same record size, but > have a bit more flexibility. > > Going the whole hog, we could even make the trace data opaque to trace.c - > have a char[] for the data, and deal with the semantics in terms > of "longs" "u64" etc in macros in the traced code, and in xentrace_format. > > If we did this, the logical extension would be to have variable length trace > records with a fixed-size header giving the full length. I think this would > be a good direction to go in, and would ensure that we maximise use of the > trace buffer space. It shouldn''t be that hard to modify the system to do > this - most of the work may even be in making it nice to use! > > Cheers, > Mark > > On Thursday 30 November 2006 17:03, Keir Fraser wrote: > > On 30/11/06 16:58, "George Dunlap" <dunlapg@umich.edu> wrote: > > > Hmm... this has the unfortunate side-effect of doubling the size of > > > the trace, and effectively halving the effectiveness of the trace > > > buffer in avoiding drops. My moderate-length traces are already in > > > the gigabyte range, and I occasionally lose trace records even with a > > > buffer size of 256. It would be really nice if we could avoid that. > > > > > > I happen to be using the VMENTER/VMEXIT tracing, which could be > > > consolidated into one record if we went to a 64-bit trace. Is anyone > > > else doing high-bandwidth tracing that this would affect in a > > > significantly negative way? > > > > As we move increasingly towards x86/64 this is an issue that will need to > > be addressed even if we leave the tracing fields as longs. > > > > -- Keir > > > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > -- > Dave: Just a question. What use is a unicyle with no seat? And no pedals! > Mark: To answer a question with a question: What use is a skateboard? > Dave: Skateboards have wheels. > Mark: My wheel has a wheel! >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Dec-01 18:37 UTC
Re: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fields for exported xentrace data.
On 1/12/06 5:54 pm, "George Dunlap" <dunlapg@umich.edu> wrote:> Variable-length trace buffers would certainly be nice. Right now, for > some records, I''m squeezing bits into things, and just *one* more byte > would be great... but for other records, I''m storing 4 words of 0, and > there just doesn''t seem to be any other useful information to add. > > Keir, is the simpler, "one trace size fits all" method just because it > was easier to implement originally, or is the simplicity expected to > greatly reduce overhead and/or bugginess? If the former, then there > seems enough interest in making the tracing more flexible to be worth > changing; if the latter, then we should probably chose something and > live with it, or perhaps a compromise (i.e., two record sizes).There''s no reason not to make the trace format more flexible. There''s a question about how you represent trace points in the Xen code though, when the format is no longer a list of fixed size integers. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2006-Dec-04 18:20 UTC
Re: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fields for exported xentrace data.
> Variable-length trace buffers would certainly be nice. Right now, for > some records, I''m squeezing bits into things, and just *one* more byte > would be great... but for other records, I''m storing 4 words of 0, and > there just doesn''t seem to be any other useful information to add.Yep, this is really something we should look about improving now.> Keir, is the simpler, "one trace size fits all" method just because it > was easier to implement originally, or is the simplicity expected to > greatly reduce overhead and/or bugginess?It''s just there because it was simpler to implement. I''m confident that we can come up with a more flexible format that won''t have a detrimental effect on performance / stability.> If the former, then there > seems enough interest in making the tracing more flexible to be worth > changing; if the latter, then we should probably chose something and > live with it, or perhaps a compromise (i.e., two record sizes).Two record sizes might be an interesting compromise, but I suggest we go the whole hog and spec a design for fully variable data lengths with a fixed length header. Cheers, Mark> -George > > On 11/30/06, Mark Williamson <mark.williamson@cl.cam.ac.uk> wrote: > > I guess one possibility would be to continue using unsigned longs but > > store the machine word size and endianness in a header in the trace file. > > This gets us platform independence. > > > > This avoids adding extra overhead on the fast path, the extra processing > > can happen offline (and probably not at all in the common case that > > you''re on the same endianness / word size as the trace was collected on). > > > > Another alternative would be to allow some combination of 32-bit or > > (fewer) 64-bit words in the record. This would let us keep the same > > record size, but have a bit more flexibility. > > > > Going the whole hog, we could even make the trace data opaque to trace.c > > - have a char[] for the data, and deal with the semantics in terms of > > "longs" "u64" etc in macros in the traced code, and in xentrace_format. > > > > If we did this, the logical extension would be to have variable length > > trace records with a fixed-size header giving the full length. I think > > this would be a good direction to go in, and would ensure that we > > maximise use of the trace buffer space. It shouldn''t be that hard to > > modify the system to do this - most of the work may even be in making it > > nice to use! > > > > Cheers, > > Mark > > > > On Thursday 30 November 2006 17:03, Keir Fraser wrote: > > > On 30/11/06 16:58, "George Dunlap" <dunlapg@umich.edu> wrote: > > > > Hmm... this has the unfortunate side-effect of doubling the size of > > > > the trace, and effectively halving the effectiveness of the trace > > > > buffer in avoiding drops. My moderate-length traces are already in > > > > the gigabyte range, and I occasionally lose trace records even with a > > > > buffer size of 256. It would be really nice if we could avoid that. > > > > > > > > I happen to be using the VMENTER/VMEXIT tracing, which could be > > > > consolidated into one record if we went to a 64-bit trace. Is anyone > > > > else doing high-bandwidth tracing that this would affect in a > > > > significantly negative way? > > > > > > As we move increasingly towards x86/64 this is an issue that will need > > > to be addressed even if we leave the tracing fields as longs. > > > > > > -- Keir > > > > > > > > > _______________________________________________ > > > Xen-devel mailing list > > > Xen-devel@lists.xensource.com > > > http://lists.xensource.com/xen-devel > > > > -- > > Dave: Just a question. What use is a unicyle with no seat? And no > > pedals! Mark: To answer a question with a question: What use is a > > skateboard? Dave: Skateboards have wheels. > > Mark: My wheel has a wheel!-- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2006-Dec-04 18:22 UTC
Re: [Xen-devel] [RFC][PATCH] 2/3] [TOOLS][XENTRACE] Update tools to write data to disk in a known endian''ness.
> > I did wonder > > if perhaps some sort of trace file header would be useful, giving > > metadata. For instance, a record of what dom0 / Xen builds were being > > used, and type of machine might be useful. > > I did think about doing something similar. I though I could steal the > high 16bits of the first CPU number in the file as a xentrace format > version number. Sure this would limit us to 64k CPUs but we can deal > with that limit later ;PSince traces are typically not going to be longlived but used for problem-specific debugging work, I think we can probably just make a clean break and define a new format. Is anybody is really relying on the trace format being stable? Please speak up now :-) Cheers, Mark> > A future extensible format would be a plus too, > > I guess. None of this is strictly necessary, it just might help with > > organising trace files. > > You spoke in more detail about that in another email so I''ll keep the > discussion there. > > Yours Tony > > linux.conf.au http://linux.conf.au/ || > http://lca2007.linux.org.au/ Jan 15-20 2007 The Australian Linux > Technical Conference!-- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2006-Dec-05 16:54 UTC
Re: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fields for exported xentrace data.
> There''s no reason not to make the trace format more flexible. There''s a > question about how you represent trace points in the Xen code though, when > the format is no longer a list of fixed size integers.I can see two main possibilities. One involving a variadic function and one involving mega macros of doom. One possibility would be a trace() function taking a variable number of arguments, i.e. void trace(type, unsigned char data1, unsigned char data2, ... etc) And a set of arch-defined macros (or at least bitness / endian defined macros). Eg. on x86 we might have: #define TRACE_U16(d) ((unsigned char)(d & 255)), ((unsigned char)(d >> 8)) We''d need to verify whether the extra processing had a measurable performance impact, however. Another alternative would be to make the array of trace buffers globally accessible and then use a set of macros for the trace() instead of an inline function. The macros could then look something like (pseudocode): struct trace_record { u32 type; u32 data_len; char data[] }; char *trace_buffer[NR_CPUS] #define open_trace(type) \ do { \ disable local irqs \ struct trace_record *record = \ &trace_buffer[cpu][producer_idx]; \ record->type = (u32)type \ record->data_len = 0; #define trace_u16(data) *(u16 *)record->data[record->data_len] = data \ record->data_len += sizeof(u16) ... etc for different data types, with appropriate variations if necessary for different platforms ... #define close_trace() \ inc producer counter by sizeof(struct \ trace_record) + record->data_len for userspace \ to see \ re-enable local irqs \ } while(0) Things become unhappy here because there''d need to be some kind of bounds checking in here to determine whether we need to wrap to the beginning of the trace buffer again. The alternatives as I see them would be either: a) include code in each data macro to check if we''d reached the end of the buffer and wrap the data appropriately, or b) include code that''ll simply copy everything we''ve built so far to the beginning of the trace buffer and start again. Either way is going to be ugly and unpleasant. Also, we have the problem of not knowing whether we''re going to wrap OR run out of space until we''re part way through the trace record, although in this instance, I guess we could just change to create a "missed data" record. I think the first approach (variadic function) above is probably nicer. We can always make a few macros to make common cases (e.g. recording a type and a single u64 of data) less verbose. Any thoughts? Cheers, Mark -- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2006-Dec-05 20:05 UTC
Re: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fields for exported xentrace data.
I see two other options: * Pre-allocate a block of data and fill it in * Allocate a struct on the stack, and copy it all at once. In the first case you''d do something like the following: struct { [trace layout] } *trec; trec=trace_var(TRC_TYPE, sizeof(*trec), [maybe some other info]); /* Fill in trec->* */ The second case looks similar: struct { [trace layout] } trec; /* Fill in trec.* */ trace_var(TRC_TYPE, &trec, sizeof(trec), [maybe other info]); The second case involves an extra copy, but that shouldn''t be a big deal. It has the advantage of being self-contained, and the trace code can make the record "wrap around" transparently. The first means no copying, but it also means no "wrap around"; if there''s not enough room at the end of a buffer, the space would just have to be left empty. That''s not probably such a big deal, though. The bigger problem comes if several "open" trace records happen at once. It''s technically possible that the trace buffer will wrap around before a function is done writing to its original buffer. In both of these cases, the common "TRACE_nD" macros can be left, I think. We might want to add "TRACE_nDL" for 64-bit values, and then let those who need more flexible trace structures call trace_var() directly. This way of doing things also has the advantage that the trace record can be defined in a public header somewhere, and used by user-space analysis tools as well as the hypervisor tracing code. Thoughts? -George On 12/5/06, Mark Williamson <mark.williamson@cl.cam.ac.uk> wrote:> > There''s no reason not to make the trace format more flexible. There''s a > > question about how you represent trace points in the Xen code though, when > > the format is no longer a list of fixed size integers. > > I can see two main possibilities. One involving a variadic function and one > involving mega macros of doom. > > One possibility would be a trace() function taking a variable number of > arguments, i.e. > > void trace(type, unsigned char data1, unsigned char data2, ... etc) > > And a set of arch-defined macros (or at least bitness / endian defined > macros). Eg. on x86 we might have: > > #define TRACE_U16(d) ((unsigned char)(d & 255)), ((unsigned char)(d >> 8)) > > We''d need to verify whether the extra processing had a measurable performance > impact, however. > > Another alternative would be to make the array of trace buffers globally > accessible and then use a set of macros for the trace() instead of an inline > function. The macros could then look something like (pseudocode): > > struct trace_record { > u32 type; > u32 data_len; > char data[] > }; > > char *trace_buffer[NR_CPUS] > > #define open_trace(type) \ > do { \ > disable local irqs \ > struct trace_record *record = \ > &trace_buffer[cpu][producer_idx]; \ > record->type = (u32)type \ > record->data_len = 0; > > #define trace_u16(data) *(u16 *)record->data[record->data_len] = data \ > record->data_len += sizeof(u16) > > ... etc for different data types, with appropriate variations if necessary for > different platforms ... > > #define close_trace() \ > inc producer counter by sizeof(struct \ > trace_record) + record->data_len for userspace \ > to see \ > re-enable local irqs \ > } while(0) > > > Things become unhappy here because there''d need to be some kind of bounds > checking in here to determine whether we need to wrap to the beginning of the > trace buffer again. The alternatives as I see them would be either: > > a) include code in each data macro to check if we''d reached the end of the > buffer and wrap the data appropriately, or > b) include code that''ll simply copy everything we''ve built so far to the > beginning of the trace buffer and start again. > > Either way is going to be ugly and unpleasant. Also, we have the problem of > not knowing whether we''re going to wrap OR run out of space until we''re part > way through the trace record, although in this instance, I guess we could > just change to create a "missed data" record. > > I think the first approach (variadic function) above is probably nicer. We > can always make a few macros to make common cases (e.g. recording a type and > a single u64 of data) less verbose. > > Any thoughts? > > Cheers, > Mark > > -- > Dave: Just a question. What use is a unicyle with no seat? And no pedals! > Mark: To answer a question with a question: What use is a skateboard? > Dave: Skateboards have wheels. > Mark: My wheel has a wheel! >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2006-Dec-06 16:10 UTC
RE: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fieldsfor exported xentrace data.
> I see two other options: > * Pre-allocate a block of data and fill it in > * Allocate a struct on the stack, and copy it all at once. > > In the first case you''d do something like the following: > struct { > [trace layout] > } *trec; > > trec=trace_var(TRC_TYPE, sizeof(*trec), [maybe some other info]); > > /* Fill in trec->* */ > > The second case looks similar: > struct { > [trace layout] > } trec; > > /* Fill in trec.* */ > > trace_var(TRC_TYPE, &trec, sizeof(trec), [maybe other info]);The ultimate best way of doing it would be to have trace functions that took a format string and a variable number of arguments. The actual trace record written in the buffer would just contain the record type and the length of the record, followed by the variable data. The format string would be written out in an a separate segment, enabling it to be extracted and used by the trace post-process tool to pretty print the records. Ian> The second case involves an extra copy, but that shouldn''t be a big > deal. It has the advantage of being self-contained, and the trace > code can make the record "wrap around" transparently. > > The first means no copying, but it also means no "wrap around"; if > there''s not enough room at the end of a buffer, the space would just > have to be left empty. That''s not probably such a big deal, though. > The bigger problem comes if several "open" trace records happen at > once. It''s technically possible that the trace buffer will wrap > around before a function is done writing to its original buffer. > > In both of these cases, the common "TRACE_nD" macros can be left, I > think. We might want to add "TRACE_nDL" for 64-bit values, and then > let those who need more flexible trace structures call trace_var() > directly. > > This way of doing things also has the advantage that the trace record > can be defined in a public header somewhere, and used by user-space > analysis tools as well as the hypervisor tracing code. > > Thoughts? > > -George > > On 12/5/06, Mark Williamson <mark.williamson@cl.cam.ac.uk> wrote: > > > There''s no reason not to make the trace format more flexible.There''s a> > > question about how you represent trace points in the Xen codethough,> when > > > the format is no longer a list of fixed size integers. > > > > I can see two main possibilities. One involving a variadic functionand> one > > involving mega macros of doom. > > > > One possibility would be a trace() function taking a variable numberof> > arguments, i.e. > > > > void trace(type, unsigned char data1, unsigned char data2, ... etc) > > > > And a set of arch-defined macros (or at least bitness / endiandefined> > macros). Eg. on x86 we might have: > > > > #define TRACE_U16(d) ((unsigned char)(d & 255)), ((unsigned char)(d >> > 8)) > > > > We''d need to verify whether the extra processing had a measurable > performance > > impact, however. > > > > Another alternative would be to make the array of trace buffersglobally> > accessible and then use a set of macros for the trace() instead ofan> inline > > function. The macros could then look something like (pseudocode): > > > > struct trace_record { > > u32 type; > > u32 data_len; > > char data[] > > }; > > > > char *trace_buffer[NR_CPUS] > > > > #define open_trace(type) \ > > do { \ > > disable local irqs > \ > > struct trace_record *record > \ > > > &trace_buffer[cpu][producer_idx]; \ > > record->type = (u32)type > \ > > record->data_len = 0; > > > > #define trace_u16(data) *(u16 *)record->data[record->data_len] > data \ > > record->data_len += sizeof(u16) > > > > ... etc for different data types, with appropriate variations if > necessary for > > different platforms ... > > > > #define close_trace() \ > > inc producer counter by sizeof(struct > \ > > trace_record) + record->data_len for > userspace \ > > to see \ > > re-enable local irqs > \ > > } while(0) > > > > > > Things become unhappy here because there''d need to be some kind ofbounds> > checking in here to determine whether we need to wrap to thebeginning of> the > > trace buffer again. The alternatives as I see them would be either: > > > > a) include code in each data macro to check if we''d reached the endof> the > > buffer and wrap the data appropriately, or > > b) include code that''ll simply copy everything we''ve built so far tothe> > beginning of the trace buffer and start again. > > > > Either way is going to be ugly and unpleasant. Also, we have theproblem> of > > not knowing whether we''re going to wrap OR run out of space untilwe''re> part > > way through the trace record, although in this instance, I guess wecould> > just change to create a "missed data" record. > > > > I think the first approach (variadic function) above is probablynicer.> We > > can always make a few macros to make common cases (e.g. recording atype> and > > a single u64 of data) less verbose. > > > > Any thoughts? > > > > Cheers, > > Mark > > > > -- > > Dave: Just a question. What use is a unicyle with no seat? And no > pedals! > > Mark: To answer a question with a question: What use is askateboard?> > Dave: Skateboards have wheels. > > Mark: My wheel has a wheel! > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2006-Dec-07 02:43 UTC
Re: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fieldsfor exported xentrace data.
> The ultimate best way of doing it would be to have trace functions that > took a format string and a variable number of arguments.Agreed.> The actual > trace record written in the buffer would just contain the record type > and the length of the record, followed by the variable data. The format > string would be written out in an a separate segment, enabling it to be > extracted and used by the trace post-process tool to pretty print the > records.Cool. It would be fairly trivial to declare a name for each "type" and allow userspace to query the type name <-> type ID mapping. This then removes all responsibility for determining output formatting from userspace, and should make everything neater. Nice. All we need is a coder - anyone interested? Cheers, Mark> Ian > > > The second case involves an extra copy, but that shouldn''t be a big > > deal. It has the advantage of being self-contained, and the trace > > code can make the record "wrap around" transparently. > > > > The first means no copying, but it also means no "wrap around"; if > > there''s not enough room at the end of a buffer, the space would just > > have to be left empty. That''s not probably such a big deal, though. > > The bigger problem comes if several "open" trace records happen at > > once. It''s technically possible that the trace buffer will wrap > > around before a function is done writing to its original buffer. > > > > In both of these cases, the common "TRACE_nD" macros can be left, I > > think. We might want to add "TRACE_nDL" for 64-bit values, and then > > let those who need more flexible trace structures call trace_var() > > directly. > > > > This way of doing things also has the advantage that the trace record > > can be defined in a public header somewhere, and used by user-space > > analysis tools as well as the hypervisor tracing code. > > > > Thoughts? > > > > -George > > > > On 12/5/06, Mark Williamson <mark.williamson@cl.cam.ac.uk> wrote: > > > > There''s no reason not to make the trace format more flexible. > > There''s a > > > > > question about how you represent trace points in the Xen code > > though, > > > when > > > > > > the format is no longer a list of fixed size integers. > > > > > > I can see two main possibilities. One involving a variadic function > > and > > > one > > > > > involving mega macros of doom. > > > > > > One possibility would be a trace() function taking a variable number > > of > > > > arguments, i.e. > > > > > > void trace(type, unsigned char data1, unsigned char data2, ... etc) > > > > > > And a set of arch-defined macros (or at least bitness / endian > > defined > > > > macros). Eg. on x86 we might have: > > > > > > #define TRACE_U16(d) ((unsigned char)(d & 255)), ((unsigned char)(d > > > > 8)) > > > > > We''d need to verify whether the extra processing had a measurable > > > > performance > > > > > impact, however. > > > > > > Another alternative would be to make the array of trace buffers > > globally > > > > accessible and then use a set of macros for the trace() instead of > > an > > > inline > > > > > function. The macros could then look something like (pseudocode): > > > > > > struct trace_record { > > > u32 type; > > > u32 data_len; > > > char data[] > > > }; > > > > > > char *trace_buffer[NR_CPUS] > > > > > > #define open_trace(type) \ > > > do { \ > > > disable local irqs > > > > \ > > > > > struct trace_record *record > > > > \ > > > > &trace_buffer[cpu][producer_idx]; \ > > > > > record->type = (u32)type > > > > \ > > > > > record->data_len = 0; > > > > > > #define trace_u16(data) *(u16 *)record->data[record->data_len] > > > > > data \ > > > > > record->data_len += sizeof(u16) > > > > > > ... etc for different data types, with appropriate variations if > > > > necessary for > > > > > different platforms ... > > > > > > #define close_trace() \ > > > inc producer counter by sizeof(struct > > > > \ > > > > > trace_record) + record->data_len for > > > > userspace \ > > > > > to see \ > > > re-enable local irqs > > > > \ > > > > > } while(0) > > > > > > > > > Things become unhappy here because there''d need to be some kind of > > bounds > > > > checking in here to determine whether we need to wrap to the > > beginning of > > > the > > > > > trace buffer again. The alternatives as I see them would be either: > > > > > > a) include code in each data macro to check if we''d reached the end > > of > > > the > > > > > buffer and wrap the data appropriately, or > > > b) include code that''ll simply copy everything we''ve built so far to > > the > > > > beginning of the trace buffer and start again. > > > > > > Either way is going to be ugly and unpleasant. Also, we have the > > problem > > > of > > > > > not knowing whether we''re going to wrap OR run out of space until > > we''re > > > part > > > > > way through the trace record, although in this instance, I guess we > > could > > > > just change to create a "missed data" record. > > > > > > I think the first approach (variadic function) above is probably > > nicer. > > > We > > > > > can always make a few macros to make common cases (e.g. recording a > > type > > > and > > > > > a single u64 of data) less verbose. > > > > > > Any thoughts? > > > > > > Cheers, > > > Mark > > > > > > -- > > > Dave: Just a question. What use is a unicyle with no seat? And no > > > > pedals! > > > > > Mark: To answer a question with a question: What use is a > > skateboard? > > > > Dave: Skateboards have wheels. > > > Mark: My wheel has a wheel! > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel-- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Tony Breeds
2006-Dec-07 05:29 UTC
Re: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fieldsfor exported xentrace data.
On Thu, Dec 07, 2006 at 02:43:58AM +0000, Mark Williamson wrote:> > The ultimate best way of doing it would be to have trace functions that > > took a format string and a variable number of arguments. > > Agreed. > > > The actual > > trace record written in the buffer would just contain the record type > > and the length of the record, followed by the variable data. The format > > string would be written out in an a separate segment, enabling it to be > > extracted and used by the trace post-process tool to pretty print the > > records. > > Cool. > > It would be fairly trivial to declare a name for each "type" and allow > userspace to query the type name <-> type ID mapping. This then removes all > responsibility for determining output formatting from userspace, and should > make everything neater. > > Nice. All we need is a coder - anyone interested?I''ll hack something together. to match /my/ understanding of the current proposals. We can go from there. Briefly I think we''re talking about using a variable length structure, with tracepoints looking like a printf() style function. so -- TRACE_4D(TRC_SCHED_SWITCH, prev->domain->domain_id, prev->vcpu_id, next->domain->domain_id, next->vcpu_id); -- would become something like: -- trace(TRC_SCHED_SWITCH, "%d%d%d%d", prev->domain->domain_id, prev->vcpu_id, next->domain->domain_id, next->vcpu_id); -- Trace would store a lookup table for TRC_SCHED_SWITCH => "%d%d%d%d". The data would get stored in a fashion similar to Mark''s suggestion in http://lists.xensource.com/archives/html/xen-devel/2006-12/msg00179.html The xentrace tool would write the record and format specifier to disk, in network byte order (until then data remains in native byte order). There would fo course be implicit 64bit tsc and 32bit event-type prefixed to each record. xentrace_format will pretty print the datafile. I like the idea of stepping away from printf codes and using explicit size args (%8, %16, %32 and %64), does anyone disagree? Yours Tony linux.conf.au http://linux.conf.au/ || http://lca2007.linux.org.au/ Jan 15-20 2007 The Australian Linux Technical Conference! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2006-Dec-07 10:32 UTC
Re: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fieldsfor exported xentrace data.
On 6/12/06 16:10, "Ian Pratt" <m+Ian.Pratt@cl.cam.ac.uk> wrote:> The ultimate best way of doing it would be to have trace functions that > took a format string and a variable number of arguments. The actual > trace record written in the buffer would just contain the record type > and the length of the record, followed by the variable data. The format > string would be written out in an a separate segment, enabling it to be > extracted and used by the trace post-process tool to pretty print the > records.I agree this is a good way to go. We certainly don''t want lots of little single use trace-record structures, one per trace point in Xen! That would be overkill -- format strings are a good middle ground. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2006-Dec-07 14:21 UTC
RE: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fieldsfor exported xentrace data.
> On 6/12/06 16:10, "Ian Pratt" <m+Ian.Pratt@cl.cam.ac.uk> wrote: > > > The ultimate best way of doing it would be to have trace functionsthat> > took a format string and a variable number of arguments. The actual > > trace record written in the buffer would just contain the recordtype> > and the length of the record, followed by the variable data. Theformat> > string would be written out in an a separate segment, enabling it tobe> > extracted and used by the trace post-process tool to pretty printthe> > records. > > I agree this is a good way to go. We certainly don''t want lots oflittle> single use trace-record structures, one per trace point in Xen! Thatwould> be overkill -- format strings are a good middle ground.If the trace record writing function was inlined, I wander whether gcc would be smart enough to skip all the var args stuff and just write the appropriate parameters into the trace buffer? I think that''s probably being optimistic. It might be possible to do something horrendous with the C pre-processor, matching various static format strings for common patterns (e.g. all %x), else falling back to a trivial parser. Ideally the format string would include the ''pretty print'' format for the post processing tools (e.g. put the strings as literals in a separate segment and extract them later in the build process. That''s not attractive if we have to parse the strings at run time, though. It could easily be fixed by pre-processing all source files with M4 or perl, but I don''t think we want to go there. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2006-Dec-07 18:25 UTC
Re: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sized fieldsfor exported xentrace data.
On 12/7/06, Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk> wrote:> Ideally the format string would include the ''pretty print'' format for > the post processing tools (e.g. put the strings as literals in a > separate segment and extract them later in the build process. That''s not > attractive if we have to parse the strings at run time, though. It could > easily be fixed by pre-processing all source files with M4 or perl, but > I don''t think we want to go there.I should think that parsing the strings at runtime should be pretty quick -- we don''t have to actually make text out of them, we just need to scan through looking for % tokens, determine the size, and copy to the trace buffer. We should be able to make more optimized tracing functions for hot paths that copy data directly into the trace buffer. Just for the record, my idea wan''t that every single trace record would have a struct associated with it. I think that for most trace records, a list of u32 or u64 would work just fine. I was envisioning a class of functions: trace_1d(int type, u32 val1); trace_2d(int type, u32 val1, u32 val2); /* etc */ trace_nd(int type, int n, ...); /* Take N unsigned longs */ trace_1l(int type, u64 val1); /* etc */ trace_v(int type, size_t bytes, char *data); /* Take a struct of size ''bytes'' */ The final one would only be used in cases where layout is important, mainly where space is critical. An advantage of this method is that we can use sub-byte-sized fields (e.g., 3 bits) and let the compiler do the work of shifting things around. If we adopt the format-string model, I suspect that in most cases, no one is going to worry about bits anyway, but just put up "%d" (which will probably translate to u32) even for data that only need 3 bits. If we adopt the format-string model, we need to make sure it''s unambiguous. For instnace, does "%lx" translate to u32 on both 32- and 64-bit builds? How do I specify u16 and u8? Can I print strings? Pretty-printing might be nice, but I plan on using the binary format anyway for my analysis tools, so it''s not that big a deal to me. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ian Pratt
2006-Dec-08 10:52 UTC
RE: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sizedfieldsfor exported xentrace data.
> On 12/7/06, Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk> wrote: > > Ideally the format string would include the ''pretty print'' formatfor> > the post processing tools (e.g. put the strings as literals in a > > separate segment and extract them later in the build process. That''snot> > attractive if we have to parse the strings at run time, though. Itcould> > easily be fixed by pre-processing all source files with M4 or perl,but> > I don''t think we want to go there. > > I should think that parsing the strings at runtime should be pretty > quick -- we don''t have to actually make text out of them, we just need > to scan through looking for % tokens, determine the size, and copy to > the trace buffer.It''s still a bit gross given that the string doesn''t even need to be in the executable and we knew all this stuff at compile time anyhow. If we did this with macros we could use sizeof(_x) to determine how many bytes to write to the trace buffer. Os is the intention to deliberately truncate some 64b quantities to 32b save trace buffer space? It may be ugly, but it may actually be possible to do this with the C pre-processor. Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
George Dunlap
2006-Dec-08 20:28 UTC
Re: [Xen-devel] [RFC][PATCH] 1/3] [XEN] Use explicit bit sizedfieldsfor exported xentrace data.
On 12/8/06, Ian Pratt <m+Ian.Pratt@cl.cam.ac.uk> wrote:> If we did this with macros we could use sizeof(_x) to determine how many > bytes to write to the trace buffer. Os is the intention to deliberately > truncate some 64b quantities to 32b save trace buffer space?Yes. What initiated this discussion was that changing the trace buffers to u64 would double the size of traces. My typical traces are already 1GB+, with longer ones reaching 10G+. Since at this point I''m looking mainly at 32-bit VMX code, it would be nice if I could avoid that. However, as long as the /number/ of 64b quantities was variable, I could make do with either size. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Hi, My name is Mathieu Desnoyers, I am the current maintainer of the Linux Trace Toolkit project, known as LTTng. This is a tracer for the 2.6 Linux kernels oriented towards high performance and real-time applications. I have read your tracing thread and I am surprised to see how much things you would like in a tracer are already implemented and tested in LTTng. I am currently porting my tracer to Xen, so I think it might be useful for you to know what it provides. My goal is to do not duplicate the effort and save everyone some time. Here follows some key features of LTTng : Architecture independant data types Extensible event records Self-describing traces Variable size records Fast (200 ns per event record) Highly reentrant Does not disable interrupts Does not take lock on the critical path Supports NMI tracing Analysis/visualization tool (LTTV) Looking at the integration of the existing LTTng implementation into Xen, I came up with those two points for my Christmas whichlist : Additionnal functionnalities that would be nice to have in Xen : - RCU-style updates : would allow freeing the buffers without impact on tracing. * I guess I could currently use : for_each_domain( d ) for_each_vcpu( d, v ) vcpu_sleep_sync(v); I think it will have a huge impact on the system, but it would only be performed before trace buffers free. - Polling for data in Xen from a dom0 process. Xentrace currently polls the hypervisor each 100ms to see if there is data that needs to be consumed. Instead of an active polling, it would be nice to use the dom0 OS capability to put a process to sleep while waiting for a resource. It would imply creating a module, loaded in dom0, that would wait for a specific virq coming from the Hypervisor to wake up such processes. We could think of exporting a complete poll() interface through sysfs or procfs that would be a directory filled with the resources exported from the Hypervisor to dom0 (which could include wait for resource freed, useful when shutting down a domU instead of busy looping). It would help dom0 to schedule other processes while a process is waiting for the Hypervisor. You might also be interested in looking at : - the website (http://ltt.polymtl.ca) - LTTng Xen port design document (this one is different from the one posted by Jimi) (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/lttng-xen.txt) - OLS 2006 paper "The LTTng tracer : A Low Impact Performance and Behavior Monitor for GNU/Linux" (http://ltt.polymtl.ca/papers/desnoyers-ols2006.pdf) Questions and constructive comments are welcome. Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> I have read your tracing thread and I am surprised to see how muchthings> you would like in a tracer are already implemented and tested inLTTng. I> am currently porting my tracer to Xen, so I think it might be usefulfor you> to know what it provides. My goal is to do not duplicate the effortand save> everyone some time.I like the work you''ve done with LTTng, but we have to be careful not to go too overboard with how fancy we make the solution for Xen. We don''t particularly need dynamic registering of trace types (though dynamically turn off-and-on-able is good), and I''d like to keep as much complexity compile time as possible. Having per CPU buffers using the TSC as the timestamp is perfectly adequate (provided we drop in an appropriate synchronization record whenever the Xen TSC/wall clock calibration code runs on each CPU).> - Polling for data in Xen from a dom0 process. > Xentrace currently polls the hypervisor each 100ms to see if thereis> data > that needs to be consumed. Instead of an active polling, it would benice> to > use the dom0 OS capability to put a process to sleep while waitingfor a> resource. It would imply creating a module, loaded in dom0, thatwould> wait > for a specific virq coming from the Hypervisor to wake up suchprocesses.> We could think of exporting a complete poll() interface throughsysfs or> procfs that would be a directory filled with the resources exportedfrom> the > Hypervisor to dom0 (which could include wait for resource freed,useful> when > shutting down a domU instead of busy looping). It would help dom0 to > schedule > other processes while a process is waiting for the Hypervisor.I really thought we already had the functionality to enable the trac ewriter to block on the trace buffer(s) becoming half full -- I thought Rob Gardener fixed this ages ago. He certainly *promised* a patch to do it :) Thanks, Ian _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
* Ian Pratt (m+Ian.Pratt@cl.cam.ac.uk) wrote:> > I have read your tracing thread and I am surprised to see how much > things > > you would like in a tracer are already implemented and tested in > LTTng. I > > am currently porting my tracer to Xen, so I think it might be useful > for you > > to know what it provides. My goal is to do not duplicate the effort > and save > > everyone some time. > > I like the work you''ve done with LTTng, but we have to be careful not to > go too overboard with how fancy we make the solution for Xen. We don''t > particularly need dynamic registering of trace types (though dynamically > turn off-and-on-able is good), and I''d like to keep as much complexity > compile time as possible. Having per CPU buffers using the TSC as the > timestamp is perfectly adequate (provided we drop in an appropriate > synchronization record whenever the Xen TSC/wall clock calibration code > runs on each CPU). >Hi Ian, The good thing about being flexible is that we can easily trim down the unneeded features like dynamic registering of trace types if you don''t like them (they are implemented in the ltt-facilities.c module which could be hacked to take a statically known set of facilities). Some features like a small control channel which records information about facilities, event types, arch type sizes and endianness is still interesting even without dynamically loadable facilities though, as we can expect the developers to extend the set of events to add their own and it provides portability.> > - Polling for data in Xen from a dom0 process. > > Xentrace currently polls the hypervisor each 100ms to see if there > is > > data > > that needs to be consumed. Instead of an active polling, it would be > nice > > to > > use the dom0 OS capability to put a process to sleep while waiting > for a > > resource. It would imply creating a module, loaded in dom0, that > would > > wait > > for a specific virq coming from the Hypervisor to wake up such > processes. > > We could think of exporting a complete poll() interface through > sysfs or > > procfs that would be a directory filled with the resources exported > from > > the > > Hypervisor to dom0 (which could include wait for resource freed, > useful > > when > > shutting down a domU instead of busy looping). It would help dom0 to > > schedule > > other processes while a process is waiting for the Hypervisor. > > I really thought we already had the functionality to enable the trac > ewriter to block on the trace buffer(s) becoming half full -- I thought > Rob Gardener fixed this ages ago. He certainly *promised* a patch to do > it :) >As from the current mercurial tree : http://lxr.xensource.com/lxr/source/tools/xentrace/xentrace.c int monitor_tbufs(FILE *logfile) 310 /* now, scan buffers for events */ 311 while ( !interrupted ) 312 { 313 for ( i = 0; (i < num) && !interrupted; i++ ) 314 { 315 while ( meta[i]->cons != meta[i]->prod ) 316 { 317 rmb(); /* read prod, then read item. */ 318 write_rec(i, data[i] + meta[i]->cons % size_in_recs, logfile); 319 mb(); /* read item, then update cons. */ 320 meta[i]->cons++; 321 } 322 } 323 324 nanosleep(&opts.poll_sleep, NULL); 325 } So it seems like the implementation must be hiding either in someone''s head or in his mercurial tree. :) I guess it would be easiter to implement if there was support for the dom0 OS to block a process waiting for an Hypervisor resource. Or maybe is it already implemented but eluded my attention ? Regards, Mathieu OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
On Tue, Dec 12, 2006 at 05:14:02PM -0500, Mathieu Desnoyers wrote:> As from the current mercurial tree : > http://lxr.xensource.com/lxr/source/tools/xentrace/xentrace.c > > int monitor_tbufs(FILE *logfile) > 310 /* now, scan buffers for events */ > 311 while ( !interrupted ) > 312 { > 313 for ( i = 0; (i < num) && !interrupted; i++ ) > 314 { > 315 while ( meta[i]->cons != meta[i]->prod ) > 316 { > 317 rmb(); /* read prod, then read item. */ > 318 write_rec(i, data[i] + meta[i]->cons % size_in_recs, logfile); > 319 mb(); /* read item, then update cons. */ > 320 meta[i]->cons++; > 321 } > 322 } > 323 > 324 nanosleep(&opts.poll_sleep, NULL); > 325 } > > So it seems like the implementation must be hiding either in someone''s head or > in his mercurial tree. :)I suspect that xenbaked uses it, http://lxr.xensource.com/lxr/source/tools/xenmon/xenbaked.c --- 262 int eventchn_init(void) 263 { 264 int rc; 265 266 // to revert to old way: 267 if (0) 268 return -1; 269 270 xce_handle = xc_evtchn_open(); 271 272 if (xce_handle < 0) 273 perror("Failed to open evtchn device"); 274 275 if ((rc = xc_evtchn_bind_virq(xce_handle, VIRQ_TBUF)) == -1) 276 perror("Failed to bind to domain exception virq port"); 277 virq_port = rc; 278 279 return xce_handle; 280 } --- With appropriate triggering, in xen/common/trace.c Yours Tony linux.conf.au http://linux.conf.au/ || http://lca2007.linux.org.au/ Jan 15-20 2007 The Australian Linux Technical Conference! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> I suspect that xenbaked uses it, > http://lxr.xensource.com/lxr/source/tools/xenmon/xenbaked.c > --- > 262 int eventchn_init(void) > 263 { > 264 int rc; > 265 > 266 // to revert to old way: > 267 if (0) > 268 return -1; > 269 > 270 xce_handle = xc_evtchn_open(); > 271 > 272 if (xce_handle < 0) > 273 perror("Failed to open evtchn device"); > 274 > 275 if ((rc = xc_evtchn_bind_virq(xce_handle, VIRQ_TBUF)) == -1) > 276 perror("Failed to bind to domain exception virq port"); > 277 virq_port = rc; > 278 > 279 return xce_handle; > 280 } > --- > > With appropriate triggering, in xen/common/trace.cAh yes. It should be fairly trivial to copy this code into xentrace.c and then replace the nanosleep with a wait on that event. This would probably be worthwhile as an interim solution until we figure out what next gen tracing ought to look like. Cheers, Mark _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> I have read your tracing thread and I am surprised to see how much things > you would like in a tracer are already implemented and tested in LTTng.The Xen tracer was originally a fairly simple hack, although it has been updated a number of times to improve the scope of its functionality. A more advanced tracing tool would be nice.> I > am currently porting my tracer to Xen, so I think it might be useful for > you to know what it provides. My goal is to do not duplicate the effort > and save everyone some time.>Here follows some key features of LTTng : > >Architecture independant data typesThis should please those developing with cross compilers, so would be nice.>Extensible event recordsDefinitely good.>Self-describing traces >Variable size recordsDefinitely good. Self-describing traces would be particularly useful - for our current xentrace, I had proposed adding a hypercall that would return event ID -> event description string mappings.>Fast (200 ns per event record) >Highly reentrant >Does not disable interrupts >Does not take lock on the critical pathCool. We''ve been running with xentrace compiled by default for about a year now, although trace buffers are not usually allocated on production systems. It would be nice to keep the overhead low enough to continue to do this. Refinements like being able to enable specific event types might be nice, but aren''t critical.>Supports NMI tracing >Analysis/visualization tool (LTTV)Visualisation would be really useful - poring through trace files (even once formatted) is pretty hairy. Some kind of timeline visualisation tool (I think I''ve seen one for LTT at some point) would rock.>- Polling for data in Xen from a dom0 process. > Xentrace currently polls the hypervisor each 100ms to see if there is > data > that needs to be consumed. Instead of an active polling, it would be > nice to > use the dom0 OS capability to put a process to sleep while waiting for a > resource. It would imply creating a module, loaded in dom0, that would > wait > for a specific virq coming from the Hypervisor to wake up such processes. > We could think of exporting a complete poll() interface through sysfs or > procfs that would be a directory filled with the resources exported > from the > Hypervisor to dom0 (which could include wait for resource freed, useful > when > shutting down a domU instead of busy looping). It would help dom0 to > schedule > other processes while a process is waiting for the Hypervisor.We can actually do this already, it''s just that xentrace doesn''t implement it ;-) See xenbaked''s code: it sleeps waiting for an event channel, which gets fired by Xen when the buffer has had a chance to fill up a bit. The nice bit is that the /dev/evtchn driver is compiled into Xen by default *anyhow*, so a new kernel module isn''t required.> You might also be interested in looking at : - the website > (http://ltt.polymtl.ca) - LTTng Xen port design document (this one is > different from the one posted by > Jimi) > (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/lttng-xen.txt) >- OLS 2006 paper "The LTTng tracer : A Low Impact Performance and Behavior > Monitor for GNU/Linux" > (http://ltt.polymtl.ca/papers/desnoyers-ols2006.pdf)A few questions: How much commonality will this have with the Linux LLTng? Will there be direct code sharing, or a patchqueue, or simply code copying? Will existing userspace LTTng tools be applicable? Is there already a working visualiser? Is there a source repository online that we could take a look at? It would be really great if you came up with something interesting on the tracing front: quite a lot of people seem to use it these days. Cheers, Mark>Questions and constructive comments are welcome. > >Mathieu > > > OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg Key > fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 > >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.xensource.com >http://lists.xensource.com/xen-devel >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mathieu Desnoyers
2007-Mar-09 01:20 UTC
Re: [Xen-devel] LTTng Xen port : finally in a repository near you
Hello, I made a working version of the LTTng tracer for xen-unstable for x86. Here is the pointer to my repository so you can try it out : hg clone http://ltt.polymtl.ca/cgi-bin/hgweb.cgi xen-unstable-lttng.hg Basic usage : (see lttctl-xen -h) lttctl-xen -c (in a different console) lttd-xen -t /tmp/xentrace1 (in the 1st console) lttctl-xen -s (tracing is active) lttctl-xen -q lttctl-xen -r lttd-xen should automatically quit after writing the last buffers as soon as lttctl-xen -r is issued. Then, you must copy the XML facilities : (see the http://ltt.polymtl.ca > QUICKSTART to see how to install the ltt-control package which contains the XML facilities in your system) lttctl-xen -e -t /tmp/xentrace1 View in the visualiser : (see the QUICKSTART to see how to install it) lttv -m textDump -t /tmp/xentrace1 (not tested yet) : to visualize a dom0 trace with the xen hypervisor information, one would have to collect the dom0 kernel trace and the Xen trace and open them together with : lttv -m textDump -t /tmp/xentrace1 -t /tmp/dom0trace The current Linux kernel instrumentation is for 2.6.20. A backport might be needed to 2.6.18 if there is no proper Xen support in 2.6.20 (I have not followed the recent developments). Currently broken/missing : - Ressources are not freed when the trace channels are destroyed. So you basically have to reboot between taking different traces. - My code in the hypervisor complains to the console that subbuffers have not been fully read when the trace channels are destroyed. The error printing is just done too fast : lttd-xen is still there and reading the buffers at that point. It will get fixed with proper ressource usage tracking of both Xen and lttd-xen (same as the first point above). - x86_64 not tested, powerpc local.h and ltt.h missing (should be ripped from my Linux kernel LTTng). Cheers, Mathieu * Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote:> Hi, > > My name is Mathieu Desnoyers, I am the current maintainer of the Linux Trace > Toolkit project, known as LTTng. This is a tracer for the 2.6 Linux kernels > oriented towards high performance and real-time applications. > > I have read your tracing thread and I am surprised to see how much things > you would like in a tracer are already implemented and tested in LTTng. I am > currently porting my tracer to Xen, so I think it might be useful for you to > know what it provides. My goal is to do not duplicate the effort and save > everyone some time. > > Here follows some key features of LTTng : > > Architecture independant data types > Extensible event records > Self-describing traces > Variable size records > Fast (200 ns per event record) > Highly reentrant > Does not disable interrupts > Does not take lock on the critical path > Supports NMI tracing > Analysis/visualization tool (LTTV) > > Looking at the integration of the existing LTTng implementation into Xen, I > came up with those two points for my Christmas whichlist : > > Additionnal functionnalities that would be nice to have in Xen : > > - RCU-style updates : would allow freeing the buffers without impact on tracing. > * I guess I could currently use : > for_each_domain( d ) > for_each_vcpu( d, v ) > vcpu_sleep_sync(v); > I think it will have a huge impact on the system, but it would only be > performed before trace buffers free. > > - Polling for data in Xen from a dom0 process. > Xentrace currently polls the hypervisor each 100ms to see if there is data > that needs to be consumed. Instead of an active polling, it would be nice to > use the dom0 OS capability to put a process to sleep while waiting for a > resource. It would imply creating a module, loaded in dom0, that would wait > for a specific virq coming from the Hypervisor to wake up such processes. > We could think of exporting a complete poll() interface through sysfs or > procfs that would be a directory filled with the resources exported from the > Hypervisor to dom0 (which could include wait for resource freed, useful when > shutting down a domU instead of busy looping). It would help dom0 to schedule > other processes while a process is waiting for the Hypervisor. > > > You might also be interested in looking at : > - the website (http://ltt.polymtl.ca) > - LTTng Xen port design document (this one is different from the one posted by > Jimi) > (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/lttng-xen.txt) > - OLS 2006 paper "The LTTng tracer : A Low Impact Performance and Behavior > Monitor for GNU/Linux" > (http://ltt.polymtl.ca/papers/desnoyers-ols2006.pdf) > > > Questions and constructive comments are welcome. > > Mathieu > > > OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg > Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel >-- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
INAKOSHI Hiroya
2007-Jun-25 08:43 UTC
Re: [Xen-devel] LTTng Xen port : finally in a repository near you
Hi Mathieu, I am interested in LTTng-xen because I thought that it would be nice if I can get traces on both xen and guest linux at the same time. I reviewed LTTng-xen and found that * LTTng and LTTng-xen have a quite similar structure, * a trace buffer resides in a hypervisor for LTTng-xen, * it is currently impossible to get traces from guest linux because there is no LTTng for 2.6.18-xen kernel, as you mentioned. I had coarsely ported LTTng to 2.6.18-xen, though it is only for i386. Now I can get traces on xen and guest linux simultaneously, even though they put records in different trace buffers. Then I thought that it would be more useful if they put records in xen''s trace buffer and I can analyze events from xen and linux guests with a single lttd and lttctl running on Domain-0. Do you have an opinion about that? Regards, Hiroya Mathieu Desnoyers wrote:> Hello, > > I made a working version of the LTTng tracer for xen-unstable for x86. > Here is the pointer to my repository so you can try it out : > > hg clone http://ltt.polymtl.ca/cgi-bin/hgweb.cgi xen-unstable-lttng.hg > > Basic usage : > > (see lttctl-xen -h) > > lttctl-xen -c > > (in a different console) > lttd-xen -t /tmp/xentrace1 > > (in the 1st console) > lttctl-xen -s > > (tracing is active) > > lttctl-xen -q > lttctl-xen -r > > lttd-xen should automatically quit after writing the last buffers as > soon as lttctl-xen -r is issued. > > Then, you must copy the XML facilities : > > (see the http://ltt.polymtl.ca > QUICKSTART to see how to install the > ltt-control package which contains the XML facilities in your system) > > lttctl-xen -e -t /tmp/xentrace1 > > View in the visualiser : (see the QUICKSTART to see how to install it) > > lttv -m textDump -t /tmp/xentrace1 > > (not tested yet) : to visualize a dom0 trace with the xen hypervisor > information, one would have to collect the dom0 kernel trace and the Xen > trace and open them together with : > lttv -m textDump -t /tmp/xentrace1 -t /tmp/dom0trace > > The current Linux kernel instrumentation is for 2.6.20. A backport might > be needed to 2.6.18 if there is no proper Xen support in 2.6.20 (I have > not followed the recent developments). > > > Currently broken/missing : > > - Ressources are not freed when the trace channels are destroyed. So you > basically have to reboot between taking different traces. > - My code in the hypervisor complains to the console that subbuffers > have not been fully read when the trace channels are destroyed. The > error printing is just done too fast : lttd-xen is still there and > reading the buffers at that point. It will get fixed with proper > ressource usage tracking of both Xen and lttd-xen (same as the first > point above). > - x86_64 not tested, powerpc local.h and ltt.h missing (should be ripped > from my Linux kernel LTTng). > > > Cheers, > > Mathieu > > > > * Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote: >> Hi, >> >> My name is Mathieu Desnoyers, I am the current maintainer of the Linux Trace >> Toolkit project, known as LTTng. This is a tracer for the 2.6 Linux kernels >> oriented towards high performance and real-time applications. >> >> I have read your tracing thread and I am surprised to see how much things >> you would like in a tracer are already implemented and tested in LTTng. I am >> currently porting my tracer to Xen, so I think it might be useful for you to >> know what it provides. My goal is to do not duplicate the effort and save >> everyone some time. >> >> Here follows some key features of LTTng : >> >> Architecture independant data types >> Extensible event records >> Self-describing traces >> Variable size records >> Fast (200 ns per event record) >> Highly reentrant >> Does not disable interrupts >> Does not take lock on the critical path >> Supports NMI tracing >> Analysis/visualization tool (LTTV) >> >> Looking at the integration of the existing LTTng implementation into Xen, I >> came up with those two points for my Christmas whichlist : >> >> Additionnal functionnalities that would be nice to have in Xen : >> >> - RCU-style updates : would allow freeing the buffers without impact on tracing. >> * I guess I could currently use : >> for_each_domain( d ) >> for_each_vcpu( d, v ) >> vcpu_sleep_sync(v); >> I think it will have a huge impact on the system, but it would only be >> performed before trace buffers free. >> >> - Polling for data in Xen from a dom0 process. >> Xentrace currently polls the hypervisor each 100ms to see if there is data >> that needs to be consumed. Instead of an active polling, it would be nice to >> use the dom0 OS capability to put a process to sleep while waiting for a >> resource. It would imply creating a module, loaded in dom0, that would wait >> for a specific virq coming from the Hypervisor to wake up such processes. >> We could think of exporting a complete poll() interface through sysfs or >> procfs that would be a directory filled with the resources exported from the >> Hypervisor to dom0 (which could include wait for resource freed, useful when >> shutting down a domU instead of busy looping). It would help dom0 to schedule >> other processes while a process is waiting for the Hypervisor. >> >> >> You might also be interested in looking at : >> - the website (http://ltt.polymtl.ca) >> - LTTng Xen port design document (this one is different from the one posted by >> Jimi) >> (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/lttng-xen.txt) >> - OLS 2006 paper "The LTTng tracer : A Low Impact Performance and Behavior >> Monitor for GNU/Linux" >> (http://ltt.polymtl.ca/papers/desnoyers-ols2006.pdf) >> >> >> Questions and constructive comments are welcome. >> >> Mathieu >> >> >> OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg >> Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@lists.xensource.com >> http://lists.xensource.com/xen-devel >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mark Williamson
2007-Jun-27 00:37 UTC
Re: [Xen-devel] LTTng Xen port : finally in a repository near you
Mathieu, Nice one, thank you for getting involved in this. Are you planning to eventually submit this for inclusion in the Xen mainline? Is there anything we (the community) can do to help? E.g. patch review / advise / etc? Cheers, Mark On Monday 25 June 2007, INAKOSHI Hiroya wrote:> Hi Mathieu, > > I am interested in LTTng-xen because I thought that it would be nice if > I can get traces on both xen and guest linux at the same time. I > reviewed LTTng-xen and found that > > * LTTng and LTTng-xen have a quite similar structure, > * a trace buffer resides in a hypervisor for LTTng-xen, > * it is currently impossible to get traces from guest linux because > there is no LTTng for 2.6.18-xen kernel, as you mentioned. > > I had coarsely ported LTTng to 2.6.18-xen, though it is only for > i386. Now I can get traces on xen and guest linux simultaneously, even > though they put records in different trace buffers. Then I thought that > it would be more useful if they put records in xen''s trace buffer and I > can analyze events from xen and linux guests with a single lttd and > lttctl running on Domain-0. Do you have an opinion about that? > > Regards, > Hiroya > > Mathieu Desnoyers wrote: > > Hello, > > > > I made a working version of the LTTng tracer for xen-unstable for x86. > > Here is the pointer to my repository so you can try it out : > > > > hg clone http://ltt.polymtl.ca/cgi-bin/hgweb.cgi xen-unstable-lttng.hg > > > > Basic usage : > > > > (see lttctl-xen -h) > > > > lttctl-xen -c > > > > (in a different console) > > lttd-xen -t /tmp/xentrace1 > > > > (in the 1st console) > > lttctl-xen -s > > > > (tracing is active) > > > > lttctl-xen -q > > lttctl-xen -r > > > > lttd-xen should automatically quit after writing the last buffers as > > soon as lttctl-xen -r is issued. > > > > Then, you must copy the XML facilities : > > > > (see the http://ltt.polymtl.ca > QUICKSTART to see how to install the > > ltt-control package which contains the XML facilities in your system) > > > > lttctl-xen -e -t /tmp/xentrace1 > > > > View in the visualiser : (see the QUICKSTART to see how to install it) > > > > lttv -m textDump -t /tmp/xentrace1 > > > > (not tested yet) : to visualize a dom0 trace with the xen hypervisor > > information, one would have to collect the dom0 kernel trace and the Xen > > trace and open them together with : > > lttv -m textDump -t /tmp/xentrace1 -t /tmp/dom0trace > > > > The current Linux kernel instrumentation is for 2.6.20. A backport might > > be needed to 2.6.18 if there is no proper Xen support in 2.6.20 (I have > > not followed the recent developments). > > > > > > Currently broken/missing : > > > > - Ressources are not freed when the trace channels are destroyed. So you > > basically have to reboot between taking different traces. > > - My code in the hypervisor complains to the console that subbuffers > > have not been fully read when the trace channels are destroyed. The > > error printing is just done too fast : lttd-xen is still there and > > reading the buffers at that point. It will get fixed with proper > > ressource usage tracking of both Xen and lttd-xen (same as the first > > point above). > > - x86_64 not tested, powerpc local.h and ltt.h missing (should be ripped > > from my Linux kernel LTTng). > > > > > > Cheers, > > > > Mathieu > > > > * Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote: > >> Hi, > >> > >> My name is Mathieu Desnoyers, I am the current maintainer of the Linux > >> Trace Toolkit project, known as LTTng. This is a tracer for the 2.6 > >> Linux kernels oriented towards high performance and real-time > >> applications. > >> > >> I have read your tracing thread and I am surprised to see how much > >> things you would like in a tracer are already implemented and tested in > >> LTTng. I am currently porting my tracer to Xen, so I think it might be > >> useful for you to know what it provides. My goal is to do not duplicate > >> the effort and save everyone some time. > >> > >> Here follows some key features of LTTng : > >> > >> Architecture independant data types > >> Extensible event records > >> Self-describing traces > >> Variable size records > >> Fast (200 ns per event record) > >> Highly reentrant > >> Does not disable interrupts > >> Does not take lock on the critical path > >> Supports NMI tracing > >> Analysis/visualization tool (LTTV) > >> > >> Looking at the integration of the existing LTTng implementation into > >> Xen, I came up with those two points for my Christmas whichlist : > >> > >> Additionnal functionnalities that would be nice to have in Xen : > >> > >> - RCU-style updates : would allow freeing the buffers without impact on > >> tracing. * I guess I could currently use : > >> for_each_domain( d ) > >> for_each_vcpu( d, v ) > >> vcpu_sleep_sync(v); > >> I think it will have a huge impact on the system, but it would > >> only be performed before trace buffers free. > >> > >> - Polling for data in Xen from a dom0 process. > >> Xentrace currently polls the hypervisor each 100ms to see if there is > >> data that needs to be consumed. Instead of an active polling, it would > >> be nice to use the dom0 OS capability to put a process to sleep while > >> waiting for a resource. It would imply creating a module, loaded in > >> dom0, that would wait for a specific virq coming from the Hypervisor to > >> wake up such processes. We could think of exporting a complete poll() > >> interface through sysfs or procfs that would be a directory filled with > >> the resources exported from the Hypervisor to dom0 (which could include > >> wait for resource freed, useful when shutting down a domU instead of > >> busy looping). It would help dom0 to schedule other processes while a > >> process is waiting for the Hypervisor. > >> > >> > >> You might also be interested in looking at : > >> - the website (http://ltt.polymtl.ca) > >> - LTTng Xen port design document (this one is different from the one > >> posted by Jimi) > >> > >> (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/lttng-xen.txt > >>) - OLS 2006 paper "The LTTng tracer : A Low Impact Performance and > >> Behavior Monitor for GNU/Linux" > >> (http://ltt.polymtl.ca/papers/desnoyers-ols2006.pdf) > >> > >> > >> Questions and constructive comments are welcome. > >> > >> Mathieu > >> > >> > >> OpenPGP public key: > >> http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 > >> 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@lists.xensource.com > >> http://lists.xensource.com/xen-devel > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel-- Dave: Just a question. What use is a unicyle with no seat? And no pedals! Mark: To answer a question with a question: What use is a skateboard? Dave: Skateboards have wheels. Mark: My wheel has a wheel! _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mathieu Desnoyers
2007-Jun-27 16:14 UTC
Re: [Xen-devel] LTTng Xen port : finally in a repository near you
* INAKOSHI Hiroya (inakoshi.hiroya@jp.fujitsu.com) wrote:> Hi Mathieu, > > I am interested in LTTng-xen because I thought that it would be nice if > I can get traces on both xen and guest linux at the same time. I > reviewed LTTng-xen and found that > > * LTTng and LTTng-xen have a quite similar structure, > * a trace buffer resides in a hypervisor for LTTng-xen, > * it is currently impossible to get traces from guest linux because > there is no LTTng for 2.6.18-xen kernel, as you mentioned. > > I had coarsely ported LTTng to 2.6.18-xen, though it is only for > i386. Now I can get traces on xen and guest linux simultaneously, even > though they put records in different trace buffers.Hi Ikanoski, We did the same kind of coarse 2.6.18 port at our lab internally to get traces from both Linux and Xen. The fact that the traces are recorded in different buffers does not change anything to the fact that those trace files can be copied in the same trace directory so they can be parsed together by LTTV (traces coming from dom0, domUs and hypervisor). They are synchronized by using the TSCs (hopefully, you will configure your system to get a reliable TSC on AMD and older intels, see the ltt-test-tsc kernel module in recent LTTng versions and ltt.polymtl.ca website for info on that matter).> Then I thought that > it would be more useful if they put records in xen''s trace buffer and I > can analyze eventsLTTV merges the information from all the valid trace files that appears within the trace directory, so the analysis can be done on data coming from userspace, kernels and hypervisor.> from xen and linux guests with a single lttd and > lttctl running on Domain-0. Do you have an opinion about that? >lttctl-xen and lttd-xen, although being quite similar to lttd and lttctl, use hypercalls to get the data. The standard lttctl/lttd uses debugfs files as a hook to the trace buffers. As a distribution matter, I prefer to leave both separate for now, because lttctl-xen and lttd-xen is highly tied to the Xen tree. Also, merging the information within the buffers between Xen and Dom0 is not such a great idea: The Hypervisor and dom0 can have a different number of CPUs (Xen : real CPUs, dom0: vcpus). Since I use per-cpu buffers, it does not fit. Also, I don''t want dom0 to overwrite data from the Xen buffers easily: it''s better if we keep some protection between dom0 and the Hypervisor. Thanks for looking into this, don''t hesitate to ask further questions, Mathieu> Regards, > Hiroya > > > Mathieu Desnoyers wrote: > > Hello, > > > > I made a working version of the LTTng tracer for xen-unstable for x86. > > Here is the pointer to my repository so you can try it out : > > > > hg clone http://ltt.polymtl.ca/cgi-bin/hgweb.cgi xen-unstable-lttng.hg > > > > Basic usage : > > > > (see lttctl-xen -h) > > > > lttctl-xen -c > > > > (in a different console) > > lttd-xen -t /tmp/xentrace1 > > > > (in the 1st console) > > lttctl-xen -s > > > > (tracing is active) > > > > lttctl-xen -q > > lttctl-xen -r > > > > lttd-xen should automatically quit after writing the last buffers as > > soon as lttctl-xen -r is issued. > > > > Then, you must copy the XML facilities : > > > > (see the http://ltt.polymtl.ca > QUICKSTART to see how to install the > > ltt-control package which contains the XML facilities in your system) > > > > lttctl-xen -e -t /tmp/xentrace1 > > > > View in the visualiser : (see the QUICKSTART to see how to install it) > > > > lttv -m textDump -t /tmp/xentrace1 > > > > (not tested yet) : to visualize a dom0 trace with the xen hypervisor > > information, one would have to collect the dom0 kernel trace and the Xen > > trace and open them together with : > > lttv -m textDump -t /tmp/xentrace1 -t /tmp/dom0trace > > > > The current Linux kernel instrumentation is for 2.6.20. A backport might > > be needed to 2.6.18 if there is no proper Xen support in 2.6.20 (I have > > not followed the recent developments). > > > > > > Currently broken/missing : > > > > - Ressources are not freed when the trace channels are destroyed. So you > > basically have to reboot between taking different traces. > > - My code in the hypervisor complains to the console that subbuffers > > have not been fully read when the trace channels are destroyed. The > > error printing is just done too fast : lttd-xen is still there and > > reading the buffers at that point. It will get fixed with proper > > ressource usage tracking of both Xen and lttd-xen (same as the first > > point above). > > - x86_64 not tested, powerpc local.h and ltt.h missing (should be ripped > > from my Linux kernel LTTng). > > > > > > Cheers, > > > > Mathieu > > > > > > > > * Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote: > >> Hi, > >> > >> My name is Mathieu Desnoyers, I am the current maintainer of the Linux Trace > >> Toolkit project, known as LTTng. This is a tracer for the 2.6 Linux kernels > >> oriented towards high performance and real-time applications. > >> > >> I have read your tracing thread and I am surprised to see how much things > >> you would like in a tracer are already implemented and tested in LTTng. I am > >> currently porting my tracer to Xen, so I think it might be useful for you to > >> know what it provides. My goal is to do not duplicate the effort and save > >> everyone some time. > >> > >> Here follows some key features of LTTng : > >> > >> Architecture independant data types > >> Extensible event records > >> Self-describing traces > >> Variable size records > >> Fast (200 ns per event record) > >> Highly reentrant > >> Does not disable interrupts > >> Does not take lock on the critical path > >> Supports NMI tracing > >> Analysis/visualization tool (LTTV) > >> > >> Looking at the integration of the existing LTTng implementation into Xen, I > >> came up with those two points for my Christmas whichlist : > >> > >> Additionnal functionnalities that would be nice to have in Xen : > >> > >> - RCU-style updates : would allow freeing the buffers without impact on tracing. > >> * I guess I could currently use : > >> for_each_domain( d ) > >> for_each_vcpu( d, v ) > >> vcpu_sleep_sync(v); > >> I think it will have a huge impact on the system, but it would only be > >> performed before trace buffers free. > >> > >> - Polling for data in Xen from a dom0 process. > >> Xentrace currently polls the hypervisor each 100ms to see if there is data > >> that needs to be consumed. Instead of an active polling, it would be nice to > >> use the dom0 OS capability to put a process to sleep while waiting for a > >> resource. It would imply creating a module, loaded in dom0, that would wait > >> for a specific virq coming from the Hypervisor to wake up such processes. > >> We could think of exporting a complete poll() interface through sysfs or > >> procfs that would be a directory filled with the resources exported from the > >> Hypervisor to dom0 (which could include wait for resource freed, useful when > >> shutting down a domU instead of busy looping). It would help dom0 to schedule > >> other processes while a process is waiting for the Hypervisor. > >> > >> > >> You might also be interested in looking at : > >> - the website (http://ltt.polymtl.ca) > >> - LTTng Xen port design document (this one is different from the one posted by > >> Jimi) > >> (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/lttng-xen.txt) > >> - OLS 2006 paper "The LTTng tracer : A Low Impact Performance and Behavior > >> Monitor for GNU/Linux" > >> (http://ltt.polymtl.ca/papers/desnoyers-ols2006.pdf) > >> > >> > >> Questions and constructive comments are welcome. > >> > >> Mathieu > >> > >> > >> OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg > >> Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 > >> > >> _______________________________________________ > >> Xen-devel mailing list > >> Xen-devel@lists.xensource.com > >> http://lists.xensource.com/xen-devel > >> > > > > >-- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mathieu Desnoyers
2007-Jun-27 16:22 UTC
Re: [Xen-devel] LTTng Xen port : finally in a repository near you
* Mark Williamson (mark.williamson@cl.cam.ac.uk) wrote:> Mathieu, > > Nice one, thank you for getting involved in this. Are you planning to > eventually submit this for inclusion in the Xen mainline? >I did the original port, but my current focus is more oriented towards having LTTng included in Linux mainline. As you probably guess, it is changing quite a bit in the process. I plan to first get the kernel tracing polished, and afterward to port this "polished" tracer to userspace tracing and hypervisor. Since a first port has been done, it will not represent too much work to do it again.> Is there anything we (the community) can do to help? E.g. patch review / > advise / etc? >You could go through a code review of the Xen port I have done. Not the tracing infrastructure itself, since it is scheduled for major changes, but mostly the low-level calls I did in lttd-xen, lttctl-xen and the Xen ltt-tracer. Note that you have to expect the current "Linux Kernel Markers" to be ported to the Xen hypervisor eventually. Some advice about how I currently (fail) to do the teardown of traces correctly: I depend on synchronize_sched() under Linux to prodive a race-less efficient teardown (using read-copy-update for my data structures). It is a core concept I depend on to provide fast and efficient tracing. Has RCU finally been implement in Xen since the last time I checked? Thanks, Mathieu> Cheers, > Mark > > On Monday 25 June 2007, INAKOSHI Hiroya wrote: > > Hi Mathieu, > > > > I am interested in LTTng-xen because I thought that it would be nice if > > I can get traces on both xen and guest linux at the same time. I > > reviewed LTTng-xen and found that > > > > * LTTng and LTTng-xen have a quite similar structure, > > * a trace buffer resides in a hypervisor for LTTng-xen, > > * it is currently impossible to get traces from guest linux because > > there is no LTTng for 2.6.18-xen kernel, as you mentioned. > > > > I had coarsely ported LTTng to 2.6.18-xen, though it is only for > > i386. Now I can get traces on xen and guest linux simultaneously, even > > though they put records in different trace buffers. Then I thought that > > it would be more useful if they put records in xen''s trace buffer and I > > can analyze events from xen and linux guests with a single lttd and > > lttctl running on Domain-0. Do you have an opinion about that? > > > > Regards, > > Hiroya > > > > Mathieu Desnoyers wrote: > > > Hello, > > > > > > I made a working version of the LTTng tracer for xen-unstable for x86. > > > Here is the pointer to my repository so you can try it out : > > > > > > hg clone http://ltt.polymtl.ca/cgi-bin/hgweb.cgi xen-unstable-lttng.hg > > > > > > Basic usage : > > > > > > (see lttctl-xen -h) > > > > > > lttctl-xen -c > > > > > > (in a different console) > > > lttd-xen -t /tmp/xentrace1 > > > > > > (in the 1st console) > > > lttctl-xen -s > > > > > > (tracing is active) > > > > > > lttctl-xen -q > > > lttctl-xen -r > > > > > > lttd-xen should automatically quit after writing the last buffers as > > > soon as lttctl-xen -r is issued. > > > > > > Then, you must copy the XML facilities : > > > > > > (see the http://ltt.polymtl.ca > QUICKSTART to see how to install the > > > ltt-control package which contains the XML facilities in your system) > > > > > > lttctl-xen -e -t /tmp/xentrace1 > > > > > > View in the visualiser : (see the QUICKSTART to see how to install it) > > > > > > lttv -m textDump -t /tmp/xentrace1 > > > > > > (not tested yet) : to visualize a dom0 trace with the xen hypervisor > > > information, one would have to collect the dom0 kernel trace and the Xen > > > trace and open them together with : > > > lttv -m textDump -t /tmp/xentrace1 -t /tmp/dom0trace > > > > > > The current Linux kernel instrumentation is for 2.6.20. A backport might > > > be needed to 2.6.18 if there is no proper Xen support in 2.6.20 (I have > > > not followed the recent developments). > > > > > > > > > Currently broken/missing : > > > > > > - Ressources are not freed when the trace channels are destroyed. So you > > > basically have to reboot between taking different traces. > > > - My code in the hypervisor complains to the console that subbuffers > > > have not been fully read when the trace channels are destroyed. The > > > error printing is just done too fast : lttd-xen is still there and > > > reading the buffers at that point. It will get fixed with proper > > > ressource usage tracking of both Xen and lttd-xen (same as the first > > > point above). > > > - x86_64 not tested, powerpc local.h and ltt.h missing (should be ripped > > > from my Linux kernel LTTng). > > > > > > > > > Cheers, > > > > > > Mathieu > > > > > > * Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote: > > >> Hi, > > >> > > >> My name is Mathieu Desnoyers, I am the current maintainer of the Linux > > >> Trace Toolkit project, known as LTTng. This is a tracer for the 2.6 > > >> Linux kernels oriented towards high performance and real-time > > >> applications. > > >> > > >> I have read your tracing thread and I am surprised to see how much > > >> things you would like in a tracer are already implemented and tested in > > >> LTTng. I am currently porting my tracer to Xen, so I think it might be > > >> useful for you to know what it provides. My goal is to do not duplicate > > >> the effort and save everyone some time. > > >> > > >> Here follows some key features of LTTng : > > >> > > >> Architecture independant data types > > >> Extensible event records > > >> Self-describing traces > > >> Variable size records > > >> Fast (200 ns per event record) > > >> Highly reentrant > > >> Does not disable interrupts > > >> Does not take lock on the critical path > > >> Supports NMI tracing > > >> Analysis/visualization tool (LTTV) > > >> > > >> Looking at the integration of the existing LTTng implementation into > > >> Xen, I came up with those two points for my Christmas whichlist : > > >> > > >> Additionnal functionnalities that would be nice to have in Xen : > > >> > > >> - RCU-style updates : would allow freeing the buffers without impact on > > >> tracing. * I guess I could currently use : > > >> for_each_domain( d ) > > >> for_each_vcpu( d, v ) > > >> vcpu_sleep_sync(v); > > >> I think it will have a huge impact on the system, but it would > > >> only be performed before trace buffers free. > > >> > > >> - Polling for data in Xen from a dom0 process. > > >> Xentrace currently polls the hypervisor each 100ms to see if there is > > >> data that needs to be consumed. Instead of an active polling, it would > > >> be nice to use the dom0 OS capability to put a process to sleep while > > >> waiting for a resource. It would imply creating a module, loaded in > > >> dom0, that would wait for a specific virq coming from the Hypervisor to > > >> wake up such processes. We could think of exporting a complete poll() > > >> interface through sysfs or procfs that would be a directory filled with > > >> the resources exported from the Hypervisor to dom0 (which could include > > >> wait for resource freed, useful when shutting down a domU instead of > > >> busy looping). It would help dom0 to schedule other processes while a > > >> process is waiting for the Hypervisor. > > >> > > >> > > >> You might also be interested in looking at : > > >> - the website (http://ltt.polymtl.ca) > > >> - LTTng Xen port design document (this one is different from the one > > >> posted by Jimi) > > >> > > >> (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/lttng-xen.txt > > >>) - OLS 2006 paper "The LTTng tracer : A Low Impact Performance and > > >> Behavior Monitor for GNU/Linux" > > >> (http://ltt.polymtl.ca/papers/desnoyers-ols2006.pdf) > > >> > > >> > > >> Questions and constructive comments are welcome. > > >> > > >> Mathieu > > >> > > >> > > >> OpenPGP public key: > > >> http://krystal.dyndns.org:8080/key/compudj.gpg Key fingerprint: 8CD5 > > >> 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 > > >> > > >> _______________________________________________ > > >> Xen-devel mailing list > > >> Xen-devel@lists.xensource.com > > >> http://lists.xensource.com/xen-devel > > > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@lists.xensource.com > > http://lists.xensource.com/xen-devel > > > > -- > Dave: Just a question. What use is a unicyle with no seat? And no pedals! > Mark: To answer a question with a question: What use is a skateboard? > Dave: Skateboards have wheels. > Mark: My wheel has a wheel! >-- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2007-Jun-27 16:25 UTC
Re: [Xen-devel] LTTng Xen port : finally in a repository near you
On 27/6/07 17:22, "Mathieu Desnoyers" <compudj@krystal.dyndns.org> wrote:> Some advice about how I currently (fail) to do the teardown of traces > correctly: I depend on synchronize_sched() under Linux to prodive a > race-less efficient teardown (using read-copy-update for my data > structures). It is a core concept I depend on to provide fast and > efficient tracing. Has RCU finally been implement in Xen since the last > time I checked?Yes. K. _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
INAKOSHI Hiroya
2007-Jun-28 05:07 UTC
Re: [Xen-devel] LTTng Xen port : finally in a repository near you
Hi Mathieu, thanks for your reply. I can understand your opinion very well but a concern is that cpu ids on a guest OS are different from those on Xen because they are virtualized. The number of vcpus in a guest OS is also different from that of pcpus as you mentioned. I wondered if the two traces could be merged directly. If you translate vcpu ids to pcpu ids in writing records in the trace buffer in Xen, this concern is solved in a natural way. Mathieu Desnoyers wrote:> * INAKOSHI Hiroya (inakoshi.hiroya@jp.fujitsu.com) wrote: >> Hi Mathieu, >> >> I am interested in LTTng-xen because I thought that it would be nice if >> I can get traces on both xen and guest linux at the same time. I >> reviewed LTTng-xen and found that >> >> * LTTng and LTTng-xen have a quite similar structure, >> * a trace buffer resides in a hypervisor for LTTng-xen, >> * it is currently impossible to get traces from guest linux because >> there is no LTTng for 2.6.18-xen kernel, as you mentioned. >> >> I had coarsely ported LTTng to 2.6.18-xen, though it is only for >> i386. Now I can get traces on xen and guest linux simultaneously, even >> though they put records in different trace buffers. > > Hi Ikanoski, > > We did the same kind of coarse 2.6.18 port at our lab internally to get > traces from both Linux and Xen. The fact that the traces are recorded in > different buffers does not change anything to the fact that those trace > files can be copied in the same trace directory so they can be parsed > together by LTTV (traces coming from dom0, domUs and hypervisor). They > are synchronized by using the TSCs (hopefully, you will configure your > system to get a reliable TSC on AMD and older intels, see the > ltt-test-tsc kernel module in recent LTTng versions and ltt.polymtl.ca > website for info on that matter). > > >> Then I thought that >> it would be more useful if they put records in xen''s trace buffer and I >> can analyze events > > LTTV merges the information from all the valid trace files that appears > within the trace directory, so the analysis can be done on data coming > from userspace, kernels and hypervisor. > >> from xen and linux guests with a single lttd and >> lttctl running on Domain-0. Do you have an opinion about that? >> > > lttctl-xen and lttd-xen, although being quite similar to lttd and > lttctl, use hypercalls to get the data. The standard lttctl/lttd uses > debugfs files as a hook to the trace buffers. > > As a distribution matter, I prefer to leave both separate for now, > because lttctl-xen and lttd-xen is highly tied to the Xen tree. > > Also, merging the information within the buffers between Xen and Dom0 is > not such a great idea: The Hypervisor and dom0 can have a different > number of CPUs (Xen : real CPUs, dom0: vcpus). Since I use per-cpu > buffers, it does not fit. > > Also, I don''t want dom0 to overwrite data from the Xen buffers easily: > it''s better if we keep some protection between dom0 and the Hypervisor. > > Thanks for looking into this, don''t hesitate to ask further questions, > > Mathieu > >> Regards, >> Hiroya >> >> >> Mathieu Desnoyers wrote: >>> Hello, >>> >>> I made a working version of the LTTng tracer for xen-unstable for x86. >>> Here is the pointer to my repository so you can try it out : >>> >>> hg clone http://ltt.polymtl.ca/cgi-bin/hgweb.cgi xen-unstable-lttng.hg >>> >>> Basic usage : >>> >>> (see lttctl-xen -h) >>> >>> lttctl-xen -c >>> >>> (in a different console) >>> lttd-xen -t /tmp/xentrace1 >>> >>> (in the 1st console) >>> lttctl-xen -s >>> >>> (tracing is active) >>> >>> lttctl-xen -q >>> lttctl-xen -r >>> >>> lttd-xen should automatically quit after writing the last buffers as >>> soon as lttctl-xen -r is issued. >>> >>> Then, you must copy the XML facilities : >>> >>> (see the http://ltt.polymtl.ca > QUICKSTART to see how to install the >>> ltt-control package which contains the XML facilities in your system) >>> >>> lttctl-xen -e -t /tmp/xentrace1 >>> >>> View in the visualiser : (see the QUICKSTART to see how to install it) >>> >>> lttv -m textDump -t /tmp/xentrace1 >>> >>> (not tested yet) : to visualize a dom0 trace with the xen hypervisor >>> information, one would have to collect the dom0 kernel trace and the Xen >>> trace and open them together with : >>> lttv -m textDump -t /tmp/xentrace1 -t /tmp/dom0trace >>> >>> The current Linux kernel instrumentation is for 2.6.20. A backport might >>> be needed to 2.6.18 if there is no proper Xen support in 2.6.20 (I have >>> not followed the recent developments). >>> >>> >>> Currently broken/missing : >>> >>> - Ressources are not freed when the trace channels are destroyed. So you >>> basically have to reboot between taking different traces. >>> - My code in the hypervisor complains to the console that subbuffers >>> have not been fully read when the trace channels are destroyed. The >>> error printing is just done too fast : lttd-xen is still there and >>> reading the buffers at that point. It will get fixed with proper >>> ressource usage tracking of both Xen and lttd-xen (same as the first >>> point above). >>> - x86_64 not tested, powerpc local.h and ltt.h missing (should be ripped >>> from my Linux kernel LTTng). >>> >>> >>> Cheers, >>> >>> Mathieu >>> >>> >>> >>> * Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote: >>>> Hi, >>>> >>>> My name is Mathieu Desnoyers, I am the current maintainer of the Linux Trace >>>> Toolkit project, known as LTTng. This is a tracer for the 2.6 Linux kernels >>>> oriented towards high performance and real-time applications. >>>> >>>> I have read your tracing thread and I am surprised to see how much things >>>> you would like in a tracer are already implemented and tested in LTTng. I am >>>> currently porting my tracer to Xen, so I think it might be useful for you to >>>> know what it provides. My goal is to do not duplicate the effort and save >>>> everyone some time. >>>> >>>> Here follows some key features of LTTng : >>>> >>>> Architecture independant data types >>>> Extensible event records >>>> Self-describing traces >>>> Variable size records >>>> Fast (200 ns per event record) >>>> Highly reentrant >>>> Does not disable interrupts >>>> Does not take lock on the critical path >>>> Supports NMI tracing >>>> Analysis/visualization tool (LTTV) >>>> >>>> Looking at the integration of the existing LTTng implementation into Xen, I >>>> came up with those two points for my Christmas whichlist : >>>> >>>> Additionnal functionnalities that would be nice to have in Xen : >>>> >>>> - RCU-style updates : would allow freeing the buffers without impact on tracing. >>>> * I guess I could currently use : >>>> for_each_domain( d ) >>>> for_each_vcpu( d, v ) >>>> vcpu_sleep_sync(v); >>>> I think it will have a huge impact on the system, but it would only be >>>> performed before trace buffers free. >>>> >>>> - Polling for data in Xen from a dom0 process. >>>> Xentrace currently polls the hypervisor each 100ms to see if there is data >>>> that needs to be consumed. Instead of an active polling, it would be nice to >>>> use the dom0 OS capability to put a process to sleep while waiting for a >>>> resource. It would imply creating a module, loaded in dom0, that would wait >>>> for a specific virq coming from the Hypervisor to wake up such processes. >>>> We could think of exporting a complete poll() interface through sysfs or >>>> procfs that would be a directory filled with the resources exported from the >>>> Hypervisor to dom0 (which could include wait for resource freed, useful when >>>> shutting down a domU instead of busy looping). It would help dom0 to schedule >>>> other processes while a process is waiting for the Hypervisor. >>>> >>>> >>>> You might also be interested in looking at : >>>> - the website (http://ltt.polymtl.ca) >>>> - LTTng Xen port design document (this one is different from the one posted by >>>> Jimi) >>>> (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/lttng-xen.txt) >>>> - OLS 2006 paper "The LTTng tracer : A Low Impact Performance and Behavior >>>> Monitor for GNU/Linux" >>>> (http://ltt.polymtl.ca/papers/desnoyers-ols2006.pdf) >>>> >>>> >>>> Questions and constructive comments are welcome. >>>> >>>> Mathieu >>>> >>>> >>>> OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg >>>> Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@lists.xensource.com >>>> http://lists.xensource.com/xen-devel >>>> >> >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mathieu Desnoyers
2007-Jun-28 06:58 UTC
Re: [Xen-devel] LTTng Xen port : finally in a repository near you
* INAKOSHI Hiroya (inakoshi.hiroya@jp.fujitsu.com) wrote:> Hi Mathieu, > > thanks for your reply. I can understand your opinion very well but a > concern is that cpu ids on a guest OS are different from those on Xen > because they are virtualized. The number of vcpus in a guest OS is also > different from that of pcpus as you mentioned. I wondered if the two > traces could be merged directly. If you translate vcpu ids to pcpu ids > in writing records in the trace buffer in Xen, this concern is solved in > a natural way. >When you are executing code in dom0 or domUs, how do you plan to get the physical CPU number on which the tracing is done ?> Mathieu Desnoyers wrote: > > * INAKOSHI Hiroya (inakoshi.hiroya@jp.fujitsu.com) wrote: > >> Hi Mathieu, > >> > >> I am interested in LTTng-xen because I thought that it would be nice if > >> I can get traces on both xen and guest linux at the same time. I > >> reviewed LTTng-xen and found that > >> > >> * LTTng and LTTng-xen have a quite similar structure, > >> * a trace buffer resides in a hypervisor for LTTng-xen, > >> * it is currently impossible to get traces from guest linux because > >> there is no LTTng for 2.6.18-xen kernel, as you mentioned. > >> > >> I had coarsely ported LTTng to 2.6.18-xen, though it is only for > >> i386. Now I can get traces on xen and guest linux simultaneously, even > >> though they put records in different trace buffers. > > > > Hi Ikanoski, > > > > We did the same kind of coarse 2.6.18 port at our lab internally to get > > traces from both Linux and Xen. The fact that the traces are recorded in > > different buffers does not change anything to the fact that those trace > > files can be copied in the same trace directory so they can be parsed > > together by LTTV (traces coming from dom0, domUs and hypervisor). They > > are synchronized by using the TSCs (hopefully, you will configure your > > system to get a reliable TSC on AMD and older intels, see the > > ltt-test-tsc kernel module in recent LTTng versions and ltt.polymtl.ca > > website for info on that matter). > > > > > >> Then I thought that > >> it would be more useful if they put records in xen''s trace buffer and I > >> can analyze events > > > > LTTV merges the information from all the valid trace files that appears > > within the trace directory, so the analysis can be done on data coming > > from userspace, kernels and hypervisor. > > > >> from xen and linux guests with a single lttd and > >> lttctl running on Domain-0. Do you have an opinion about that? > >> > > > > lttctl-xen and lttd-xen, although being quite similar to lttd and > > lttctl, use hypercalls to get the data. The standard lttctl/lttd uses > > debugfs files as a hook to the trace buffers. > > > > As a distribution matter, I prefer to leave both separate for now, > > because lttctl-xen and lttd-xen is highly tied to the Xen tree. > > > > Also, merging the information within the buffers between Xen and Dom0 is > > not such a great idea: The Hypervisor and dom0 can have a different > > number of CPUs (Xen : real CPUs, dom0: vcpus). Since I use per-cpu > > buffers, it does not fit. > > > > Also, I don''t want dom0 to overwrite data from the Xen buffers easily: > > it''s better if we keep some protection between dom0 and the Hypervisor. > > > > Thanks for looking into this, don''t hesitate to ask further questions, > > > > Mathieu > > > >> Regards, > >> Hiroya > >> > >> > >> Mathieu Desnoyers wrote: > >>> Hello, > >>> > >>> I made a working version of the LTTng tracer for xen-unstable for x86. > >>> Here is the pointer to my repository so you can try it out : > >>> > >>> hg clone http://ltt.polymtl.ca/cgi-bin/hgweb.cgi xen-unstable-lttng.hg > >>> > >>> Basic usage : > >>> > >>> (see lttctl-xen -h) > >>> > >>> lttctl-xen -c > >>> > >>> (in a different console) > >>> lttd-xen -t /tmp/xentrace1 > >>> > >>> (in the 1st console) > >>> lttctl-xen -s > >>> > >>> (tracing is active) > >>> > >>> lttctl-xen -q > >>> lttctl-xen -r > >>> > >>> lttd-xen should automatically quit after writing the last buffers as > >>> soon as lttctl-xen -r is issued. > >>> > >>> Then, you must copy the XML facilities : > >>> > >>> (see the http://ltt.polymtl.ca > QUICKSTART to see how to install the > >>> ltt-control package which contains the XML facilities in your system) > >>> > >>> lttctl-xen -e -t /tmp/xentrace1 > >>> > >>> View in the visualiser : (see the QUICKSTART to see how to install it) > >>> > >>> lttv -m textDump -t /tmp/xentrace1 > >>> > >>> (not tested yet) : to visualize a dom0 trace with the xen hypervisor > >>> information, one would have to collect the dom0 kernel trace and the Xen > >>> trace and open them together with : > >>> lttv -m textDump -t /tmp/xentrace1 -t /tmp/dom0trace > >>> > >>> The current Linux kernel instrumentation is for 2.6.20. A backport might > >>> be needed to 2.6.18 if there is no proper Xen support in 2.6.20 (I have > >>> not followed the recent developments). > >>> > >>> > >>> Currently broken/missing : > >>> > >>> - Ressources are not freed when the trace channels are destroyed. So you > >>> basically have to reboot between taking different traces. > >>> - My code in the hypervisor complains to the console that subbuffers > >>> have not been fully read when the trace channels are destroyed. The > >>> error printing is just done too fast : lttd-xen is still there and > >>> reading the buffers at that point. It will get fixed with proper > >>> ressource usage tracking of both Xen and lttd-xen (same as the first > >>> point above). > >>> - x86_64 not tested, powerpc local.h and ltt.h missing (should be ripped > >>> from my Linux kernel LTTng). > >>> > >>> > >>> Cheers, > >>> > >>> Mathieu > >>> > >>> > >>> > >>> * Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote: > >>>> Hi, > >>>> > >>>> My name is Mathieu Desnoyers, I am the current maintainer of the Linux Trace > >>>> Toolkit project, known as LTTng. This is a tracer for the 2.6 Linux kernels > >>>> oriented towards high performance and real-time applications. > >>>> > >>>> I have read your tracing thread and I am surprised to see how much things > >>>> you would like in a tracer are already implemented and tested in LTTng. I am > >>>> currently porting my tracer to Xen, so I think it might be useful for you to > >>>> know what it provides. My goal is to do not duplicate the effort and save > >>>> everyone some time. > >>>> > >>>> Here follows some key features of LTTng : > >>>> > >>>> Architecture independant data types > >>>> Extensible event records > >>>> Self-describing traces > >>>> Variable size records > >>>> Fast (200 ns per event record) > >>>> Highly reentrant > >>>> Does not disable interrupts > >>>> Does not take lock on the critical path > >>>> Supports NMI tracing > >>>> Analysis/visualization tool (LTTV) > >>>> > >>>> Looking at the integration of the existing LTTng implementation into Xen, I > >>>> came up with those two points for my Christmas whichlist : > >>>> > >>>> Additionnal functionnalities that would be nice to have in Xen : > >>>> > >>>> - RCU-style updates : would allow freeing the buffers without impact on tracing. > >>>> * I guess I could currently use : > >>>> for_each_domain( d ) > >>>> for_each_vcpu( d, v ) > >>>> vcpu_sleep_sync(v); > >>>> I think it will have a huge impact on the system, but it would only be > >>>> performed before trace buffers free. > >>>> > >>>> - Polling for data in Xen from a dom0 process. > >>>> Xentrace currently polls the hypervisor each 100ms to see if there is data > >>>> that needs to be consumed. Instead of an active polling, it would be nice to > >>>> use the dom0 OS capability to put a process to sleep while waiting for a > >>>> resource. It would imply creating a module, loaded in dom0, that would wait > >>>> for a specific virq coming from the Hypervisor to wake up such processes. > >>>> We could think of exporting a complete poll() interface through sysfs or > >>>> procfs that would be a directory filled with the resources exported from the > >>>> Hypervisor to dom0 (which could include wait for resource freed, useful when > >>>> shutting down a domU instead of busy looping). It would help dom0 to schedule > >>>> other processes while a process is waiting for the Hypervisor. > >>>> > >>>> > >>>> You might also be interested in looking at : > >>>> - the website (http://ltt.polymtl.ca) > >>>> - LTTng Xen port design document (this one is different from the one posted by > >>>> Jimi) > >>>> (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/lttng-xen.txt) > >>>> - OLS 2006 paper "The LTTng tracer : A Low Impact Performance and Behavior > >>>> Monitor for GNU/Linux" > >>>> (http://ltt.polymtl.ca/papers/desnoyers-ols2006.pdf) > >>>> > >>>> > >>>> Questions and constructive comments are welcome. > >>>> > >>>> Mathieu > >>>> > >>>> > >>>> OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg > >>>> Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 > >>>> > >>>> _______________________________________________ > >>>> Xen-devel mailing list > >>>> Xen-devel@lists.xensource.com > >>>> http://lists.xensource.com/xen-devel > >>>> > >> > >> > > > >-- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
INAKOSHI Hiroya
2007-Jun-28 12:32 UTC
Re: [Xen-devel] LTTng Xen port : finally in a repository near you
Mathieu Desnoyers wrote:> * INAKOSHI Hiroya (inakoshi.hiroya@jp.fujitsu.com) wrote: >> Hi Mathieu, >> >> thanks for your reply. I can understand your opinion very well but a >> concern is that cpu ids on a guest OS are different from those on Xen >> because they are virtualized. The number of vcpus in a guest OS is also >> different from that of pcpus as you mentioned. I wondered if the two >> traces could be merged directly. If you translate vcpu ids to pcpu ids >> in writing records in the trace buffer in Xen, this concern is solved in >> a natural way. >> > > When you are executing code in dom0 or domUs, how do you plan to get the > physical CPU number on which the tracing is done ?I am considering the way where dom0 or domUs call hypercalls to write records in the Xen''s trace buffer. In this setting, the vcpu info is located in the xen kernel stack and the pcpu is the one performing the hypercall. So, I can resolve the mapping between vcpu id and pcpu id. Regards, Hiroya> >> Mathieu Desnoyers wrote: >>> * INAKOSHI Hiroya (inakoshi.hiroya@jp.fujitsu.com) wrote: >>>> Hi Mathieu, >>>> >>>> I am interested in LTTng-xen because I thought that it would be nice if >>>> I can get traces on both xen and guest linux at the same time. I >>>> reviewed LTTng-xen and found that >>>> >>>> * LTTng and LTTng-xen have a quite similar structure, >>>> * a trace buffer resides in a hypervisor for LTTng-xen, >>>> * it is currently impossible to get traces from guest linux because >>>> there is no LTTng for 2.6.18-xen kernel, as you mentioned. >>>> >>>> I had coarsely ported LTTng to 2.6.18-xen, though it is only for >>>> i386. Now I can get traces on xen and guest linux simultaneously, even >>>> though they put records in different trace buffers. >>> Hi Ikanoski, >>> >>> We did the same kind of coarse 2.6.18 port at our lab internally to get >>> traces from both Linux and Xen. The fact that the traces are recorded in >>> different buffers does not change anything to the fact that those trace >>> files can be copied in the same trace directory so they can be parsed >>> together by LTTV (traces coming from dom0, domUs and hypervisor). They >>> are synchronized by using the TSCs (hopefully, you will configure your >>> system to get a reliable TSC on AMD and older intels, see the >>> ltt-test-tsc kernel module in recent LTTng versions and ltt.polymtl.ca >>> website for info on that matter). >>> >>> >>>> Then I thought that >>>> it would be more useful if they put records in xen''s trace buffer and I >>>> can analyze events >>> LTTV merges the information from all the valid trace files that appears >>> within the trace directory, so the analysis can be done on data coming >>> from userspace, kernels and hypervisor. >>> >>>> from xen and linux guests with a single lttd and >>>> lttctl running on Domain-0. Do you have an opinion about that? >>>> >>> lttctl-xen and lttd-xen, although being quite similar to lttd and >>> lttctl, use hypercalls to get the data. The standard lttctl/lttd uses >>> debugfs files as a hook to the trace buffers. >>> >>> As a distribution matter, I prefer to leave both separate for now, >>> because lttctl-xen and lttd-xen is highly tied to the Xen tree. >>> >>> Also, merging the information within the buffers between Xen and Dom0 is >>> not such a great idea: The Hypervisor and dom0 can have a different >>> number of CPUs (Xen : real CPUs, dom0: vcpus). Since I use per-cpu >>> buffers, it does not fit. >>> >>> Also, I don''t want dom0 to overwrite data from the Xen buffers easily: >>> it''s better if we keep some protection between dom0 and the Hypervisor. >>> >>> Thanks for looking into this, don''t hesitate to ask further questions, >>> >>> Mathieu >>> >>>> Regards, >>>> Hiroya >>>> >>>> >>>> Mathieu Desnoyers wrote: >>>>> Hello, >>>>> >>>>> I made a working version of the LTTng tracer for xen-unstable for x86. >>>>> Here is the pointer to my repository so you can try it out : >>>>> >>>>> hg clone http://ltt.polymtl.ca/cgi-bin/hgweb.cgi xen-unstable-lttng.hg >>>>> >>>>> Basic usage : >>>>> >>>>> (see lttctl-xen -h) >>>>> >>>>> lttctl-xen -c >>>>> >>>>> (in a different console) >>>>> lttd-xen -t /tmp/xentrace1 >>>>> >>>>> (in the 1st console) >>>>> lttctl-xen -s >>>>> >>>>> (tracing is active) >>>>> >>>>> lttctl-xen -q >>>>> lttctl-xen -r >>>>> >>>>> lttd-xen should automatically quit after writing the last buffers as >>>>> soon as lttctl-xen -r is issued. >>>>> >>>>> Then, you must copy the XML facilities : >>>>> >>>>> (see the http://ltt.polymtl.ca > QUICKSTART to see how to install the >>>>> ltt-control package which contains the XML facilities in your system) >>>>> >>>>> lttctl-xen -e -t /tmp/xentrace1 >>>>> >>>>> View in the visualiser : (see the QUICKSTART to see how to install it) >>>>> >>>>> lttv -m textDump -t /tmp/xentrace1 >>>>> >>>>> (not tested yet) : to visualize a dom0 trace with the xen hypervisor >>>>> information, one would have to collect the dom0 kernel trace and the Xen >>>>> trace and open them together with : >>>>> lttv -m textDump -t /tmp/xentrace1 -t /tmp/dom0trace >>>>> >>>>> The current Linux kernel instrumentation is for 2.6.20. A backport might >>>>> be needed to 2.6.18 if there is no proper Xen support in 2.6.20 (I have >>>>> not followed the recent developments). >>>>> >>>>> >>>>> Currently broken/missing : >>>>> >>>>> - Ressources are not freed when the trace channels are destroyed. So you >>>>> basically have to reboot between taking different traces. >>>>> - My code in the hypervisor complains to the console that subbuffers >>>>> have not been fully read when the trace channels are destroyed. The >>>>> error printing is just done too fast : lttd-xen is still there and >>>>> reading the buffers at that point. It will get fixed with proper >>>>> ressource usage tracking of both Xen and lttd-xen (same as the first >>>>> point above). >>>>> - x86_64 not tested, powerpc local.h and ltt.h missing (should be ripped >>>>> from my Linux kernel LTTng). >>>>> >>>>> >>>>> Cheers, >>>>> >>>>> Mathieu >>>>> >>>>> >>>>> >>>>> * Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote: >>>>>> Hi, >>>>>> >>>>>> My name is Mathieu Desnoyers, I am the current maintainer of the Linux Trace >>>>>> Toolkit project, known as LTTng. This is a tracer for the 2.6 Linux kernels >>>>>> oriented towards high performance and real-time applications. >>>>>> >>>>>> I have read your tracing thread and I am surprised to see how much things >>>>>> you would like in a tracer are already implemented and tested in LTTng. I am >>>>>> currently porting my tracer to Xen, so I think it might be useful for you to >>>>>> know what it provides. My goal is to do not duplicate the effort and save >>>>>> everyone some time. >>>>>> >>>>>> Here follows some key features of LTTng : >>>>>> >>>>>> Architecture independant data types >>>>>> Extensible event records >>>>>> Self-describing traces >>>>>> Variable size records >>>>>> Fast (200 ns per event record) >>>>>> Highly reentrant >>>>>> Does not disable interrupts >>>>>> Does not take lock on the critical path >>>>>> Supports NMI tracing >>>>>> Analysis/visualization tool (LTTV) >>>>>> >>>>>> Looking at the integration of the existing LTTng implementation into Xen, I >>>>>> came up with those two points for my Christmas whichlist : >>>>>> >>>>>> Additionnal functionnalities that would be nice to have in Xen : >>>>>> >>>>>> - RCU-style updates : would allow freeing the buffers without impact on tracing. >>>>>> * I guess I could currently use : >>>>>> for_each_domain( d ) >>>>>> for_each_vcpu( d, v ) >>>>>> vcpu_sleep_sync(v); >>>>>> I think it will have a huge impact on the system, but it would only be >>>>>> performed before trace buffers free. >>>>>> >>>>>> - Polling for data in Xen from a dom0 process. >>>>>> Xentrace currently polls the hypervisor each 100ms to see if there is data >>>>>> that needs to be consumed. Instead of an active polling, it would be nice to >>>>>> use the dom0 OS capability to put a process to sleep while waiting for a >>>>>> resource. It would imply creating a module, loaded in dom0, that would wait >>>>>> for a specific virq coming from the Hypervisor to wake up such processes. >>>>>> We could think of exporting a complete poll() interface through sysfs or >>>>>> procfs that would be a directory filled with the resources exported from the >>>>>> Hypervisor to dom0 (which could include wait for resource freed, useful when >>>>>> shutting down a domU instead of busy looping). It would help dom0 to schedule >>>>>> other processes while a process is waiting for the Hypervisor. >>>>>> >>>>>> >>>>>> You might also be interested in looking at : >>>>>> - the website (http://ltt.polymtl.ca) >>>>>> - LTTng Xen port design document (this one is different from the one posted by >>>>>> Jimi) >>>>>> (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/lttng-xen.txt) >>>>>> - OLS 2006 paper "The LTTng tracer : A Low Impact Performance and Behavior >>>>>> Monitor for GNU/Linux" >>>>>> (http://ltt.polymtl.ca/papers/desnoyers-ols2006.pdf) >>>>>> >>>>>> >>>>>> Questions and constructive comments are welcome. >>>>>> >>>>>> Mathieu >>>>>> >>>>>> >>>>>> OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg >>>>>> Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 >>>>>> >>>>>> _______________________________________________ >>>>>> Xen-devel mailing list >>>>>> Xen-devel@lists.xensource.com >>>>>> http://lists.xensource.com/xen-devel >>>>>> >>>> >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Mathieu Desnoyers
2007-Jun-28 15:08 UTC
Re: [Xen-devel] LTTng Xen port : finally in a repository near you
* INAKOSHI Hiroya (inakoshi.hiroya@jp.fujitsu.com) wrote:> Mathieu Desnoyers wrote: > > * INAKOSHI Hiroya (inakoshi.hiroya@jp.fujitsu.com) wrote: > >> Hi Mathieu, > >> > >> thanks for your reply. I can understand your opinion very well but a > >> concern is that cpu ids on a guest OS are different from those on Xen > >> because they are virtualized. The number of vcpus in a guest OS is also > >> different from that of pcpus as you mentioned. I wondered if the two > >> traces could be merged directly. If you translate vcpu ids to pcpu ids > >> in writing records in the trace buffer in Xen, this concern is solved in > >> a natural way. > >> > > > > When you are executing code in dom0 or domUs, how do you plan to get the > > physical CPU number on which the tracing is done ? > > I am considering the way where dom0 or domUs call hypercalls to write > records in the Xen''s trace buffer. In this setting, the vcpu info is > located in the xen kernel stack and the pcpu is the one performing the > hypercall. So, I can resolve the mapping between vcpu id and pcpu id. >The performance hit that goes with going through an hypercall for each traced event would be too high. Typically, changing ring level involves executing an interrupt routine, which takes a few thousands nanoseconds. My tracing probes runs within the traced ring in about 270ns (as tested on a Pentium 4 3GHz). Mathieu> Regards, > Hiroya > > > > >> Mathieu Desnoyers wrote: > >>> * INAKOSHI Hiroya (inakoshi.hiroya@jp.fujitsu.com) wrote: > >>>> Hi Mathieu, > >>>> > >>>> I am interested in LTTng-xen because I thought that it would be nice if > >>>> I can get traces on both xen and guest linux at the same time. I > >>>> reviewed LTTng-xen and found that > >>>> > >>>> * LTTng and LTTng-xen have a quite similar structure, > >>>> * a trace buffer resides in a hypervisor for LTTng-xen, > >>>> * it is currently impossible to get traces from guest linux because > >>>> there is no LTTng for 2.6.18-xen kernel, as you mentioned. > >>>> > >>>> I had coarsely ported LTTng to 2.6.18-xen, though it is only for > >>>> i386. Now I can get traces on xen and guest linux simultaneously, even > >>>> though they put records in different trace buffers. > >>> Hi Ikanoski, > >>> > >>> We did the same kind of coarse 2.6.18 port at our lab internally to get > >>> traces from both Linux and Xen. The fact that the traces are recorded in > >>> different buffers does not change anything to the fact that those trace > >>> files can be copied in the same trace directory so they can be parsed > >>> together by LTTV (traces coming from dom0, domUs and hypervisor). They > >>> are synchronized by using the TSCs (hopefully, you will configure your > >>> system to get a reliable TSC on AMD and older intels, see the > >>> ltt-test-tsc kernel module in recent LTTng versions and ltt.polymtl.ca > >>> website for info on that matter). > >>> > >>> > >>>> Then I thought that > >>>> it would be more useful if they put records in xen''s trace buffer and I > >>>> can analyze events > >>> LTTV merges the information from all the valid trace files that appears > >>> within the trace directory, so the analysis can be done on data coming > >>> from userspace, kernels and hypervisor. > >>> > >>>> from xen and linux guests with a single lttd and > >>>> lttctl running on Domain-0. Do you have an opinion about that? > >>>> > >>> lttctl-xen and lttd-xen, although being quite similar to lttd and > >>> lttctl, use hypercalls to get the data. The standard lttctl/lttd uses > >>> debugfs files as a hook to the trace buffers. > >>> > >>> As a distribution matter, I prefer to leave both separate for now, > >>> because lttctl-xen and lttd-xen is highly tied to the Xen tree. > >>> > >>> Also, merging the information within the buffers between Xen and Dom0 is > >>> not such a great idea: The Hypervisor and dom0 can have a different > >>> number of CPUs (Xen : real CPUs, dom0: vcpus). Since I use per-cpu > >>> buffers, it does not fit. > >>> > >>> Also, I don''t want dom0 to overwrite data from the Xen buffers easily: > >>> it''s better if we keep some protection between dom0 and the Hypervisor. > >>> > >>> Thanks for looking into this, don''t hesitate to ask further questions, > >>> > >>> Mathieu > >>> > >>>> Regards, > >>>> Hiroya > >>>> > >>>> > >>>> Mathieu Desnoyers wrote: > >>>>> Hello, > >>>>> > >>>>> I made a working version of the LTTng tracer for xen-unstable for x86. > >>>>> Here is the pointer to my repository so you can try it out : > >>>>> > >>>>> hg clone http://ltt.polymtl.ca/cgi-bin/hgweb.cgi xen-unstable-lttng.hg > >>>>> > >>>>> Basic usage : > >>>>> > >>>>> (see lttctl-xen -h) > >>>>> > >>>>> lttctl-xen -c > >>>>> > >>>>> (in a different console) > >>>>> lttd-xen -t /tmp/xentrace1 > >>>>> > >>>>> (in the 1st console) > >>>>> lttctl-xen -s > >>>>> > >>>>> (tracing is active) > >>>>> > >>>>> lttctl-xen -q > >>>>> lttctl-xen -r > >>>>> > >>>>> lttd-xen should automatically quit after writing the last buffers as > >>>>> soon as lttctl-xen -r is issued. > >>>>> > >>>>> Then, you must copy the XML facilities : > >>>>> > >>>>> (see the http://ltt.polymtl.ca > QUICKSTART to see how to install the > >>>>> ltt-control package which contains the XML facilities in your system) > >>>>> > >>>>> lttctl-xen -e -t /tmp/xentrace1 > >>>>> > >>>>> View in the visualiser : (see the QUICKSTART to see how to install it) > >>>>> > >>>>> lttv -m textDump -t /tmp/xentrace1 > >>>>> > >>>>> (not tested yet) : to visualize a dom0 trace with the xen hypervisor > >>>>> information, one would have to collect the dom0 kernel trace and the Xen > >>>>> trace and open them together with : > >>>>> lttv -m textDump -t /tmp/xentrace1 -t /tmp/dom0trace > >>>>> > >>>>> The current Linux kernel instrumentation is for 2.6.20. A backport might > >>>>> be needed to 2.6.18 if there is no proper Xen support in 2.6.20 (I have > >>>>> not followed the recent developments). > >>>>> > >>>>> > >>>>> Currently broken/missing : > >>>>> > >>>>> - Ressources are not freed when the trace channels are destroyed. So you > >>>>> basically have to reboot between taking different traces. > >>>>> - My code in the hypervisor complains to the console that subbuffers > >>>>> have not been fully read when the trace channels are destroyed. The > >>>>> error printing is just done too fast : lttd-xen is still there and > >>>>> reading the buffers at that point. It will get fixed with proper > >>>>> ressource usage tracking of both Xen and lttd-xen (same as the first > >>>>> point above). > >>>>> - x86_64 not tested, powerpc local.h and ltt.h missing (should be ripped > >>>>> from my Linux kernel LTTng). > >>>>> > >>>>> > >>>>> Cheers, > >>>>> > >>>>> Mathieu > >>>>> > >>>>> > >>>>> > >>>>> * Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote: > >>>>>> Hi, > >>>>>> > >>>>>> My name is Mathieu Desnoyers, I am the current maintainer of the Linux Trace > >>>>>> Toolkit project, known as LTTng. This is a tracer for the 2.6 Linux kernels > >>>>>> oriented towards high performance and real-time applications. > >>>>>> > >>>>>> I have read your tracing thread and I am surprised to see how much things > >>>>>> you would like in a tracer are already implemented and tested in LTTng. I am > >>>>>> currently porting my tracer to Xen, so I think it might be useful for you to > >>>>>> know what it provides. My goal is to do not duplicate the effort and save > >>>>>> everyone some time. > >>>>>> > >>>>>> Here follows some key features of LTTng : > >>>>>> > >>>>>> Architecture independant data types > >>>>>> Extensible event records > >>>>>> Self-describing traces > >>>>>> Variable size records > >>>>>> Fast (200 ns per event record) > >>>>>> Highly reentrant > >>>>>> Does not disable interrupts > >>>>>> Does not take lock on the critical path > >>>>>> Supports NMI tracing > >>>>>> Analysis/visualization tool (LTTV) > >>>>>> > >>>>>> Looking at the integration of the existing LTTng implementation into Xen, I > >>>>>> came up with those two points for my Christmas whichlist : > >>>>>> > >>>>>> Additionnal functionnalities that would be nice to have in Xen : > >>>>>> > >>>>>> - RCU-style updates : would allow freeing the buffers without impact on tracing. > >>>>>> * I guess I could currently use : > >>>>>> for_each_domain( d ) > >>>>>> for_each_vcpu( d, v ) > >>>>>> vcpu_sleep_sync(v); > >>>>>> I think it will have a huge impact on the system, but it would only be > >>>>>> performed before trace buffers free. > >>>>>> > >>>>>> - Polling for data in Xen from a dom0 process. > >>>>>> Xentrace currently polls the hypervisor each 100ms to see if there is data > >>>>>> that needs to be consumed. Instead of an active polling, it would be nice to > >>>>>> use the dom0 OS capability to put a process to sleep while waiting for a > >>>>>> resource. It would imply creating a module, loaded in dom0, that would wait > >>>>>> for a specific virq coming from the Hypervisor to wake up such processes. > >>>>>> We could think of exporting a complete poll() interface through sysfs or > >>>>>> procfs that would be a directory filled with the resources exported from the > >>>>>> Hypervisor to dom0 (which could include wait for resource freed, useful when > >>>>>> shutting down a domU instead of busy looping). It would help dom0 to schedule > >>>>>> other processes while a process is waiting for the Hypervisor. > >>>>>> > >>>>>> > >>>>>> You might also be interested in looking at : > >>>>>> - the website (http://ltt.polymtl.ca) > >>>>>> - LTTng Xen port design document (this one is different from the one posted by > >>>>>> Jimi) > >>>>>> (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/lttng-xen.txt) > >>>>>> - OLS 2006 paper "The LTTng tracer : A Low Impact Performance and Behavior > >>>>>> Monitor for GNU/Linux" > >>>>>> (http://ltt.polymtl.ca/papers/desnoyers-ols2006.pdf) > >>>>>> > >>>>>> > >>>>>> Questions and constructive comments are welcome. > >>>>>> > >>>>>> Mathieu > >>>>>> > >>>>>> > >>>>>> OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg > >>>>>> Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 > >>>>>> > >>>>>> _______________________________________________ > >>>>>> Xen-devel mailing list > >>>>>> Xen-devel@lists.xensource.com > >>>>>> http://lists.xensource.com/xen-devel > >>>>>> > >>>> > >> > > > >-- Mathieu Desnoyers Computer Engineering Ph.D. Student, Ecole Polytechnique de Montreal OpenPGP key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
INAKOSHI Hiroya
2007-Jun-29 00:19 UTC
Re: [Xen-devel] LTTng Xen port : finally in a repository near you
Hi Mathieu, I see, thanks. I''m looking forward LTTng be captured in linux kernel first, and then, in xen. Regards, Hiroya Mathieu Desnoyers wrote:> * INAKOSHI Hiroya (inakoshi.hiroya@jp.fujitsu.com) wrote: >> Mathieu Desnoyers wrote: >>> * INAKOSHI Hiroya (inakoshi.hiroya@jp.fujitsu.com) wrote: >>>> Hi Mathieu, >>>> >>>> thanks for your reply. I can understand your opinion very well but a >>>> concern is that cpu ids on a guest OS are different from those on Xen >>>> because they are virtualized. The number of vcpus in a guest OS is also >>>> different from that of pcpus as you mentioned. I wondered if the two >>>> traces could be merged directly. If you translate vcpu ids to pcpu ids >>>> in writing records in the trace buffer in Xen, this concern is solved in >>>> a natural way. >>>> >>> When you are executing code in dom0 or domUs, how do you plan to get the >>> physical CPU number on which the tracing is done ? >> I am considering the way where dom0 or domUs call hypercalls to write >> records in the Xen''s trace buffer. In this setting, the vcpu info is >> located in the xen kernel stack and the pcpu is the one performing the >> hypercall. So, I can resolve the mapping between vcpu id and pcpu id. >> > > The performance hit that goes with going through an hypercall for each > traced event would be too high. Typically, changing ring level involves > executing an interrupt routine, which takes a few thousands nanoseconds. > My tracing probes runs within the traced ring in about 270ns (as tested > on a Pentium 4 3GHz). > > Mathieu > >> Regards, >> Hiroya >> >>>> Mathieu Desnoyers wrote: >>>>> * INAKOSHI Hiroya (inakoshi.hiroya@jp.fujitsu.com) wrote: >>>>>> Hi Mathieu, >>>>>> >>>>>> I am interested in LTTng-xen because I thought that it would be nice if >>>>>> I can get traces on both xen and guest linux at the same time. I >>>>>> reviewed LTTng-xen and found that >>>>>> >>>>>> * LTTng and LTTng-xen have a quite similar structure, >>>>>> * a trace buffer resides in a hypervisor for LTTng-xen, >>>>>> * it is currently impossible to get traces from guest linux because >>>>>> there is no LTTng for 2.6.18-xen kernel, as you mentioned. >>>>>> >>>>>> I had coarsely ported LTTng to 2.6.18-xen, though it is only for >>>>>> i386. Now I can get traces on xen and guest linux simultaneously, even >>>>>> though they put records in different trace buffers. >>>>> Hi Ikanoski, >>>>> >>>>> We did the same kind of coarse 2.6.18 port at our lab internally to get >>>>> traces from both Linux and Xen. The fact that the traces are recorded in >>>>> different buffers does not change anything to the fact that those trace >>>>> files can be copied in the same trace directory so they can be parsed >>>>> together by LTTV (traces coming from dom0, domUs and hypervisor). They >>>>> are synchronized by using the TSCs (hopefully, you will configure your >>>>> system to get a reliable TSC on AMD and older intels, see the >>>>> ltt-test-tsc kernel module in recent LTTng versions and ltt.polymtl.ca >>>>> website for info on that matter). >>>>> >>>>> >>>>>> Then I thought that >>>>>> it would be more useful if they put records in xen''s trace buffer and I >>>>>> can analyze events >>>>> LTTV merges the information from all the valid trace files that appears >>>>> within the trace directory, so the analysis can be done on data coming >>>>> from userspace, kernels and hypervisor. >>>>> >>>>>> from xen and linux guests with a single lttd and >>>>>> lttctl running on Domain-0. Do you have an opinion about that? >>>>>> >>>>> lttctl-xen and lttd-xen, although being quite similar to lttd and >>>>> lttctl, use hypercalls to get the data. The standard lttctl/lttd uses >>>>> debugfs files as a hook to the trace buffers. >>>>> >>>>> As a distribution matter, I prefer to leave both separate for now, >>>>> because lttctl-xen and lttd-xen is highly tied to the Xen tree. >>>>> >>>>> Also, merging the information within the buffers between Xen and Dom0 is >>>>> not such a great idea: The Hypervisor and dom0 can have a different >>>>> number of CPUs (Xen : real CPUs, dom0: vcpus). Since I use per-cpu >>>>> buffers, it does not fit. >>>>> >>>>> Also, I don''t want dom0 to overwrite data from the Xen buffers easily: >>>>> it''s better if we keep some protection between dom0 and the Hypervisor. >>>>> >>>>> Thanks for looking into this, don''t hesitate to ask further questions, >>>>> >>>>> Mathieu >>>>> >>>>>> Regards, >>>>>> Hiroya >>>>>> >>>>>> >>>>>> Mathieu Desnoyers wrote: >>>>>>> Hello, >>>>>>> >>>>>>> I made a working version of the LTTng tracer for xen-unstable for x86. >>>>>>> Here is the pointer to my repository so you can try it out : >>>>>>> >>>>>>> hg clone http://ltt.polymtl.ca/cgi-bin/hgweb.cgi xen-unstable-lttng.hg >>>>>>> >>>>>>> Basic usage : >>>>>>> >>>>>>> (see lttctl-xen -h) >>>>>>> >>>>>>> lttctl-xen -c >>>>>>> >>>>>>> (in a different console) >>>>>>> lttd-xen -t /tmp/xentrace1 >>>>>>> >>>>>>> (in the 1st console) >>>>>>> lttctl-xen -s >>>>>>> >>>>>>> (tracing is active) >>>>>>> >>>>>>> lttctl-xen -q >>>>>>> lttctl-xen -r >>>>>>> >>>>>>> lttd-xen should automatically quit after writing the last buffers as >>>>>>> soon as lttctl-xen -r is issued. >>>>>>> >>>>>>> Then, you must copy the XML facilities : >>>>>>> >>>>>>> (see the http://ltt.polymtl.ca > QUICKSTART to see how to install the >>>>>>> ltt-control package which contains the XML facilities in your system) >>>>>>> >>>>>>> lttctl-xen -e -t /tmp/xentrace1 >>>>>>> >>>>>>> View in the visualiser : (see the QUICKSTART to see how to install it) >>>>>>> >>>>>>> lttv -m textDump -t /tmp/xentrace1 >>>>>>> >>>>>>> (not tested yet) : to visualize a dom0 trace with the xen hypervisor >>>>>>> information, one would have to collect the dom0 kernel trace and the Xen >>>>>>> trace and open them together with : >>>>>>> lttv -m textDump -t /tmp/xentrace1 -t /tmp/dom0trace >>>>>>> >>>>>>> The current Linux kernel instrumentation is for 2.6.20. A backport might >>>>>>> be needed to 2.6.18 if there is no proper Xen support in 2.6.20 (I have >>>>>>> not followed the recent developments). >>>>>>> >>>>>>> >>>>>>> Currently broken/missing : >>>>>>> >>>>>>> - Ressources are not freed when the trace channels are destroyed. So you >>>>>>> basically have to reboot between taking different traces. >>>>>>> - My code in the hypervisor complains to the console that subbuffers >>>>>>> have not been fully read when the trace channels are destroyed. The >>>>>>> error printing is just done too fast : lttd-xen is still there and >>>>>>> reading the buffers at that point. It will get fixed with proper >>>>>>> ressource usage tracking of both Xen and lttd-xen (same as the first >>>>>>> point above). >>>>>>> - x86_64 not tested, powerpc local.h and ltt.h missing (should be ripped >>>>>>> from my Linux kernel LTTng). >>>>>>> >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> Mathieu >>>>>>> >>>>>>> >>>>>>> >>>>>>> * Mathieu Desnoyers (compudj@krystal.dyndns.org) wrote: >>>>>>>> Hi, >>>>>>>> >>>>>>>> My name is Mathieu Desnoyers, I am the current maintainer of the Linux Trace >>>>>>>> Toolkit project, known as LTTng. This is a tracer for the 2.6 Linux kernels >>>>>>>> oriented towards high performance and real-time applications. >>>>>>>> >>>>>>>> I have read your tracing thread and I am surprised to see how much things >>>>>>>> you would like in a tracer are already implemented and tested in LTTng. I am >>>>>>>> currently porting my tracer to Xen, so I think it might be useful for you to >>>>>>>> know what it provides. My goal is to do not duplicate the effort and save >>>>>>>> everyone some time. >>>>>>>> >>>>>>>> Here follows some key features of LTTng : >>>>>>>> >>>>>>>> Architecture independant data types >>>>>>>> Extensible event records >>>>>>>> Self-describing traces >>>>>>>> Variable size records >>>>>>>> Fast (200 ns per event record) >>>>>>>> Highly reentrant >>>>>>>> Does not disable interrupts >>>>>>>> Does not take lock on the critical path >>>>>>>> Supports NMI tracing >>>>>>>> Analysis/visualization tool (LTTV) >>>>>>>> >>>>>>>> Looking at the integration of the existing LTTng implementation into Xen, I >>>>>>>> came up with those two points for my Christmas whichlist : >>>>>>>> >>>>>>>> Additionnal functionnalities that would be nice to have in Xen : >>>>>>>> >>>>>>>> - RCU-style updates : would allow freeing the buffers without impact on tracing. >>>>>>>> * I guess I could currently use : >>>>>>>> for_each_domain( d ) >>>>>>>> for_each_vcpu( d, v ) >>>>>>>> vcpu_sleep_sync(v); >>>>>>>> I think it will have a huge impact on the system, but it would only be >>>>>>>> performed before trace buffers free. >>>>>>>> >>>>>>>> - Polling for data in Xen from a dom0 process. >>>>>>>> Xentrace currently polls the hypervisor each 100ms to see if there is data >>>>>>>> that needs to be consumed. Instead of an active polling, it would be nice to >>>>>>>> use the dom0 OS capability to put a process to sleep while waiting for a >>>>>>>> resource. It would imply creating a module, loaded in dom0, that would wait >>>>>>>> for a specific virq coming from the Hypervisor to wake up such processes. >>>>>>>> We could think of exporting a complete poll() interface through sysfs or >>>>>>>> procfs that would be a directory filled with the resources exported from the >>>>>>>> Hypervisor to dom0 (which could include wait for resource freed, useful when >>>>>>>> shutting down a domU instead of busy looping). It would help dom0 to schedule >>>>>>>> other processes while a process is waiting for the Hypervisor. >>>>>>>> >>>>>>>> >>>>>>>> You might also be interested in looking at : >>>>>>>> - the website (http://ltt.polymtl.ca) >>>>>>>> - LTTng Xen port design document (this one is different from the one posted by >>>>>>>> Jimi) >>>>>>>> (http://ltt.polymtl.ca/svn/ltt/branches/poly/doc/developer/lttng-xen.txt) >>>>>>>> - OLS 2006 paper "The LTTng tracer : A Low Impact Performance and Behavior >>>>>>>> Monitor for GNU/Linux" >>>>>>>> (http://ltt.polymtl.ca/papers/desnoyers-ols2006.pdf) >>>>>>>> >>>>>>>> >>>>>>>> Questions and constructive comments are welcome. >>>>>>>> >>>>>>>> Mathieu >>>>>>>> >>>>>>>> >>>>>>>> OpenPGP public key: http://krystal.dyndns.org:8080/key/compudj.gpg >>>>>>>> Key fingerprint: 8CD5 52C3 8E3C 4140 715F BA06 3F25 A8FE 3BAE 9A68 >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Xen-devel mailing list >>>>>>>> Xen-devel@lists.xensource.com >>>>>>>> http://lists.xensource.com/xen-devel >>>>>>>> >> >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel