Peri Hankey
2004-Sep-24 11:05 UTC
[Xen-devel] xen-2.0 20040923 and previous: rpm crash in xenU
Hello I have previously mentioned a floating point exception in rpm which seemed at one point to be connected with block-device handling in xenU systems, as both problems occurred at the same time. As I now get only the rpm floating point exception (sporadically), I have examined it further: ... # strace rpm [lots of strace output omitted] gettimeofday({1096021305, 741150}, NULL) = 0 nanosleep({0, 20000000}, {1076798912, 1075195904}) = 0 gettimeofday({1096021305, 741150}, NULL) = 0 --- SIGFPE (Floating point exception) @ 0 (0) --- +++ killed by SIGFPE +++ Does anyone else have problems of this kind? It may of course be a bug in rpm but I would then expect it to appear in xen0 as well. This did happen,but only in xen2.0-20040909 and xen2.0-20040910. So the implication (to me) is that there is a problem which sometimes causes rpm to crash in gettimeofday, which did affect both xen0 and xenU in xen2.0-20040909 and xen2.0-20040910, and which now only affects xenU. Any ideas? ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Ian Pratt
2004-Sep-24 12:41 UTC
Re: [Xen-devel] xen-2.0 20040923 and previous: rpm crash in xenU
I''d certainly recommend upgrading to a more recent xen2.0 release and trying to repeat. You''re not sending large amounts of output from dom0 to a serial line are you? I''ve observed this causing time (in all domains) to be somewhat jerky, but this is to be expected as the serial line is intended only for synchronous debug output. Are dom0 and domU both on the same CPU? I presume you''re using 2.6? Ian> I have previously mentioned a floating point exception in rpm which > seemed at one point to be connected with block-device handling in xenU > systems, as both problems occurred at the same time. As I now get only > the rpm floating point exception (sporadically), I have examined it further: > > ... # strace rpm > > [lots of strace output omitted] > > gettimeofday({1096021305, 741150}, NULL) = 0 > nanosleep({0, 20000000}, {1076798912, 1075195904}) = 0 > gettimeofday({1096021305, 741150}, NULL) = 0 > --- SIGFPE (Floating point exception) @ 0 (0) --- > +++ killed by SIGFPE +++ > > Does anyone else have problems of this kind? It may of course be a bug > in rpm but I would then expect it to appear in xen0 as well. This did > happen,but only in xen2.0-20040909 and xen2.0-20040910. > > So the implication (to me) is that there is a problem which sometimes > causes rpm to crash in gettimeofday, which did affect both xen0 and xenU > in xen2.0-20040909 and xen2.0-20040910, and which now only affects xenU. > > Any ideas?------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Peri Hankey
2004-Sep-24 12:53 UTC
Re: [Xen-devel] xen-2.0 20040923 and previous: rpm crash in xenU
I am running the xen-2.0 dated 23 Sep 2004 (yesterday), on 2.6.8.1, single CPU, no serial line connected. Changes since my build didn''t look very relevant ot me, but I''ll rebuild and try again. Thanks Peri Ian Pratt wrote:>I''d certainly recommend upgrading to a more recent xen2.0 release >and trying to repeat. > >You''re not sending large amounts of output from dom0 to a serial >line are you? I''ve observed this causing time (in all domains) to >be somewhat jerky, but this is to be expected as the serial line >is intended only for synchronous debug output. > >Are dom0 and domU both on the same CPU? I presume you''re using 2.6? > >Ian > > > >>I have previously mentioned a floating point exception in rpm which >>seemed at one point to be connected with block-device handling in xenU >>systems, as both problems occurred at the same time. As I now get only >>the rpm floating point exception (sporadically), I have examined it further: >> >>... # strace rpm >> >>[lots of strace output omitted] >> >>gettimeofday({1096021305, 741150}, NULL) = 0 >>nanosleep({0, 20000000}, {1076798912, 1075195904}) = 0 >>gettimeofday({1096021305, 741150}, NULL) = 0 >>--- SIGFPE (Floating point exception) @ 0 (0) --- >>+++ killed by SIGFPE +++ >> >>Does anyone else have problems of this kind? It may of course be a bug >>in rpm but I would then expect it to appear in xen0 as well. This did >>happen,but only in xen2.0-20040909 and xen2.0-20040910. >> >>So the implication (to me) is that there is a problem which sometimes >>causes rpm to crash in gettimeofday, which did affect both xen0 and xenU >> in xen2.0-20040909 and xen2.0-20040910, and which now only affects xenU. >> >>Any ideas? >> >> > > > >------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Flavio Leitner
2004-Sep-24 13:22 UTC
Re: [Xen-devel] xen-2.0 20040923 and previous: rpm crash in xenU
On Fri, Sep 24, 2004 at 01:53:32PM +0100, Peri Hankey wrote:> I am running the xen-2.0 dated 23 Sep 2004 (yesterday), on 2.6.8.1, > single CPU, no serial line connected. Changes since my build didn''t look > very relevant ot me, but I''ll rebuild and try again. >The same setup here, the same bug. open("/usr/lib/rpm/rpmrc", O_RDONLY|O_LARGEFILE) = 3 fcntl64(3, F_SETFD, FD_CLOEXEC) = 0 fstat64(3, {st_mode=S_IFREG|0644, st_size=12460, ...}) = 0 old_mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x404d7000 select(4, [3], NULL, NULL, {1, 0}) = 1 (in [3], left {1, 0}) gettimeofday({1096035236, 887402}, NULL) = 0 nanosleep({0, 20000000}, {1075431556, 1075070360}) = 0 gettimeofday({1096035236, 887402}, NULL) = 0 --- SIGFPE (Floating point exception) @ 0 (0) --- +++ killed by SIGFPE +++ The xen0 and xenU is not running on the same processor. -- Flávio Bruno Leitner <fbl@conectiva.com.br> [ E74B 0BD0 5E05 C385 239E 531C BC17 D670 7FF0 A9E0 ] ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Ian Pratt
2004-Sep-25 10:34 UTC
Re: [Xen-devel] xen-2.0 20040923 and previous: rpm crash in xenU
> I have previously mentioned a floating point exception in rpm which > seemed at one point to be connected with block-device handling in xenU > systems, as both problems occurred at the same time. As I now get only > the rpm floating point exception (sporadically), I have examined it further: > > gettimeofday({1096021305, 741150}, NULL) = 0 > nanosleep({0, 20000000}, {1076798912, 1075195904}) = 0 > gettimeofday({1096021305, 741150}, NULL) = 0 > --- SIGFPE (Floating point exception) @ 0 (0) --- > +++ killed by SIGFPE +++ >I''ve tried hard to reproduce this but failed. It might be worth having a look at the rpm source to check that the nanosleep is wrapped in a while loop, and where the presumed division by zero is really coming from. Since the strace prints the arguments that gettimeofday is passed rather than returns, we can''t really tell from the trace what gettimeofday is actually returning. If there''s a following call in the trace using the same struct we can probably deduce it. I''ve appended the test program I''ve used to debug this and other time issues. You might want to run it on your system just to check. Thanks, Ian #include <stdlib.h> #include <stdio.h> #include <sys/time.h> #include <time.h> /**************************************************************************/ /* rpcc: get full 64-bit Pentium TSC value */ static __inline__ unsigned long long int rpcc(void) { unsigned int __h, __l; __asm__ __volatile__ ("rdtsc" :"=a" (__l), "=d" (__h)); return (((unsigned long long)__h) << 32) + __l; } /* * find_cpu_speed: * Interrogates /proc/cpuinfo for the processor clock speed. * * Returns: speed of processor in MHz, rounded down to nearest whole MHz. */ #define MAX_LINE_LEN 50 int find_cpu_speed(void) { FILE *f; char s[MAX_LINE_LEN], *a, *b; int mhz = 2400; if ( (f = fopen("/proc/cpuinfo", "r")) == NULL ) goto out; while ( fgets(s, MAX_LINE_LEN, f) ) { if ( strstr(s, "cpu MHz") ) { /* Find the start of the speed value, and stop at the dec point. */ if ( !(a=strpbrk(s,"0123456789")) || !(b=strpbrk(a,".")) ) break; *b = ''\0''; fclose(f); return(atoi(a)); } } out: fprintf(stderr, "find_cpu_speed: error parsing /proc/cpuinfo for cpu MHz. Assume %d Mhz",mhz); return mhz; } int mhz; int looper (int N) { int i,a=12345; for(i=0;i<N;i++) a*=a; return a; } main() { int stride; unsigned long long start=0,stop=0, last=0, now, gt_firsteventtime; unsigned long long xnow, xlast, skip; unsigned long long firsteventtime=0, lasteventtime, lastfirsteventtime = 0; struct timeval a,b; int count=0, okcount=0; /* Required in order to print intermediate results at fixed period. */ mhz = find_cpu_speed(); printf("CPU speed = %d MHz\n", mhz); #define SLEEP (20*1000) while(1) { struct timespec z = { 0, SLEEP*1000 }; gettimeofday(&a, NULL); while(nanosleep(&z,&z)); // loop until success gettimeofday(&b, NULL); last = (((long long)a.tv_sec) * 1000000) + a.tv_usec; now = (((long long)b.tv_sec) * 1000000) + b.tv_usec; if ( now - last < SLEEP ) { printf("nanosleep(%dus): gtod %d\n", SLEEP, now-last ); } fprintf(stderr,"."); } gettimeofday(&a, NULL); xlast = rpcc(); last = (((long long)a.tv_sec) * 1000000) + a.tv_usec; while(1) { struct timespec z = { 0, SLEEP*1000 }; while(nanosleep(&z,&z)); // loop until success gettimeofday(&a, NULL); xnow = rpcc(); now = (((long long)a.tv_sec) * 1000000) + a.tv_usec; if ( now - last < SLEEP ) { printf("nanosleep(%d): gtod %d rpcc %d\n", SLEEP, now-last, (xnow-xlast)/mhz ); } last = now; xlast = xnow; if((count++ % 64) == 0)fprintf(stderr,"."); } start = rpcc(); while(1) { gettimeofday(&a, NULL); xnow = rpcc(); if(xnow < xlast) printf("** %lld %lld **\n", xnow, xlast); now = (((long long)a.tv_sec) * 1000000) + a.tv_usec; if(now<last) { printf("backwards!\n"); exit(-1); } if(now==last) { count++; lasteventtime = rpcc(); if( firsteventtime == 0 ) { firsteventtime = lasteventtime; gt_firsteventtime = now; skip = (xnow-xlast)/mhz; } } if(now>last) { if(count>5) { printf("[%lld.%lld %lld] duplicates= % 5d (% 5lldus)\t prev OK= % 6d\t fe %lldus (d=%lldus) skip = %lld\n", gt_firsteventtime/1000000,gt_firsteventtime%1000000, firsteventtime, count, (lasteventtime-firsteventtime)/mhz, okcount,(firsteventtime-start)/mhz, (firsteventtime-lastfirsteventtime)/mhz, skip); okcount = 0; lastfirsteventtime = firsteventtime; } count = 0; firsteventtime = 0; okcount++; } last = now; xlast = xnow; } #if 0 while(1) { int i; gettimeofday(&a, NULL); for(i=0;i<10000;i++); gettimeofday(&b, NULL); start = (((long long)a.tv_sec) * 1000000) + a.tv_usec; stop = (((long long)b.tv_sec) * 1000000) + b.tv_usec; if(1||stop<start) printf("crap : %ld\n",stop-start); } while(1) { gettimeofday(&a, NULL); start = (((long long)a.tv_sec) * 1000000) + a.tv_usec; looper(1000000); gettimeofday(&a, NULL); stop = (((long long)a.tv_sec) * 1000000) + a.tv_usec; printf("%ld\n",stop-start); } #endif } ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Peri Hankey
2004-Sep-25 18:48 UTC
Re: [Xen-devel] xen-2.0 20040923 and previous: rpm crash in xenU
Thanks for sending the test - one point of interest is that my /proc/cpuinfo doesn''t match your expectations - presumably an AMD Athlon(tm) XP 2400+ is running at 2400MHz. The main point however is that the results are indeed different in the two domains: ---------------- xen0 ------------------------ [me@xen0 testing]$ ./time-test find_cpu_speed: error parsing /proc/cpuinfo for cpu MHz. Assume 2400 MhzCPU speed = 2400 MHz ........................................................................................................... ........................................................................................................... ... (lots of dots) [me@xen0 testing]$ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 8 model name : AMD Athlon(tm) XP 2400+ stepping : 1 cache size : 256 KB fdiv_bug : no hlt_bug : yes f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu msr pae mce cx8 apic sep mca cmov pat pse36 mmx sse syscall mmxext 3dnowext 3dnow bogomips : 4010.80 ---------------- xenU ------------------------ [me@xenU testing]$ ./time-test find_cpu_speed: error parsing /proc/cpuinfo for cpu MHz. Assume 2400 MhzCPU speed = 2400 MHz nanosleep(20000us): gtod 0 .nanosleep(20000us): gtod 0 .nanosleep(20000us): gtod 0 .nanosleep(20000us): gtod 0 .nanosleep(20000us): gtod 0 .nanosleep(20000us): gtod 0 .nanosleep(20000us): gtod 0 .nanosleep(20000us): gtod 0 ... (lots of gtod 0s) [me@xenU testing]$ cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 6 model : 8 model name : AMD Athlon(tm) XP 2400+ stepping : 1 cache size : 256 KB fdiv_bug : no hlt_bug : yes f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 1 wp : yes flags : fpu msr pae mce cx8 apic sep mca cmov pat pse36 mmx sse syscall mmxext 3dnowext 3dnow bogomips : 3263.69 I hope this will help nail it down - I notice also that the bogomips measure is different. No time for more just now - but I have a more general question about lvm2 cows and snapshots, which I will put in a separate message. ''A snapshot is not a cow'' as the liftman said to Babar the Elephant. In fact the problem is that snapshots hold on to each other''s tails. Thanks for your help. Peri Ian Pratt wrote:>>I have previously mentioned a floating point exception in rpm which >>seemed at one point to be connected with block-device handling in xenU >>systems, as both problems occurred at the same time. As I now get only >>the rpm floating point exception (sporadically), I have examined it further: >> >>gettimeofday({1096021305, 741150}, NULL) = 0 >>nanosleep({0, 20000000}, {1076798912, 1075195904}) = 0 >>gettimeofday({1096021305, 741150}, NULL) = 0 >>--- SIGFPE (Floating point exception) @ 0 (0) --- >>+++ killed by SIGFPE +++ >> >> >> > >I''ve tried hard to reproduce this but failed. It might be worth >having a look at the rpm source to check that the nanosleep is >wrapped in a while loop, and where the presumed division by zero >is really coming from. > >Since the strace prints the arguments that gettimeofday is passed >rather than returns, we can''t really tell from the trace what >gettimeofday is actually returning. If there''s a following call >in the trace using the same struct we can probably deduce it. > >I''ve appended the test program I''ve used to debug this and other >time issues. You might want to run it on your system just to >check. > >Thanks, >Ian > > >#include <stdlib.h> >#include <stdio.h> >#include <sys/time.h> >#include <time.h> > >/**************************************************************************/ > > >/* rpcc: get full 64-bit Pentium TSC value */ >static __inline__ unsigned long long int rpcc(void) >{ > unsigned int __h, __l; > __asm__ __volatile__ ("rdtsc" :"=a" (__l), "=d" (__h)); > return (((unsigned long long)__h) << 32) + __l; >} > > >/* > * find_cpu_speed: > * Interrogates /proc/cpuinfo for the processor clock speed. > * > * Returns: speed of processor in MHz, rounded down to nearest whole MHz. > */ >#define MAX_LINE_LEN 50 >int find_cpu_speed(void) >{ > FILE *f; > char s[MAX_LINE_LEN], *a, *b; > int mhz = 2400; > > if ( (f = fopen("/proc/cpuinfo", "r")) == NULL ) goto out; > > while ( fgets(s, MAX_LINE_LEN, f) ) > { > if ( strstr(s, "cpu MHz") ) > { > /* Find the start of the speed value, and stop at the dec point. */ > if ( !(a=strpbrk(s,"0123456789")) || !(b=strpbrk(a,".")) ) break; > *b = ''\0''; > fclose(f); > return(atoi(a)); > } > } > > out: > fprintf(stderr, "find_cpu_speed: error parsing /proc/cpuinfo for cpu MHz. Assume %d Mhz",mhz); > return mhz; > >} > >int mhz; > >int looper (int N) >{ > int i,a=12345; > for(i=0;i<N;i++) > a*=a; > return a; >} > > >main() >{ > int stride; > unsigned long long start=0,stop=0, last=0, now, gt_firsteventtime; > unsigned long long xnow, xlast, skip; > unsigned long long firsteventtime=0, lasteventtime, lastfirsteventtime = 0; > struct timeval a,b; > int count=0, okcount=0; > > /* Required in order to print intermediate results at fixed period. */ > mhz = find_cpu_speed(); > printf("CPU speed = %d MHz\n", mhz); > > >#define SLEEP (20*1000) > > while(1) > { > struct timespec z = { 0, SLEEP*1000 }; > > gettimeofday(&a, NULL); > while(nanosleep(&z,&z)); // loop until success > gettimeofday(&b, NULL); > > last = (((long long)a.tv_sec) * 1000000) + a.tv_usec; > now = (((long long)b.tv_sec) * 1000000) + b.tv_usec; > > if ( now - last < SLEEP ) > { > printf("nanosleep(%dus): gtod %d\n", > SLEEP, now-last ); > } > > fprintf(stderr,"."); > } > > > > gettimeofday(&a, NULL); > xlast = rpcc(); > last = (((long long)a.tv_sec) * 1000000) + a.tv_usec; > while(1) > { > struct timespec z = { 0, SLEEP*1000 }; > > while(nanosleep(&z,&z)); // loop until success > gettimeofday(&a, NULL); > xnow = rpcc(); > now = (((long long)a.tv_sec) * 1000000) + a.tv_usec; > > if ( now - last < SLEEP ) > { > printf("nanosleep(%d): gtod %d rpcc %d\n", > SLEEP, now-last, (xnow-xlast)/mhz ); > } > > last = now; > xlast = xnow; > > if((count++ % 64) == 0)fprintf(stderr,"."); > } > > > start = rpcc(); > while(1) > { > gettimeofday(&a, NULL); > xnow = rpcc(); > > if(xnow < xlast) > printf("** %lld %lld **\n", xnow, xlast); > > now = (((long long)a.tv_sec) * 1000000) + a.tv_usec; > > if(now<last) > { > printf("backwards!\n"); > exit(-1); > } > > if(now==last) > { > count++; > lasteventtime = rpcc(); > if( firsteventtime == 0 ) > { > firsteventtime = lasteventtime; > gt_firsteventtime = now; > skip = (xnow-xlast)/mhz; > } > > } > > if(now>last) > { > if(count>5) > { > printf("[%lld.%lld %lld] duplicates= % 5d (% 5lldus)\t prev OK= % 6d\t fe %lldus (d=%lldus) skip = %lld\n", > gt_firsteventtime/1000000,gt_firsteventtime%1000000, > firsteventtime, > count, (lasteventtime-firsteventtime)/mhz, > okcount,(firsteventtime-start)/mhz, > (firsteventtime-lastfirsteventtime)/mhz, > skip); > okcount = 0; > lastfirsteventtime = firsteventtime; > } > count = 0; > firsteventtime = 0; > okcount++; > } > > last = now; > xlast = xnow; > } > > >#if 0 > while(1) > { >int i; > gettimeofday(&a, NULL); >for(i=0;i<10000;i++); > gettimeofday(&b, NULL); > start = (((long long)a.tv_sec) * 1000000) + a.tv_usec; > stop = (((long long)b.tv_sec) * 1000000) + b.tv_usec; > > if(1||stop<start) printf("crap : %ld\n",stop-start); > } > > while(1) > { > gettimeofday(&a, NULL); > start = (((long long)a.tv_sec) * 1000000) + a.tv_usec; > > looper(1000000); > > gettimeofday(&a, NULL); > stop = (((long long)a.tv_sec) * 1000000) + a.tv_usec; > > printf("%ld\n",stop-start); > > } > > >#endif > > > > >} > > > >------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Ian Pratt
2004-Sep-27 15:54 UTC
Re: [Xen-devel] xen-2.0 20040923 and previous: rpm crash in xenU
> Thanks for sending the test - one point of interest is that my > /proc/cpuinfo doesn''t match your expectations - presumably an AMD > Athlon(tm) XP 2400+ is running at 2400MHz. The main point however is > that the results are indeed different in the two domains:> [me@xenU testing]$ ./time-test > find_cpu_speed: error parsing /proc/cpuinfo for cpu MHz. Assume 2400 > MhzCPU speed = 2400 MHz > nanosleep(20000us): gtod 0 > .nanosleep(20000us): gtod 0 > .nanosleep(20000us): gtod 0Can you try a newer version of Xen/Linux. There''s been a tsc flag change that may influence this. It may be an Athlon-only bug, as we can''t reproduce on Opteron or Xeon. Thanks, Ian ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Flavio Leitner
2004-Sep-27 16:17 UTC
Re: [Xen-devel] xen-2.0 20040923 and previous: rpm crash in xenU
On Sat, Sep 25, 2004 at 07:48:44PM +0100, Peri Hankey wrote:> ---------------- xen0 ------------------------ > [me@xen0 testing]$ ./time-test > find_cpu_speed: error parsing /proc/cpuinfo for cpu MHz. Assume 2400 > MhzCPU speed = 2400 MHz > ........................................................................................................... > ........................................................................................................... > ... (lots of dots) > ---------------- xenU ------------------------ > [me@xenU testing]$ ./time-test > find_cpu_speed: error parsing /proc/cpuinfo for cpu MHz. Assume 2400 > MhzCPU speed = 2400 MHz > nanosleep(20000us): gtod 0 > .nanosleep(20000us): gtod 0 > .nanosleep(20000us): gtod 0 > .nanosleep(20000us): gtod 0Running on xen0 or xenU doesn''t make any difference, just lots of dots and that parser error. That''s funny because even rpm can''t reproduce that bug until now. -- Flávio Bruno Leitner <fbl@conectiva.com.br> [ E74B 0BD0 5E05 C385 239E 531C BC17 D670 7FF0 A9E0 ] ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Ian Pratt
2004-Sep-27 16:21 UTC
Re: [Xen-devel] xen-2.0 20040923 and previous: rpm crash in xenU
> On Sat, Sep 25, 2004 at 07:48:44PM +0100, Peri Hankey wrote: > > ---------------- xen0 ------------------------ > > [me@xen0 testing]$ ./time-test > > find_cpu_speed: error parsing /proc/cpuinfo for cpu MHz. Assume 2400 > > MhzCPU speed = 2400 MHz > > ........................................................................................................... > > ........................................................................................................... > > ... (lots of dots) > > ---------------- xenU ------------------------ > > [me@xenU testing]$ ./time-test > > find_cpu_speed: error parsing /proc/cpuinfo for cpu MHz. Assume 2400 > > MhzCPU speed = 2400 MHz > > nanosleep(20000us): gtod 0 > > .nanosleep(20000us): gtod 0 > > .nanosleep(20000us): gtod 0 > > .nanosleep(20000us): gtod 0 > > Running on xen0 or xenU doesn''t make any difference, just lots > of dotsThat''s the behaviour we see on all of our machines i.e. nanosleep works fine. I think Peri''s problem may be Athlon related,> and that parser error.The find_cpu_speed printf shouldn''t happen any more -- we no longer clear the TSC bit in the set of features the CPU reports, hence there should be a ''MHz'' line in /proc/cpuinfo Ian ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Flavio Leitner
2004-Sep-27 16:28 UTC
Re: [Xen-devel] xen-2.0 20040923 and previous: rpm crash in xenU
On Mon, Sep 27, 2004 at 05:21:52PM +0100, Ian Pratt wrote:> > On Sat, Sep 25, 2004 at 07:48:44PM +0100, Peri Hankey wrote: > > Running on xen0 or xenU doesn''t make any difference, just lots > > of dots > > That''s the behaviour we see on all of our machines i.e. nanosleep > works fine. I think Peri''s problem may be Athlon related,Just remember that test machine show the same problem before and it''s a PentiumIII. I''ll try to reproduce again tomorrow. -- Flávio Bruno Leitner <fbl@conectiva.com.br> [ E74B 0BD0 5E05 C385 239E 531C BC17 D670 7FF0 A9E0 ] ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Peri Hankey
2004-Sep-27 17:04 UTC
Re: [Xen-devel] xen-2.0 20040923 and previous: rpm crash in xenU
The timer question seems to involve more than just the processor type: ------------todays build: xen0 running Mandrake 10: OK [me@a4 testing]$ ./time-test CPU speed = 2008 MHz ............................................... [me@a4 testing]$ rpm -qa mktemp-1.5-11mdk libpwdb0-0.61.2-3mdk make-3.80-5mdk gettext-base-0.13.1-1mdk ifplugd-0.21b-1mdk libgdbm2-1.8.0-24mdk ... -------------todays build: a41: xenU running Mandrake 10.0: FAIL [me@a41 testing]$ ./time-test CPU speed = 2008 MHz nanosleep(20000us): gtod 0 .nanosleep(20000us): gtod 0 .nanosleep(20000us): gtod 0 .nanosleep(20000us): gtod 0 [me@a41 testing]$ rpm -qa Floating point exception (core dumped) [peri@a41 testing]$ --------------- todays build: a37: xenU running PLD Linux: OK [me@a37 testing]$ ./time-test CPU speed = 2008 MHz ............................................................ [peri@a37 testing]$ rpm -qa FHS-2.3-1 basesystem-1.99-2 acl-2.2.22-2 cracklib-2.7-18 cracklib-dicts-2.7-18 make-3.80-5 pam-0.77.3-11 ... I don''t think I''ve yet seen the rpm crash on the PLD Linux distribution. So it looks like a combination of Athlon with some subtle difference in the way the libraries are compiled. But Flavio, are you saying that you have the rpm problem that I have but don''t see the same test results as me? And by the way, I was very interested to hear about your conectiva rpm for xen - is it downloadable? Thanks to all Peri Ian Pratt wrote:>>On Sat, Sep 25, 2004 at 07:48:44PM +0100, Peri Hankey wrote: >> >> >>>---------------- xen0 ------------------------ >>>[me@xen0 testing]$ ./time-test >>>find_cpu_speed: error parsing /proc/cpuinfo for cpu MHz. Assume 2400 >>>MhzCPU speed = 2400 MHz >>>........................................................................................................... >>>........................................................................................................... >>>... (lots of dots) >>>---------------- xenU ------------------------ >>>[me@xenU testing]$ ./time-test >>>find_cpu_speed: error parsing /proc/cpuinfo for cpu MHz. Assume 2400 >>>MhzCPU speed = 2400 MHz >>>nanosleep(20000us): gtod 0 >>>.nanosleep(20000us): gtod 0 >>>.nanosleep(20000us): gtod 0 >>>.nanosleep(20000us): gtod 0 >>> >>> >>Running on xen0 or xenU doesn''t make any difference, just lots >>of dots >> >> > >That''s the behaviour we see on all of our machines i.e. nanosleep >works fine. I think Peri''s problem may be Athlon related, > > > >>and that parser error. >> >> > >The find_cpu_speed printf shouldn''t happen any more -- we no >longer clear the TSC bit in the set of features the CPU reports, >hence there should be a ''MHz'' line in /proc/cpuinfo > > >Ian > > > > >------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Flavio Leitner
2004-Sep-27 17:18 UTC
Re: [Xen-devel] xen-2.0 20040923 and previous: rpm crash in xenU
On Mon, Sep 27, 2004 at 06:04:31PM +0100, Peri Hankey wrote:> But Flavio, are you saying that you have the rpm problem that I have but > don''t see the same test results as me?Now I can''t reproduce rpm segfault as before, but tomorrow I''ll try again with more time. Probably the reason I can''t reproduce it anymore is the same reason that the test program didn''t fail.> And by the way, I was very interested to hear about your conectiva rpm > for xen - is it downloadable?Yes, should be in one mirror with our snapshot. The latest is xen-2.0-68430cl.i686.rpm. -- Flávio Bruno Leitner <fbl@conectiva.com.br> [ E74B 0BD0 5E05 C385 239E 531C BC17 D670 7FF0 A9E0 ] ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Peri Hankey
2004-Sep-28 10:33 UTC
Re: [Xen-devel] xen-2.0 20040923 and previous: rpm crash in xenU
Is there any kind of workaround for this nanosleep problem? -- Peri Ian Pratt wrote:>>On Sat, Sep 25, 2004 at 07:48:44PM +0100, Peri Hankey wrote: >> >> >>>---------------- xen0 ------------------------ >>>[me@xen0 testing]$ ./time-test >>>find_cpu_speed: error parsing /proc/cpuinfo for cpu MHz. Assume 2400 >>>MhzCPU speed = 2400 MHz >>>........................................................................................................... >>>........................................................................................................... >>>... (lots of dots) >>>---------------- xenU ------------------------ >>>[me@xenU testing]$ ./time-test >>>find_cpu_speed: error parsing /proc/cpuinfo for cpu MHz. Assume 2400 >>>MhzCPU speed = 2400 MHz >>>nanosleep(20000us): gtod 0 >>>.nanosleep(20000us): gtod 0 >>>.nanosleep(20000us): gtod 0 >>>.nanosleep(20000us): gtod 0 >>> >>> >>Running on xen0 or xenU doesn''t make any difference, just lots >>of dots >> >> > >That''s the behaviour we see on all of our machines i.e. nanosleep >works fine. I think Peri''s problem may be Athlon related, > > > >>and that parser error. >> >> > >The find_cpu_speed printf shouldn''t happen any more -- we no >longer clear the TSC bit in the set of features the CPU reports, >hence there should be a ''MHz'' line in /proc/cpuinfo > > >Ian > > > > >------------------------------------------------------- >This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 >Project Admins to receive an Apple iPod Mini FREE for your judgement on >who ports your project to Linux PPC the best. Sponsored by IBM. >Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/xen-devel > > > >------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Ian Pratt
2004-Sep-28 12:39 UTC
Re: [Xen-devel] xen-2.0 20040923 and previous: rpm crash in xenU
> Is there any kind of workaround for this nanosleep problem?It''s a bug, and one that''s probably trivial to fix if we can pin down the combination of CPU and libraries etc that provoke it. Are you using the latest unstable tree? Does my test program still indicate a problem for you? If so, please post the output of /proc/cpuinfo, and information about your distribution and libraries. Like the only other pending bug (the network panic bug) it seems only to effect a minority of users, and we can''t seem to reproduce it locally. Until we get more information I''m afraid there''s not a lot we can do beyond simply reviewing the code. Thanks, Ian ------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel
Peri Hankey
2004-Sep-28 13:57 UTC
Re: [Xen-devel] xen-2.0 20040923 and previous: rpm crash in xenU
Here''s a summary of results so far. I wasn''t able to try your test on the fedora xenU system as it was a minimal install - I was planning to install extras as needed using poldek (like yum or urpmi, from PLD, minimal external dependencies, very handy - but it relies on rpm to do the actual installs). using xen-20040927 (changelog in today''s unstable doesn''t suggest there would be much difference): xen0 (mandrake 10.0 - 2.6.8.1-xen0) time-test OK rpm: OK /lib/libc.so.6 -> libc-2.3.3.so xenU (mandrake 10.0 - 2.6.8.1-xenU) time-test FAIL rpm: CRASH /lib/libc.so.6 -> libc-2.3.3.so xenU (fedora-2 - 2.6.8.1-xenU) time-test ???? rpm: CRASH /lib/libc.so.6 -> libc-2.3.3.so xenU (PLD Linux - 2.6.8.1-xenU) time-test OK rpm: OK /lib/libc.so.6 -> libc-2.3.3.so xenU (Debian Sarge - 2.6.8.1-xenU) time-test OK rpm: NOT AVAILABLE /lib/libc.so.6 -> libc-2.3.2.so No very clear pattern - on the face of it it looks as if the same library version produces different results. I suppose that could be a result of different compiler flags and optimisation levels. I don''t know if that helps much. -- Peri Ian Pratt wrote:>>Is there any kind of workaround for this nanosleep problem? >> >> > >It''s a bug, and one that''s probably trivial to fix if we can pin >down the combination of CPU and libraries etc that provoke it. > >Are you using the latest unstable tree? Does my test program >still indicate a problem for you? If so, please post the output >of /proc/cpuinfo, and information about your distribution and >libraries. > >Like the only other pending bug (the network panic bug) it seems >only to effect a minority of users, and we can''t seem to >reproduce it locally. Until we get more information I''m afraid >there''s not a lot we can do beyond simply reviewing the code. > >Thanks, >Ian > > >------------------------------------------------------- >This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 >Project Admins to receive an Apple iPod Mini FREE for your judgement on >who ports your project to Linux PPC the best. Sponsored by IBM. >Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php >_______________________________________________ >Xen-devel mailing list >Xen-devel@lists.sourceforge.net >https://lists.sourceforge.net/lists/listinfo/xen-devel > > > >------------------------------------------------------- This SF.Net email is sponsored by: YOU BE THE JUDGE. Be one of 170 Project Admins to receive an Apple iPod Mini FREE for your judgement on who ports your project to Linux PPC the best. Sponsored by IBM. Deadline: Sept. 24. Go here: http://sf.net/ppc_contest.php _______________________________________________ Xen-devel mailing list Xen-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/xen-devel