Hi list Maybe everybody already knows this, but today many of our domUs got crazy - unusually high load on machines doing nothing (e.g. load of 200 on machine which usually has load 2-4). Simple command: date; date `date +"%m%d%H%M%C%y.%S"`; date magically make peace ... [shocked] GB
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Yeah, that was fun. According to my monitoring, context switches and interrupts increased by a factor of 10 or more when the leap second was added. On 07/01/2012 08:59 AM, G.Bakalarski@icm.edu.pl wrote:> Hi list > > Maybe everybody already knows this, but > > today many of our domUs got crazy - unusually > > high load on machines doing nothing (e.g. load of 200 > > on machine which usually has load 2-4). > > Simple command: > > date; date `date +"%m%d%H%M%C%y.%S"`; date > > magically make peace ... > > > [shocked] > > > GB > > > _______________________________________________ Xen-users mailing > list Xen-users@lists.xen.org http://lists.xen.org/xen-users >- -- Tony Lill, OCT, Tony.Lill@AJLC.Waterloo.ON.CA President, A. J. Lill Consultants (519) 650 0660 539 Grand Valley Dr., Cambridge, Ont. N3H 2S2 (519) 241 2461 - --------------- http://www.ajlc.waterloo.on.ca/ ---------------- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/wa4YACgkQGS8yZq1uvxBragCeMwZj+YogkqOdIcOLgpCL0sIZ HhoAn2tMPHrrEVh4Qi2w/qHtViUkEykL =+5J7 -----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Apparently this is a known bug http://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/ On 07/01/2012 11:23 AM, Tony Lill wrote:> Yeah, that was fun. According to my monitoring, context switches > and interrupts increased by a factor of 10 or more when the leap > second was added. > > On 07/01/2012 08:59 AM, G.Bakalarski@icm.edu.pl wrote: >> Hi list > >> Maybe everybody already knows this, but > >> today many of our domUs got crazy - unusually > >> high load on machines doing nothing (e.g. load of 200 > >> on machine which usually has load 2-4). > >> Simple command: > >> date; date `date +"%m%d%H%M%C%y.%S"`; date > >> magically make peace ... > > >> [shocked] > > >> GB > > >> _______________________________________________ Xen-users mailing >> list Xen-users@lists.xen.org http://lists.xen.org/xen-users > > > > _______________________________________________ Xen-users mailing > list Xen-users@lists.xen.org http://lists.xen.org/xen-users >- -- Tony Lill, OCT, Tony.Lill@AJLC.Waterloo.ON.CA President, A. J. Lill Consultants (519) 650 0660 539 Grand Valley Dr., Cambridge, Ont. N3H 2S2 (519) 241 2461 - --------------- http://www.ajlc.waterloo.on.ca/ ---------------- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/wbY4ACgkQGS8yZq1uvxDqaACfXccDuw5PU3v/35lQg/+jQQ/W Ip8An0KhRNf9N1CxYUzl72JAoP4Rzr62 =tYiD -----END PGP SIGNATURE-----
On Sun, 2012-07-01 at 16:32 +0100, Tony Lill wrote:> Apparently this is a known bug > > http://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/AFAIK there isn''t anything Xen specific here, is there? If there is then we''ll need more details about exactly what was running (dom0 and domU OS, userspace workloads etc) on the machines in question. Ian.> > > On 07/01/2012 11:23 AM, Tony Lill wrote: > > Yeah, that was fun. According to my monitoring, context switches > > and interrupts increased by a factor of 10 or more when the leap > > second was added. > > > > On 07/01/2012 08:59 AM, G.Bakalarski@icm.edu.pl wrote: > >> Hi list > > > >> Maybe everybody already knows this, but > > > >> today many of our domUs got crazy - unusually > > > >> high load on machines doing nothing (e.g. load of 200 > > > >> on machine which usually has load 2-4). > > > >> Simple command: > > > >> date; date `date +"%m%d%H%M%C%y.%S"`; date > > > >> magically make peace ... > > > > > >> [shocked] > > > > > >> GB > > > > > >> _______________________________________________ Xen-users mailing > >> list Xen-users@lists.xen.org http://lists.xen.org/xen-users > > > > > > > > _______________________________________________ Xen-users mailing > > list Xen-users@lists.xen.org http://lists.xen.org/xen-users > > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xen.org > http://lists.xen.org/xen-users
> On Sun, 2012-07-01 at 16:32 +0100, Tony Lill wrote: >> Apparently this is a known bug >> >> http://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/ > > AFAIK there isn''t anything Xen specific here, is there?No and yes. At first we thought it was XEN only issue. Our Xen domUs where mostly unresposive. With xm console domU command we could login to domU after 20+ min waiting. Most machines had load in a range 150-300 and full CPUs load while doing really nothing. But today noticed that base metal servers were also affected but not so hard. Eg. bare metal machine which used to use 1-2 CPUs with appropriate load (normal) , used today more than 10 CPUs with load 10+ but effective performance was some 5 times slower (i.e. it could managed 5 time less records in a minute). But it was resposive - this is why we did not notice a problem yesterday. After trick with date command the behaviour returned to normal> > If there is then we''ll need more details about exactly what was running > (dom0 and domU OS, userspace workloads etc) on the machines in question.At first we were convinced that mostly Xen 4.1 + kernel 3.2.0-1 (debian wheezy first 3.2 kernel). But today w found older machines with debian squeeze kernel 2.6.32-5-xen where affected. But as I wrote above some bare metal machines with kernel 3.2.0-2 also affected. Our main applications are java and postresql based with java mostly generating mess. But also simple longer scp (overnight copies) were also broken - probably due to leap second or extremally high load. Maybe context switching is a reason why Xen machines were so much attected ??? GB> >> On 07/01/2012 11:23 AM, Tony Lill wrote: >> > Yeah, that was fun. According to my monitoring, context switches >> > and interrupts increased by a factor of 10 or more when the leap >> > second was added. >> > >> > On 07/01/2012 08:59 AM, G.Bakalarski@icm.edu.pl wrote: >> >> Hi list >> > >> >> Maybe everybody already knows this, but >> > >> >> today many of our domUs got crazy - unusually >> > >> >> high load on machines doing nothing (e.g. load of 200 >> > >> >> on machine which usually has load 2-4). >> > >> >> Simple command: >> > >> >> date; date `date +"%m%d%H%M%C%y.%S"`; date >> > >> >> magically make peace ... >> > >> > >> >> [shocked] >> > >> > >> >> GB >> > >> > >> >> _______________________________________________ Xen-users mailing >> >> list Xen-users@lists.xen.org http://lists.xen.org/xen-users >> > >> > >> > >> > _______________________________________________ Xen-users mailing >> > list Xen-users@lists.xen.org http://lists.xen.org/xen-users >> > >> >> _______________________________________________ >> Xen-users mailing list >> Xen-users@lists.xen.org >> http://lists.xen.org/xen-users > > > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xen.org > http://lists.xen.org/xen-users >
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Nah, I think it''s just a co-incidence that all the affected machines were Xen VMs. Guess I should go through them all and update the kernels. On 07/02/2012 05:41 AM, Ian Campbell wrote:> On Sun, 2012-07-01 at 16:32 +0100, Tony Lill wrote: >> Apparently this is a known bug >> >> http://blog.mozilla.org/it/2012/06/30/mysql-and-the-leap-second-high-cpu-and-the-fix/ > >> > AFAIK there isn''t anything Xen specific here, is there? > > If there is then we''ll need more details about exactly what was > running (dom0 and domU OS, userspace workloads etc) on the machines > in question. > > Ian. > >> >> >> On 07/01/2012 11:23 AM, Tony Lill wrote: >>> Yeah, that was fun. According to my monitoring, context >>> switches and interrupts increased by a factor of 10 or more >>> when the leap second was added. >>> >>> On 07/01/2012 08:59 AM, G.Bakalarski@icm.edu.pl wrote: >>>> Hi list >>> >>>> Maybe everybody already knows this, but >>> >>>> today many of our domUs got crazy - unusually >>> >>>> high load on machines doing nothing (e.g. load of 200 >>> >>>> on machine which usually has load 2-4). >>> >>>> Simple command: >>> >>>> date; date `date +"%m%d%H%M%C%y.%S"`; date >>> >>>> magically make peace ... >>> >>> >>>> [shocked] >>> >>> >>>> GB >>> >>> >>>> _______________________________________________ Xen-users >>>> mailing list Xen-users@lists.xen.org >>>> http://lists.xen.org/xen-users >>> >>> >>> >>> _______________________________________________ Xen-users >>> mailing list Xen-users@lists.xen.org >>> http://lists.xen.org/xen-users >>> >> >> _______________________________________________ Xen-users mailing >> list Xen-users@lists.xen.org http://lists.xen.org/xen-users > > > > _______________________________________________ Xen-users mailing > list Xen-users@lists.xen.org http://lists.xen.org/xen-users >- -- Tony Lill, OCT, Tony.Lill@AJLC.Waterloo.ON.CA President, A. J. Lill Consultants (519) 650 0660 539 Grand Valley Dr., Cambridge, Ont. N3H 2S2 (519) 241 2461 - --------------- http://www.ajlc.waterloo.on.ca/ ---------------- -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk/x+jIACgkQGS8yZq1uvxBisgCbBHj+cGikmEjwUeqxjKs5JEva vzMAn0UojeZIlFABrWP80wsQpQVL4Vl4 =ka41 -----END PGP SIGNATURE-----