Just in case anyone else on this list is running into similar issues, I can confirm that the patch appears to have resolved this. I've opened https://bugs.centos.org/view.php?id=13713 It was so bad that having the system under load (with rpmbuild) and opening another ssh window or two would almost always cause the oops. Cheers, Nathan From: CentOS-virt [mailto:centos-virt-bounces at centos.org] On Behalf Of Nathan March Sent: Wednesday, August 23, 2017 3:32 PM To: 'Discussion about the virtualization on CentOS' <centos-virt at centos.org> Subject: Re: [CentOS-virt] Major stability problems with xen 4.6.6 This appears to be a centos kernel issue rather than a xen one. https://lkml.org/lkml/2016/5/17/440 Digging through the posts and not clear why this never made it upstream. I'm going to apply that patch to my systems and see if it resolves, but won't know for certain until a week or two of stability goes by. - Nathan From: CentOS-virt [mailto:centos-virt-bounces at centos.org] On Behalf Of Nathan March Sent: Wednesday, August 23, 2017 2:48 PM To: centos-virt at centos.org <mailto:centos-virt at centos.org> Subject: [CentOS-virt] Major stability problems with xen 4.6.6 Hi, I'm seeing numerous crashes on the xen 4.6.6-1 / 4.6.6-2 releases, on both the 4.9.34-29 and 4.9.39-29 kernels. I've attached a txt with two different servers outputs. Xen-028: This crashed this morning while running 4.6.6-1 and 4.9.39-29 Xen-001: This crashed shortly after being upgraded to 4.6.6-2 and 4.9.34-29 Both are on different hardware platforms, and have had a long history of being stable until these upgrades. It sounds potentially related to https://kernel.googlesource.com/pub/scm/linux/kernel/git/tiwai/sound-unstabl e/+/9ce119f318ba1a07c29149301f1544b6c4bea52a%5E%21/ but I've confirmed this patch is in the above kernels. Any suggestions / thoughts? Cheers, Nathan -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.centos.org/pipermail/centos-virt/attachments/20170824/b0ad6504/attachment-0002.html>
Pasi Kärkkäinen
2017-Aug-29 19:45 UTC
[CentOS-virt] Major stability problems with xen 4.6.6
Hi, On Thu, Aug 24, 2017 at 03:45:46PM -0700, Nathan March wrote:> Just in case anyone else on this list is running into similar issues, I > can confirm that the patch appears to have resolved this. > > > I've opened [1]https://bugs.centos.org/view.php?id=13713 > > > It was so bad that having the system under load (with rpmbuild) and > opening another ssh window or two would almost always cause the oops. >It seems the patch you mentioned was merged to upstream Linux here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=71472fa9c52b1da27663c275d416d8654b905f05 and then reverted/removed here: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=896d81fefe5d1919537db2c2150ab6384e4a6610 Do you know if there has been proper/fixed patch after that? has it been merged to upstream Linux kernel already? Thanks, -- Pasi> > > Cheers, > > Nathan > > > > From: CentOS-virt [mailto:centos-virt-bounces at centos.org] On Behalf Of > Nathan March > Sent: Wednesday, August 23, 2017 3:32 PM > To: 'Discussion about the virtualization on CentOS' > <centos-virt at centos.org> > Subject: Re: [CentOS-virt] Major stability problems with xen 4.6.6 > > > > This appears to be a centos kernel issue rather than a xen one. > > > > [2]https://lkml.org/lkml/2016/5/17/440 > > > > Digging through the posts and not clear why this never made it upstream... > > > > I'm going to apply that patch to my systems and see if it resolves, but > won't know for certain until a week or two of stability goes by. > > > > - Nathan > > > > > > From: CentOS-virt [[3]mailto:centos-virt-bounces at centos.org] On Behalf Of > Nathan March > Sent: Wednesday, August 23, 2017 2:48 PM > To: [4]centos-virt at centos.org > Subject: [CentOS-virt] Major stability problems with xen 4.6.6 > > > > Hi, > > > > I'm seeing numerous crashes on the xen 4.6.6-1 / 4.6.6-2 releases, on both > the 4.9.34-29 and 4.9.39-29 kernels. > > > > I've attached a txt with two different servers outputs. > > > > Xen-028: This crashed this morning while running 4.6.6-1 and 4.9.39-29 > > Xen-001: This crashed shortly after being upgraded to 4.6.6-2 and > 4.9.34-29 > > > > Both are on different hardware platforms, and have had a long history of > being stable until these upgrades. > > > > It sounds potentially related to > [5]https://kernel.googlesource.com/pub/scm/linux/kernel/git/tiwai/sound-unstable/+/9ce119f318ba1a07c29149301f1544b6c4bea52a%5E%21/ > but I've confirmed this patch is in the above kernels. > > > > Any suggestions / thoughts? > > > > Cheers, > > Nathan
> It seems the patch you mentioned was merged to upstream Linux here: >https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i> d=71472fa9c52b1da27663c275d416d8654b905f05 > > and then reverted/removed here: >https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i> d=896d81fefe5d1919537db2c2150ab6384e4a6610 > > Do you know if there has been proper/fixed patch after that? has it been > merged to upstream Linux kernel already?Interesting! I didn't come across that when digging into this. It looks like this hasn't been followed up on at all since April: https://lists.gt.net/engine?list=linux;do=search_results;search_type=AND;sea rch_forum=forum_1;search_string=ldisc%20reopened&sb=post_time Currently I've got ~40 dom0's running with the patch on 4.9.44-39 and it's resolved all stability issues, previously I was seeing multiple crashes a week. Cheers, Nathan
Possibly Parallel Threads
- Major stability problems with xen 4.6.6
- Major stability problems with xen 4.6.6
- Major stability problems with xen 4.6.6
- Status of reverted Linux patch "tty: Fix ldisc crash on reopened tty", Linux 4.9 kernel frequent crashes
- Status of reverted Linux patch "tty: Fix ldisc crash on reopened tty", Linux 4.9 kernel frequent crashes