What does "Controller" mean, the openstack controller node or somethign else (like HBA ) ? You picture says its SAN but the text says multi-path mount.. SAN would mean block devices, so I am assuming you have redundant block devices on the compute host, mkfs'ing it and then creating bricks for gluster ? The stack trace looks like you hit a kernel bug and glusterfsd happens to be running on the CPU at the time... my 2 cents thanx, deepak On Tue, Jan 20, 2015 at 11:29 AM, chamara samarakoon <chthsa123 at gmail.com> wrote:> Hi All, > > > We have setup Openstack cloud as below. And the "/va/lib/nova/instances" > is a Gluster volume. > > CentOS - 6.5 > Kernel - 2.6.32-431.29.2.el6.x86_64 > GlusterFS - glusterfs 3.5.2 built on Jul 31 2014 18:47:54 > OpenStack - RDO using Packstack > > > > > ? > > > Recently Controller node freezes with following error (Which required hard > reboot), as a result Gluster volumes on compute node can not reach the > controller and due to that all the instances on compute nodes become to > read-only status which causes to restart all instances. > > > > > *BUG: scheduling while atomic : glusterfsd/42725/0xffffffff* > *BUG: unable to handle kernel paging request at 0000000038a60d0a8* > *IP: [<fffffffff81058e5d>] task_rq_lock+0x4d/0xa0* > *PGD 1065525067 PUD 0* > *Oops: 0000 [#1] SMP* > *last sysfs file : > /sys/device/pci0000:80/0000:80:02.0/0000:86:00.0/host2/port-2:0/end_device-2:0/target2:0:0/2:0:0:1/state* > *CPU 0* > *Modules linked in : xtconntrack iptable_filter ip_tables ipt_REDIRECT > fuse ipv openvswitch vxlan iptable_mangle * > > Please advice on above incident , also feedback on the Openstack + > GlusterFS setup is appreciated. > > Thank You, > Chamara > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150120/98c29d0e/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: openstak-gluster.png Type: image/png Size: 240027 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150120/98c29d0e/attachment-0001.png>
chamara samarakoon
2015-Jan-20 06:41 UTC
[Gluster-users] CentOS Freeze with GlusterFS Error
HI Deepak, Yes, it is Openstack controller and SAN is configured for redundancy through multi-path. Regards, Chamara On Tue, Jan 20, 2015 at 11:55 AM, Deepak Shetty <dpkshetty at gmail.com> wrote:> What does "Controller" mean, the openstack controller node or somethign > else (like HBA ) ? > You picture says its SAN but the text says multi-path mount.. SAN would > mean block devices, so I am assuming you have redundant block devices on > the compute host, mkfs'ing it and then creating bricks for gluster ? > > > The stack trace looks like you hit a kernel bug and glusterfsd happens to > be running on the CPU at the time... my 2 cents > > thanx, > deepak > > On Tue, Jan 20, 2015 at 11:29 AM, chamara samarakoon <chthsa123 at gmail.com> > wrote: > >> Hi All, >> >> >> We have setup Openstack cloud as below. And the "/va/lib/nova/instances" >> is a Gluster volume. >> >> CentOS - 6.5 >> Kernel - 2.6.32-431.29.2.el6.x86_64 >> GlusterFS - glusterfs 3.5.2 built on Jul 31 2014 18:47:54 >> OpenStack - RDO using Packstack >> >> >> >> >> ? >> >> >> Recently Controller node freezes with following error (Which required >> hard reboot), as a result Gluster volumes on compute node can not reach the >> controller and due to that all the instances on compute nodes become to >> read-only status which causes to restart all instances. >> >> >> >> >> *BUG: scheduling while atomic : glusterfsd/42725/0xffffffff* >> *BUG: unable to handle kernel paging request at 0000000038a60d0a8* >> *IP: [<fffffffff81058e5d>] task_rq_lock+0x4d/0xa0* >> *PGD 1065525067 PUD 0* >> *Oops: 0000 [#1] SMP* >> *last sysfs file : >> /sys/device/pci0000:80/0000:80:02.0/0000:86:00.0/host2/port-2:0/end_device-2:0/target2:0:0/2:0:0:1/state* >> *CPU 0* >> *Modules linked in : xtconntrack iptable_filter ip_tables ipt_REDIRECT >> fuse ipv openvswitch vxlan iptable_mangle * >> >> Please advice on above incident , also feedback on the Openstack + >> GlusterFS setup is appreciated. >> >> Thank You, >> Chamara >> >> >> _______________________________________________ >> Gluster-users mailing list >> Gluster-users at gluster.org >> http://www.gluster.org/mailman/listinfo/gluster-users >> > >-- chthsa -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150120/83d3fe17/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: openstak-gluster.png Type: image/png Size: 240027 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150120/83d3fe17/attachment.png>
On Tue, Jan 20, 2015 at 11:55:40AM +0530, Deepak Shetty wrote:> What does "Controller" mean, the openstack controller node or somethign > else (like HBA ) ? > You picture says its SAN but the text says multi-path mount.. SAN would > mean block devices, so I am assuming you have redundant block devices on > the compute host, mkfs'ing it and then creating bricks for gluster ? > > > The stack trace looks like you hit a kernel bug and glusterfsd happens to > be running on the CPU at the time... my 2 centsThat definitely is a kernel issue. You should contact your OS support vendor about this. The bits you copy/pasted are not sufficient to see what caused it. The glusterfsd process is just a casualty of the kernel issue, and it is not likely this can be fixed in Gluster. I suspect you need a kernel patch/update. Niels> > thanx, > deepak > > On Tue, Jan 20, 2015 at 11:29 AM, chamara samarakoon <chthsa123 at gmail.com> > wrote: > > > Hi All, > > > > > > We have setup Openstack cloud as below. And the "/va/lib/nova/instances" > > is a Gluster volume. > > > > CentOS - 6.5 > > Kernel - 2.6.32-431.29.2.el6.x86_64 > > GlusterFS - glusterfs 3.5.2 built on Jul 31 2014 18:47:54 > > OpenStack - RDO using Packstack > > > > > > > > > > ? > > > > > > Recently Controller node freezes with following error (Which required hard > > reboot), as a result Gluster volumes on compute node can not reach the > > controller and due to that all the instances on compute nodes become to > > read-only status which causes to restart all instances. > > > > > > > > > > *BUG: scheduling while atomic : glusterfsd/42725/0xffffffff* > > *BUG: unable to handle kernel paging request at 0000000038a60d0a8* > > *IP: [<fffffffff81058e5d>] task_rq_lock+0x4d/0xa0* > > *PGD 1065525067 PUD 0* > > *Oops: 0000 [#1] SMP* > > *last sysfs file : > > /sys/device/pci0000:80/0000:80:02.0/0000:86:00.0/host2/port-2:0/end_device-2:0/target2:0:0/2:0:0:1/state* > > *CPU 0* > > *Modules linked in : xtconntrack iptable_filter ip_tables ipt_REDIRECT > > fuse ipv openvswitch vxlan iptable_mangle * > > > > Please advice on above incident , also feedback on the Openstack + > > GlusterFS setup is appreciated. > > > > Thank You, > > Chamara > > > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-users > >> _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 181 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150120/1c3d22fc/attachment.sig>