In addition, I would also like to add that i do suspect (just my hunch) that it could be related to multipath. If you can try without multipath and if it doesn't re-create, i think that would be a good data point for kernel/OS vendor to debug further. my 2 cents again :) thanx, deepak On Tue, Jan 20, 2015 at 2:32 PM, Niels de Vos <ndevos at redhat.com> wrote:> On Tue, Jan 20, 2015 at 11:55:40AM +0530, Deepak Shetty wrote: > > What does "Controller" mean, the openstack controller node or somethign > > else (like HBA ) ? > > You picture says its SAN but the text says multi-path mount.. SAN would > > mean block devices, so I am assuming you have redundant block devices on > > the compute host, mkfs'ing it and then creating bricks for gluster ? > > > > > > The stack trace looks like you hit a kernel bug and glusterfsd happens to > > be running on the CPU at the time... my 2 cents > > That definitely is a kernel issue. You should contact your OS support > vendor about this. > > The bits you copy/pasted are not sufficient to see what caused it. The > glusterfsd process is just a casualty of the kernel issue, and it is not > likely this can be fixed in Gluster. I suspect you need a kernel > patch/update. > > Niels > > > > > thanx, > > deepak > > > > On Tue, Jan 20, 2015 at 11:29 AM, chamara samarakoon < > chthsa123 at gmail.com> > > wrote: > > > > > Hi All, > > > > > > > > > We have setup Openstack cloud as below. And the > "/va/lib/nova/instances" > > > is a Gluster volume. > > > > > > CentOS - 6.5 > > > Kernel - 2.6.32-431.29.2.el6.x86_64 > > > GlusterFS - glusterfs 3.5.2 built on Jul 31 2014 18:47:54 > > > OpenStack - RDO using Packstack > > > > > > > > > > > > > > > ? > > > > > > > > > Recently Controller node freezes with following error (Which required > hard > > > reboot), as a result Gluster volumes on compute node can not reach the > > > controller and due to that all the instances on compute nodes become > to > > > read-only status which causes to restart all instances. > > > > > > > > > > > > > > > *BUG: scheduling while atomic : glusterfsd/42725/0xffffffff* > > > *BUG: unable to handle kernel paging request at 0000000038a60d0a8* > > > *IP: [<fffffffff81058e5d>] task_rq_lock+0x4d/0xa0* > > > *PGD 1065525067 PUD 0* > > > *Oops: 0000 [#1] SMP* > > > *last sysfs file : > > > > /sys/device/pci0000:80/0000:80:02.0/0000:86:00.0/host2/port-2:0/end_device-2:0/target2:0:0/2:0:0:1/state* > > > *CPU 0* > > > *Modules linked in : xtconntrack iptable_filter ip_tables ipt_REDIRECT > > > fuse ipv openvswitch vxlan iptable_mangle * > > > > > > Please advice on above incident , also feedback on the Openstack + > > > GlusterFS setup is appreciated. > > > > > > Thank You, > > > Chamara > > > > > > > > > _______________________________________________ > > > Gluster-users mailing list > > > Gluster-users at gluster.org > > > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > > > > _______________________________________________ > > Gluster-users mailing list > > Gluster-users at gluster.org > > http://www.gluster.org/mailman/listinfo/gluster-users > >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150120/3fe270c3/attachment.html>
chamara samarakoon
2015-Jan-20 12:03 UTC
[Gluster-users] CentOS Freeze with GlusterFS Error
HI All, Thank You for valuable feedback , I will test the suggested solutions, and update the thread. Regards, Chamara On Tue, Jan 20, 2015 at 4:17 PM, Deepak Shetty <dpkshetty at gmail.com> wrote:> In addition, I would also like to add that i do suspect (just my hunch) > that it could be related to multipath. > If you can try without multipath and if it doesn't re-create, i think that > would be a good data point for kernel/OS vendor to debug further. > > my 2 cents again :) > > thanx, > deepak > > > On Tue, Jan 20, 2015 at 2:32 PM, Niels de Vos <ndevos at redhat.com> wrote: > >> On Tue, Jan 20, 2015 at 11:55:40AM +0530, Deepak Shetty wrote: >> > What does "Controller" mean, the openstack controller node or somethign >> > else (like HBA ) ? >> > You picture says its SAN but the text says multi-path mount.. SAN would >> > mean block devices, so I am assuming you have redundant block devices on >> > the compute host, mkfs'ing it and then creating bricks for gluster ? >> > >> > >> > The stack trace looks like you hit a kernel bug and glusterfsd happens >> to >> > be running on the CPU at the time... my 2 cents >> >> That definitely is a kernel issue. You should contact your OS support >> vendor about this. >> >> The bits you copy/pasted are not sufficient to see what caused it. The >> glusterfsd process is just a casualty of the kernel issue, and it is not >> likely this can be fixed in Gluster. I suspect you need a kernel >> patch/update. >> >> Niels >> >> > >> > thanx, >> > deepak >> > >> > On Tue, Jan 20, 2015 at 11:29 AM, chamara samarakoon < >> chthsa123 at gmail.com> >> > wrote: >> > >> > > Hi All, >> > > >> > > >> > > We have setup Openstack cloud as below. And the >> "/va/lib/nova/instances" >> > > is a Gluster volume. >> > > >> > > CentOS - 6.5 >> > > Kernel - 2.6.32-431.29.2.el6.x86_64 >> > > GlusterFS - glusterfs 3.5.2 built on Jul 31 2014 18:47:54 >> > > OpenStack - RDO using Packstack >> > > >> > > >> > > >> > > >> > > ? >> > > >> > > >> > > Recently Controller node freezes with following error (Which required >> hard >> > > reboot), as a result Gluster volumes on compute node can not reach the >> > > controller and due to that all the instances on compute nodes become >> to >> > > read-only status which causes to restart all instances. >> > > >> > > >> > > >> > > >> > > *BUG: scheduling while atomic : glusterfsd/42725/0xffffffff* >> > > *BUG: unable to handle kernel paging request at 0000000038a60d0a8* >> > > *IP: [<fffffffff81058e5d>] task_rq_lock+0x4d/0xa0* >> > > *PGD 1065525067 PUD 0* >> > > *Oops: 0000 [#1] SMP* >> > > *last sysfs file : >> > > >> /sys/device/pci0000:80/0000:80:02.0/0000:86:00.0/host2/port-2:0/end_device-2:0/target2:0:0/2:0:0:1/state* >> > > *CPU 0* >> > > *Modules linked in : xtconntrack iptable_filter ip_tables ipt_REDIRECT >> > > fuse ipv openvswitch vxlan iptable_mangle * >> > > >> > > Please advice on above incident , also feedback on the Openstack + >> > > GlusterFS setup is appreciated. >> > > >> > > Thank You, >> > > Chamara >> > > >> > > >> > > _______________________________________________ >> > > Gluster-users mailing list >> > > Gluster-users at gluster.org >> > > http://www.gluster.org/mailman/listinfo/gluster-users >> > > >> >> >> >> > _______________________________________________ >> > Gluster-users mailing list >> > Gluster-users at gluster.org >> > http://www.gluster.org/mailman/listinfo/gluster-users >> >> >-- chthsa -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20150120/a636f063/attachment.html>