thr3ads.net - Gluster users - [Gluster-users] CentOS Freeze with GlusterFS Error [Jan 2015]

If this information is useful, please help other people find it:
Share via:

chamara samarakoon

2015-Jan-21 16:41 UTC

[Gluster-users] CentOS Freeze with GlusterFS Error

HI All,


Same error encountered again before trying anything else. So I took screen
shot  with more details of the incident.


?

Thank You,
Chamara



On Tue, Jan 20, 2015 at 5:33 PM, chamara samarakoon <chthsa123 at
gmail.com>
wrote:
> HI All,
>
> Thank You for valuable feedback , I will test the suggested solutions, and
> update the thread.
>
> Regards,
> Chamara
>
> On Tue, Jan 20, 2015 at 4:17 PM, Deepak Shetty <dpkshetty at
gmail.com>
> wrote:
>
>> In addition, I would also like to add that i do suspect (just my hunch)
>> that it could be related to multipath.
>> If you can try without multipath and if it doesn't re-create, i
think
>> that would be a good data point for kernel/OS vendor to debug further.
>>
>> my 2 cents again :)
>>
>> thanx,
>> deepak
>>
>>
>> On Tue, Jan 20, 2015 at 2:32 PM, Niels de Vos <ndevos at
redhat.com> wrote:
>>
>>> On Tue, Jan 20, 2015 at 11:55:40AM +0530, Deepak Shetty wrote:
>>> > What does "Controller" mean, the openstack
controller node or somethign
>>> > else (like HBA ) ?
>>> > You picture says its SAN but the text says multi-path mount..
SAN would
>>> > mean block devices, so I am assuming you have redundant block
devices
>>> on
>>> > the compute host, mkfs'ing it and then creating bricks for
gluster ?
>>> >
>>> >
>>> > The stack trace looks like you hit a kernel bug and glusterfsd
happens
>>> to
>>> > be running on the CPU at the time... my 2 cents
>>>
>>> That definitely is a kernel issue. You should contact your OS
support
>>> vendor about this.
>>>
>>> The bits you copy/pasted are not sufficient to see what caused it.
The
>>> glusterfsd process is just a casualty of the kernel issue, and it
is not
>>> likely this can be fixed in Gluster. I suspect you need a kernel
>>> patch/update.
>>>
>>> Niels
>>>
>>> >
>>> > thanx,
>>> > deepak
>>> >
>>> > On Tue, Jan 20, 2015 at 11:29 AM, chamara samarakoon <
>>> chthsa123 at gmail.com>
>>> > wrote:
>>> >
>>> > > Hi All,
>>> > >
>>> > >
>>> > > We have setup Openstack cloud as below. And the
>>> "/va/lib/nova/instances"
>>> > > is a Gluster volume.
>>> > >
>>> > > CentOS - 6.5
>>> > > Kernel -  2.6.32-431.29.2.el6.x86_64
>>> > > GlusterFS - glusterfs 3.5.2 built on Jul 31 2014 18:47:54
>>> > > OpenStack - RDO using Packstack
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > ?
>>> > >
>>> > >
>>> > > Recently Controller node freezes with following error
(Which
>>> required hard
>>> > > reboot), as a result Gluster volumes on compute node can
not reach
>>> the
>>> > > controller and due to that all the instances on compute
nodes
>>> become to
>>> > > read-only status  which causes to restart all instances.
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > *BUG: scheduling while atomic :
glusterfsd/42725/0xffffffff*
>>> > > *BUG: unable to handle kernel paging request at
0000000038a60d0a8*
>>> > > *IP: [<fffffffff81058e5d>] task_rq_lock+0x4d/0xa0*
>>> > > *PGD 1065525067 PUD 0*
>>> > > *Oops: 0000 [#1] SMP*
>>> > > *last sysfs file :
>>> > >
>>>
/sys/device/pci0000:80/0000:80:02.0/0000:86:00.0/host2/port-2:0/end_device-2:0/target2:0:0/2:0:0:1/state*
>>> > > *CPU 0*
>>> > > *Modules linked in : xtconntrack iptable_filter ip_tables
>>> ipt_REDIRECT
>>> > > fuse ipv openvswitch vxlan iptable_mangle *
>>> > >
>>> > > Please advice on above incident , also feedback on the
Openstack +
>>> > > GlusterFS setup is appreciated.
>>> > >
>>> > > Thank You,
>>> > > Chamara
>>> > >
>>> > >
>>> > > _______________________________________________
>>> > > Gluster-users mailing list
>>> > > Gluster-users at gluster.org
>>> > > http://www.gluster.org/mailman/listinfo/gluster-users
>>> > >
>>>
>>>
>>>
>>> > _______________________________________________
>>> > Gluster-users mailing list
>>> > Gluster-users at gluster.org
>>> > http://www.gluster.org/mailman/listinfo/gluster-users
>>>
>>>
>>
>
>
> --
> chthsa
>


-- 
chthsa
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150121/dcabd9f5/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glustererr.png
Type: image/png
Size: 31831 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150121/dcabd9f5/attachment.png>

Niels de Vos

2015-Jan-21 19:27 UTC

head link

[Gluster-users] CentOS Freeze with GlusterFS Error

On Wed, Jan 21, 2015 at 10:11:20PM +0530, chamara samarakoon
wrote:> HI All,
> 
> 
> Same error encountered again before trying anything else. So I took screen
> shot  with more details of the incident.
This shows an XFS error. So it can be a problem with XFS, or something
that contributes to it in the XFS path. I would guess it is caused by an
issue on the disk(s) because there is the mentioning of corruption.
However, it could also be bad RAM, or an other hardware component that
is used to access data from the disks. I suggest you take two
approaches:

1. run hardware tests - if the error is detected, contact your HW vendor
2. open a support case with the vendor of the OS and check for updates

Gluster can stress filesystems in ways that are not very common, and
there have been issues found in XFS due to this. Your OS support vendor
should be able to tell you if the latest and related XFS fixes are
included in your kernel.

HTH,
Niels
> 
> 
> ?
> 
> Thank You,
> Chamara
> 
> 
> 
> On Tue, Jan 20, 2015 at 5:33 PM, chamara samarakoon <chthsa123 at
gmail.com>
> wrote:
> 
> > HI All,
> >
> > Thank You for valuable feedback , I will test the suggested solutions,
and
> > update the thread.
> >
> > Regards,
> > Chamara
> >
> > On Tue, Jan 20, 2015 at 4:17 PM, Deepak Shetty <dpkshetty at
gmail.com>
> > wrote:
> >
> >> In addition, I would also like to add that i do suspect (just my
hunch)
> >> that it could be related to multipath.
> >> If you can try without multipath and if it doesn't re-create,
i think
> >> that would be a good data point for kernel/OS vendor to debug
further.
> >>
> >> my 2 cents again :)
> >>
> >> thanx,
> >> deepak
> >>
> >>
> >> On Tue, Jan 20, 2015 at 2:32 PM, Niels de Vos <ndevos at
redhat.com> wrote:
> >>
> >>> On Tue, Jan 20, 2015 at 11:55:40AM +0530, Deepak Shetty wrote:
> >>> > What does "Controller" mean, the openstack
controller node or somethign
> >>> > else (like HBA ) ?
> >>> > You picture says its SAN but the text says multi-path
mount.. SAN would
> >>> > mean block devices, so I am assuming you have redundant
block devices
> >>> on
> >>> > the compute host, mkfs'ing it and then creating
bricks for gluster ?
> >>> >
> >>> >
> >>> > The stack trace looks like you hit a kernel bug and
glusterfsd happens
> >>> to
> >>> > be running on the CPU at the time... my 2 cents
> >>>
> >>> That definitely is a kernel issue. You should contact your OS
support
> >>> vendor about this.
> >>>
> >>> The bits you copy/pasted are not sufficient to see what caused
it. The
> >>> glusterfsd process is just a casualty of the kernel issue, and
it is not
> >>> likely this can be fixed in Gluster. I suspect you need a
kernel
> >>> patch/update.
> >>>
> >>> Niels
> >>>
> >>> >
> >>> > thanx,
> >>> > deepak
> >>> >
> >>> > On Tue, Jan 20, 2015 at 11:29 AM, chamara samarakoon <
> >>> chthsa123 at gmail.com>
> >>> > wrote:
> >>> >
> >>> > > Hi All,
> >>> > >
> >>> > >
> >>> > > We have setup Openstack cloud as below. And the
> >>> "/va/lib/nova/instances"
> >>> > > is a Gluster volume.
> >>> > >
> >>> > > CentOS - 6.5
> >>> > > Kernel -  2.6.32-431.29.2.el6.x86_64
> >>> > > GlusterFS - glusterfs 3.5.2 built on Jul 31 2014
18:47:54
> >>> > > OpenStack - RDO using Packstack
> >>> > >
> >>> > >
> >>> > >
> >>> > >
> >>> > > ?
> >>> > >
> >>> > >
> >>> > > Recently Controller node freezes with following
error (Which
> >>> required hard
> >>> > > reboot), as a result Gluster volumes on compute node
can not reach
> >>> the
> >>> > > controller and due to that all the instances on
compute nodes
> >>> become to
> >>> > > read-only status  which causes to restart all
instances.
> >>> > >
> >>> > >
> >>> > >
> >>> > >
> >>> > > *BUG: scheduling while atomic :
glusterfsd/42725/0xffffffff*
> >>> > > *BUG: unable to handle kernel paging request at
0000000038a60d0a8*
> >>> > > *IP: [<fffffffff81058e5d>]
task_rq_lock+0x4d/0xa0*
> >>> > > *PGD 1065525067 PUD 0*
> >>> > > *Oops: 0000 [#1] SMP*
> >>> > > *last sysfs file :
> >>> > >
> >>>
/sys/device/pci0000:80/0000:80:02.0/0000:86:00.0/host2/port-2:0/end_device-2:0/target2:0:0/2:0:0:1/state*
> >>> > > *CPU 0*
> >>> > > *Modules linked in : xtconntrack iptable_filter
ip_tables
> >>> ipt_REDIRECT
> >>> > > fuse ipv openvswitch vxlan iptable_mangle *
> >>> > >
> >>> > > Please advice on above incident , also feedback on
the Openstack +
> >>> > > GlusterFS setup is appreciated.
> >>> > >
> >>> > > Thank You,
> >>> > > Chamara
> >>> > >
> >>> > >
> >>> > > _______________________________________________
> >>> > > Gluster-users mailing list
> >>> > > Gluster-users at gluster.org
> >>> > >
http://www.gluster.org/mailman/listinfo/gluster-users
> >>> > >
> >>>
> >>>
> >>>
> >>> > _______________________________________________
> >>> > Gluster-users mailing list
> >>> > Gluster-users at gluster.org
> >>> > http://www.gluster.org/mailman/listinfo/gluster-users
> >>>
> >>>
> >>
> >
> >
> > --
> > chthsa
> >
> 
> 
> 
> -- 
> chthsa

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150121/108a13f3/attachment.sig>

Gluster users - Jan 2015 - CentOS Freeze with GlusterFS Error

[Gluster-users] CentOS Freeze with GlusterFS Error

[Gluster-users] CentOS Freeze with GlusterFS Error