thr3ads.net - Lustre discuss - [Lustre-discuss] OSS Nodes Fencing issue in HPC [Jan 2012]

If this information is useful, please help other people find it:
Share via:

VIJESH EK

2012-Jan-23 06:33 UTC

[Lustre-discuss] OSS Nodes Fencing issue in HPC

Hi,

 I hope all of them are in good spirit....

*We have a four OSS servers, OSS1 to OSS4 are clustered each other*
*The Nodes are clustered with OSS1 and OSS2 , OSS3 & OSS4.*
*It was configured six months back, from the beginning itself its creacting
*
*an issue that one of  node is fencing the other node and its goes to the
shutdown state.*
*This problem may be happen from two to three weeks timing period.*
*In the /var/log/messages showing some errors continuously that *
*" slow start_page_write 57s due to heavy IO load "*
*Can anybody can help me regarding this issue.....*
*
*

Thanks & Regards
*
VIJESH E K*
*
*
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120123/52f7c4e6/attachment-0001.html

Carlos Thomaz

2012-Jan-23 06:45 UTC

head link

[Lustre-discuss] OSS Nodes Fencing issue in HPC

Hi Vijesh.

You are probably facing a called "split brain" issue. It may happen
due heartbeat communication problems.
One common reason is an issue with your heartbeat. What sort of heartbeat are
you using?

Some time ago we had a problem when the OSS were overloaded and the heartbeat
becomes unresponsive. This would cause a "false split brain" scenario.
Basically all the two nodes within your HA pair stonith itself since there was
no answer from heartbeat device.

I guess you should take a look and start monitoring your oss nodes to understand
if the message logged makes sense (very likely). How''s the memory
configuration of your OSS nodes? What OS? How your zone reclaim mode looks like?

Regards,
Carlos


--
Carlos Thomaz | HPC Systems Architect
Mobile: +1 (303) 519-0578
cthomaz at ddn.com | Skype ID: carlosthomaz

DataDirect Networks, Inc.
9960 Federal Dr., Ste 100 Colorado Springs, CO 80921
ddn.com<http://www.ddn.com/> | Twitter:
@ddn_limitless<http://twitter.com/ddn_limitless> | 1.800.TERABYTE

From: VIJESH EK <ekvijesh at gmail.com<mailto:ekvijesh at
gmail.com>>
Date: Sun, 22 Jan 2012 22:33:20 -0800
To: "lustre-discuss at lists.lustre.org<mailto:lustre-discuss at
lists.lustre.org>" <lustre-discuss at
lists.lustre.org<mailto:lustre-discuss at lists.lustre.org>>
Subject: [Lustre-discuss] OSS Nodes Fencing issue in HPC

" slow start_page_write 57s due to heavy IO load "

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120122/039c97ba/attachment.html

Kevin Van Maren

2012-Jan-23 06:46 UTC

head link

[Lustre-discuss] OSS Nodes Fencing issue in HPC

Well, it sounds like an issue with your HA package configuration.  Likely one
node is not being responsive enough to heartbeat/are-you-alive messages so the
other node assumes it has died.  This is likely fixed by increasing the deadtime
parameter in your HA configuration (try 180 seconds if it is smaller than that).
Hard to say, as you omitted any logs, and you didn''t even say what HA
package you are using.

You also didn''t indicate which Lustre version you are using.  One of
the likely candidates for those messages is the kernel having difficulty
allocating memory.  On many kernels, if /proc/sys/vm/zone_reclaim_mode is not 0,
memory allocations can take a long time as it keeps looking for the best pages
to free until pages in the local NUMA node are available.   With the Lustre
1.8.x write cache, the memory pressure is substantial (in 1.6.x and earlier, the
service threads had statically-allocated buffers, but starting with 1.8.x each
incoming request allocates new pages and frees them back to the page cache).

Kevin


On Jan 22, 2012, at 11:33 PM, VIJESH EK wrote:

Hi,

 I hope all of them are in good spirit....

We have a four OSS servers, OSS1 to OSS4 are clustered each other
The Nodes are clustered with OSS1 and OSS2 , OSS3 & OSS4.
It was configured six months back, from the beginning itself its creacting
an issue that one of  node is fencing the other node and its goes to the
shutdown state.
This problem may be happen from two to three weeks timing period.
In the /var/log/messages showing some errors continuously that
" slow start_page_write 57s due to heavy IO load "
Can anybody can help me regarding this issue.....


Thanks & Regards

VIJESH E K


<ATT00001..txt>


Confidentiality Notice: This e-mail message, its contents and any attachments to
it are confidential to the intended recipient, and may contain information that
is privileged and/or exempt from disclosure under applicable law. If you are not
the intended recipient, please immediately notify the sender and destroy the
original e-mail message and any attachments (and any copies that may have been
made) from your system or otherwise. Any unauthorized use, copying, disclosure
or distribution of this information is strictly prohibited.  Email addresses
that end with a ?-c? identify the sender as a Fusion-io contractor.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120122/5828bba2/attachment.html

Wojciech Turek

2012-Jan-25 13:43 UTC

head link

[Lustre-discuss] OSS Nodes Fencing issue in HPC

You have got already a great advice from Carlos and Kevin. One more point I
would like to add is that quite often people configure their HA software to
send heartbeat over a single network thus creating a single point of
failure and the heartbeat (keep alive) pings are sent over the network that
is used as the main I/O feed. In my experience I found that  it is very
important to have the HA ping to be send at least on two networks or even
better using two different methods of comm like Ethernet and serial.

Regards,

Wojciech

On 23 January 2012 06:33, VIJESH EK <ekvijesh at gmail.com> wrote:
> Hi,
>
>  I hope all of them are in good spirit....
>
> *We have a four OSS servers, OSS1 to OSS4 are clustered each other*
> *The Nodes are clustered with OSS1 and OSS2 , OSS3 & OSS4.*
> *It was configured six months back, from the beginning itself its
> creacting *
> *an issue that one of  node is fencing the other node and its goes to the
> shutdown state.*
> *This problem may be happen from two to three weeks timing period.*
> *In the /var/log/messages showing some errors continuously that *
> *" slow start_page_write 57s due to heavy IO load "*
> *Can anybody can help me regarding this issue.....*
> *
> *
>
> Thanks & Regards
> *
> VIJESH E K*
> *
> *
>
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at lists.lustre.org
> http://lists.lustre.org/mailman/listinfo/lustre-discuss
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120125/4824e4c5/attachment.html

VIJESH EK

2012-Jan-27 08:42 UTC

head link

[Lustre-discuss] OSS Nodes Fencing issue in HPC

*Dear Sir,*
*
*
*I have attached the /var/log/messages from the OSS node ,*
*Please go through the logs and kindly give me a solution for this
issue........
*
*
*
*Thanks & Regards

VIJESH E K*
*HCL Infosystems Ltd.
Chennai-6
Mob:+91 99400 96543*


On Mon, Jan 23, 2012 at 12:03 PM, VIJESH EK <ekvijesh at gmail.com> wrote:
> Hi,
>
>  I hope all of them are in good spirit....
>
> *We have a four OSS servers, OSS1 to OSS4 are clustered each other*
> *The Nodes are clustered with OSS1 and OSS2 , OSS3 & OSS4.*
> *It was configured six months back, from the beginning itself its
> creacting *
> *an issue that one of  node is fencing the other node and its goes to the
> shutdown state.*
> *This problem may be happen from two to three weeks timing period.*
> *In the /var/log/messages showing some errors continuously that *
> *" slow start_page_write 57s due to heavy IO load "*
> *Can anybody can help me regarding this issue.....*
> *
> *
>
> Thanks & Regards
> *
> VIJESH E K*
> *
> *
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120127/71307764/attachment-0001.html
-------------- next part --------------
A non-text attachment was scrubbed...
Name: messages.3
Type: application/octet-stream
Size: 67149 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120127/71307764/attachment-0004.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: messages
Type: application/octet-stream
Size: 92035 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120127/71307764/attachment-0005.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: messages.1
Type: application/octet-stream
Size: 187937 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120127/71307764/attachment-0006.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: messages.2
Type: application/octet-stream
Size: 126397 bytes
Desc: not available
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120127/71307764/attachment-0007.obj

Kevin Van Maren

2012-Jan-30 18:51 UTC

head link

[Lustre-discuss] OSS Nodes Fencing issue in HPC

As I replied earlier, those "slow" messages are often a result of
memory allocations taking a long time. Since zone_reclaim shows up in many of
the stack traces, that still appears to be a good candidate.

Did you check /proc/sys/vm/zone_reclaim_mode and was it 0? Did you change it to
0 and still have problems?

The same situation that causes the Lustre threads to be slow can also stall the
heartbeat processes. Did you increase the heartbeat deadtime timeout value?

Kevin

On Jan 27, 2012, at 1:42 AM, VIJESH EK wrote:

Dear Sir,

I have attached the /var/log/messages from the OSS node ,
Please go through the logs and kindly give me a solution for this issue........

Thanks & Regards

VIJESH E K
HCL Infosystems Ltd.
Chennai-6
Mob:+91 99400 96543

On Mon, Jan 23, 2012 at 12:03 PM, VIJESH EK <ekvijesh at
gmail.com<mailto:ekvijesh at gmail.com>> wrote:
Hi,

I hope all of them are in good spirit....

We have a four OSS servers, OSS1 to OSS4 are clustered each other
The Nodes are clustered with OSS1 and OSS2 , OSS3 & OSS4.
It was configured six months back, from the beginning itself its creacting
an issue that one of node is fencing the other node and its goes to the
shutdown state.
This problem may be happen from two to three weeks timing period.
In the /var/log/messages showing some errors continuously that
" slow start_page_write 57s due to heavy IO load "
Can anybody can help me regarding this issue.....

Thanks & Regards

VIJESH E K

<messages.3><messages><messages.1><messages.2><ATT00001..txt>

Confidentiality Notice: This e-mail message, its contents and any attachments to
it are confidential to the intended recipient, and may contain information that
is privileged and/or exempt from disclosure under applicable law. If you are not
the intended recipient, please immediately notify the sender and destroy the
original e-mail message and any attachments (and any copies that may have been
made) from your system or otherwise. Any unauthorized use, copying, disclosure
or distribution of this information is strictly prohibited. Email addresses
that end with a ?-c? identify the sender as a Fusion-io contractor.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120130/69d0e75a/attachment.html

VIJESH EK

2012-Jan-31 06:03 UTC

head link

[Lustre-discuss] OSS Nodes Fencing issue in HPC

*Dear Sir,*
*
*
*I have checked the file **/proc/sys/vm/zone_reclaim_mode , and found
that its value is 1 in four OSS servers (OSS1 to OSS4). Should i change to
0 in all nodes.  I want to know one thing , How it can be resolve the
current issue ? Can u please explain?, What is the main function of this
file ?*
*Have u verified the log file which one i has been sent earlier ?. If i
have changed the value to 0, Is it will effect currently running  processes
or Jobs ?*
*I am waiting for your reply....*
*
*
*Thanks & Regards

VIJESH E K*
*
*
On Tue, Jan 31, 2012 at 12:21 AM, Kevin Van Maren <KVanMaren at
fusionio.com>wrote:
> As I replied earlier, those "slow" messages are often a result of
memory
> allocations taking a long time.  Since zone_reclaim shows up in many of the
> stack traces, that still appears to be a good candidate.
>
> Did you check /proc/sys/vm/zone_reclaim_mode and was it 0?  Did you change
> it to 0 and still have problems?
>
> The same situation that causes the Lustre threads to be slow can also
> stall the heartbeat processes.  Did you increase the heartbeat deadtime
> timeout value?
>
> Kevin
>
>
> On Jan 27, 2012, at 1:42 AM, VIJESH EK wrote:
>
> *Dear Sir,*
> *
> *
> *I have attached the /var/log/messages from the OSS node ,*
> *Please go through the logs and kindly give me a solution for this
> issue........
> *
> *
> *
> *Thanks & Regards
>
> VIJESH E K*
> *HCL Infosystems Ltd.
> Chennai-6
> Mob:+91 99400 96543*
>
>
> On Mon, Jan 23, 2012 at 12:03 PM, VIJESH EK <ekvijesh at gmail.com>
wrote:
>
>> Hi,
>>
>>  I hope all of them are in good spirit....
>>
>> *We have a four OSS servers, OSS1 to OSS4 are clustered each other*
>> *The Nodes are clustered with OSS1 and OSS2 , OSS3 & OSS4.*
>> *It was configured six months back, from the beginning itself its
>> creacting *
>> *an issue that one of  node is fencing the other node and its goes to
>> the shutdown state.*
>> *This problem may be happen from two to three weeks timing period.*
>> *In the /var/log/messages showing some errors continuously that *
>> *" slow start_page_write 57s due to heavy IO load "*
>> *Can anybody can help me regarding this issue.....*
>> *
>> *
>>
>> Thanks & Regards
>> *
>> VIJESH E K*
>> *
>> *
>>
>>
>
>
>
> 
<messages.3><messages><messages.1><messages.2><ATT00001..txt>
>
>
>
>
> Confidentiality Notice: This e-mail message, its contents and any
> attachments to it are confidential to the intended recipient, and may
> contain information that is privileged and/or exempt from disclosure under
> applicable law. If you are not the intended recipient, please immediately
> notify the sender and destroy the original e-mail message and any
> attachments (and any copies that may have been made) from your system or
> otherwise. Any unauthorized use, copying, disclosure or distribution of
> this information is strictly prohibited. Email addresses that end with a
> ?-c? identify the sender as a Fusion-io contractor.
>   ??
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120131/ca1a2be8/attachment.html

Kevin Van Maren

2012-Jan-31 06:10 UTC

head link

[Lustre-discuss] OSS Nodes Fencing issue in HPC

Yes, change it to 0. This will make it easier to allocate memory. Although it
will sometimes allocate memory connected to the wrong CPU, it shouldn''t
get stuck for long periods in the memory allocator. Because of the Lustre oss
cache (starting in 1.8.0), service threads have to allocate new memory for every
request. Your lustre server threads are getting stuck allocating memory.

I expect that you will see many fewer "slow" messages on the servers
after making that change.

Kevin

On Jan 30, 2012, at 11:03 PM, VIJESH EK wrote:

Dear Sir,

I have checked the file /proc/sys/vm/zone_reclaim_mode , and found that its
value is 1 in four OSS servers (OSS1 to OSS4). Should i change to 0 in all
nodes. I want to know one thing , How it can be resolve the current issue ? Can
u please explain?, What is the main function of this file ?
Have u verified the log file which one i has been sent earlier ?. If i have
changed the value to 0, Is it will effect currently running processes or Jobs ?
I am waiting for your reply....

Thanks & Regards

VIJESH E K

On Tue, Jan 31, 2012 at 12:21 AM, Kevin Van Maren <KVanMaren at
fusionio.com<mailto:KVanMaren at fusionio.com>> wrote:
As I replied earlier, those "slow" messages are often a result of
memory allocations taking a long time. Since zone_reclaim shows up in many of
the stack traces, that still appears to be a good candidate.

Did you check /proc/sys/vm/zone_reclaim_mode and was it 0? Did you change it to
0 and still have problems?

The same situation that causes the Lustre threads to be slow can also stall the
heartbeat processes. Did you increase the heartbeat deadtime timeout value?

Kevin

On Jan 27, 2012, at 1:42 AM, VIJESH EK wrote:

Dear Sir,

I have attached the /var/log/messages from the OSS node ,
Please go through the logs and kindly give me a solution for this issue........

Thanks & Regards

VIJESH E K
HCL Infosystems Ltd.
Chennai-6
Mob:+91 99400 96543

On Mon, Jan 23, 2012 at 12:03 PM, VIJESH EK <ekvijesh at
gmail.com<mailto:ekvijesh at gmail.com>> wrote:
Hi,

I hope all of them are in good spirit....

Thanks & Regards

VIJESH E K

<messages.3><messages><messages.1><messages.2><ATT00001..txt>

Confidentiality Notice: This e-mail message, its contents and any attachments to
it are confidential to the intended recipient, and may contain information that
is privileged and/or exempt from disclosure under applicable law. If you are not
the intended recipient, please immediately notify the sender and destroy the
original e-mail message and any attachments (and any copies that may have been
made) from your system or otherwise. Any unauthorized use, copying, disclosure
or distribution of this information is strictly prohibited. Email addresses that
end with a ?-c? identify the sender as a Fusion-io contractor.
??

This e-mail message, its contents and any attachments to it are confidential to
the intended recipient, and may contain information that is privileged and/or
exempt from disclosure under applicable law. If you are not the intended
recipient, please immediately notify the sender and destroy the original e-mail
message and any attachments (and any copies that may have been made) from your
system or otherwise. Any unauthorized use, copying, disclosure or distribution
of this information is strictly prohibited. Email addresses that end with a
?-c? identify the sender as a Fusion-io contractor.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20120130/40df1def/attachment.html

Lustre discuss - Jan 2012 - OSS Nodes Fencing issue in HPC

[Lustre-discuss] OSS Nodes Fencing issue in HPC

[Lustre-discuss] OSS Nodes Fencing issue in HPC

[Lustre-discuss] OSS Nodes Fencing issue in HPC

[Lustre-discuss] OSS Nodes Fencing issue in HPC

[Lustre-discuss] OSS Nodes Fencing issue in HPC

[Lustre-discuss] OSS Nodes Fencing issue in HPC

[Lustre-discuss] OSS Nodes Fencing issue in HPC

[Lustre-discuss] OSS Nodes Fencing issue in HPC