thr3ads.net - Ocfs2 users - [Ocfs2-users] OCFS2 tuning, fragmentation and localalloc option. Cluster hanging during mix read+write workloads [Jul 2013]

If this information is useful, please help other people find it:
Share via:

Gavin Jones

2013-Jul-15 20:33 UTC

[Ocfs2-users] OCFS2 tuning, fragmentation and localalloc option. Cluster hanging during mix read+write workloads

Hello,

We have a 16 node OCFS2 cluster used for web serving duties.  Each
node mounts (the same) 6 OCFS2 volumes.  Shared data includes client
files, application files for our webapp, log files, configuration
files.  Storage provided by 2x EqualLogic PS400E iSCSI SANs, each
having 12 drives in a RAID50; units are in a 'Group'.

The problem we are having is that periodically, maybe once a week or
so, we get several Apache processes on a handful of nodes that get
stuck in D state and are unable to recover.  This greatly increases
server load, causes more Apache processes to backup, OCFS2 starts
complaining about unresponsive nodes and before you know it, the
cluster is down.

This seems to occur most often when we are doing writes + reads; if it
is just reads the cluster hums along.  However, when we need to update
many files or remove lots of files (think temporary images) in
addition to normal read activity, we have the above-mentioned problem.

We have done some searching and found
http://www.mail-archive.com/ocfs2-users at oss.oracle.com/msg05525.html
which describes a similar problem with write activity.  In that case,
the problem was allocating contiguous space on a fragmented filesystem
and the solution was to adjust the mount option 'localalloc'.  We are
wondering if we are in a similar position.

Below is the output from the stat_sysdir_analyze.sh script mentioned
in the link above, which analyzes stat_sysdir.sh output; I've included
the two volumes that seem to be our 'problem' volumes.

Volume 1:
bash stat_sysdir_analyze.sh sde1-client-20130715.txt
Number |
of |
clust. | Contiguous cluster size
--------------------------------
4549 510 and smaller
1825 511

Volume 2:
bash stat_sysdir_analyze.sh sdd1-data-20130715.txt
Number |
of |
clust. | Contiguous cluster size
--------------------------------
175 510 and smaller
23 511

Any evidence here of excessive fragmentation that tuning localalloc
would help with?

Also regarding localalloc, I notice it is different for the above two
volumes on many of the nodes; I find this interesting as the cluster
is supposed to make an educated guess on this value.  For instance:

/dev/sda1 on /u/client type ocfs2
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=6,coherency=full,user_xattr,noacl)
/dev/sde1 on /u/data type ocfs2
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl)


/dev/sdd1 on /u/client type ocfs2
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=9,coherency=full,user_xattr,noacl)
/dev/sdb1 on /u/data type ocfs2
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl)


/dev/sda1 on /u/client type ocfs2
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=11,coherency=full,user_xattr,noacl)
/dev/sdc1 on /u/data type ocfs2
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl)


/dev/sda1 on /u/client type ocfs2
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=6,coherency=full,user_xattr,noacl)
/dev/sdc1 on /u/data type ocfs2
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=7,coherency=full,user_xattr,noacl

I'm not sure why the cluster would be picking different values
depending on the node?

Anyway, any opinions, advice, tuning suggestions greatly appreciated.
This business of the cluster hanging is turning into quite a problem.

I'll provide any other requested information upon request.

Thanks,

Gavin W. Jones
Where 2 Get It, Inc.

--
"There has grown up in the minds of certain groups in this country the
notion that because a man or corporation has made a profit out of the
public for a number of years, the government and the courts are
charged with the duty of guaranteeing such profit in the future, even
in the face of changing circumstances and contrary to public interest.
This strange doctrine is not supported by statute nor common law."

~Robert Heinlein

Srinivas Eeda

2013-Jul-16 00:32 UTC

head link

[Ocfs2-users] OCFS2 tuning, fragmentation and localalloc option. Cluster hanging during mix read+write workloads

I am not entirely sure about significant slowdown and cluster outage. 
But from your description and information you provided, you are seeing 
fragmentation related issues. What is the ocfs2/kernel version and what 
is the cluster size/block size of these volumes?


On 07/15/2013 01:33 PM, Gavin Jones wrote:> Hello,
>
> We have a 16 node OCFS2 cluster used for web serving duties.  Each
> node mounts (the same) 6 OCFS2 volumes.  Shared data includes client
> files, application files for our webapp, log files, configuration
> files.  Storage provided by 2x EqualLogic PS400E iSCSI SANs, each
> having 12 drives in a RAID50; units are in a 'Group'.
>
> The problem we are having is that periodically, maybe once a week or
> so, we get several Apache processes on a handful of nodes that get
> stuck in D state and are unable to recover.  This greatly increases
> server load, causes more Apache processes to backup, OCFS2 starts
> complaining about unresponsive nodes and before you know it, the
> cluster is down.
>
> This seems to occur most often when we are doing writes + reads; if it
> is just reads the cluster hums along.  However, when we need to update
> many files or remove lots of files (think temporary images) in
> addition to normal read activity, we have the above-mentioned problem.
>
> We have done some searching and found
> http://www.mail-archive.com/ocfs2-users at oss.oracle.com/msg05525.html
> which describes a similar problem with write activity.  In that case,
> the problem was allocating contiguous space on a fragmented filesystem
> and the solution was to adjust the mount option 'localalloc'.  We
are
> wondering if we are in a similar position.
>
> Below is the output from the stat_sysdir_analyze.sh script mentioned
> in the link above, which analyzes stat_sysdir.sh output; I've included
> the two volumes that seem to be our 'problem' volumes.
>
> Volume 1:
> bash stat_sysdir_analyze.sh sde1-client-20130715.txt
> Number |
> of |
> clust. | Contiguous cluster size
> --------------------------------
> 4549 510 and smaller
> 1825 511
>
> Volume 2:
> bash stat_sysdir_analyze.sh sdd1-data-20130715.txt
> Number |
> of |
> clust. | Contiguous cluster size
> --------------------------------
> 175 510 and smaller
> 23 511
>
> Any evidence here of excessive fragmentation that tuning localalloc
> would help with?
>
> Also regarding localalloc, I notice it is different for the above two
> volumes on many of the nodes; I find this interesting as the cluster
> is supposed to make an educated guess on this value.  For instance:
>
> /dev/sda1 on /u/client type ocfs2
>
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=6,coherency=full,user_xattr,noacl)
> /dev/sde1 on /u/data type ocfs2
>
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl)
>
>
> /dev/sdd1 on /u/client type ocfs2
>
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=9,coherency=full,user_xattr,noacl)
> /dev/sdb1 on /u/data type ocfs2
>
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl)
>
>
> /dev/sda1 on /u/client type ocfs2
>
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=11,coherency=full,user_xattr,noacl)
> /dev/sdc1 on /u/data type ocfs2
>
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl)
>
>
> /dev/sda1 on /u/client type ocfs2
>
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=6,coherency=full,user_xattr,noacl)
> /dev/sdc1 on /u/data type ocfs2
>
(rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=7,coherency=full,user_xattr,noacl
>
> I'm not sure why the cluster would be picking different values
> depending on the node?
>
> Anyway, any opinions, advice, tuning suggestions greatly appreciated.
> This business of the cluster hanging is turning into quite a problem.
>
> I'll provide any other requested information upon request.
>
> Thanks,
>
> Gavin W. Jones
> Where 2 Get It, Inc.
>
> --
> "There has grown up in the minds of certain groups in this country the
> notion that because a man or corporation has made a profit out of the
> public for a number of years, the government and the courts are
> charged with the duty of guaranteeing such profit in the future, even
> in the face of changing circumstances and contrary to public interest.
> This strange doctrine is not supported by statute nor common law."
>
> ~Robert Heinlein
>
> _______________________________________________
> Ocfs2-users mailing list
> Ocfs2-users at oss.oracle.com
> https://oss.oracle.com/mailman/listinfo/ocfs2-users

Ocfs2 users - Jul 2013 - OCFS2 tuning, fragmentation and localalloc option. Cluster hanging during mix read+write workloads

[Ocfs2-users] OCFS2 tuning, fragmentation and localalloc option. Cluster hanging during mix read+write workloads

[Ocfs2-users] OCFS2 tuning, fragmentation and localalloc option. Cluster hanging during mix read+write workloads