Gavin Jones
2013-Jul-15 20:33 UTC
[Ocfs2-users] OCFS2 tuning, fragmentation and localalloc option. Cluster hanging during mix read+write workloads
Hello, We have a 16 node OCFS2 cluster used for web serving duties. Each node mounts (the same) 6 OCFS2 volumes. Shared data includes client files, application files for our webapp, log files, configuration files. Storage provided by 2x EqualLogic PS400E iSCSI SANs, each having 12 drives in a RAID50; units are in a 'Group'. The problem we are having is that periodically, maybe once a week or so, we get several Apache processes on a handful of nodes that get stuck in D state and are unable to recover. This greatly increases server load, causes more Apache processes to backup, OCFS2 starts complaining about unresponsive nodes and before you know it, the cluster is down. This seems to occur most often when we are doing writes + reads; if it is just reads the cluster hums along. However, when we need to update many files or remove lots of files (think temporary images) in addition to normal read activity, we have the above-mentioned problem. We have done some searching and found http://www.mail-archive.com/ocfs2-users at oss.oracle.com/msg05525.html which describes a similar problem with write activity. In that case, the problem was allocating contiguous space on a fragmented filesystem and the solution was to adjust the mount option 'localalloc'. We are wondering if we are in a similar position. Below is the output from the stat_sysdir_analyze.sh script mentioned in the link above, which analyzes stat_sysdir.sh output; I've included the two volumes that seem to be our 'problem' volumes. Volume 1: bash stat_sysdir_analyze.sh sde1-client-20130715.txt Number | of | clust. | Contiguous cluster size -------------------------------- 4549 510 and smaller 1825 511 Volume 2: bash stat_sysdir_analyze.sh sdd1-data-20130715.txt Number | of | clust. | Contiguous cluster size -------------------------------- 175 510 and smaller 23 511 Any evidence here of excessive fragmentation that tuning localalloc would help with? Also regarding localalloc, I notice it is different for the above two volumes on many of the nodes; I find this interesting as the cluster is supposed to make an educated guess on this value. For instance: /dev/sda1 on /u/client type ocfs2 (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=6,coherency=full,user_xattr,noacl) /dev/sde1 on /u/data type ocfs2 (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl) /dev/sdd1 on /u/client type ocfs2 (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=9,coherency=full,user_xattr,noacl) /dev/sdb1 on /u/data type ocfs2 (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl) /dev/sda1 on /u/client type ocfs2 (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=11,coherency=full,user_xattr,noacl) /dev/sdc1 on /u/data type ocfs2 (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl) /dev/sda1 on /u/client type ocfs2 (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=6,coherency=full,user_xattr,noacl) /dev/sdc1 on /u/data type ocfs2 (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=7,coherency=full,user_xattr,noacl I'm not sure why the cluster would be picking different values depending on the node? Anyway, any opinions, advice, tuning suggestions greatly appreciated. This business of the cluster hanging is turning into quite a problem. I'll provide any other requested information upon request. Thanks, Gavin W. Jones Where 2 Get It, Inc. -- "There has grown up in the minds of certain groups in this country the notion that because a man or corporation has made a profit out of the public for a number of years, the government and the courts are charged with the duty of guaranteeing such profit in the future, even in the face of changing circumstances and contrary to public interest. This strange doctrine is not supported by statute nor common law." ~Robert Heinlein
Srinivas Eeda
2013-Jul-16 00:32 UTC
[Ocfs2-users] OCFS2 tuning, fragmentation and localalloc option. Cluster hanging during mix read+write workloads
I am not entirely sure about significant slowdown and cluster outage. But from your description and information you provided, you are seeing fragmentation related issues. What is the ocfs2/kernel version and what is the cluster size/block size of these volumes? On 07/15/2013 01:33 PM, Gavin Jones wrote:> Hello, > > We have a 16 node OCFS2 cluster used for web serving duties. Each > node mounts (the same) 6 OCFS2 volumes. Shared data includes client > files, application files for our webapp, log files, configuration > files. Storage provided by 2x EqualLogic PS400E iSCSI SANs, each > having 12 drives in a RAID50; units are in a 'Group'. > > The problem we are having is that periodically, maybe once a week or > so, we get several Apache processes on a handful of nodes that get > stuck in D state and are unable to recover. This greatly increases > server load, causes more Apache processes to backup, OCFS2 starts > complaining about unresponsive nodes and before you know it, the > cluster is down. > > This seems to occur most often when we are doing writes + reads; if it > is just reads the cluster hums along. However, when we need to update > many files or remove lots of files (think temporary images) in > addition to normal read activity, we have the above-mentioned problem. > > We have done some searching and found > http://www.mail-archive.com/ocfs2-users at oss.oracle.com/msg05525.html > which describes a similar problem with write activity. In that case, > the problem was allocating contiguous space on a fragmented filesystem > and the solution was to adjust the mount option 'localalloc'. We are > wondering if we are in a similar position. > > Below is the output from the stat_sysdir_analyze.sh script mentioned > in the link above, which analyzes stat_sysdir.sh output; I've included > the two volumes that seem to be our 'problem' volumes. > > Volume 1: > bash stat_sysdir_analyze.sh sde1-client-20130715.txt > Number | > of | > clust. | Contiguous cluster size > -------------------------------- > 4549 510 and smaller > 1825 511 > > Volume 2: > bash stat_sysdir_analyze.sh sdd1-data-20130715.txt > Number | > of | > clust. | Contiguous cluster size > -------------------------------- > 175 510 and smaller > 23 511 > > Any evidence here of excessive fragmentation that tuning localalloc > would help with? > > Also regarding localalloc, I notice it is different for the above two > volumes on many of the nodes; I find this interesting as the cluster > is supposed to make an educated guess on this value. For instance: > > /dev/sda1 on /u/client type ocfs2 > (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=6,coherency=full,user_xattr,noacl) > /dev/sde1 on /u/data type ocfs2 > (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl) > > > /dev/sdd1 on /u/client type ocfs2 > (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=9,coherency=full,user_xattr,noacl) > /dev/sdb1 on /u/data type ocfs2 > (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl) > > > /dev/sda1 on /u/client type ocfs2 > (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=11,coherency=full,user_xattr,noacl) > /dev/sdc1 on /u/data type ocfs2 > (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=5,coherency=full,user_xattr,noacl) > > > /dev/sda1 on /u/client type ocfs2 > (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=6,coherency=full,user_xattr,noacl) > /dev/sdc1 on /u/data type ocfs2 > (rw,relatime,_netdev,heartbeat=local,nointr,data=ordered,errors=remount-ro,atime_quantum=60,localalloc=7,coherency=full,user_xattr,noacl > > I'm not sure why the cluster would be picking different values > depending on the node? > > Anyway, any opinions, advice, tuning suggestions greatly appreciated. > This business of the cluster hanging is turning into quite a problem. > > I'll provide any other requested information upon request. > > Thanks, > > Gavin W. Jones > Where 2 Get It, Inc. > > -- > "There has grown up in the minds of certain groups in this country the > notion that because a man or corporation has made a profit out of the > public for a number of years, the government and the courts are > charged with the duty of guaranteeing such profit in the future, even > in the face of changing circumstances and contrary to public interest. > This strange doctrine is not supported by statute nor common law." > > ~Robert Heinlein > > _______________________________________________ > Ocfs2-users mailing list > Ocfs2-users at oss.oracle.com > https://oss.oracle.com/mailman/listinfo/ocfs2-users