Hello! I have tried to google for similar problems, but have not found any relevant or helpful information. I have server with gluster-3.7.4 installed. Now it is configured as 3 pairs of 2 replicas on 6 local hard drives. In current setup it writes log files from several network sources in directory structure like: process/server/month/hour.log. Periodically other process traverses the tree and compress relatively old files. The problem is that server is hung up at different intervals - it can work for 2 weeks without a problem or it can hung several days in a row. Looks like it bacame IO-hung, because kernel responses to pings and to ssh port, but I can not login to it or do anything else. I do not know how to properly debug it. Can somebody help with it? Originally bricks was located on ext4 filesystem, I have tried to change it to xfs, but it does not helped. I have setup netconsole logging from kernel, file is attached. Here is some additional information: # gluster --version glusterfs 3.7.4 built on Sep 19 2015 11:44:12 Repository revision: git://git.gluster.com/glusterfs.git Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> GlusterFS comes with ABSOLUTELY NO WARRANTY. You may redistribute copies of GlusterFS under the terms of the GNU General Public License. # gluster volume info gv0 Volume Name: gv0 Type: Distributed-Replicate Volume ID: b3167dd1-dbc1-48dd-8c8e-ca56a37f78a8 Status: Started Number of Bricks: 3 x 2 = 6 Transport-type: tcp Bricks: Brick1: log0:/data/brick/b/gv0 Brick2: log0:/data/brick/a/gv0 Brick3: log0:/data/brick/d/gv0 Brick4: log0:/data/brick/c/gv0 Brick5: log0:/data/brick/e/gv0 Brick6: log0:/data/brick/f/gv0 Options Reconfigured: performance.readdir-ahead: on cluster.self-heal-daemon: enable # gluster volume status gv0 Status of volume: gv0 Gluster process TCP Port RDMA Port Online Pid ------------------------------------------------------------------------------ Brick log0:/data/brick/b/gv0 49163 0 Y 4385 Brick log0:/data/brick/a/gv0 49166 0 Y 4407 Brick log0:/data/brick/d/gv0 49164 0 Y 4397 Brick log0:/data/brick/c/gv0 49167 0 Y 4418 Brick log0:/data/brick/e/gv0 49165 0 Y 4379 Brick log0:/data/brick/f/gv0 49168 0 Y 4391 NFS Server on localhost N/A N/A N N/A Self-heal Daemon on localhost N/A N/A Y 4358 Task Status of Volume gv0 ------------------------------------------------------------------------------ There are no active volume tasks # uname -a Linux log0 4.1.4-hardened #1 SMP Fri Aug 14 10:32:50 MSK 2015 x86_64 Intel(R) Core(TM) i7-3770 CPU @ 3.40GHz GenuineIntel GNU/Linux (it is gentoo-hardened kernel) # cat /proc/mounts | grep /data/brick /dev/sda4 /data/brick/a xfs rw,noatime,attr2,inode64,noquota 0 0 /dev/sdb4 /data/brick/b xfs rw,noatime,attr2,inode64,noquota 0 0 /dev/sdc4 /data/brick/c xfs rw,noatime,attr2,inode64,noquota 0 0 /dev/sdd4 /data/brick/d xfs rw,noatime,attr2,inode64,noquota 0 0 /dev/sde4 /data/brick/e xfs rw,noatime,attr2,inode64,noquota 0 0 /dev/sdf4 /data/brick/f xfs rw,noatime,attr2,inode64,noquota 0 0 /dev/sdg4 /data/brick/g xfs rw,noatime,attr2,inode64,noquota 0 0 I can provide additional information if needed. -------------- next part -------------- A non-text attachment was scrubbed... Name: logserver.log Type: text/x-log Size: 59545 bytes Desc: not available URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20151031/30c2c82d/attachment.bin>