Markus Fröhlich
2011-Jun-06 15:03 UTC
[Gluster-users] uninterruptible processes writing to glusterfs share
hi!
sometimes we've on some client-servers hanging uninterruptible processes
("ps aux" stat is on "D" ) and on one the CPU wait I/O grows
within some
minutes to 100%.
you are not able to kill such processes - also "kill -9" doesnt work -
when you connect via "strace" to such an process, you wont see
anything
and you cannot detach it again.
there are only two possibilities:
killing the glusterfs process (umount GFS share) or rebooting the server.
the only log entry I found, was on one client - just a single line:
[2011-06-06 10:44:18.593211] I
[afr-common.c:581:afr_lookup_collect_xattr] 0-office-data-replicate-0:
data self-heal is pending for
/pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML/bilder/Thumbs.db.
one of the client-servers is a samba-server, the other one a
backup-server based on rsync with millions of small files.
gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0
and here are the configs from server and client:
server config
"/etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol":
volume office-data-posix
type storage/posix
option directory /GFS/office-data02
end-volume
volume office-data-access-control
type features/access-control
subvolumes office-data-posix
end-volume
volume office-data-locks
type features/locks
subvolumes office-data-access-control
end-volume
volume office-data-io-threads
type performance/io-threads
subvolumes office-data-locks
end-volume
volume office-data-marker
type features/marker
option volume-uuid 3c6e633d-a0bb-4c52-8f05-a2db9bc9c659
option timestamp-file /etc/glusterd/vols/office-data/marker.tstamp
option xtime off
option quota off
subvolumes office-data-io-threads
end-volume
volume /GFS/office-data02
type debug/io-stats
option latency-measurement off
option count-fop-hits off
subvolumes office-data-marker
end-volume
volume office-data-server
type protocol/server
option transport-type tcp
option auth.addr./GFS/office-data02.allow *
subvolumes /GFS/office-data02
end-volume
--------------
client config "/etc/glusterd/vols/office-data/office-data-fuse.vol":
volume office-data-client-0
type protocol/client
option remote-host gfs-01-01
option remote-subvolume /GFS/office-data02
option transport-type tcp
end-volume
volume office-data-replicate-0
type cluster/replicate
subvolumes office-data-client-0
end-volume
volume office-data-write-behind
type performance/write-behind
subvolumes office-data-replicate-0
end-volume
volume office-data-read-ahead
type performance/read-ahead
subvolumes office-data-write-behind
end-volume
volume office-data-io-cache
type performance/io-cache
subvolumes office-data-read-ahead
end-volume
volume office-data-quick-read
type performance/quick-read
subvolumes office-data-io-cache
end-volume
volume office-data-stat-prefetch
type performance/stat-prefetch
subvolumes office-data-quick-read
end-volume
volume office-data
type debug/io-stats
option latency-measurement off
option count-fop-hits off
subvolumes office-data-stat-prefetch
end-volume
--
Mit freundlichen Gr?ssen
Markus Fr?hlich
Techniker
________________________________________________________
Xidras GmbH
Stockern 47
3744 Stockern
Austria
Tel: +43 (0) 2983 201 30503
Fax: +43 (0) 2983 201 305039
Email: markus.froehlich at xidras.com
Web: http://www.xidras.com
FN 317036 f | Landesgericht Krems | ATU64485024
________________________________________________________________________________
VERTRAULICHE INFORMATIONEN!
Diese eMail enth?lt vertrauliche Informationen und ist nur f?r den
berechtigten Empf?nger bestimmt. Wenn diese eMail nicht f?r Sie bestimmt
ist, bitten wir Sie, diese eMail an uns zur?ckzusenden und anschlie?end
auf Ihrem Computer und Mail-Server zu l?schen. Solche eMails und Anlagen
d?rfen Sie weder nutzen, noch verarbeiten oder Dritten zug?nglich
machen, gleich in welcher Form.
Wir danken f?r Ihre Kooperation!
CONFIDENTIAL!
This email contains confidential information and is intended for the
authorised recipient only. If you are not an authorised recipient,
please return the email to us and then delete it from your computer
and mail-server. You may neither use nor edit any such emails including
attachments, nor make them accessible to third parties in any manner
whatsoever.
Thank you for your cooperation
________________________________________________________________________________
Mohit Anchlia
2011-Jun-06 19:13 UTC
[Gluster-users] uninterruptible processes writing to glusterfs share
Is there anything in the server logs? Does it follow any particular pattern before going in this mode? Did you upgrade Gluster or is this new install? 2011/6/6 Markus Fr?hlich <markus.froehlich at xidras.com>:> hi! > > sometimes we've on some client-servers hanging uninterruptible processes > ("ps aux" stat is on "D" ) and on one the CPU wait I/O grows within some > minutes to 100%. > you are not able to kill such processes - also "kill -9" doesnt work - when > you connect via "strace" to such an process, you wont see anything and you > cannot detach it again. > > there are only two possibilities: > killing the glusterfs process (umount GFS share) or rebooting the server. > > the only log entry I found, was on one client - just a single line: > [2011-06-06 10:44:18.593211] I [afr-common.c:581:afr_lookup_collect_xattr] > 0-office-data-replicate-0: data self-heal is pending for > /pc-partnerbet-public/Promotionaktionen/Mailakquise_2009/Webmaster_2010/HTML/bilder/Thumbs.db. > > one of the client-servers is a samba-server, the other one a backup-server > based on rsync with millions of small files. > > gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0 > > and here are the configs from server and client: > server config > "/etc/glusterd/vols/office-data/office-data.gfs-01-01.GFS-office-data02.vol": > volume office-data-posix > ? ?type storage/posix > ? ?option directory /GFS/office-data02 > end-volume > > volume office-data-access-control > ? ?type features/access-control > ? ?subvolumes office-data-posix > end-volume > > volume office-data-locks > ? ?type features/locks > ? ?subvolumes office-data-access-control > end-volume > > volume office-data-io-threads > ? ?type performance/io-threads > ? ?subvolumes office-data-locks > end-volume > > volume office-data-marker > ? ?type features/marker > ? ?option volume-uuid 3c6e633d-a0bb-4c52-8f05-a2db9bc9c659 > ? ?option timestamp-file /etc/glusterd/vols/office-data/marker.tstamp > ? ?option xtime off > ? ?option quota off > ? ?subvolumes office-data-io-threads > end-volume > > volume /GFS/office-data02 > ? ?type debug/io-stats > ? ?option latency-measurement off > ? ?option count-fop-hits off > ? ?subvolumes office-data-marker > end-volume > > volume office-data-server > ? ?type protocol/server > ? ?option transport-type tcp > ? ?option auth.addr./GFS/office-data02.allow * > ? ?subvolumes /GFS/office-data02 > end-volume > > > -------------- > client config "/etc/glusterd/vols/office-data/office-data-fuse.vol": > volume office-data-client-0 > ? ?type protocol/client > ? ?option remote-host gfs-01-01 > ? ?option remote-subvolume /GFS/office-data02 > ? ?option transport-type tcp > end-volume > > volume office-data-replicate-0 > ? ?type cluster/replicate > ? ?subvolumes office-data-client-0 > end-volume > > volume office-data-write-behind > ? ?type performance/write-behind > ? ?subvolumes office-data-replicate-0 > end-volume > > volume office-data-read-ahead > ? ?type performance/read-ahead > ? ?subvolumes office-data-write-behind > end-volume > > volume office-data-io-cache > ? ?type performance/io-cache > ? ?subvolumes office-data-read-ahead > end-volume > > volume office-data-quick-read > ? ?type performance/quick-read > ? ?subvolumes office-data-io-cache > end-volume > > volume office-data-stat-prefetch > ? ?type performance/stat-prefetch > ? ?subvolumes office-data-quick-read > end-volume > > volume office-data > ? ?type debug/io-stats > ? ?option latency-measurement off > ? ?option count-fop-hits off > ? ?subvolumes office-data-stat-prefetch > end-volume > > > ?-- Mit freundlichen Gr?ssen > > Markus Fr?hlich > Techniker > > ________________________________________________________ > > Xidras GmbH > Stockern 47 > 3744 Stockern > Austria > > Tel: ? ? +43 (0) 2983 201 30503 > Fax: ? ? +43 (0) 2983 201 305039 > Email: ? markus.froehlich at xidras.com > Web: ? ?http://www.xidras.com > > FN 317036 f | Landesgericht Krems | ATU64485024 > > ________________________________________________________________________________ > > VERTRAULICHE INFORMATIONEN! > Diese eMail enth?lt vertrauliche Informationen und ist nur f?r den > berechtigten Empf?nger bestimmt. Wenn diese eMail nicht f?r Sie bestimmt > ist, bitten wir Sie, diese eMail an uns zur?ckzusenden und anschlie?end > auf Ihrem Computer und Mail-Server zu l?schen. Solche eMails und Anlagen > d?rfen Sie weder nutzen, noch verarbeiten oder Dritten zug?nglich > machen, gleich in welcher Form. > Wir danken f?r Ihre Kooperation! > > CONFIDENTIAL! > This email contains confidential information and is intended for the > authorised recipient only. If you are not an authorised recipient, > please return the email to us and then delete it from your computer > and mail-server. You may neither use nor edit any such emails including > attachments, nor make them accessible to third parties in any manner > whatsoever. > Thank you for your cooperation > > ________________________________________________________________________________ > > > > > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://gluster.org/cgi-bin/mailman/listinfo/gluster-users >
Tomasz Chmielewski
2011-Jun-07 07:55 UTC
[Gluster-users] uninterruptible processes writing to glusterfs share
On 06.06.2011 16:03, Markus Fr?hlich wrote:> hi! > > sometimes we've on some client-servers hanging uninterruptible processes > ("ps aux" stat is on "D" ) and on one the CPU wait I/O grows within some > minutes to 100%. > you are not able to kill such processes - also "kill -9" doesnt work - > when you connect via "strace" to such an process, you wont see anything > and you cannot detach it again. > > there are only two possibilities: > killing the glusterfs process (umount GFS share) or rebooting the server.> gfs-servers + gfs-clients: SLES11 x86_64, glusterfs V 3.2.0I've seen such "hangs" as well with 3.2.0; 3.1.4 works fine. -- Tomasz Chmielewski http://wpkg.org