Gerald Brandt
2012-Apr-27 15:09 UTC
[Gluster-users] Gluster-users Digest, Vol 48, Issue 43
Hi, You are having the exact same problem I am. So far, no response from anyone at Gluster/RedHat as to what is happening or if this is a known issue. Gerald ----- Original Message -----> From: "anthony garnier" <sokar6012 at hotmail.com> > To: gluster-users at gluster.org, gbr at majentis.com > Sent: Friday, April 27, 2012 9:56:31 AM > Subject: RE: Gluster-users Digest, Vol 48, Issue 43 > > > Gerald, > > I'm currently using 3.2.5 under SLES 11. Yes my client is using NFS > to connect to the server. > > But once again the FD seems to stay open on both replica : > > glusterfs 7002 root 22u REG 253,13 13499473920 106 > /users3/poolsave/yval9000/test/tmp/ > 27-04-16-00_test_glusterfs_save.tar (deleted) > > > I checked in my script log : > > .... > .. > . > /users2/splunk/var/ > /users2/splunk/var/log/ > /users2/splunk/var/log/splunk/ > Creation of file /users98/test/tmp/ > 27-04-16-00_test_glusterfs_save.tar > moving previous file in /users98/test/gluster <= Seems to be someting > related when the file is moved > > And then : > > # cat netback_27-04-16-30.log > Simulation netback, tar in /dev/null > tar: Removing leading `/' from member names > /users98/test/gluster/ 27-04-16-00_test_glusterfs_save.tar > deleting archive 27-04-16-00_test_glusterfs_save.tar > > > Maybe we should open a Bug if we don't get a answer ? > > Anthony, > > > > > > > Message: 1 > > Date: Fri, 27 Apr 2012 08:07:33 -0500 (CDT) > > From: Gerald Brandt <gbr at majentis.com> > > Subject: Re: [Gluster-users] remote operation failed: No space left > > on > > device > > To: anthony garnier <sokar6012 at hotmail.com> > > Cc: gluster-users at gluster.org > > Message-ID: <2490900.366.1335532031778.JavaMail.gbr at thinkpad> > > Content-Type: text/plain; charset=utf-8 > > > > Anthony, > > > > I have the exact same issue with GlusterFS 3.2.5 under Ubuntu > > 10.04. I haven't got an answer yet on what is happening. > > > > Are you using the NFS server s GlusterFS? > > > > Gerald > > > > > > ----- Original Message ----- > > > From: "anthony garnier" <sokar6012 at hotmail.com> > > > To: gluster-users at gluster.org > > > Sent: Friday, April 27, 2012 7:41:16 AM > > > Subject: Re: [Gluster-users] remote operation failed: No space > > > left on device > > > > > > > > > > > > After some research, I found what's going on: > > > seems that the glusterfs process has a lot of FD open on each > > > brick : > > > > > > lsof -n -P | grep deleted > > > ksh 644 cdui 2w REG 253,10 952 190 /users/mmr00/log/2704mmr.log > > > (deleted) > > > glusterfs 2357 root 10u REG 253,6 0 28 /tmp/tmpfnMwo2I (deleted) > > > glusterfs 2361 root 10u REG 253,6 0 35 /tmp/tmpfnBlK2I (deleted) > > > glusterfs 2365 root 10u REG 253,6 0 41 /tmp/tmpfHZG51I (deleted) > > > glusterfs 2365 root 12u REG 253,6 1011 13 /tmp/tmpfPGJjje > > > (deleted) > > > glusterfs 2365 root 13u REG 253,6 1013 20 /tmp/tmpf4ITi6m > > > (deleted) > > > glusterfs 2365 root 17u REG 253,6 1012 25 /tmp/tmpfBwwE1h > > > (deleted) > > > glusterfs 2365 root 18u REG 253,6 1011 43 /tmp/tmpfsoNSmV > > > (deleted) > > > glusterfs 2365 root 19u REG 253,6 1011 19 /tmp/tmpfDmMruu > > > (deleted) > > > glusterfs 2365 root 21u REG 253,6 1012 47 /tmp/tmpfE4SpVM > > > (deleted) > > > glusterfs 2365 root 22u REG 253,6 1012 48 /tmp/tmpfHjjdXw > > > (deleted) > > > glusterfs 2365 root 24u REG 253,6 1011 49 /tmp/tmpfwOoX6F > > > (deleted) > > > glusterfs 2365 root 26u REG 253,13 13509969920 1829 > > > /users3/poolsave/yval9000/test/tmp/23-04-18-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 27u REG 253,13 13538048000 1842 > > > /users3/poolsave/yval9000/test/tmp/24-04-07-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 28u REG 253,13 13607956480 1737 > > > /users3/poolsave/yval9000/test/tmp/22-04-01-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 30u REG 253,13 13519441920 1337 > > > /users3/poolsave/yval9000/test/tmp/16-04-14-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 31u REG 253,13 13530081280 1342 > > > /users3/poolsave/yval9000/test/tmp/16-04-16-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 34u REG 253,13 13559777280 1347 > > > /users3/poolsave/yval9000/test/tmp/16-04-20-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 35u REG 253,13 13581772800 1352 > > > /users3/poolsave/yval9000/test/tmp/16-04-23-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 36u REG 253,13 13513922560 1357 > > > /users3/poolsave/yval9000/test/tmp/17-04-04-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 37u REG 253,13 13513338880 1362 > > > /users3/poolsave/yval9000/test/tmp/17-04-07-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 38u REG 253,13 13520199680 1367 > > > /users3/poolsave/yval9000/test/tmp/17-04-08-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 39u REG 253,13 13576509440 1372 > > > /users3/poolsave/yval9000/test/tmp/18-04-00-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 40u REG 253,13 13591869440 1377 > > > /users3/poolsave/yval9000/test/tmp/18-04-02-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 41u REG 253,13 13592371200 1382 > > > /users3/poolsave/yval9000/test/tmp/18-04-04-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 42u REG 253,13 13512202240 1387 > > > /users3/poolsave/yval9000/test/tmp/18-04-11-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 43u REG 253,13 13528012800 1392 > > > /users3/poolsave/yval9000/test/tmp/18-04-19-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 44u REG 253,13 13547735040 1397 > > > /users3/poolsave/yval9000/test/tmp/18-04-22-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 45u REG 253,13 13574236160 1402 > > > /users3/poolsave/yval9000/test/tmp/19-04-03-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 46u REG 253,13 13553295360 1407 > > > /users3/poolsave/yval9000/test/tmp/19-04-05-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 47u REG 253,13 13542963200 1416 > > > /users3/poolsave/yval9000/test/tmp/19-04-13-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 48u REG 253,13 13556346880 1421 > > > /users3/poolsave/yval9000/test/tmp/19-04-15-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 49u REG 253,13 13517445120 1426 > > > /users3/poolsave/yval9000/test/tmp/19-04-16-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 50u REG 253,13 13515581440 1443 > > > /users3/poolsave/yval9000/test/tmp/19-04-17-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 51u REG 253,13 13575065600 1448 > > > /users3/poolsave/yval9000/test/tmp/20-04-05-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 52u REG 253,13 13578659840 1453 > > > /users3/poolsave/yval9000/test/tmp/20-04-06-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 53u REG 253,13 13575741440 1458 > > > /users3/poolsave/yval9000/test/tmp/20-04-10-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 54u REG 253,13 13580052480 1463 > > > /users3/poolsave/yval9000/test/tmp/20-04-11-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 55u REG 253,13 13578004480 1468 > > > /users3/poolsave/yval9000/test/tmp/20-04-15-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 56u REG 253,13 13597194240 1473 > > > /users3/poolsave/yval9000/test/tmp/20-04-18-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 57u REG 253,13 13558159360 1478 > > > /users3/poolsave/yval9000/test/tmp/20-04-20-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 58u REG 253,13 13583390720 1483 > > > /users3/poolsave/yval9000/test/tmp/21-04-01-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 59u REG 253,13 13542072320 1504 > > > /users3/poolsave/yval9000/test/tmp/21-04-10-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 60u REG 253,13 13563269120 1509 > > > /users3/poolsave/yval9000/test/tmp/21-04-15-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 61u REG 253,13 13565573120 1514 > > > /users3/poolsave/yval9000/test/tmp/21-04-16-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 62u REG 253,13 13570037760 1519 > > > /users3/poolsave/yval9000/test/tmp/21-04-17-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 63u REG 253,13 13576038400 1711 > > > /users3/poolsave/yval9000/test/tmp/21-04-18-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 64u REG 253,13 13599846400 1724 > > > /users3/poolsave/yval9000/test/tmp/22-04-00-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 65u REG 253,13 13554862080 1746 > > > /users3/poolsave/yval9000/test/tmp/22-04-03-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 67u REG 253,13 13537761280 1759 > > > /users3/poolsave/yval9000/test/tmp/22-04-16-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 68u REG 253,13 13539768320 1772 > > > /users3/poolsave/yval9000/test/tmp/22-04-18-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 69u REG 253,13 13539614720 1781 > > > /users3/poolsave/yval9000/test/tmp/23-04-02-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 70u REG 253,13 13544693760 1794 > > > /users3/poolsave/yval9000/test/tmp/23-04-05-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 71u REG 253,13 13488281600 1807 > > > /users3/poolsave/yval9000/test/tmp/23-04-10-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 72u REG 253,13 13504133120 1816 > > > /users3/poolsave/yval9000/test/tmp/23-04-11-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 73u REG 253,13 13488353280 1851 > > > /users3/poolsave/yval9000/test/tmp/24-04-14-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 74u REG 253,13 13647964160 1864 > > > /users3/poolsave/yval9000/test/tmp/24-04-15-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 75u REG 253,13 14598379520 1877 > > > /users3/poolsave/yval9000/test/tmp/24-04-16-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 76u REG 253,13 13529159680 1886 > > > /users3/poolsave/yval9000/test/tmp/25-04-02-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 77u REG 253,13 13544314880 1899 > > > /users3/poolsave/yval9000/test/tmp/25-04-14-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 78u REG 253,13 13547100160 1912 > > > /users3/poolsave/yval9000/test/tmp/25-04-15-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 79u REG 253,13 13549813760 1921 > > > /users3/poolsave/yval9000/test/tmp/25-04-17-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 80u REG 253,13 13538140160 1934 > > > /users3/poolsave/yval9000/test/tmp/25-04-23-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 81u REG 253,13 13538641920 1947 > > > /users3/poolsave/yval9000/test/tmp/26-04-16-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 82u REG 253,13 13543915520 1956 > > > /users3/poolsave/yval9000/test/tmp/26-04-17-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 83u REG 253,13 13551063040 1969 > > > /users3/poolsave/yval9000/test/tmp/26-04-18-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 84u REG 253,13 13561364480 1982 > > > /users3/poolsave/yval9000/test/tmp/26-04-20-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 85u REG 253,13 13486448640 1991 > > > /users3/poolsave/yval9000/test/tmp/27-04-07-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 86u REG 253,13 13492623360 2004 > > > /users3/poolsave/yval9000/test/tmp/27-04-08-00_test_glusterfs_save.tar > > > (deleted) > > > glusterfs 2365 root 87u REG 253,13 8968806400 2017 > > > /users3/poolsave/yval9000/test/tmp/27-04-09-00_test_glusterfs_save.tar > > > (deleted) > > > ksh 18908 u347750 3u REG 253,6 28 45 /tmp/ast3c.bh3 (deleted) > > > > > > > > > Each Tar archive is around 15Go > > > > > > Restarting the daemon seems to solve the problem : > > > > > > ylal3510:/users3 # du -s -h . > > > 129G . > > > ylal3510:/users3 # df -h . > > > Filesystem Size Used Avail Use% Mounted on > > > /dev/mapper/users-users3vol > > > 858G 130G 729G 16% /users3 > > > > > > > > > But my main preoccupation is " Is it a normal behavior for > > > glusterfs > > > ?" > > > > > > > > > > > > > > > > > > > > > > > > From: sokar6012 at hotmail.com > > > To: gluster-users-request at gluster.org > > > Subject: remote operation failed: No space left on device > > > Date: Fri, 27 Apr 2012 11:41:01 +0000 > > > > > > > > > Hi all, > > > > > > I've got an issue , it's seems that the size reported by df -h > > > grows > > > indefinitely. Any help would be appreciated. > > > > > > some details : > > > On the client : > > > > > > yval9000:/users98 # df -h . > > > Filesystem Size Used Avail Use% Mounted on > > > ylal3510:/poolsave/yval9000 > > > 1.7T 1.7T 25G 99% /users98 > > > > > > yval9000:/users98 # du -ch . > > > 5.1G /users98 > > > > > > > > > My logs are full of : > > > [2012-04-27 12:14:32.402972] I > > > [client3_1-fops.c:683:client3_1_writev_cbk] 0-poolsave-client-1: > > > remote operation failed: No space left on device > > > [2012-04-27 12:14:32.426964] I > > > [client3_1-fops.c:683:client3_1_writev_cbk] 0-poolsave-client-1: > > > remote operation failed: No space left on device > > > [2012-04-27 12:14:32.439424] I > > > [client3_1-fops.c:683:client3_1_writev_cbk] 0-poolsave-client-1: > > > remote operation failed: No space left on device > > > [2012-04-27 12:14:32.441505] I > > > [client3_1-fops.c:683:client3_1_writev_cbk] 0-poolsave-client-0: > > > remote operation failed: No space left on device > > > > > > > > > > > > This is my volume config : > > > > > > Volume Name: poolsave > > > Type: Distributed-Replicate > > > Status: Started > > > Number of Bricks: 2 x 2 = 4 > > > Transport-type: tcp > > > Bricks: > > > Brick1: ylal3510:/users3/poolsave > > > Brick2: ylal3530:/users3/poolsave > > > Brick3: ylal3520:/users3/poolsave > > > Brick4: ylal3540:/users3/poolsave > > > Options Reconfigured: > > > nfs.enable-ino32: off > > > features.quota-timeout: 30 > > > features.quota: off > > > performance.cache-size: 6GB > > > network.ping-timeout: 60 > > > performance.cache-min-file-size: 1KB > > > performance.cache-max-file-size: 4GB > > > performance.cache-refresh-timeout: 2 > > > nfs.port: 2049 > > > performance.io-thread-count: 64 > > > diagnostics.latency-measurement: on > > > diagnostics.count-fop-hits: on > > > > > > > > > Space left on servers : > > > > > > ylal3510:/users3 # df -h . > > > Filesystem Size Used Avail Use% Mounted on > > > /dev/mapper/users-users3vol > > > 858G 857G 1.1G 100% /users3 > > > ylal3510:/users3 # du -ch /users3 | grep total > > > 129G total > > > --- > > > > > > ylal3530:/users3 # df -h . > > > Filesystem Size Used Avail Use% Mounted on > > > /dev/mapper/users-users3vol > > > 858G 857G 1.1G 100% /users3 > > > ylal3530:/users3 # du -ch /users3 | grep total > > > 129G total > > > --- > > > > > > ylal3520:/users3 # df -h . > > > Filesystem Size Used Avail Use% Mounted on > > > /dev/mapper/users-users3vol > > > 858G 835G 24G 98% /users3 > > > ylal3520:/users3 # du -ch /users3 | grep total > > > 182G total > > > --- > > > > > > ylal3540:/users3 # df -h . > > > Filesystem Size Used Avail Use% Mounted on > > > /dev/mapper/users-users3vol > > > 858G 833G 25G 98% /users3 > > > ylal3540:/users3 # du -ch /users3 | grep total > > > 181G total > > > > > > > > > This issue appears after those 2 scripts running during 2 weeks : > > > > > > test_save.sh is executed each hour,it takes a bunch of data to > > > compress (dest : > > > REP_SAVE_TEMP) and then move it in a folder (REP_SAVE) that the > > > netback.sh > > > script will scan each 30 min > > > > > > #!/usr/bin/ksh > > > # > > > ________________________________________________________________________ > > > # | > > > # Nom test_save.sh > > > # > > > ____________|___________________________________________________________ > > > # | > > > # Description | test GlusterFS > > > # > > > ____________|___________________________________________________________ > > > > > > UNIXSAVE=/users98/test > > > REP_SAVE_TEMP=${UNIXSAVE}/tmp > > > REP_SAVE=${UNIXSAVE}/gluster > > > LOG=/users/glusterfs_test > > > > > > > > > f_tar_mv() > > > { > > > echo "\n" > > > ARCHNAME=${REP_SAVE_TEMP}/`date +%d-%m-%H-%M`_${SUBNAME}.tar > > > > > > tar -cpvf ${ARCHNAME} ${REPERTOIRE} > > > > > > echo "creation of ${ARCHNAME}" > > > > > > > > > # mv ${REP_SAVE_TEMP}/*_${SUBNAME}.tar ${REP_SAVE} > > > mv ${REP_SAVE_TEMP}/* ${REP_SAVE} > > > echo "Moving archive in ${REP_SAVE} " > > > echo "\n" > > > > > > return $? > > > } > > > > > > REPERTOIRE="/users2/" > > > SUBNAME="test_glusterfs_save" > > > f_tar_mv >$LOG/save_`date +%d-%m-%Y-%H-%M`.log 2>&1 > > > > > > > > > #!/usr/bin/ksh > > > # > > > ________________________________________________________________________ > > > # | > > > # Nom netback.sh > > > # > > > ____________|___________________________________________________________ > > > # | > > > # Description | Sauvegarde test GlusterFS > > > # > > > ____________|___________________________________________________________ > > > > > > UNIXSAVE=/users98/test > > > REP_SAVE_TEMP=${UNIXSAVE}/tmp > > > REP_SAVE=${UNIXSAVE}/gluster > > > LOG=/users/glusterfs_test > > > > > > f_net_back() > > > { > > > if [[ `find ${REP_SAVE} -type f | wc -l` -eq 0 ]] > > > then > > > echo "nothing to save"; > > > else > > > echo "Simulation netbackup, tar in /dev/null" > > > tar -cpvf /dev/null ${REP_SAVE}/* > > > echo "deletion archive" > > > rm ${REP_SAVE}/* > > > > > > fi > > > return $? > > > } > > > > > > f_net_back >${LOG}/netback_`date +%d-%m-%H-%M`.log 2>&1 > > > ********************************************* >
Amar Tumballi
2012-Apr-30 16:14 UTC
[Gluster-users] Gluster-users Digest, Vol 48, Issue 43
> > You are having the exact same problem I am. So far, no response from anyone at Gluster/RedHat as to what is happening or if this is a known issue. >Hi Gerald/Anthony, This issue is not easy to handle with 3.2.x version of gluster's NFS server. This issue is being addressed with 3.3.x branch (ie current master branch). Please try 3.3.0beta3+ or qa36+ for testing the behavior. This happens because NFS process works on FH (file handles), and for that we needed to keep a fd-ref till NFS client has reference to filehandle. With 3.3.0, we changed some of the internal way how we handle NFS FHs, so this problem should not happen in 3.3.0 release. Regards, Amar