Strahil Nikolov
2020-Mar-11 00:18 UTC
[Gluster-users] Erroneous "No space left on device." messages
On March 10, 2020 9:47:49 PM GMT+02:00, Pat Haley <phaley at mit.edu> wrote:> >Hi, > >If I understand this, to remove the "No space left on device" error I >either have to clear up 10% space on each brick, or clean-up a lesser >amount and reset cluster.min-free.? Is this correct? > >I have found the following command for resetting the cluster.min-free > > * > > gluster volume set <volume> cluster.min-free-disk <value> > >Can this be done while the volume is live?? Does the <value> need to be > >an integer? > >Thanks > >Pat > > >On 3/10/20 2:45 PM, Pat Haley wrote: >> >> Hi, >> >> I get the following >> >> [root at mseas-data2 bricks]# gluster? volume get data-volume all | grep > >> cluster.min-free >> cluster.min-free-disk 10% >> cluster.min-free-inodes 5% >> >> >> On 3/10/20 2:34 PM, Strahil Nikolov wrote: >>> On March 10, 2020 8:14:41 PM GMT+02:00, Pat Haley <phaley at mit.edu> >>> wrote: >>>> HI, >>>> >>>> After some more poking around in the logs (specifically the brick >logs) >>>> >>>> ? * brick1 & brick2 have both been recording "No space left on >device" >>>> ??? messages today (as recently at 15 minutes ago) >>>> ? * brick3 last recorded a "No space left on device" message last >night >>>> ??? around 10:30pm >>>> ? * brick4 has no such messages in its log file >>>> >>>> Note brick1 & brick2 are on one server, brick3 and brick4 are on >the >>>> second server. >>>> >>>> Pat >>>> >>>> >>>> On 3/10/20 11:51 AM, Pat Haley wrote: >>>>> Hi, >>>>> >>>>> We have developed a problem with Gluster reporting "No space left >on >>>>> device." even though "df" of both the gluster filesystem and the >>>>> underlying bricks show space available (details below).? Our inode >>>>> usage is between 1-3%.? We are running gluster 3.7.11 in a >>>> distributed >>>>> volume across 2 servers (2 bricks each). We have followed the >thread >>>>> >>>> >https://lists.gluster.org/pipermail/gluster-users/2020-March/037821.html > >>>> >>>> >>>>> but haven't found a solution yet. >>>>> >>>>> Last night we ran a rebalance which appeared successful (and have >>>>> since cleared up some more space which seems to have mainly been >on >>>>> one brick).? There were intermittent erroneous "No space..." >messages >>>>> last night, but they have become much more frequent today. >>>>> >>>>> Any help would be greatly appreciated. >>>>> >>>>> Thanks >>>>> >>>>> --------------------------- >>>>> [root at mseas-data2 ~]# df -h >>>>> --------------------------- >>>>> Filesystem????? Size? Used Avail Use% Mounted on >>>>> /dev/sdb??????? 164T? 164T? 324G 100% /mnt/brick2 >>>>> /dev/sda??????? 164T? 164T? 323G 100% /mnt/brick1 >>>>> --------------------------- >>>>> [root at mseas-data2 ~]# df -i >>>>> --------------------------- >>>>> Filesystem???????? Inodes??? IUsed????? IFree IUse% Mounted on >>>>> /dev/sdb?????? 1375470800 31207165 1344263635??? 3% /mnt/brick2 >>>>> /dev/sda?????? 1384781520 28706614 1356074906??? 3% /mnt/brick1 >>>>> >>>>> --------------------------- >>>>> [root at mseas-data3 ~]# df -h >>>>> --------------------------- >>>>> /dev/sda?????????????? 91T?? 91T? 323G 100% /export/sda/brick3 >>>>> /dev/mapper/vg_Data4-lv_Data4 >>>>> ??????????????????????? 91T?? 88T? 3.4T? 97% /export/sdc/brick4 >>>>> --------------------------- >>>>> [root at mseas-data3 ~]# df -i >>>>> --------------------------- >>>>> /dev/sda????????????? 679323496? 9822199? 669501297??? 2% >>>>> /export/sda/brick3 >>>>> /dev/mapper/vg_Data4-lv_Data4 >>>>> ????????????????????? 3906272768 11467484 3894805284??? 1% >>>>> /export/sdc/brick4 >>>>> >>>>> >>>>> >>>>> --------------------------------------- >>>>> [root at mseas-data2 ~]# gluster --version >>>>> --------------------------------------- >>>>> glusterfs 3.7.11 built on Apr 27 2016 14:09:22 >>>>> Repository revision: git://git.gluster.com/glusterfs.git >>>>> Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> >>>>> GlusterFS comes with ABSOLUTELY NO WARRANTY. >>>>> You may redistribute copies of GlusterFS under the terms of the >GNU >>>>> General Public License. >>>>> >>>>> >>>>> >>>>> ----------------------------------------- >>>>> [root at mseas-data2 ~]# gluster volume info >>>>> ----------------------------------------- >>>>> Volume Name: data-volume >>>>> Type: Distribute >>>>> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18 >>>>> Status: Started >>>>> Number of Bricks: 4 >>>>> Transport-type: tcp >>>>> Bricks: >>>>> Brick1: mseas-data2:/mnt/brick1 >>>>> Brick2: mseas-data2:/mnt/brick2 >>>>> Brick3: mseas-data3:/export/sda/brick3 >>>>> Brick4: mseas-data3:/export/sdc/brick4 >>>>> Options Reconfigured: >>>>> nfs.export-volumes: off >>>>> nfs.disable: on >>>>> performance.readdir-ahead: on >>>>> diagnostics.brick-sys-log-level: WARNING >>>>> nfs.exports-auth-enable: on >>>>> server.allow-insecure: on >>>>> auth.allow: * >>>>> disperse.eager-lock: off >>>>> performance.open-behind: off >>>>> performance.md-cache-timeout: 60 >>>>> network.inode-lru-limit: 50000 >>>>> diagnostics.client-log-level: ERROR >>>>> >>>>> >>>>> >>>>> -------------------------------------------------------------- >>>>> [root at mseas-data2 ~]# gluster volume status data-volume detail >>>>> -------------------------------------------------------------- >>>>> Status of volume: data-volume >>>>> >>>> >------------------------------------------------------------------------------ > >>>> >>>> >>>>> Brick??????????????? : Brick mseas-data2:/mnt/brick1 >>>>> TCP Port???????????? : 49154 >>>>> RDMA Port??????????? : 0 >>>>> Online?????????????? : Y >>>>> Pid????????????????? : 4601 >>>>> File System????????? : xfs >>>>> Device?????????????? : /dev/sda >>>>> Mount Options??????? : rw >>>>> Inode Size?????????? : 256 >>>>> Disk Space Free????? : 318.8GB >>>>> Total Disk Space???? : 163.7TB >>>>> Inode Count????????? : 1365878288 >>>>> Free Inodes????????? : 1337173596 >>>>> >>>> >------------------------------------------------------------------------------ > >>>> >>>> >>>>> Brick??????????????? : Brick mseas-data2:/mnt/brick2 >>>>> TCP Port???????????? : 49155 >>>>> RDMA Port??????????? : 0 >>>>> Online?????????????? : Y >>>>> Pid????????????????? : 7949 >>>>> File System????????? : xfs >>>>> Device?????????????? : /dev/sdb >>>>> Mount Options??????? : rw >>>>> Inode Size?????????? : 256 >>>>> Disk Space Free????? : 319.8GB >>>>> Total Disk Space???? : 163.7TB >>>>> Inode Count????????? : 1372421408 >>>>> Free Inodes????????? : 1341219039 >>>>> >>>> >------------------------------------------------------------------------------ > >>>> >>>> >>>>> Brick??????????????? : Brick mseas-data3:/export/sda/brick3 >>>>> TCP Port???????????? : 49153 >>>>> RDMA Port??????????? : 0 >>>>> Online?????????????? : Y >>>>> Pid????????????????? : 4650 >>>>> File System????????? : xfs >>>>> Device?????????????? : /dev/sda >>>>> Mount Options??????? : rw >>>>> Inode Size?????????? : 512 >>>>> Disk Space Free????? : 325.3GB >>>>> Total Disk Space???? : 91.0TB >>>>> Inode Count????????? : 692001992 >>>>> Free Inodes????????? : 682188893 >>>>> >>>> >------------------------------------------------------------------------------ > >>>> >>>> >>>>> Brick??????????????? : Brick mseas-data3:/export/sdc/brick4 >>>>> TCP Port???????????? : 49154 >>>>> RDMA Port??????????? : 0 >>>>> Online?????????????? : Y >>>>> Pid????????????????? : 23772 >>>>> File System????????? : xfs >>>>> Device?????????????? : /dev/mapper/vg_Data4-lv_Data4 >>>>> Mount Options??????? : rw >>>>> Inode Size?????????? : 256 >>>>> Disk Space Free????? : 3.4TB >>>>> Total Disk Space???? : 90.9TB >>>>> Inode Count????????? : 3906272768 >>>>> Free Inodes????????? : 3894809903 >>>>> >>> Hi Pat, >>> >>> What is the output of: >>> gluster? volume get data-volume all | grep cluster.min-free >>> >>> 1% of 164 T is? 1640G , but in your case you have only 324G which is > >>> way lower. >>> >>> Best Regards, >>> Strahil Nikolov >>Hey Pat, Some users have reported they are using a value of 1% and it seems to be working. Most probably you will be able to do it live, but I have never had to change that. You can give a try on a test cluster. Best Regards, Strahil Nikolov
Pat Haley
2020-Mar-11 14:27 UTC
[Gluster-users] Erroneous "No space left on device." messages
Hi, I was able to successfully reset cluster.min-free-disk.? That only made the "No space left on device" problem intermittent instead of constant.? I then look at the brick log files again and noticed "No space ..." error recorded for files that I knew nobody was accessing.? gluster volume status was also reporting a rebalance on-going (but not the same ID as that one I started on Monday).? I stopped the rebalance and I do not seem to be getting the "No space left on device" messages. However I now have new curious issue.? I have at least one file that I created after resetting cluster.min-free-disk but before shutting down the rebalance that does not show up on a simple "ls" command but does show up if I explicitly try to ls that file (example below, the file in question is PeManJob).? This semi-missing file is located on brick1 (one of the 2 that were giving the "No space left on device" messages).? How do I fix this new issue? Thanks Pat mseas(DSMccfzR75deg_001b)% ls at_pe_job?????????????????????????????????? pe_nrg.nc check_times_job???????????????????????????? pe_out.nc HoldJob???????????????????????????????????? pe_PBI.in oi_3hr.dat????????????????????????????????? PePbiJob PE_Data_Comparison_glider_all_smalldom.m??? pe_PB.in PE_Data_Comparison_glider_sp011_smalldom.m? pe_PB.log PE_Data_Comparison_glider_sp064_smalldom.m pe_PB_short.in PeManJob.log??????????????????????????????? PlotJob mseas(DSMccfzR75deg_001b)% ls PeManJob PeManJob mseas(DSMccfzR75deg_001b)% ls PeManJob* PeManJob.log On 3/10/20 8:18 PM, Strahil Nikolov wrote:> On March 10, 2020 9:47:49 PM GMT+02:00, Pat Haley <phaley at mit.edu> wrote: >> Hi, >> >> If I understand this, to remove the "No space left on device" error I >> either have to clear up 10% space on each brick, or clean-up a lesser >> amount and reset cluster.min-free.? Is this correct? >> >> I have found the following command for resetting the cluster.min-free >> >> * >> >> gluster volume set <volume> cluster.min-free-disk <value> >> >> Can this be done while the volume is live?? Does the <value> need to be >> >> an integer? >> >> Thanks >> >> Pat >> >> >> On 3/10/20 2:45 PM, Pat Haley wrote: >>> Hi, >>> >>> I get the following >>> >>> [root at mseas-data2 bricks]# gluster? volume get data-volume all | grep >>> cluster.min-free >>> cluster.min-free-disk 10% >>> cluster.min-free-inodes 5% >>> >>> >>> On 3/10/20 2:34 PM, Strahil Nikolov wrote: >>>> On March 10, 2020 8:14:41 PM GMT+02:00, Pat Haley <phaley at mit.edu> >>>> wrote: >>>>> HI, >>>>> >>>>> After some more poking around in the logs (specifically the brick >> logs) >>>>> ? * brick1 & brick2 have both been recording "No space left on >> device" >>>>> ??? messages today (as recently at 15 minutes ago) >>>>> ? * brick3 last recorded a "No space left on device" message last >> night >>>>> ??? around 10:30pm >>>>> ? * brick4 has no such messages in its log file >>>>> >>>>> Note brick1 & brick2 are on one server, brick3 and brick4 are on >> the >>>>> second server. >>>>> >>>>> Pat >>>>> >>>>> >>>>> On 3/10/20 11:51 AM, Pat Haley wrote: >>>>>> Hi, >>>>>> >>>>>> We have developed a problem with Gluster reporting "No space left >> on >>>>>> device." even though "df" of both the gluster filesystem and the >>>>>> underlying bricks show space available (details below).? Our inode >>>>>> usage is between 1-3%.? We are running gluster 3.7.11 in a >>>>> distributed >>>>>> volume across 2 servers (2 bricks each). We have followed the >> thread >> https://lists.gluster.org/pipermail/gluster-users/2020-March/037821.html >> >>>>> >>>>>> but haven't found a solution yet. >>>>>> >>>>>> Last night we ran a rebalance which appeared successful (and have >>>>>> since cleared up some more space which seems to have mainly been >> on >>>>>> one brick).? There were intermittent erroneous "No space..." >> messages >>>>>> last night, but they have become much more frequent today. >>>>>> >>>>>> Any help would be greatly appreciated. >>>>>> >>>>>> Thanks >>>>>> >>>>>> --------------------------- >>>>>> [root at mseas-data2 ~]# df -h >>>>>> --------------------------- >>>>>> Filesystem????? Size? Used Avail Use% Mounted on >>>>>> /dev/sdb??????? 164T? 164T? 324G 100% /mnt/brick2 >>>>>> /dev/sda??????? 164T? 164T? 323G 100% /mnt/brick1 >>>>>> --------------------------- >>>>>> [root at mseas-data2 ~]# df -i >>>>>> --------------------------- >>>>>> Filesystem???????? Inodes??? IUsed????? IFree IUse% Mounted on >>>>>> /dev/sdb?????? 1375470800 31207165 1344263635??? 3% /mnt/brick2 >>>>>> /dev/sda?????? 1384781520 28706614 1356074906??? 3% /mnt/brick1 >>>>>> >>>>>> --------------------------- >>>>>> [root at mseas-data3 ~]# df -h >>>>>> --------------------------- >>>>>> /dev/sda?????????????? 91T?? 91T? 323G 100% /export/sda/brick3 >>>>>> /dev/mapper/vg_Data4-lv_Data4 >>>>>> ??????????????????????? 91T?? 88T? 3.4T? 97% /export/sdc/brick4 >>>>>> --------------------------- >>>>>> [root at mseas-data3 ~]# df -i >>>>>> --------------------------- >>>>>> /dev/sda????????????? 679323496? 9822199? 669501297??? 2% >>>>>> /export/sda/brick3 >>>>>> /dev/mapper/vg_Data4-lv_Data4 >>>>>> ????????????????????? 3906272768 11467484 3894805284??? 1% >>>>>> /export/sdc/brick4 >>>>>> >>>>>> >>>>>> >>>>>> --------------------------------------- >>>>>> [root at mseas-data2 ~]# gluster --version >>>>>> --------------------------------------- >>>>>> glusterfs 3.7.11 built on Apr 27 2016 14:09:22 >>>>>> Repository revision: git://git.gluster.com/glusterfs.git >>>>>> Copyright (c) 2006-2011 Gluster Inc. <http://www.gluster.com> >>>>>> GlusterFS comes with ABSOLUTELY NO WARRANTY. >>>>>> You may redistribute copies of GlusterFS under the terms of the >> GNU >>>>>> General Public License. >>>>>> >>>>>> >>>>>> >>>>>> ----------------------------------------- >>>>>> [root at mseas-data2 ~]# gluster volume info >>>>>> ----------------------------------------- >>>>>> Volume Name: data-volume >>>>>> Type: Distribute >>>>>> Volume ID: c162161e-2a2d-4dac-b015-f31fd89ceb18 >>>>>> Status: Started >>>>>> Number of Bricks: 4 >>>>>> Transport-type: tcp >>>>>> Bricks: >>>>>> Brick1: mseas-data2:/mnt/brick1 >>>>>> Brick2: mseas-data2:/mnt/brick2 >>>>>> Brick3: mseas-data3:/export/sda/brick3 >>>>>> Brick4: mseas-data3:/export/sdc/brick4 >>>>>> Options Reconfigured: >>>>>> nfs.export-volumes: off >>>>>> nfs.disable: on >>>>>> performance.readdir-ahead: on >>>>>> diagnostics.brick-sys-log-level: WARNING >>>>>> nfs.exports-auth-enable: on >>>>>> server.allow-insecure: on >>>>>> auth.allow: * >>>>>> disperse.eager-lock: off >>>>>> performance.open-behind: off >>>>>> performance.md-cache-timeout: 60 >>>>>> network.inode-lru-limit: 50000 >>>>>> diagnostics.client-log-level: ERROR >>>>>> >>>>>> >>>>>> >>>>>> -------------------------------------------------------------- >>>>>> [root at mseas-data2 ~]# gluster volume status data-volume detail >>>>>> -------------------------------------------------------------- >>>>>> Status of volume: data-volume >>>>>> >> ------------------------------------------------------------------------------ >> >>>>> >>>>>> Brick??????????????? : Brick mseas-data2:/mnt/brick1 >>>>>> TCP Port???????????? : 49154 >>>>>> RDMA Port??????????? : 0 >>>>>> Online?????????????? : Y >>>>>> Pid????????????????? : 4601 >>>>>> File System????????? : xfs >>>>>> Device?????????????? : /dev/sda >>>>>> Mount Options??????? : rw >>>>>> Inode Size?????????? : 256 >>>>>> Disk Space Free????? : 318.8GB >>>>>> Total Disk Space???? : 163.7TB >>>>>> Inode Count????????? : 1365878288 >>>>>> Free Inodes????????? : 1337173596 >>>>>> >> ------------------------------------------------------------------------------ >> >>>>> >>>>>> Brick??????????????? : Brick mseas-data2:/mnt/brick2 >>>>>> TCP Port???????????? : 49155 >>>>>> RDMA Port??????????? : 0 >>>>>> Online?????????????? : Y >>>>>> Pid????????????????? : 7949 >>>>>> File System????????? : xfs >>>>>> Device?????????????? : /dev/sdb >>>>>> Mount Options??????? : rw >>>>>> Inode Size?????????? : 256 >>>>>> Disk Space Free????? : 319.8GB >>>>>> Total Disk Space???? : 163.7TB >>>>>> Inode Count????????? : 1372421408 >>>>>> Free Inodes????????? : 1341219039 >>>>>> >> ------------------------------------------------------------------------------ >> >>>>> >>>>>> Brick??????????????? : Brick mseas-data3:/export/sda/brick3 >>>>>> TCP Port???????????? : 49153 >>>>>> RDMA Port??????????? : 0 >>>>>> Online?????????????? : Y >>>>>> Pid????????????????? : 4650 >>>>>> File System????????? : xfs >>>>>> Device?????????????? : /dev/sda >>>>>> Mount Options??????? : rw >>>>>> Inode Size?????????? : 512 >>>>>> Disk Space Free????? : 325.3GB >>>>>> Total Disk Space???? : 91.0TB >>>>>> Inode Count????????? : 692001992 >>>>>> Free Inodes????????? : 682188893 >>>>>> >> ------------------------------------------------------------------------------ >> >>>>> >>>>>> Brick??????????????? : Brick mseas-data3:/export/sdc/brick4 >>>>>> TCP Port???????????? : 49154 >>>>>> RDMA Port??????????? : 0 >>>>>> Online?????????????? : Y >>>>>> Pid????????????????? : 23772 >>>>>> File System????????? : xfs >>>>>> Device?????????????? : /dev/mapper/vg_Data4-lv_Data4 >>>>>> Mount Options??????? : rw >>>>>> Inode Size?????????? : 256 >>>>>> Disk Space Free????? : 3.4TB >>>>>> Total Disk Space???? : 90.9TB >>>>>> Inode Count????????? : 3906272768 >>>>>> Free Inodes????????? : 3894809903 >>>>>> >>>> Hi Pat, >>>> >>>> What is the output of: >>>> gluster? volume get data-volume all | grep cluster.min-free >>>> >>>> 1% of 164 T is? 1640G , but in your case you have only 324G which is >>>> way lower. >>>> >>>> Best Regards, >>>> Strahil Nikolov > Hey Pat, > > Some users have reported they are using a value of 1% and it seems to be working. > > Most probably you will be able to do it live, but I have never had to change that. You can give a try on a test cluster. > > Best Regards, > Strahil Nikolov-- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Pat Haley Email: phaley at mit.edu Center for Ocean Engineering Phone: (617) 253-6824 Dept. of Mechanical Engineering Fax: (617) 253-8125 MIT, Room 5-213 http://web.mit.edu/phaley/www/ 77 Massachusetts Avenue Cambridge, MA 02139-4301