Jose V. Carrión
2018-Mar-01 09:39 UTC
[Gluster-users] df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)
Hi Nithya, Below the output of both volumes: [root at stor1t ~]# gluster volume rebalance volumedisk1 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 703964 16384.0PB 1475983 0 0 completed 64:37:55 stor2data 704610 16384.0PB 1475199 0 0 completed 64:31:30 stor3data 703964 16384.0PB 1475983 0 0 completed 64:37:55 volume rebalance: volumedisk1: success [root at stor1 ~]# gluster volume rebalance volumedisk0 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 411919 1.1GB 718044 0 0 completed 2:28:52 stor2data 435340 16384.0PB 741287 0 0 completed 2:26:01 stor3data 411919 1.1GB 718044 0 0 completed 2:28:52 volume rebalance: volumedisk0: success And volumedisk1 rebalance logs finished saying: [2018-02-13 03:47:48.703311] I [MSGID: 109028] [dht-rebalance.c:5053:gf_defrag_status_get] 0-volumedisk1-dht: Rebalance is completed. Time taken is 232675.00 secs [2018-02-13 03:47:48.703351] I [MSGID: 109028] [dht-rebalance.c:5057:gf_defrag_status_get] 0-volumedisk1-dht: Files migrated: 703964, size: 14046969178073, lookups: 1475983, failures: 0, skipped: 0 Checking my logs the new stor3node and the rebalance task was executed on 2018-02-10 . From this date to now I have been storing new files. The sequence of commands to add the node was: gluster peer probe stor3data gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b1/glusterfs/vol0 gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b1/glusterfs/vol0 2018-03-01 6:32 GMT+01:00 Nithya Balachandran <nbalacha at redhat.com>:> Hi Jose, > > On 28 February 2018 at 22:31, Jose V. Carri?n <jocarbur at gmail.com> wrote: > >> Hi Nithya, >> >> My initial setup was composed of 2 similar nodes: stor1data and >> stor2data. A month ago I expanded both volumes with a new node: stor3data >> (2 bricks per volume). >> Of course, then to add the new peer with the bricks I did the 'balance >> force' operation. This task finished successfully (you can see info below) >> and number of files on the 3 nodes were very similar . >> >> For volumedisk1 I only have files of 500MB and they are continuosly >> written in sequential mode. The filename pattern of written files is: >> >> run.node1.0000.rd >> run.node2.0000.rd >> run.node1.0001.rd >> run.node2.0001.rd >> run.node1.0002.rd >> run.node2.0002.rd >> ........... >> ........... >> run.node1.X.rd >> run.node2.X.rd >> >> ( X ranging from 0000 to infinite ) >> >> Curiously stor1data and stor2data maintain similar ratios in bytes: >> >> Filesystem 1K-blocks Used Available >> Use% Mounted on >> /dev/sdc1 52737613824 17079174264 <(707)%20917-4264> >> 35658439560 33% /mnt/glusterfs/vol1 -> stor1data >> /dev/sdc1 52737613824 17118810848 35618802976 33% >> /mnt/glusterfs/vol1 -> stor2data >> >> However the ratio on som3data differs too much (1TB): >> Filesystem 1K-blocks Used Available >> Use% Mounted on >> /dev/sdc1 52737613824 15479191748 37258422076 30% >> /mnt/disk_c/glusterfs/vol1 -> stor3data >> /dev/sdd1 52737613824 15566398604 37171215220 30% >> /mnt/disk_d/glusterfs/vol1 -> stor3data >> >> Thinking in inodes: >> >> Filesystem Inodes IUsed IFree IUse% >> Mounted on >> /dev/sdc1 5273970048 851053 5273118995 1% >> /mnt/glusterfs/vol1 -> stor1data >> /dev/sdc1 5273970048 849388 5273120660 1% >> /mnt/glusterfs/vol1 -> stor2data >> >> /dev/sdc1 5273970048 846877 5273123171 1% >> /mnt/disk_c/glusterfs/vol1 -> stor3data >> /dev/sdd1 5273970048 845250 5273124798 1% >> /mnt/disk_d/glusterfs/vol1 -> stor3data >> >> 851053 (stor1) - 845250 (stor3) = 5803 files of difference ! >> > > The inode numbers are a little misleading here - gluster uses some to > create its own internal files and directory structures. Based on the > average file size, I think this would actually work out to a difference of > around 2000 files. > > >> >> In adition, correct me if I'm wrong, stor3data should have 50% of >> probability to store a new file (even taking into account the algorithm of >> DHT with filename patterns) >> >> Theoretically yes , but again, it depends on the filenames and their hash > distribution. > > Please send us the output of : > gluster volume rebalance <volname> status > > for the volume. > > Regards, > Nithya > > >> Thanks, >> Greetings. >> >> Jose V. >> >> Status of volume: volumedisk0 >> Gluster process TCP Port RDMA Port Online >> Pid >> ------------------------------------------------------------ >> ------------------ >> Brick stor1data:/mnt/glusterfs/vol0/bri >> ck1 49152 0 Y >> 13533 >> Brick stor2data:/mnt/glusterfs/vol0/bri >> ck1 49152 0 Y >> 13302 >> Brick stor3data:/mnt/disk_b1/glusterfs/ >> vol0/brick1 49152 0 Y >> 17371 >> Brick stor3data:/mnt/disk_b2/glusterfs/ >> vol0/brick1 49153 0 Y >> 17391 >> NFS Server on localhost N/A N/A N >> N/A >> NFS Server on stor3data N/A N/A N N/A >> NFS Server on stor2data N/A N/A N N/A >> >> Task Status of Volume volumedisk0 >> ------------------------------------------------------------ >> ------------------ >> Task : Rebalance >> ID : 7f5328cb-ed25-4627-9196-fb3e29e0e4ca >> Status : completed >> >> Status of volume: volumedisk1 >> Gluster process TCP Port RDMA Port Online >> Pid >> ------------------------------------------------------------ >> ------------------ >> Brick stor1data:/mnt/glusterfs/vol1/bri >> ck1 49153 0 Y >> 13579 >> Brick stor2data:/mnt/glusterfs/vol1/bri >> ck1 49153 0 Y >> 13344 >> Brick stor3data:/mnt/disk_c/glusterfs/v >> ol1/brick1 49154 0 Y >> 17439 >> Brick stor3data:/mnt/disk_d/glusterfs/v >> ol1/brick1 49155 0 Y >> 17459 >> NFS Server on localhost N/A N/A N >> N/A >> NFS Server on stor3data N/A N/A N N/A >> NFS Server on stor2data N/A N/A N N/A >> >> Task Status of Volume volumedisk1 >> ------------------------------------------------------------ >> ------------------ >> Task : Rebalance >> ID : d0048704-beeb-4a6a-ae94-7e7916423fd3 >> Status : completed >> >> >> 2018-02-28 15:40 GMT+01:00 Nithya Balachandran <nbalacha at redhat.com>: >> >>> Hi Jose, >>> >>> On 28 February 2018 at 18:28, Jose V. Carri?n <jocarbur at gmail.com> >>> wrote: >>> >>>> Hi Nithya, >>>> >>>> I applied the workarround for this bug and now df shows the right size: >>>> >>>> That is good to hear. >>> >>> >>> >>>> [root at stor1 ~]# df -h >>>> Filesystem Size Used Avail Use% Mounted on >>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0 >>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1 >>>> stor1data:/volumedisk0 >>>> 101T 3,3T 97T 4% /volumedisk0 >>>> stor1data:/volumedisk1 >>>> 197T 61T 136T 31% /volumedisk1 >>>> >>>> >>>> [root at stor2 ~]# df -h >>>> Filesystem Size Used Avail Use% Mounted on >>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0 >>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1 >>>> stor2data:/volumedisk0 >>>> 101T 3,3T 97T 4% /volumedisk0 >>>> stor2data:/volumedisk1 >>>> 197T 61T 136T 31% /volumedisk1 >>>> >>>> >>>> [root at stor3 ~]# df -h >>>> Filesystem Size Used Avail Use% Mounted on >>>> /dev/sdb1 25T 638G 24T 3% /mnt/disk_b1/glusterfs/vol0 >>>> /dev/sdb2 25T 654G 24T 3% /mnt/disk_b2/glusterfs/vol0 >>>> /dev/sdc1 50T 15T 35T 30% /mnt/disk_c/glusterfs/vol1 >>>> /dev/sdd1 50T 15T 35T 30% /mnt/disk_d/glusterfs/vol1 >>>> stor3data:/volumedisk0 >>>> 101T 3,3T 97T 4% /volumedisk0 >>>> stor3data:/volumedisk1 >>>> 197T 61T 136T 31% /volumedisk1 >>>> >>>> >>>> However I'm concerned because, as you can see, the volumedisk0 on >>>> stor3data is composed by 2 bricks on thesame disk but on different >>>> partitions (/dev/sdb1 and /dev/sdb2). >>>> After to aplly the workarround, the shared-brick-count parameter was >>>> setted to 1 in all the bricks and all the servers (see below). Could be >>>> this an issue ? >>>> >>>> No, this is correct. The shared-brick-count will be > 1 only if >>> multiple bricks share the same partition. >>> >>> >>> >>>> Also, I can check that stor3data is now unbalanced respect stor1data >>>> and stor2data. The three nodes have the same size of brick but stor3data >>>> bricks have used 1TB less than stor1data and stor2data: >>>> >>> >>> >>> This does not necessarily indicate a problem. The distribution need not >>> be exactly equal and depends on the filenames. Can you provide more >>> information on the kind of dataset (how many files, sizes etc) on this >>> volume? Did you create the volume with all 4 bricks or add some later? >>> >>> Regards, >>> Nithya >>> >>>> >>>> stor1data: >>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0 >>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1 >>>> >>>> stor2data bricks: >>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0 >>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1 >>>> >>>> stor3data bricks: >>>> /dev/sdb1 25T 638G 24T 3% >>>> /mnt/disk_b1/glusterfs/vol0 >>>> /dev/sdb2 25T 654G 24T 3% >>>> /mnt/disk_b2/glusterfs/vol0 >>>> dev/sdc1 50T 15T 35T 30% /mnt/disk_c/glusterfs/vol1 >>>> /dev/sdd1 50T 15T 35T 30% /mnt/disk_d/glusterfs/vol1 >>>> >>>> >>>> [root at stor1 ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/* >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3: >>>> option shared-brick-count 1 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt >>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 1 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3: >>>> option shared-brick-count 1 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt >>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 0 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>> shared-brick-count 0 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>> shared-brick-count 0 >>>> >>>> [root at stor2 ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/* >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3: >>>> option shared-brick-count 1 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt >>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 0 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3: >>>> option shared-brick-count 1 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt >>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 1 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>> shared-brick-count 0 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>> shared-brick-count 0 >>>> >>>> [root at stor3t ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/* >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3: >>>> option shared-brick-count 1 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt >>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 1 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3: >>>> option shared-brick-count 1 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt >>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 0 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>> shared-brick-count 0 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>> shared-brick-count 0 >>>> >>>> Thaks for your help, >>>> Greetings. >>>> >>>> Jose V. >>>> >>>> >>>> 2018-02-28 5:07 GMT+01:00 Nithya Balachandran <nbalacha at redhat.com>: >>>> >>>>> Hi Jose, >>>>> >>>>> There is a known issue with gluster 3.12.x builds (see [1]) so you may >>>>> be running into this. >>>>> >>>>> The "shared-brick-count" values seem fine on stor1. Please send us "grep >>>>> -n "share" /var/lib/glusterd/vols/volumedisk1/*" results for the >>>>> other nodes so we can check if they are the cause. >>>>> >>>>> >>>>> Regards, >>>>> Nithya >>>>> >>>>> >>>>> >>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1517260 >>>>> >>>>> On 28 February 2018 at 03:03, Jose V. Carri?n <jocarbur at gmail.com> >>>>> wrote: >>>>> >>>>>> Hi, >>>>>> >>>>>> Some days ago all my glusterfs configuration was working fine. Today >>>>>> I realized that the total size reported by df command was changed and is >>>>>> smaller than the aggregated capacity of all the bricks in the volume. >>>>>> >>>>>> I checked that all the volumes status are fine, all the glusterd >>>>>> daemons are running, there is no error in logs, however df shows a bad >>>>>> total size. >>>>>> >>>>>> My configuration for one volume: volumedisk1 >>>>>> [root at stor1 ~]# gluster volume status volumedisk1 detail >>>>>> >>>>>> Status of volume: volumedisk1 >>>>>> ------------------------------------------------------------ >>>>>> ------------------ >>>>>> Brick : Brick stor1data:/mnt/glusterfs/vol1/brick1 >>>>>> TCP Port : 49153 >>>>>> RDMA Port : 0 >>>>>> Online : Y >>>>>> Pid : 13579 >>>>>> File System : xfs >>>>>> Device : /dev/sdc1 >>>>>> Mount Options : rw,noatime >>>>>> Inode Size : 512 >>>>>> Disk Space Free : 35.0TB >>>>>> Total Disk Space : 49.1TB >>>>>> Inode Count : 5273970048 >>>>>> Free Inodes : 5273123069 >>>>>> ------------------------------------------------------------ >>>>>> ------------------ >>>>>> Brick : Brick stor2data:/mnt/glusterfs/vol1/brick1 >>>>>> TCP Port : 49153 >>>>>> RDMA Port : 0 >>>>>> Online : Y >>>>>> Pid : 13344 >>>>>> File System : xfs >>>>>> Device : /dev/sdc1 >>>>>> Mount Options : rw,noatime >>>>>> Inode Size : 512 >>>>>> Disk Space Free : 35.0TB >>>>>> Total Disk Space : 49.1TB >>>>>> Inode Count : 5273970048 >>>>>> Free Inodes : 5273124718 >>>>>> ------------------------------------------------------------ >>>>>> ------------------ >>>>>> Brick : Brick stor3data:/mnt/disk_c/glusterf >>>>>> s/vol1/brick1 >>>>>> TCP Port : 49154 >>>>>> RDMA Port : 0 >>>>>> Online : Y >>>>>> Pid : 17439 >>>>>> File System : xfs >>>>>> Device : /dev/sdc1 >>>>>> Mount Options : rw,noatime >>>>>> Inode Size : 512 >>>>>> Disk Space Free : 35.7TB >>>>>> Total Disk Space : 49.1TB >>>>>> Inode Count : 5273970048 >>>>>> Free Inodes : 5273125437 >>>>>> ------------------------------------------------------------ >>>>>> ------------------ >>>>>> Brick : Brick stor3data:/mnt/disk_d/glusterf >>>>>> s/vol1/brick1 >>>>>> TCP Port : 49155 >>>>>> RDMA Port : 0 >>>>>> Online : Y >>>>>> Pid : 17459 >>>>>> File System : xfs >>>>>> Device : /dev/sdd1 >>>>>> Mount Options : rw,noatime >>>>>> Inode Size : 512 >>>>>> Disk Space Free : 35.6TB >>>>>> Total Disk Space : 49.1TB >>>>>> Inode Count : 5273970048 >>>>>> Free Inodes : 5273127036 >>>>>> >>>>>> >>>>>> Then full size for volumedisk1 should be: 49.1TB + 49.1TB + 49.1TB >>>>>> +49.1TB = *196,4 TB *but df shows: >>>>>> >>>>>> [root at stor1 ~]# df -h >>>>>> Filesystem Size Used Avail Use% Mounted on >>>>>> /dev/sda2 48G 21G 25G 46% / >>>>>> tmpfs 32G 80K 32G 1% /dev/shm >>>>>> /dev/sda1 190M 62M 119M 35% /boot >>>>>> /dev/sda4 395G 251G 124G 68% /data >>>>>> /dev/sdb1 26T 601G 25T 3% /mnt/glusterfs/vol0 >>>>>> /dev/sdc1 50T 15T 36T 29% /mnt/glusterfs/vol1 >>>>>> stor1data:/volumedisk0 >>>>>> 76T 1,6T 74T 3% /volumedisk0 >>>>>> stor1data:/volumedisk1 >>>>>> *148T* 42T 106T 29% /volumedisk1 >>>>>> >>>>>> Exactly 1 brick minus: 196,4 TB - 49,1TB = 148TB >>>>>> >>>>>> It's a production system so I hope you can help me. >>>>>> >>>>>> Thanks in advance. >>>>>> >>>>>> Jose V. >>>>>> >>>>>> >>>>>> Below some other data of my configuration: >>>>>> >>>>>> [root at stor1 ~]# gluster volume info >>>>>> >>>>>> Volume Name: volumedisk0 >>>>>> Type: Distribute >>>>>> Volume ID: 0ee52d94-1131-4061-bcef-bd8cf898da10 >>>>>> Status: Started >>>>>> Snapshot Count: 0 >>>>>> Number of Bricks: 4 >>>>>> Transport-type: tcp >>>>>> Bricks: >>>>>> Brick1: stor1data:/mnt/glusterfs/vol0/brick1 >>>>>> Brick2: stor2data:/mnt/glusterfs/vol0/brick1 >>>>>> Brick3: stor3data:/mnt/disk_b1/glusterfs/vol0/brick1 >>>>>> Brick4: stor3data:/mnt/disk_b2/glusterfs/vol0/brick1 >>>>>> Options Reconfigured: >>>>>> performance.cache-size: 4GB >>>>>> cluster.min-free-disk: 1% >>>>>> performance.io-thread-count: 16 >>>>>> performance.readdir-ahead: on >>>>>> >>>>>> Volume Name: volumedisk1 >>>>>> Type: Distribute >>>>>> Volume ID: 591b7098-800e-4954-82a9-6b6d81c9e0a2 >>>>>> Status: Started >>>>>> Snapshot Count: 0 >>>>>> Number of Bricks: 4 >>>>>> Transport-type: tcp >>>>>> Bricks: >>>>>> Brick1: stor1data:/mnt/glusterfs/vol1/brick1 >>>>>> Brick2: stor2data:/mnt/glusterfs/vol1/brick1 >>>>>> Brick3: stor3data:/mnt/disk_c/glusterfs/vol1/brick1 >>>>>> Brick4: stor3data:/mnt/disk_d/glusterfs/vol1/brick1 >>>>>> Options Reconfigured: >>>>>> cluster.min-free-inodes: 6% >>>>>> performance.cache-size: 4GB >>>>>> cluster.min-free-disk: 1% >>>>>> performance.io-thread-count: 16 >>>>>> performance.readdir-ahead: on >>>>>> >>>>>> [root at stor1 ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/* >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt >>>>>> -glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt >>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt >>>>>> -glusterfs-vol1-brick1.vol:3: option shared-brick-count 0 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt >>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 0 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 0 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>>> shared-brick-count 0 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 0 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>>> shared-brick-count 0 >>>>>> >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> Gluster-users mailing list >>>>>> Gluster-users at gluster.org >>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>> >>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180301/0b598040/attachment.html>
Jose V. Carrión
2018-Mar-01 09:55 UTC
[Gluster-users] df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)
I'm sorry for my last incomplete message. Below the output of both volumes: [root at stor1t ~]# gluster volume rebalance volumedisk1 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 703964 16384.0PB 1475983 0 0 completed 64:37:55 stor2data 704610 16384.0PB 1475199 0 0 completed 64:31:30 stor3data 703964 16384.0PB 1475983 0 0 completed 64:37:55 volume rebalance: volumedisk1: success [root at stor1 ~]# gluster volume rebalance volumedisk0 status Node Rebalanced-files size scanned failures skipped status run time in h:m:s --------- ----------- ----------- ----------- ----------- ----------- ------------ -------------- localhost 411919 1.1GB 718044 0 0 completed 2:28:52 stor2data 435340 16384.0PB 741287 0 0 completed 2:26:01 stor3data 411919 1.1GB 718044 0 0 completed 2:28:52 volume rebalance: volumedisk0: success And volumedisk1 rebalance logs finished saying: [2018-02-13 03:47:48.703311] I [MSGID: 109028] [dht-rebalance.c:5053:gf_defrag_status_get] 0-volumedisk1-dht: Rebalance is completed. Time taken is 232675.00 secs [2018-02-13 03:47:48.703351] I [MSGID: 109028] [dht-rebalance.c:5057:gf_defrag_status_get] 0-volumedisk1-dht: Files migrated: 703964, size: 14046969178073, lookups: 1475983, failures: 0, skipped: 0 Checking my logs the new stor3node and the rebalance task was executed on 2018-02-10 . From this date to now I have been storing new files. The exact sequence of commands to add the new node was: gluster peer probe stor3data gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b1/glusterfs/vol0 gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b2/glusterfs/vol0 gluster volume add-brick volumedisk1 stor3data:/mnt/disk_c/glusterfs/vol1 gluster volume add-brick volumedisk1 stor3data:/mnt/disk_d/glusterfs/vol1 gluster volume rebalance volumedisk0 start force gluster volume rebalance volumedisk1 start force For some reason , could be unbalanced the assigned range of DHT for stor3data bricks ? Could be minor than stor1data and stor2data ? , Any way to verify it ? Any way to modify/rebalance the DHT range between bricks in order to unify the DHT range per brick ?. Thanks a lot, Greetings. Jose V. 2018-03-01 10:39 GMT+01:00 Jose V. Carri?n <jocarbur at gmail.com>:> Hi Nithya, > Below the output of both volumes: > > [root at stor1t ~]# gluster volume rebalance volumedisk1 status > Node Rebalanced-files size > scanned failures skipped status run time in > h:m:s > --------- ----------- ----------- > ----------- ----------- ----------- ------------ > -------------- > localhost 703964 16384.0PB > 1475983 0 0 completed 64:37:55 > stor2data 704610 16384.0PB > 1475199 0 0 completed 64:31:30 > stor3data 703964 16384.0PB > 1475983 0 0 completed 64:37:55 > volume rebalance: volumedisk1: success > > [root at stor1 ~]# gluster volume rebalance volumedisk0 status > Node Rebalanced-files size > scanned failures skipped status run time in > h:m:s > --------- ----------- ----------- > ----------- ----------- ----------- ------------ > -------------- > localhost 411919 1.1GB > 718044 0 0 completed 2:28:52 > stor2data 435340 16384.0PB > 741287 0 0 completed 2:26:01 > stor3data 411919 1.1GB > 718044 0 0 completed 2:28:52 > volume rebalance: volumedisk0: success > > And volumedisk1 rebalance logs finished saying: > [2018-02-13 03:47:48.703311] I [MSGID: 109028] [dht-rebalance.c:5053:gf_defrag_status_get] > 0-volumedisk1-dht: Rebalance is completed. Time taken is 232675.00 secs > [2018-02-13 03:47:48.703351] I [MSGID: 109028] [dht-rebalance.c:5057:gf_defrag_status_get] > 0-volumedisk1-dht: Files migrated: 703964, size: 14046969178073, lookups: > 1475983, failures: 0, skipped: 0 > > Checking my logs the new stor3node and the rebalance task was executed on > 2018-02-10 . From this date to now I have been storing new files. > The sequence of commands to add the node was: > > gluster peer probe stor3data > > gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b1/glusterfs/vol0 > > gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b1/glusterfs/vol0 > > > > > > 2018-03-01 6:32 GMT+01:00 Nithya Balachandran <nbalacha at redhat.com>: > >> Hi Jose, >> >> On 28 February 2018 at 22:31, Jose V. Carri?n <jocarbur at gmail.com> wrote: >> >>> Hi Nithya, >>> >>> My initial setup was composed of 2 similar nodes: stor1data and >>> stor2data. A month ago I expanded both volumes with a new node: stor3data >>> (2 bricks per volume). >>> Of course, then to add the new peer with the bricks I did the 'balance >>> force' operation. This task finished successfully (you can see info below) >>> and number of files on the 3 nodes were very similar . >>> >>> For volumedisk1 I only have files of 500MB and they are continuosly >>> written in sequential mode. The filename pattern of written files is: >>> >>> run.node1.0000.rd >>> run.node2.0000.rd >>> run.node1.0001.rd >>> run.node2.0001.rd >>> run.node1.0002.rd >>> run.node2.0002.rd >>> ........... >>> ........... >>> run.node1.X.rd >>> run.node2.X.rd >>> >>> ( X ranging from 0000 to infinite ) >>> >>> Curiously stor1data and stor2data maintain similar ratios in bytes: >>> >>> Filesystem 1K-blocks Used Available >>> Use% Mounted on >>> /dev/sdc1 52737613824 17079174264 <(707)%20917-4264> >>> 35658439560 33% /mnt/glusterfs/vol1 -> stor1data >>> /dev/sdc1 52737613824 17118810848 35618802976 33% >>> /mnt/glusterfs/vol1 -> stor2data >>> >>> However the ratio on som3data differs too much (1TB): >>> Filesystem 1K-blocks Used Available >>> Use% Mounted on >>> /dev/sdc1 52737613824 15479191748 37258422076 30% >>> /mnt/disk_c/glusterfs/vol1 -> stor3data >>> /dev/sdd1 52737613824 15566398604 37171215220 30% >>> /mnt/disk_d/glusterfs/vol1 -> stor3data >>> >>> Thinking in inodes: >>> >>> Filesystem Inodes IUsed IFree IUse% >>> Mounted on >>> /dev/sdc1 5273970048 851053 5273118995 1% >>> /mnt/glusterfs/vol1 -> stor1data >>> /dev/sdc1 5273970048 849388 5273120660 1% >>> /mnt/glusterfs/vol1 -> stor2data >>> >>> /dev/sdc1 5273970048 846877 5273123171 1% >>> /mnt/disk_c/glusterfs/vol1 -> stor3data >>> /dev/sdd1 5273970048 845250 5273124798 1% >>> /mnt/disk_d/glusterfs/vol1 -> stor3data >>> >>> 851053 (stor1) - 845250 (stor3) = 5803 files of difference ! >>> >> >> The inode numbers are a little misleading here - gluster uses some to >> create its own internal files and directory structures. Based on the >> average file size, I think this would actually work out to a difference of >> around 2000 files. >> >> >>> >>> In adition, correct me if I'm wrong, stor3data should have 50% of >>> probability to store a new file (even taking into account the algorithm of >>> DHT with filename patterns) >>> >>> Theoretically yes , but again, it depends on the filenames and their >> hash distribution. >> >> Please send us the output of : >> gluster volume rebalance <volname> status >> >> for the volume. >> >> Regards, >> Nithya >> >> >>> Thanks, >>> Greetings. >>> >>> Jose V. >>> >>> Status of volume: volumedisk0 >>> Gluster process TCP Port RDMA Port Online >>> Pid >>> ------------------------------------------------------------ >>> ------------------ >>> Brick stor1data:/mnt/glusterfs/vol0/bri >>> ck1 49152 0 Y >>> 13533 >>> Brick stor2data:/mnt/glusterfs/vol0/bri >>> ck1 49152 0 Y >>> 13302 >>> Brick stor3data:/mnt/disk_b1/glusterfs/ >>> vol0/brick1 49152 0 Y >>> 17371 >>> Brick stor3data:/mnt/disk_b2/glusterfs/ >>> vol0/brick1 49153 0 Y >>> 17391 >>> NFS Server on localhost N/A N/A N >>> N/A >>> NFS Server on stor3data N/A N/A N N/A >>> >>> NFS Server on stor2data N/A N/A N N/A >>> >>> >>> Task Status of Volume volumedisk0 >>> ------------------------------------------------------------ >>> ------------------ >>> Task : Rebalance >>> ID : 7f5328cb-ed25-4627-9196-fb3e29e0e4ca >>> Status : completed >>> >>> Status of volume: volumedisk1 >>> Gluster process TCP Port RDMA Port Online >>> Pid >>> ------------------------------------------------------------ >>> ------------------ >>> Brick stor1data:/mnt/glusterfs/vol1/bri >>> ck1 49153 0 Y >>> 13579 >>> Brick stor2data:/mnt/glusterfs/vol1/bri >>> ck1 49153 0 Y >>> 13344 >>> Brick stor3data:/mnt/disk_c/glusterfs/v >>> ol1/brick1 49154 0 Y >>> 17439 >>> Brick stor3data:/mnt/disk_d/glusterfs/v >>> ol1/brick1 49155 0 Y >>> 17459 >>> NFS Server on localhost N/A N/A N >>> N/A >>> NFS Server on stor3data N/A N/A N N/A >>> >>> NFS Server on stor2data N/A N/A N N/A >>> >>> >>> Task Status of Volume volumedisk1 >>> ------------------------------------------------------------ >>> ------------------ >>> Task : Rebalance >>> ID : d0048704-beeb-4a6a-ae94-7e7916423fd3 >>> Status : completed >>> >>> >>> 2018-02-28 15:40 GMT+01:00 Nithya Balachandran <nbalacha at redhat.com>: >>> >>>> Hi Jose, >>>> >>>> On 28 February 2018 at 18:28, Jose V. Carri?n <jocarbur at gmail.com> >>>> wrote: >>>> >>>>> Hi Nithya, >>>>> >>>>> I applied the workarround for this bug and now df shows the right size: >>>>> >>>>> That is good to hear. >>>> >>>> >>>> >>>>> [root at stor1 ~]# df -h >>>>> Filesystem Size Used Avail Use% Mounted on >>>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0 >>>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1 >>>>> stor1data:/volumedisk0 >>>>> 101T 3,3T 97T 4% /volumedisk0 >>>>> stor1data:/volumedisk1 >>>>> 197T 61T 136T 31% /volumedisk1 >>>>> >>>>> >>>>> [root at stor2 ~]# df -h >>>>> Filesystem Size Used Avail Use% Mounted on >>>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0 >>>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1 >>>>> stor2data:/volumedisk0 >>>>> 101T 3,3T 97T 4% /volumedisk0 >>>>> stor2data:/volumedisk1 >>>>> 197T 61T 136T 31% /volumedisk1 >>>>> >>>>> >>>>> [root at stor3 ~]# df -h >>>>> Filesystem Size Used Avail Use% Mounted on >>>>> /dev/sdb1 25T 638G 24T 3% /mnt/disk_b1/glusterfs/vol0 >>>>> /dev/sdb2 25T 654G 24T 3% /mnt/disk_b2/glusterfs/vol0 >>>>> /dev/sdc1 50T 15T 35T 30% /mnt/disk_c/glusterfs/vol1 >>>>> /dev/sdd1 50T 15T 35T 30% /mnt/disk_d/glusterfs/vol1 >>>>> stor3data:/volumedisk0 >>>>> 101T 3,3T 97T 4% /volumedisk0 >>>>> stor3data:/volumedisk1 >>>>> 197T 61T 136T 31% /volumedisk1 >>>>> >>>>> >>>>> However I'm concerned because, as you can see, the volumedisk0 on >>>>> stor3data is composed by 2 bricks on thesame disk but on different >>>>> partitions (/dev/sdb1 and /dev/sdb2). >>>>> After to aplly the workarround, the shared-brick-count parameter was >>>>> setted to 1 in all the bricks and all the servers (see below). Could be >>>>> this an issue ? >>>>> >>>>> No, this is correct. The shared-brick-count will be > 1 only if >>>> multiple bricks share the same partition. >>>> >>>> >>>> >>>>> Also, I can check that stor3data is now unbalanced respect stor1data >>>>> and stor2data. The three nodes have the same size of brick but stor3data >>>>> bricks have used 1TB less than stor1data and stor2data: >>>>> >>>> >>>> >>>> This does not necessarily indicate a problem. The distribution need not >>>> be exactly equal and depends on the filenames. Can you provide more >>>> information on the kind of dataset (how many files, sizes etc) on this >>>> volume? Did you create the volume with all 4 bricks or add some later? >>>> >>>> Regards, >>>> Nithya >>>> >>>>> >>>>> stor1data: >>>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0 >>>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1 >>>>> >>>>> stor2data bricks: >>>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0 >>>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1 >>>>> >>>>> stor3data bricks: >>>>> /dev/sdb1 25T 638G 24T 3% >>>>> /mnt/disk_b1/glusterfs/vol0 >>>>> /dev/sdb2 25T 654G 24T 3% >>>>> /mnt/disk_b2/glusterfs/vol0 >>>>> dev/sdc1 50T 15T 35T 30% >>>>> /mnt/disk_c/glusterfs/vol1 >>>>> /dev/sdd1 50T 15T 35T 30% >>>>> /mnt/disk_d/glusterfs/vol1 >>>>> >>>>> >>>>> [root at stor1 ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/* >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3: >>>>> option shared-brick-count 1 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt >>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 1 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3: >>>>> option shared-brick-count 1 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt >>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 0 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>> shared-brick-count 0 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>> shared-brick-count 0 >>>>> >>>>> [root at stor2 ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/* >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3: >>>>> option shared-brick-count 1 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt >>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 0 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3: >>>>> option shared-brick-count 1 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt >>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 1 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>> shared-brick-count 0 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>> shared-brick-count 0 >>>>> >>>>> [root at stor3t ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/* >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3: >>>>> option shared-brick-count 1 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt >>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 1 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3: >>>>> option shared-brick-count 1 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt >>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 0 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>> shared-brick-count 0 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>> shared-brick-count 0 >>>>> >>>>> Thaks for your help, >>>>> Greetings. >>>>> >>>>> Jose V. >>>>> >>>>> >>>>> 2018-02-28 5:07 GMT+01:00 Nithya Balachandran <nbalacha at redhat.com>: >>>>> >>>>>> Hi Jose, >>>>>> >>>>>> There is a known issue with gluster 3.12.x builds (see [1]) so you >>>>>> may be running into this. >>>>>> >>>>>> The "shared-brick-count" values seem fine on stor1. Please send us "grep >>>>>> -n "share" /var/lib/glusterd/vols/volumedisk1/*" results for the >>>>>> other nodes so we can check if they are the cause. >>>>>> >>>>>> >>>>>> Regards, >>>>>> Nithya >>>>>> >>>>>> >>>>>> >>>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1517260 >>>>>> >>>>>> On 28 February 2018 at 03:03, Jose V. Carri?n <jocarbur at gmail.com> >>>>>> wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> Some days ago all my glusterfs configuration was working fine. Today >>>>>>> I realized that the total size reported by df command was changed and is >>>>>>> smaller than the aggregated capacity of all the bricks in the volume. >>>>>>> >>>>>>> I checked that all the volumes status are fine, all the glusterd >>>>>>> daemons are running, there is no error in logs, however df shows a bad >>>>>>> total size. >>>>>>> >>>>>>> My configuration for one volume: volumedisk1 >>>>>>> [root at stor1 ~]# gluster volume status volumedisk1 detail >>>>>>> >>>>>>> Status of volume: volumedisk1 >>>>>>> ------------------------------------------------------------ >>>>>>> ------------------ >>>>>>> Brick : Brick stor1data:/mnt/glusterfs/vol1/brick1 >>>>>>> TCP Port : 49153 >>>>>>> RDMA Port : 0 >>>>>>> Online : Y >>>>>>> Pid : 13579 >>>>>>> File System : xfs >>>>>>> Device : /dev/sdc1 >>>>>>> Mount Options : rw,noatime >>>>>>> Inode Size : 512 >>>>>>> Disk Space Free : 35.0TB >>>>>>> Total Disk Space : 49.1TB >>>>>>> Inode Count : 5273970048 >>>>>>> Free Inodes : 5273123069 >>>>>>> ------------------------------------------------------------ >>>>>>> ------------------ >>>>>>> Brick : Brick stor2data:/mnt/glusterfs/vol1/brick1 >>>>>>> TCP Port : 49153 >>>>>>> RDMA Port : 0 >>>>>>> Online : Y >>>>>>> Pid : 13344 >>>>>>> File System : xfs >>>>>>> Device : /dev/sdc1 >>>>>>> Mount Options : rw,noatime >>>>>>> Inode Size : 512 >>>>>>> Disk Space Free : 35.0TB >>>>>>> Total Disk Space : 49.1TB >>>>>>> Inode Count : 5273970048 >>>>>>> Free Inodes : 5273124718 >>>>>>> ------------------------------------------------------------ >>>>>>> ------------------ >>>>>>> Brick : Brick stor3data:/mnt/disk_c/glusterf >>>>>>> s/vol1/brick1 >>>>>>> TCP Port : 49154 >>>>>>> RDMA Port : 0 >>>>>>> Online : Y >>>>>>> Pid : 17439 >>>>>>> File System : xfs >>>>>>> Device : /dev/sdc1 >>>>>>> Mount Options : rw,noatime >>>>>>> Inode Size : 512 >>>>>>> Disk Space Free : 35.7TB >>>>>>> Total Disk Space : 49.1TB >>>>>>> Inode Count : 5273970048 >>>>>>> Free Inodes : 5273125437 >>>>>>> ------------------------------------------------------------ >>>>>>> ------------------ >>>>>>> Brick : Brick stor3data:/mnt/disk_d/glusterf >>>>>>> s/vol1/brick1 >>>>>>> TCP Port : 49155 >>>>>>> RDMA Port : 0 >>>>>>> Online : Y >>>>>>> Pid : 17459 >>>>>>> File System : xfs >>>>>>> Device : /dev/sdd1 >>>>>>> Mount Options : rw,noatime >>>>>>> Inode Size : 512 >>>>>>> Disk Space Free : 35.6TB >>>>>>> Total Disk Space : 49.1TB >>>>>>> Inode Count : 5273970048 >>>>>>> Free Inodes : 5273127036 >>>>>>> >>>>>>> >>>>>>> Then full size for volumedisk1 should be: 49.1TB + 49.1TB + 49.1TB >>>>>>> +49.1TB = *196,4 TB *but df shows: >>>>>>> >>>>>>> [root at stor1 ~]# df -h >>>>>>> Filesystem Size Used Avail Use% Mounted on >>>>>>> /dev/sda2 48G 21G 25G 46% / >>>>>>> tmpfs 32G 80K 32G 1% /dev/shm >>>>>>> /dev/sda1 190M 62M 119M 35% /boot >>>>>>> /dev/sda4 395G 251G 124G 68% /data >>>>>>> /dev/sdb1 26T 601G 25T 3% /mnt/glusterfs/vol0 >>>>>>> /dev/sdc1 50T 15T 36T 29% /mnt/glusterfs/vol1 >>>>>>> stor1data:/volumedisk0 >>>>>>> 76T 1,6T 74T 3% /volumedisk0 >>>>>>> stor1data:/volumedisk1 >>>>>>> *148T* 42T 106T 29% /volumedisk1 >>>>>>> >>>>>>> Exactly 1 brick minus: 196,4 TB - 49,1TB = 148TB >>>>>>> >>>>>>> It's a production system so I hope you can help me. >>>>>>> >>>>>>> Thanks in advance. >>>>>>> >>>>>>> Jose V. >>>>>>> >>>>>>> >>>>>>> Below some other data of my configuration: >>>>>>> >>>>>>> [root at stor1 ~]# gluster volume info >>>>>>> >>>>>>> Volume Name: volumedisk0 >>>>>>> Type: Distribute >>>>>>> Volume ID: 0ee52d94-1131-4061-bcef-bd8cf898da10 >>>>>>> Status: Started >>>>>>> Snapshot Count: 0 >>>>>>> Number of Bricks: 4 >>>>>>> Transport-type: tcp >>>>>>> Bricks: >>>>>>> Brick1: stor1data:/mnt/glusterfs/vol0/brick1 >>>>>>> Brick2: stor2data:/mnt/glusterfs/vol0/brick1 >>>>>>> Brick3: stor3data:/mnt/disk_b1/glusterfs/vol0/brick1 >>>>>>> Brick4: stor3data:/mnt/disk_b2/glusterfs/vol0/brick1 >>>>>>> Options Reconfigured: >>>>>>> performance.cache-size: 4GB >>>>>>> cluster.min-free-disk: 1% >>>>>>> performance.io-thread-count: 16 >>>>>>> performance.readdir-ahead: on >>>>>>> >>>>>>> Volume Name: volumedisk1 >>>>>>> Type: Distribute >>>>>>> Volume ID: 591b7098-800e-4954-82a9-6b6d81c9e0a2 >>>>>>> Status: Started >>>>>>> Snapshot Count: 0 >>>>>>> Number of Bricks: 4 >>>>>>> Transport-type: tcp >>>>>>> Bricks: >>>>>>> Brick1: stor1data:/mnt/glusterfs/vol1/brick1 >>>>>>> Brick2: stor2data:/mnt/glusterfs/vol1/brick1 >>>>>>> Brick3: stor3data:/mnt/disk_c/glusterfs/vol1/brick1 >>>>>>> Brick4: stor3data:/mnt/disk_d/glusterfs/vol1/brick1 >>>>>>> Options Reconfigured: >>>>>>> cluster.min-free-inodes: 6% >>>>>>> performance.cache-size: 4GB >>>>>>> cluster.min-free-disk: 1% >>>>>>> performance.io-thread-count: 16 >>>>>>> performance.readdir-ahead: on >>>>>>> >>>>>>> [root at stor1 ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/* >>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt >>>>>>> -glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt >>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 1 >>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt >>>>>>> -glusterfs-vol1-brick1.vol:3: option shared-brick-count 0 >>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt >>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 0 >>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 0 >>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>>>> shared-brick-count 0 >>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 0 >>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>>>> shared-brick-count 0 >>>>>>> >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Gluster-users mailing list >>>>>>> Gluster-users at gluster.org >>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180301/858e3786/attachment.html>
Nithya Balachandran
2018-Mar-01 11:32 UTC
[Gluster-users] df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)
On 1 March 2018 at 15:25, Jose V. Carri?n <jocarbur at gmail.com> wrote:> I'm sorry for my last incomplete message. > > Below the output of both volumes: > > [root at stor1t ~]# gluster volume rebalance volumedisk1 status > Node Rebalanced-files size > scanned failures skipped status run time in > h:m:s > --------- ----------- ----------- > ----------- ----------- ----------- ------------ > -------------- > localhost 703964 16384.0PB > 1475983 0 0 completed 64:37:55 > stor2data 704610 16384.0PB > 1475199 0 0 completed 64:31:30 > stor3data 703964 16384.0PB > 1475983 0 0 completed 64:37:55 > volume rebalance: volumedisk1: success > > [root at stor1 ~]# gluster volume rebalance volumedisk0 status > Node Rebalanced-files size > scanned failures skipped status run time in > h:m:s > --------- ----------- ----------- > ----------- ----------- ----------- ------------ > -------------- > localhost 411919 1.1GB > 718044 0 0 completed 2:28:52 > stor2data 435340 16384.0PB > 741287 0 0 completed 2:26:01 > stor3data 411919 1.1GB > 718044 0 0 completed 2:28:52 > volume rebalance: volumedisk0: success > > And volumedisk1 rebalance logs finished saying: > [2018-02-13 03:47:48.703311] I [MSGID: 109028] > [dht-rebalance.c:5053:gf_defrag_status_get] 0-volumedisk1-dht: Rebalance > is completed. Time taken is 232675.00 secs > [2018-02-13 03:47:48.703351] I [MSGID: 109028] > [dht-rebalance.c:5057:gf_defrag_status_get] 0-volumedisk1-dht: Files > migrated: 703964, size: 14046969178073, lookups: 1475983, failures: 0, > skipped: 0 > > Checking my logs the new stor3node and the rebalance task was executed on > 2018-02-10 . From this date to now I have been storing new files. > The exact sequence of commands to add the new node was: > > gluster peer probe stor3data > > gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b1/glusterfs/vol0 > > gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b2/glusterfs/vol0 > > gluster volume add-brick volumedisk1 stor3data:/mnt/disk_c/glusterfs/vol1 > > gluster volume add-brick volumedisk1 stor3data:/mnt/disk_d/glusterfs/vol1 > > gluster volume rebalance volumedisk0 start force > > gluster volume rebalance volumedisk1 start force > > For some reason , could be unbalanced the assigned range of DHT for > stor3data bricks ? Could be minor than stor1data and stor2data ? , > > Any way to verify it ? > > Any way to modify/rebalance the DHT range between bricks in order to > unify the DHT range per brick ?. > > Thanks a lot, > > Greetings. > > Jose V. > > > 2018-03-01 10:39 GMT+01:00 Jose V. Carri?n <jocarbur at gmail.com>: > >> Hi Nithya, >> Below the output of both volumes: >> >> [root at stor1t ~]# gluster volume rebalance volumedisk1 status >> Node Rebalanced-files size >> scanned failures skipped status run time in >> h:m:s >> --------- ----------- ----------- >> ----------- ----------- ----------- ------------ >> -------------- >> localhost 703964 16384.0PB >> 1475983 0 0 completed 64:37:55 >> stor2data 704610 16384.0PB >> 1475199 0 0 completed 64:31:30 >> stor3data 703964 16384.0PB >> 1475983 0 0 completed 64:37:55 >> volume rebalance: volumedisk1: success >> >> [root at stor1 ~]# gluster volume rebalance volumedisk0 status >> Node Rebalanced-files size >> scanned failures skipped status run time in >> h:m:s >> --------- ----------- ----------- >> ----------- ----------- ----------- ------------ >> -------------- >> localhost 411919 1.1GB >> 718044 0 0 completed 2:28:52 >> stor2data 435340 16384.0PB >> 741287 0 0 completed 2:26:01 >> stor3data 411919 1.1GB >> 718044 0 0 completed 2:28:52 >> volume rebalance: volumedisk0: success >> >> And volumedisk1 rebalance logs finished saying: >> [2018-02-13 03:47:48.703311] I [MSGID: 109028] >> [dht-rebalance.c:5053:gf_defrag_status_get] 0-volumedisk1-dht: Rebalance >> is completed. Time taken is 232675.00 secs >> [2018-02-13 03:47:48.703351] I [MSGID: 109028] >> [dht-rebalance.c:5057:gf_defrag_status_get] 0-volumedisk1-dht: Files >> migrated: 703964, size: 14046969178073, lookups: 1475983, failures: 0, >> skipped: 0 >> >> Checking my logs the new stor3node and the rebalance task was executed on >> 2018-02-10 . From this date to now I have been storing new files. >> The sequence of commands to add the node was: >> >> gluster peer probe stor3data >> >> gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b1/glusterfs/vol0 >> >> gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b1/glusterfs/vol0 >> >> >> >>While it is odd that both bricks on the third node show similar usage, I do not see a problem in the steps or the status. Can you keep an eye on this and let us know if this continues to be the case?>> >> 2018-03-01 6:32 GMT+01:00 Nithya Balachandran <nbalacha at redhat.com>: >> >>> Hi Jose, >>> >>> On 28 February 2018 at 22:31, Jose V. Carri?n <jocarbur at gmail.com> >>> wrote: >>> >>>> Hi Nithya, >>>> >>>> My initial setup was composed of 2 similar nodes: stor1data and >>>> stor2data. A month ago I expanded both volumes with a new node: stor3data >>>> (2 bricks per volume). >>>> Of course, then to add the new peer with the bricks I did the 'balance >>>> force' operation. This task finished successfully (you can see info below) >>>> and number of files on the 3 nodes were very similar . >>>> >>>> For volumedisk1 I only have files of 500MB and they are continuosly >>>> written in sequential mode. The filename pattern of written files is: >>>> >>>> run.node1.0000.rd >>>> run.node2.0000.rd >>>> run.node1.0001.rd >>>> run.node2.0001.rd >>>> run.node1.0002.rd >>>> run.node2.0002.rd >>>> ........... >>>> ........... >>>> run.node1.X.rd >>>> run.node2.X.rd >>>> >>>> ( X ranging from 0000 to infinite ) >>>> >>>> Curiously stor1data and stor2data maintain similar ratios in bytes: >>>> >>>> Filesystem 1K-blocks Used Available >>>> Use% Mounted on >>>> /dev/sdc1 52737613824 17079174264 <(707)%20917-4264> >>>> 35658439560 33% /mnt/glusterfs/vol1 -> stor1data >>>> /dev/sdc1 52737613824 17118810848 35618802976 33% >>>> /mnt/glusterfs/vol1 -> stor2data >>>> >>>> However the ratio on som3data differs too much (1TB): >>>> Filesystem 1K-blocks Used Available >>>> Use% Mounted on >>>> /dev/sdc1 52737613824 15479191748 37258422076 30% >>>> /mnt/disk_c/glusterfs/vol1 -> stor3data >>>> /dev/sdd1 52737613824 15566398604 37171215220 30% >>>> /mnt/disk_d/glusterfs/vol1 -> stor3data >>>> >>>> Thinking in inodes: >>>> >>>> Filesystem Inodes IUsed IFree IUse% >>>> Mounted on >>>> /dev/sdc1 5273970048 851053 5273118995 1% >>>> /mnt/glusterfs/vol1 -> stor1data >>>> /dev/sdc1 5273970048 849388 5273120660 1% >>>> /mnt/glusterfs/vol1 -> stor2data >>>> >>>> /dev/sdc1 5273970048 846877 5273123171 1% >>>> /mnt/disk_c/glusterfs/vol1 -> stor3data >>>> /dev/sdd1 5273970048 845250 5273124798 1% >>>> /mnt/disk_d/glusterfs/vol1 -> stor3data >>>> >>>> 851053 (stor1) - 845250 (stor3) = 5803 files of difference ! >>>> >>> >>> The inode numbers are a little misleading here - gluster uses some to >>> create its own internal files and directory structures. Based on the >>> average file size, I think this would actually work out to a difference of >>> around 2000 files. >>> >>> >>>> >>>> In adition, correct me if I'm wrong, stor3data should have 50% of >>>> probability to store a new file (even taking into account the algorithm of >>>> DHT with filename patterns) >>>> >>>> Theoretically yes , but again, it depends on the filenames and their >>> hash distribution. >>> >>> Please send us the output of : >>> gluster volume rebalance <volname> status >>> >>> for the volume. >>> >>> Regards, >>> Nithya >>> >>> >>>> Thanks, >>>> Greetings. >>>> >>>> Jose V. >>>> >>>> Status of volume: volumedisk0 >>>> Gluster process TCP Port RDMA Port Online >>>> Pid >>>> ------------------------------------------------------------ >>>> ------------------ >>>> Brick stor1data:/mnt/glusterfs/vol0/bri >>>> ck1 49152 0 Y >>>> 13533 >>>> Brick stor2data:/mnt/glusterfs/vol0/bri >>>> ck1 49152 0 Y >>>> 13302 >>>> Brick stor3data:/mnt/disk_b1/glusterfs/ >>>> vol0/brick1 49152 0 Y >>>> 17371 >>>> Brick stor3data:/mnt/disk_b2/glusterfs/ >>>> vol0/brick1 49153 0 Y >>>> 17391 >>>> NFS Server on localhost N/A N/A N >>>> N/A >>>> NFS Server on stor3data N/A N/A N >>>> N/A >>>> NFS Server on stor2data N/A N/A N >>>> N/A >>>> >>>> Task Status of Volume volumedisk0 >>>> ------------------------------------------------------------ >>>> ------------------ >>>> Task : Rebalance >>>> ID : 7f5328cb-ed25-4627-9196-fb3e29e0e4ca >>>> Status : completed >>>> >>>> Status of volume: volumedisk1 >>>> Gluster process TCP Port RDMA Port Online >>>> Pid >>>> ------------------------------------------------------------ >>>> ------------------ >>>> Brick stor1data:/mnt/glusterfs/vol1/bri >>>> ck1 49153 0 Y >>>> 13579 >>>> Brick stor2data:/mnt/glusterfs/vol1/bri >>>> ck1 49153 0 Y >>>> 13344 >>>> Brick stor3data:/mnt/disk_c/glusterfs/v >>>> ol1/brick1 49154 0 Y >>>> 17439 >>>> Brick stor3data:/mnt/disk_d/glusterfs/v >>>> ol1/brick1 49155 0 Y >>>> 17459 >>>> NFS Server on localhost N/A N/A N >>>> N/A >>>> NFS Server on stor3data N/A N/A N >>>> N/A >>>> NFS Server on stor2data N/A N/A N >>>> N/A >>>> >>>> Task Status of Volume volumedisk1 >>>> ------------------------------------------------------------ >>>> ------------------ >>>> Task : Rebalance >>>> ID : d0048704-beeb-4a6a-ae94-7e7916423fd3 >>>> Status : completed >>>> >>>> >>>> 2018-02-28 15:40 GMT+01:00 Nithya Balachandran <nbalacha at redhat.com>: >>>> >>>>> Hi Jose, >>>>> >>>>> On 28 February 2018 at 18:28, Jose V. Carri?n <jocarbur at gmail.com> >>>>> wrote: >>>>> >>>>>> Hi Nithya, >>>>>> >>>>>> I applied the workarround for this bug and now df shows the right >>>>>> size: >>>>>> >>>>>> That is good to hear. >>>>> >>>>> >>>>> >>>>>> [root at stor1 ~]# df -h >>>>>> Filesystem Size Used Avail Use% Mounted on >>>>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0 >>>>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1 >>>>>> stor1data:/volumedisk0 >>>>>> 101T 3,3T 97T 4% /volumedisk0 >>>>>> stor1data:/volumedisk1 >>>>>> 197T 61T 136T 31% /volumedisk1 >>>>>> >>>>>> >>>>>> [root at stor2 ~]# df -h >>>>>> Filesystem Size Used Avail Use% Mounted on >>>>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0 >>>>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1 >>>>>> stor2data:/volumedisk0 >>>>>> 101T 3,3T 97T 4% /volumedisk0 >>>>>> stor2data:/volumedisk1 >>>>>> 197T 61T 136T 31% /volumedisk1 >>>>>> >>>>>> >>>>>> [root at stor3 ~]# df -h >>>>>> Filesystem Size Used Avail Use% Mounted on >>>>>> /dev/sdb1 25T 638G 24T 3% >>>>>> /mnt/disk_b1/glusterfs/vol0 >>>>>> /dev/sdb2 25T 654G 24T 3% >>>>>> /mnt/disk_b2/glusterfs/vol0 >>>>>> /dev/sdc1 50T 15T 35T 30% /mnt/disk_c/glusterfs/vol1 >>>>>> /dev/sdd1 50T 15T 35T 30% /mnt/disk_d/glusterfs/vol1 >>>>>> stor3data:/volumedisk0 >>>>>> 101T 3,3T 97T 4% /volumedisk0 >>>>>> stor3data:/volumedisk1 >>>>>> 197T 61T 136T 31% /volumedisk1 >>>>>> >>>>>> >>>>>> However I'm concerned because, as you can see, the volumedisk0 on >>>>>> stor3data is composed by 2 bricks on thesame disk but on different >>>>>> partitions (/dev/sdb1 and /dev/sdb2). >>>>>> After to aplly the workarround, the shared-brick-count parameter was >>>>>> setted to 1 in all the bricks and all the servers (see below). Could be >>>>>> this an issue ? >>>>>> >>>>>> No, this is correct. The shared-brick-count will be > 1 only if >>>>> multiple bricks share the same partition. >>>>> >>>>> >>>>> >>>>>> Also, I can check that stor3data is now unbalanced respect stor1data >>>>>> and stor2data. The three nodes have the same size of brick but stor3data >>>>>> bricks have used 1TB less than stor1data and stor2data: >>>>>> >>>>> >>>>> >>>>> This does not necessarily indicate a problem. The distribution need >>>>> not be exactly equal and depends on the filenames. Can you provide more >>>>> information on the kind of dataset (how many files, sizes etc) on this >>>>> volume? Did you create the volume with all 4 bricks or add some later? >>>>> >>>>> Regards, >>>>> Nithya >>>>> >>>>>> >>>>>> stor1data: >>>>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0 >>>>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1 >>>>>> >>>>>> stor2data bricks: >>>>>> /dev/sdb1 26T 1,1T 25T 4% /mnt/glusterfs/vol0 >>>>>> /dev/sdc1 50T 16T 34T 33% /mnt/glusterfs/vol1 >>>>>> >>>>>> stor3data bricks: >>>>>> /dev/sdb1 25T 638G 24T 3% >>>>>> /mnt/disk_b1/glusterfs/vol0 >>>>>> /dev/sdb2 25T 654G 24T 3% >>>>>> /mnt/disk_b2/glusterfs/vol0 >>>>>> dev/sdc1 50T 15T 35T 30% >>>>>> /mnt/disk_c/glusterfs/vol1 >>>>>> /dev/sdd1 50T 15T 35T 30% >>>>>> /mnt/disk_d/glusterfs/vol1 >>>>>> >>>>>> >>>>>> [root at stor1 ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/* >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3: >>>>>> option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt >>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3: >>>>>> option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt >>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 0 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>>> shared-brick-count 0 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>>> shared-brick-count 0 >>>>>> >>>>>> [root at stor2 ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/* >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3: >>>>>> option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt >>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 0 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3: >>>>>> option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt >>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>>> shared-brick-count 0 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>>> shared-brick-count 0 >>>>>> >>>>>> [root at stor3t ~]# grep -n "share" /var/lib/glusterd/vols/volumedisk1/* >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3: >>>>>> option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt >>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3: >>>>>> option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt >>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count 0 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>>> shared-brick-count 0 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>>> shared-brick-count 0 >>>>>> >>>>>> Thaks for your help, >>>>>> Greetings. >>>>>> >>>>>> Jose V. >>>>>> >>>>>> >>>>>> 2018-02-28 5:07 GMT+01:00 Nithya Balachandran <nbalacha at redhat.com>: >>>>>> >>>>>>> Hi Jose, >>>>>>> >>>>>>> There is a known issue with gluster 3.12.x builds (see [1]) so you >>>>>>> may be running into this. >>>>>>> >>>>>>> The "shared-brick-count" values seem fine on stor1. Please send us "grep >>>>>>> -n "share" /var/lib/glusterd/vols/volumedisk1/*" results for the >>>>>>> other nodes so we can check if they are the cause. >>>>>>> >>>>>>> >>>>>>> Regards, >>>>>>> Nithya >>>>>>> >>>>>>> >>>>>>> >>>>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1517260 >>>>>>> >>>>>>> On 28 February 2018 at 03:03, Jose V. Carri?n <jocarbur at gmail.com> >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> Some days ago all my glusterfs configuration was working fine. >>>>>>>> Today I realized that the total size reported by df command was changed and >>>>>>>> is smaller than the aggregated capacity of all the bricks in the volume. >>>>>>>> >>>>>>>> I checked that all the volumes status are fine, all the glusterd >>>>>>>> daemons are running, there is no error in logs, however df shows a bad >>>>>>>> total size. >>>>>>>> >>>>>>>> My configuration for one volume: volumedisk1 >>>>>>>> [root at stor1 ~]# gluster volume status volumedisk1 detail >>>>>>>> >>>>>>>> Status of volume: volumedisk1 >>>>>>>> ------------------------------------------------------------ >>>>>>>> ------------------ >>>>>>>> Brick : Brick stor1data:/mnt/glusterfs/vol1/brick1 >>>>>>>> TCP Port : 49153 >>>>>>>> RDMA Port : 0 >>>>>>>> Online : Y >>>>>>>> Pid : 13579 >>>>>>>> File System : xfs >>>>>>>> Device : /dev/sdc1 >>>>>>>> Mount Options : rw,noatime >>>>>>>> Inode Size : 512 >>>>>>>> Disk Space Free : 35.0TB >>>>>>>> Total Disk Space : 49.1TB >>>>>>>> Inode Count : 5273970048 >>>>>>>> Free Inodes : 5273123069 >>>>>>>> ------------------------------------------------------------ >>>>>>>> ------------------ >>>>>>>> Brick : Brick stor2data:/mnt/glusterfs/vol1/brick1 >>>>>>>> TCP Port : 49153 >>>>>>>> RDMA Port : 0 >>>>>>>> Online : Y >>>>>>>> Pid : 13344 >>>>>>>> File System : xfs >>>>>>>> Device : /dev/sdc1 >>>>>>>> Mount Options : rw,noatime >>>>>>>> Inode Size : 512 >>>>>>>> Disk Space Free : 35.0TB >>>>>>>> Total Disk Space : 49.1TB >>>>>>>> Inode Count : 5273970048 >>>>>>>> Free Inodes : 5273124718 >>>>>>>> ------------------------------------------------------------ >>>>>>>> ------------------ >>>>>>>> Brick : Brick stor3data:/mnt/disk_c/glusterf >>>>>>>> s/vol1/brick1 >>>>>>>> TCP Port : 49154 >>>>>>>> RDMA Port : 0 >>>>>>>> Online : Y >>>>>>>> Pid : 17439 >>>>>>>> File System : xfs >>>>>>>> Device : /dev/sdc1 >>>>>>>> Mount Options : rw,noatime >>>>>>>> Inode Size : 512 >>>>>>>> Disk Space Free : 35.7TB >>>>>>>> Total Disk Space : 49.1TB >>>>>>>> Inode Count : 5273970048 >>>>>>>> Free Inodes : 5273125437 >>>>>>>> ------------------------------------------------------------ >>>>>>>> ------------------ >>>>>>>> Brick : Brick stor3data:/mnt/disk_d/glusterf >>>>>>>> s/vol1/brick1 >>>>>>>> TCP Port : 49155 >>>>>>>> RDMA Port : 0 >>>>>>>> Online : Y >>>>>>>> Pid : 17459 >>>>>>>> File System : xfs >>>>>>>> Device : /dev/sdd1 >>>>>>>> Mount Options : rw,noatime >>>>>>>> Inode Size : 512 >>>>>>>> Disk Space Free : 35.6TB >>>>>>>> Total Disk Space : 49.1TB >>>>>>>> Inode Count : 5273970048 >>>>>>>> Free Inodes : 5273127036 >>>>>>>> >>>>>>>> >>>>>>>> Then full size for volumedisk1 should be: 49.1TB + 49.1TB + 49.1TB >>>>>>>> +49.1TB = *196,4 TB *but df shows: >>>>>>>> >>>>>>>> [root at stor1 ~]# df -h >>>>>>>> Filesystem Size Used Avail Use% Mounted on >>>>>>>> /dev/sda2 48G 21G 25G 46% / >>>>>>>> tmpfs 32G 80K 32G 1% /dev/shm >>>>>>>> /dev/sda1 190M 62M 119M 35% /boot >>>>>>>> /dev/sda4 395G 251G 124G 68% /data >>>>>>>> /dev/sdb1 26T 601G 25T 3% /mnt/glusterfs/vol0 >>>>>>>> /dev/sdc1 50T 15T 36T 29% /mnt/glusterfs/vol1 >>>>>>>> stor1data:/volumedisk0 >>>>>>>> 76T 1,6T 74T 3% /volumedisk0 >>>>>>>> stor1data:/volumedisk1 >>>>>>>> *148T* 42T 106T 29% /volumedisk1 >>>>>>>> >>>>>>>> Exactly 1 brick minus: 196,4 TB - 49,1TB = 148TB >>>>>>>> >>>>>>>> It's a production system so I hope you can help me. >>>>>>>> >>>>>>>> Thanks in advance. >>>>>>>> >>>>>>>> Jose V. >>>>>>>> >>>>>>>> >>>>>>>> Below some other data of my configuration: >>>>>>>> >>>>>>>> [root at stor1 ~]# gluster volume info >>>>>>>> >>>>>>>> Volume Name: volumedisk0 >>>>>>>> Type: Distribute >>>>>>>> Volume ID: 0ee52d94-1131-4061-bcef-bd8cf898da10 >>>>>>>> Status: Started >>>>>>>> Snapshot Count: 0 >>>>>>>> Number of Bricks: 4 >>>>>>>> Transport-type: tcp >>>>>>>> Bricks: >>>>>>>> Brick1: stor1data:/mnt/glusterfs/vol0/brick1 >>>>>>>> Brick2: stor2data:/mnt/glusterfs/vol0/brick1 >>>>>>>> Brick3: stor3data:/mnt/disk_b1/glusterfs/vol0/brick1 >>>>>>>> Brick4: stor3data:/mnt/disk_b2/glusterfs/vol0/brick1 >>>>>>>> Options Reconfigured: >>>>>>>> performance.cache-size: 4GB >>>>>>>> cluster.min-free-disk: 1% >>>>>>>> performance.io-thread-count: 16 >>>>>>>> performance.readdir-ahead: on >>>>>>>> >>>>>>>> Volume Name: volumedisk1 >>>>>>>> Type: Distribute >>>>>>>> Volume ID: 591b7098-800e-4954-82a9-6b6d81c9e0a2 >>>>>>>> Status: Started >>>>>>>> Snapshot Count: 0 >>>>>>>> Number of Bricks: 4 >>>>>>>> Transport-type: tcp >>>>>>>> Bricks: >>>>>>>> Brick1: stor1data:/mnt/glusterfs/vol1/brick1 >>>>>>>> Brick2: stor2data:/mnt/glusterfs/vol1/brick1 >>>>>>>> Brick3: stor3data:/mnt/disk_c/glusterfs/vol1/brick1 >>>>>>>> Brick4: stor3data:/mnt/disk_d/glusterfs/vol1/brick1 >>>>>>>> Options Reconfigured: >>>>>>>> cluster.min-free-inodes: 6% >>>>>>>> performance.cache-size: 4GB >>>>>>>> cluster.min-free-disk: 1% >>>>>>>> performance.io-thread-count: 16 >>>>>>>> performance.readdir-ahead: on >>>>>>>> >>>>>>>> [root at stor1 ~]# grep -n "share" /var/lib/glusterd/vols/volumed >>>>>>>> isk1/* >>>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt >>>>>>>> -glusterfs-vol1-brick1.vol:3: option shared-brick-count 1 >>>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt >>>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count >>>>>>>> 1 >>>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt >>>>>>>> -glusterfs-vol1-brick1.vol:3: option shared-brick-count 0 >>>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt >>>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3: option shared-brick-count >>>>>>>> 0 >>>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>>>> -disk_c-glusterfs-vol1-brick1.vol:3: option shared-brick-count 0 >>>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>>>>> shared-brick-count 0 >>>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>>>> -disk_d-glusterfs-vol1-brick1.vol:3: option shared-brick-count 0 >>>>>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt >>>>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3: option >>>>>>>> shared-brick-count 0 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> _______________________________________________ >>>>>>>> Gluster-users mailing list >>>>>>>> Gluster-users at gluster.org >>>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://lists.gluster.org/pipermail/gluster-users/attachments/20180301/155bf9e4/attachment-0001.html>
Maybe Matching Threads
- df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)
- df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)
- df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)
- df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)
- df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)