thr3ads.net - Gluster users - [Gluster-users] df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1) [Mar 2018]

If this information is useful, please help other people find it:
Share via:

Jose V. Carrión

2018-Mar-01 09:39 UTC

[Gluster-users] df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)

Hi Nithya,
Below the output of both volumes:

[root at stor1t ~]# gluster volume rebalance volumedisk1 status
                                    Node Rebalanced-files          size
  scanned      failures       skipped               status  run time in
h:m:s
                               ---------      -----------   -----------
-----------   -----------   -----------         ------------
--------------
                               localhost           703964     16384.0PB
  1475983             0             0            completed       64:37:55
                           stor2data           704610     16384.0PB
1475199             0             0            completed       64:31:30
                           stor3data           703964     16384.0PB
1475983             0             0            completed       64:37:55
volume rebalance: volumedisk1: success

[root at stor1 ~]# gluster volume rebalance volumedisk0 status
                                    Node Rebalanced-files          size
  scanned      failures       skipped               status  run time in
h:m:s
                               ---------      -----------   -----------
-----------   -----------   -----------         ------------
--------------
                               localhost           411919         1.1GB
   718044             0             0            completed        2:28:52
                           stor2data           435340     16384.0PB
 741287             0             0            completed        2:26:01
                           stor3data           411919         1.1GB
 718044             0             0            completed        2:28:52
volume rebalance: volumedisk0: success

And  volumedisk1 rebalance logs finished saying:
[2018-02-13 03:47:48.703311] I [MSGID: 109028]
[dht-rebalance.c:5053:gf_defrag_status_get] 0-volumedisk1-dht: Rebalance is
completed. Time taken is 232675.00 secs
[2018-02-13 03:47:48.703351] I [MSGID: 109028]
[dht-rebalance.c:5057:gf_defrag_status_get] 0-volumedisk1-dht: Files
migrated: 703964, size: 14046969178073, lookups: 1475983, failures: 0,
skipped: 0

Checking my logs the new stor3node and the rebalance task was executed on
 2018-02-10 . From this date to now I have been storing new files.
The sequence of commands to add the node was:

gluster peer probe stor3data

gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b1/glusterfs/vol0

gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b1/glusterfs/vol0





2018-03-01 6:32 GMT+01:00 Nithya Balachandran <nbalacha at redhat.com>:
> Hi Jose,
>
> On 28 February 2018 at 22:31, Jose V. Carri?n <jocarbur at gmail.com>
wrote:
>
>> Hi Nithya,
>>
>> My initial setup was composed of 2 similar nodes: stor1data and
>> stor2data. A month ago I expanded both volumes with a new node:
stor3data
>> (2 bricks per volume).
>> Of course, then to add the new peer with the bricks I did the
'balance
>> force' operation. This task finished successfully (you can see info
below)
>> and number of files on the 3 nodes were very similar .
>>
>> For volumedisk1 I only have files of 500MB and they are continuosly
>> written in sequential mode. The filename pattern of written files is:
>>
>> run.node1.0000.rd
>> run.node2.0000.rd
>> run.node1.0001.rd
>> run.node2.0001.rd
>> run.node1.0002.rd
>> run.node2.0002.rd
>> ...........
>> ...........
>> run.node1.X.rd
>> run.node2.X.rd
>>
>> (  X ranging from 0000 to infinite )
>>
>> Curiously stor1data and stor2data maintain similar ratios in bytes:
>>
>> Filesystem              1K-blocks        Used               Available
>> Use% Mounted on
>> /dev/sdc1             52737613824 17079174264 <(707)%20917-4264>
>>  35658439560  33% /mnt/glusterfs/vol1   -> stor1data
>> /dev/sdc1             52737613824 17118810848  35618802976  33%
>> /mnt/glusterfs/vol1  ->  stor2data
>>
>> However the ratio on som3data differs too much (1TB):
>> Filesystem           1K-blocks        Used                Available
>> Use% Mounted on
>> /dev/sdc1             52737613824 15479191748  37258422076  30%
>> /mnt/disk_c/glusterfs/vol1 -> stor3data
>> /dev/sdd1             52737613824 15566398604  37171215220  30%
>> /mnt/disk_d/glusterfs/vol1 -> stor3data
>>
>> Thinking in  inodes:
>>
>> Filesystem                Inodes       IUsed       IFree          IUse%
>> Mounted on
>> /dev/sdc1             5273970048  851053  5273118995    1%
>> /mnt/glusterfs/vol1 ->  stor1data
>> /dev/sdc1             5273970048  849388  5273120660    1%
>> /mnt/glusterfs/vol1 ->  stor2data
>>
>> /dev/sdc1             5273970048  846877  5273123171    1%
>> /mnt/disk_c/glusterfs/vol1 -> stor3data
>> /dev/sdd1             5273970048  845250  5273124798    1%
>> /mnt/disk_d/glusterfs/vol1 -> stor3data
>>
>> 851053 (stor1) - 845250 (stor3) = 5803 files of difference !
>>
>
> The inode numbers are a little misleading here - gluster uses some to
> create its own internal files and directory structures. Based on the
> average file size, I think this would actually work out to a difference of
> around 2000 files.
>
>
>>
>> In adition, correct me if I'm wrong,  stor3data should have 50% of
>> probability to store a new file (even taking into account the algorithm
of
>> DHT with filename patterns)
>>
>> Theoretically yes , but again, it depends on the filenames and their
hash
> distribution.
>
> Please send us the output of :
> gluster volume rebalance <volname> status
>
> for the volume.
>
> Regards,
> Nithya
>
>
>> Thanks,
>> Greetings.
>>
>> Jose V.
>>
>> Status of volume: volumedisk0
>> Gluster process                             TCP Port  RDMA Port  Online
>>  Pid
>> ------------------------------------------------------------
>> ------------------
>> Brick stor1data:/mnt/glusterfs/vol0/bri
>> ck1                                         49152     0          Y
>> 13533
>> Brick stor2data:/mnt/glusterfs/vol0/bri
>> ck1                                         49152     0          Y
>> 13302
>> Brick stor3data:/mnt/disk_b1/glusterfs/
>> vol0/brick1                                 49152     0          Y
>> 17371
>> Brick stor3data:/mnt/disk_b2/glusterfs/
>> vol0/brick1                                 49153     0          Y
>> 17391
>> NFS Server on localhost                     N/A       N/A        N
>> N/A
>> NFS Server on stor3data                 N/A       N/A        N      
N/A
>> NFS Server on stor2data                 N/A       N/A        N      
N/A
>>
>> Task Status of Volume volumedisk0
>> ------------------------------------------------------------
>> ------------------
>> Task                 : Rebalance
>> ID                   : 7f5328cb-ed25-4627-9196-fb3e29e0e4ca
>> Status               : completed
>>
>> Status of volume: volumedisk1
>> Gluster process                             TCP Port  RDMA Port  Online
>>  Pid
>> ------------------------------------------------------------
>> ------------------
>> Brick stor1data:/mnt/glusterfs/vol1/bri
>> ck1                                         49153     0          Y
>> 13579
>> Brick stor2data:/mnt/glusterfs/vol1/bri
>> ck1                                         49153     0          Y
>> 13344
>> Brick stor3data:/mnt/disk_c/glusterfs/v
>> ol1/brick1                                  49154     0          Y
>> 17439
>> Brick stor3data:/mnt/disk_d/glusterfs/v
>> ol1/brick1                                  49155     0          Y
>> 17459
>> NFS Server on localhost                     N/A       N/A        N
>> N/A
>> NFS Server on stor3data                 N/A       N/A        N      
N/A
>> NFS Server on stor2data                 N/A       N/A        N      
N/A
>>
>> Task Status of Volume volumedisk1
>> ------------------------------------------------------------
>> ------------------
>> Task                 : Rebalance
>> ID                   : d0048704-beeb-4a6a-ae94-7e7916423fd3
>> Status               : completed
>>
>>
>> 2018-02-28 15:40 GMT+01:00 Nithya Balachandran <nbalacha at
redhat.com>:
>>
>>> Hi Jose,
>>>
>>> On 28 February 2018 at 18:28, Jose V. Carri?n <jocarbur at
gmail.com>
>>> wrote:
>>>
>>>> Hi Nithya,
>>>>
>>>> I applied the workarround for this bug and now df shows the
right size:
>>>>
>>>> That is good to hear.
>>>
>>>
>>>
>>>> [root at stor1 ~]# df -h
>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>> /dev/sdb1              26T  1,1T   25T   4% /mnt/glusterfs/vol0
>>>> /dev/sdc1              50T   16T   34T  33% /mnt/glusterfs/vol1
>>>> stor1data:/volumedisk0
>>>>                       101T  3,3T   97T   4% /volumedisk0
>>>> stor1data:/volumedisk1
>>>>                       197T   61T  136T  31% /volumedisk1
>>>>
>>>>
>>>> [root at stor2 ~]# df -h
>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>> /dev/sdb1              26T  1,1T   25T   4% /mnt/glusterfs/vol0
>>>> /dev/sdc1              50T   16T   34T  33% /mnt/glusterfs/vol1
>>>> stor2data:/volumedisk0
>>>>                       101T  3,3T   97T   4% /volumedisk0
>>>> stor2data:/volumedisk1
>>>>                       197T   61T  136T  31% /volumedisk1
>>>>
>>>>
>>>> [root at stor3 ~]# df -h
>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>> /dev/sdb1              25T  638G   24T   3%
/mnt/disk_b1/glusterfs/vol0
>>>> /dev/sdb2              25T  654G   24T   3%
/mnt/disk_b2/glusterfs/vol0
>>>> /dev/sdc1              50T   15T   35T  30%
/mnt/disk_c/glusterfs/vol1
>>>> /dev/sdd1              50T   15T   35T  30%
/mnt/disk_d/glusterfs/vol1
>>>> stor3data:/volumedisk0
>>>>                       101T  3,3T   97T   4% /volumedisk0
>>>> stor3data:/volumedisk1
>>>>                       197T   61T  136T  31% /volumedisk1
>>>>
>>>>
>>>> However I'm concerned because, as you can see, the
volumedisk0 on
>>>> stor3data is composed by 2 bricks on thesame disk but on
different
>>>> partitions (/dev/sdb1 and /dev/sdb2).
>>>> After to aplly the workarround, the  shared-brick-count
parameter was
>>>> setted to 1 in all the bricks and all the servers (see below).
Could be
>>>> this an issue ?
>>>>
>>>> No, this is correct. The shared-brick-count will be > 1 only
if
>>> multiple bricks share the same partition.
>>>
>>>
>>>
>>>> Also, I can check that stor3data is now unbalanced respect
stor1data
>>>> and stor2data. The three nodes have the same size of brick but
stor3data
>>>> bricks have used 1TB less than stor1data and stor2data:
>>>>
>>>
>>>
>>> This does not necessarily indicate a problem. The distribution need
not
>>> be exactly equal and depends on the filenames. Can you provide more
>>> information on the kind of dataset (how many files, sizes etc) on
this
>>> volume? Did you create the volume with all 4 bricks or add some
later?
>>>
>>> Regards,
>>> Nithya
>>>
>>>>
>>>> stor1data:
>>>> /dev/sdb1              26T  1,1T   25T   4% /mnt/glusterfs/vol0
>>>> /dev/sdc1              50T   16T   34T  33% /mnt/glusterfs/vol1
>>>>
>>>> stor2data bricks:
>>>> /dev/sdb1              26T  1,1T   25T   4% /mnt/glusterfs/vol0
>>>> /dev/sdc1              50T   16T   34T  33% /mnt/glusterfs/vol1
>>>>
>>>> stor3data bricks:
>>>>   /dev/sdb1              25T  638G   24T   3%
>>>> /mnt/disk_b1/glusterfs/vol0
>>>>   /dev/sdb2              25T  654G   24T   3%
>>>> /mnt/disk_b2/glusterfs/vol0
>>>>    dev/sdc1              50T   15T   35T  30%
/mnt/disk_c/glusterfs/vol1
>>>>    /dev/sdd1             50T   15T   35T  30%
/mnt/disk_d/glusterfs/vol1
>>>>
>>>>
>>>> [root at stor1 ~]# grep -n "share"
/var/lib/glusterd/vols/volumedisk1/*
>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>    option shared-brick-count 1
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 1
>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>    option shared-brick-count 1
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 0
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>> shared-brick-count 0
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>> shared-brick-count 0
>>>>
>>>> [root at stor2 ~]# grep -n "share"
/var/lib/glusterd/vols/volumedisk1/*
>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>    option shared-brick-count 1
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 0
>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>    option shared-brick-count 1
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 1
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>> shared-brick-count 0
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>> shared-brick-count 0
>>>>
>>>> [root at stor3t ~]# grep -n "share"
/var/lib/glusterd/vols/volumedisk1/*
>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>    option shared-brick-count 1
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 1
>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>    option shared-brick-count 1
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 0
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>> shared-brick-count 0
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>> /var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>> shared-brick-count 0
>>>>
>>>> Thaks for your help,
>>>> Greetings.
>>>>
>>>> Jose V.
>>>>
>>>>
>>>> 2018-02-28 5:07 GMT+01:00 Nithya Balachandran <nbalacha at
redhat.com>:
>>>>
>>>>> Hi Jose,
>>>>>
>>>>> There is a known issue with gluster 3.12.x builds (see [1])
so you may
>>>>> be running into this.
>>>>>
>>>>> The "shared-brick-count" values seem fine on
stor1. Please send us "grep
>>>>> -n "share"
/var/lib/glusterd/vols/volumedisk1/*" results for the
>>>>> other nodes so we can check if they are the cause.
>>>>>
>>>>>
>>>>> Regards,
>>>>> Nithya
>>>>>
>>>>>
>>>>>
>>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1517260
>>>>>
>>>>> On 28 February 2018 at 03:03, Jose V. Carri?n <jocarbur
at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Some days ago all my glusterfs configuration was
working fine. Today
>>>>>> I realized that the total size reported by df command
was changed and is
>>>>>> smaller than the aggregated capacity of all the bricks
in the volume.
>>>>>>
>>>>>> I checked that all the volumes status are fine, all the
glusterd
>>>>>> daemons are running, there is no error in logs, 
however df shows a bad
>>>>>> total size.
>>>>>>
>>>>>> My configuration for one volume: volumedisk1
>>>>>> [root at stor1 ~]# gluster volume status volumedisk1 
detail
>>>>>>
>>>>>> Status of volume: volumedisk1
>>>>>>
------------------------------------------------------------
>>>>>> ------------------
>>>>>> Brick                : Brick
stor1data:/mnt/glusterfs/vol1/brick1
>>>>>> TCP Port             : 49153
>>>>>> RDMA Port            : 0
>>>>>> Online               : Y
>>>>>> Pid                  : 13579
>>>>>> File System          : xfs
>>>>>> Device               : /dev/sdc1
>>>>>> Mount Options        : rw,noatime
>>>>>> Inode Size           : 512
>>>>>> Disk Space Free      : 35.0TB
>>>>>> Total Disk Space     : 49.1TB
>>>>>> Inode Count          : 5273970048
>>>>>> Free Inodes          : 5273123069
>>>>>>
------------------------------------------------------------
>>>>>> ------------------
>>>>>> Brick                : Brick
stor2data:/mnt/glusterfs/vol1/brick1
>>>>>> TCP Port             : 49153
>>>>>> RDMA Port            : 0
>>>>>> Online               : Y
>>>>>> Pid                  : 13344
>>>>>> File System          : xfs
>>>>>> Device               : /dev/sdc1
>>>>>> Mount Options        : rw,noatime
>>>>>> Inode Size           : 512
>>>>>> Disk Space Free      : 35.0TB
>>>>>> Total Disk Space     : 49.1TB
>>>>>> Inode Count          : 5273970048
>>>>>> Free Inodes          : 5273124718
>>>>>>
------------------------------------------------------------
>>>>>> ------------------
>>>>>> Brick                : Brick
stor3data:/mnt/disk_c/glusterf
>>>>>> s/vol1/brick1
>>>>>> TCP Port             : 49154
>>>>>> RDMA Port            : 0
>>>>>> Online               : Y
>>>>>> Pid                  : 17439
>>>>>> File System          : xfs
>>>>>> Device               : /dev/sdc1
>>>>>> Mount Options        : rw,noatime
>>>>>> Inode Size           : 512
>>>>>> Disk Space Free      : 35.7TB
>>>>>> Total Disk Space     : 49.1TB
>>>>>> Inode Count          : 5273970048
>>>>>> Free Inodes          : 5273125437
>>>>>>
------------------------------------------------------------
>>>>>> ------------------
>>>>>> Brick                : Brick
stor3data:/mnt/disk_d/glusterf
>>>>>> s/vol1/brick1
>>>>>> TCP Port             : 49155
>>>>>> RDMA Port            : 0
>>>>>> Online               : Y
>>>>>> Pid                  : 17459
>>>>>> File System          : xfs
>>>>>> Device               : /dev/sdd1
>>>>>> Mount Options        : rw,noatime
>>>>>> Inode Size           : 512
>>>>>> Disk Space Free      : 35.6TB
>>>>>> Total Disk Space     : 49.1TB
>>>>>> Inode Count          : 5273970048
>>>>>> Free Inodes          : 5273127036
>>>>>>
>>>>>>
>>>>>> Then full size for volumedisk1 should be: 49.1TB +
49.1TB + 49.1TB
>>>>>> +49.1TB = *196,4 TB  *but df shows:
>>>>>>
>>>>>> [root at stor1 ~]# df -h
>>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>>> /dev/sda2              48G   21G   25G  46% /
>>>>>> tmpfs                  32G   80K   32G   1% /dev/shm
>>>>>> /dev/sda1             190M   62M  119M  35% /boot
>>>>>> /dev/sda4             395G  251G  124G  68% /data
>>>>>> /dev/sdb1              26T  601G   25T   3%
/mnt/glusterfs/vol0
>>>>>> /dev/sdc1              50T   15T   36T  29%
/mnt/glusterfs/vol1
>>>>>> stor1data:/volumedisk0
>>>>>>                        76T  1,6T   74T   3%
/volumedisk0
>>>>>> stor1data:/volumedisk1
>>>>>>                       *148T*   42T  106T  29%
/volumedisk1
>>>>>>
>>>>>> Exactly 1 brick minus: 196,4 TB - 49,1TB = 148TB
>>>>>>
>>>>>> It's a production system so I hope you can help me.
>>>>>>
>>>>>> Thanks in advance.
>>>>>>
>>>>>> Jose V.
>>>>>>
>>>>>>
>>>>>> Below some other data of my configuration:
>>>>>>
>>>>>> [root at stor1 ~]# gluster volume info
>>>>>>
>>>>>> Volume Name: volumedisk0
>>>>>> Type: Distribute
>>>>>> Volume ID: 0ee52d94-1131-4061-bcef-bd8cf898da10
>>>>>> Status: Started
>>>>>> Snapshot Count: 0
>>>>>> Number of Bricks: 4
>>>>>> Transport-type: tcp
>>>>>> Bricks:
>>>>>> Brick1: stor1data:/mnt/glusterfs/vol0/brick1
>>>>>> Brick2: stor2data:/mnt/glusterfs/vol0/brick1
>>>>>> Brick3: stor3data:/mnt/disk_b1/glusterfs/vol0/brick1
>>>>>> Brick4: stor3data:/mnt/disk_b2/glusterfs/vol0/brick1
>>>>>> Options Reconfigured:
>>>>>> performance.cache-size: 4GB
>>>>>> cluster.min-free-disk: 1%
>>>>>> performance.io-thread-count: 16
>>>>>> performance.readdir-ahead: on
>>>>>>
>>>>>> Volume Name: volumedisk1
>>>>>> Type: Distribute
>>>>>> Volume ID: 591b7098-800e-4954-82a9-6b6d81c9e0a2
>>>>>> Status: Started
>>>>>> Snapshot Count: 0
>>>>>> Number of Bricks: 4
>>>>>> Transport-type: tcp
>>>>>> Bricks:
>>>>>> Brick1: stor1data:/mnt/glusterfs/vol1/brick1
>>>>>> Brick2: stor2data:/mnt/glusterfs/vol1/brick1
>>>>>> Brick3: stor3data:/mnt/disk_c/glusterfs/vol1/brick1
>>>>>> Brick4: stor3data:/mnt/disk_d/glusterfs/vol1/brick1
>>>>>> Options Reconfigured:
>>>>>> cluster.min-free-inodes: 6%
>>>>>> performance.cache-size: 4GB
>>>>>> cluster.min-free-disk: 1%
>>>>>> performance.io-thread-count: 16
>>>>>> performance.readdir-ahead: on
>>>>>>
>>>>>> [root at stor1 ~]# grep -n "share"
/var/lib/glusterd/vols/volumedisk1/*
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>>> -glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>>> -glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 0
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 0
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 0
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>>> shared-brick-count 0
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 0
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>>> shared-brick-count 0
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180301/0b598040/attachment.html>

Jose V. Carrión

2018-Mar-01 09:55 UTC

head link

[Gluster-users] df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)

I'm sorry for my last incomplete message.

Below the output of both volumes:

[root at stor1t ~]# gluster volume rebalance volumedisk1 status
                                    Node Rebalanced-files          size
  scanned      failures       skipped               status  run time in
h:m:s
                               ---------      -----------   -----------
-----------   -----------   -----------         ------------
--------------
                               localhost           703964     16384.0PB
  1475983             0             0            completed       64:37:55
                           stor2data           704610     16384.0PB
1475199             0             0            completed       64:31:30
                           stor3data           703964     16384.0PB
1475983             0             0            completed       64:37:55
volume rebalance: volumedisk1: success

[root at stor1 ~]# gluster volume rebalance volumedisk0 status
                                    Node Rebalanced-files          size
  scanned      failures       skipped               status  run time in
h:m:s
                               ---------      -----------   -----------
-----------   -----------   -----------         ------------
--------------
                               localhost           411919         1.1GB
   718044             0             0            completed        2:28:52
                           stor2data           435340     16384.0PB
 741287             0             0            completed        2:26:01
                           stor3data           411919         1.1GB
 718044             0             0            completed        2:28:52
volume rebalance: volumedisk0: success

And  volumedisk1 rebalance logs finished saying:
[2018-02-13 03:47:48.703311] I [MSGID: 109028]
[dht-rebalance.c:5053:gf_defrag_status_get]
0-volumedisk1-dht: Rebalance is completed. Time taken is 232675.00 secs
[2018-02-13 03:47:48.703351] I [MSGID: 109028]
[dht-rebalance.c:5057:gf_defrag_status_get]
0-volumedisk1-dht: Files migrated: 703964, size: 14046969178073, lookups:
1475983, failures: 0, skipped: 0

Checking my logs the new stor3node and the rebalance task was executed on
 2018-02-10 . From this date to now I have been storing new files.
The exact sequence of commands to add the new node was:

gluster peer probe stor3data

gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b1/glusterfs/vol0

gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b2/glusterfs/vol0

gluster volume add-brick volumedisk1 stor3data:/mnt/disk_c/glusterfs/vol1

gluster volume add-brick volumedisk1 stor3data:/mnt/disk_d/glusterfs/vol1

gluster volume rebalance volumedisk0 start force

gluster volume rebalance volumedisk1 start force

For some reason , could be unbalanced the assigned range of DHT for
stor3data bricks ? Could be minor than stor1data and stor2data ? ,

Any way to verify it ?

Any way to modify/rebalance the DHT range between bricks  in order to unify
the DHT range per brick ?.

Thanks a lot,

Greetings.

Jose V.


2018-03-01 10:39 GMT+01:00 Jose V. Carri?n <jocarbur at gmail.com>:
> Hi Nithya,
> Below the output of both volumes:
>
> [root at stor1t ~]# gluster volume rebalance volumedisk1 status
>                                     Node Rebalanced-files          size
>     scanned      failures       skipped               status  run time in
> h:m:s
>                                ---------      -----------   -----------
> -----------   -----------   -----------         ------------
> --------------
>                                localhost           703964     16384.0PB
>     1475983             0             0            completed       64:37:55
>                            stor2data           704610     16384.0PB
> 1475199             0             0            completed       64:31:30
>                            stor3data           703964     16384.0PB
> 1475983             0             0            completed       64:37:55
> volume rebalance: volumedisk1: success
>
> [root at stor1 ~]# gluster volume rebalance volumedisk0 status
>                                     Node Rebalanced-files          size
>     scanned      failures       skipped               status  run time in
> h:m:s
>                                ---------      -----------   -----------
> -----------   -----------   -----------         ------------
> --------------
>                                localhost           411919         1.1GB
>      718044             0             0            completed        2:28:52
>                            stor2data           435340     16384.0PB
>  741287             0             0            completed        2:26:01
>                            stor3data           411919         1.1GB
>  718044             0             0            completed        2:28:52
> volume rebalance: volumedisk0: success
>
> And  volumedisk1 rebalance logs finished saying:
> [2018-02-13 03:47:48.703311] I [MSGID: 109028]
[dht-rebalance.c:5053:gf_defrag_status_get]
> 0-volumedisk1-dht: Rebalance is completed. Time taken is 232675.00 secs
> [2018-02-13 03:47:48.703351] I [MSGID: 109028]
[dht-rebalance.c:5057:gf_defrag_status_get]
> 0-volumedisk1-dht: Files migrated: 703964, size: 14046969178073, lookups:
> 1475983, failures: 0, skipped: 0
>
> Checking my logs the new stor3node and the rebalance task was executed on
>  2018-02-10 . From this date to now I have been storing new files.
> The sequence of commands to add the node was:
>
> gluster peer probe stor3data
>
> gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b1/glusterfs/vol0
>
> gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b1/glusterfs/vol0
>
>
>
>
>
> 2018-03-01 6:32 GMT+01:00 Nithya Balachandran <nbalacha at
redhat.com>:
>
>> Hi Jose,
>>
>> On 28 February 2018 at 22:31, Jose V. Carri?n <jocarbur at
gmail.com> wrote:
>>
>>> Hi Nithya,
>>>
>>> My initial setup was composed of 2 similar nodes: stor1data and
>>> stor2data. A month ago I expanded both volumes with a new node:
stor3data
>>> (2 bricks per volume).
>>> Of course, then to add the new peer with the bricks I did the
'balance
>>> force' operation. This task finished successfully (you can see
info below)
>>> and number of files on the 3 nodes were very similar .
>>>
>>> For volumedisk1 I only have files of 500MB and they are continuosly
>>> written in sequential mode. The filename pattern of written files
is:
>>>
>>> run.node1.0000.rd
>>> run.node2.0000.rd
>>> run.node1.0001.rd
>>> run.node2.0001.rd
>>> run.node1.0002.rd
>>> run.node2.0002.rd
>>> ...........
>>> ...........
>>> run.node1.X.rd
>>> run.node2.X.rd
>>>
>>> (  X ranging from 0000 to infinite )
>>>
>>> Curiously stor1data and stor2data maintain similar ratios in bytes:
>>>
>>> Filesystem              1K-blocks        Used              
Available
>>>   Use% Mounted on
>>> /dev/sdc1             52737613824 17079174264
<(707)%20917-4264>
>>>  35658439560  33% /mnt/glusterfs/vol1   -> stor1data
>>> /dev/sdc1             52737613824 17118810848  35618802976  33%
>>> /mnt/glusterfs/vol1  ->  stor2data
>>>
>>> However the ratio on som3data differs too much (1TB):
>>> Filesystem           1K-blocks        Used                Available
>>>   Use% Mounted on
>>> /dev/sdc1             52737613824 15479191748  37258422076  30%
>>> /mnt/disk_c/glusterfs/vol1 -> stor3data
>>> /dev/sdd1             52737613824 15566398604  37171215220  30%
>>> /mnt/disk_d/glusterfs/vol1 -> stor3data
>>>
>>> Thinking in  inodes:
>>>
>>> Filesystem                Inodes       IUsed       IFree         
IUse%
>>> Mounted on
>>> /dev/sdc1             5273970048  851053  5273118995    1%
>>> /mnt/glusterfs/vol1 ->  stor1data
>>> /dev/sdc1             5273970048  849388  5273120660    1%
>>> /mnt/glusterfs/vol1 ->  stor2data
>>>
>>> /dev/sdc1             5273970048  846877  5273123171    1%
>>> /mnt/disk_c/glusterfs/vol1 -> stor3data
>>> /dev/sdd1             5273970048  845250  5273124798    1%
>>> /mnt/disk_d/glusterfs/vol1 -> stor3data
>>>
>>> 851053 (stor1) - 845250 (stor3) = 5803 files of difference !
>>>
>>
>> The inode numbers are a little misleading here - gluster uses some to
>> create its own internal files and directory structures. Based on the
>> average file size, I think this would actually work out to a difference
of
>> around 2000 files.
>>
>>
>>>
>>> In adition, correct me if I'm wrong,  stor3data should have 50%
of
>>> probability to store a new file (even taking into account the
algorithm of
>>> DHT with filename patterns)
>>>
>>> Theoretically yes , but again, it depends on the filenames and
their
>> hash distribution.
>>
>> Please send us the output of :
>> gluster volume rebalance <volname> status
>>
>> for the volume.
>>
>> Regards,
>> Nithya
>>
>>
>>> Thanks,
>>> Greetings.
>>>
>>> Jose V.
>>>
>>> Status of volume: volumedisk0
>>> Gluster process                             TCP Port  RDMA Port 
Online
>>>  Pid
>>> ------------------------------------------------------------
>>> ------------------
>>> Brick stor1data:/mnt/glusterfs/vol0/bri
>>> ck1                                         49152     0          Y
>>> 13533
>>> Brick stor2data:/mnt/glusterfs/vol0/bri
>>> ck1                                         49152     0          Y
>>> 13302
>>> Brick stor3data:/mnt/disk_b1/glusterfs/
>>> vol0/brick1                                 49152     0          Y
>>> 17371
>>> Brick stor3data:/mnt/disk_b2/glusterfs/
>>> vol0/brick1                                 49153     0          Y
>>> 17391
>>> NFS Server on localhost                     N/A       N/A        N
>>> N/A
>>> NFS Server on stor3data                 N/A       N/A        N     
N/A
>>>
>>> NFS Server on stor2data                 N/A       N/A        N     
N/A
>>>
>>>
>>> Task Status of Volume volumedisk0
>>> ------------------------------------------------------------
>>> ------------------
>>> Task                 : Rebalance
>>> ID                   : 7f5328cb-ed25-4627-9196-fb3e29e0e4ca
>>> Status               : completed
>>>
>>> Status of volume: volumedisk1
>>> Gluster process                             TCP Port  RDMA Port 
Online
>>>  Pid
>>> ------------------------------------------------------------
>>> ------------------
>>> Brick stor1data:/mnt/glusterfs/vol1/bri
>>> ck1                                         49153     0          Y
>>> 13579
>>> Brick stor2data:/mnt/glusterfs/vol1/bri
>>> ck1                                         49153     0          Y
>>> 13344
>>> Brick stor3data:/mnt/disk_c/glusterfs/v
>>> ol1/brick1                                  49154     0          Y
>>> 17439
>>> Brick stor3data:/mnt/disk_d/glusterfs/v
>>> ol1/brick1                                  49155     0          Y
>>> 17459
>>> NFS Server on localhost                     N/A       N/A        N
>>> N/A
>>> NFS Server on stor3data                 N/A       N/A        N     
N/A
>>>
>>> NFS Server on stor2data                 N/A       N/A        N     
N/A
>>>
>>>
>>> Task Status of Volume volumedisk1
>>> ------------------------------------------------------------
>>> ------------------
>>> Task                 : Rebalance
>>> ID                   : d0048704-beeb-4a6a-ae94-7e7916423fd3
>>> Status               : completed
>>>
>>>
>>> 2018-02-28 15:40 GMT+01:00 Nithya Balachandran <nbalacha at
redhat.com>:
>>>
>>>> Hi Jose,
>>>>
>>>> On 28 February 2018 at 18:28, Jose V. Carri?n <jocarbur at
gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Nithya,
>>>>>
>>>>> I applied the workarround for this bug and now df shows the
right size:
>>>>>
>>>>> That is good to hear.
>>>>
>>>>
>>>>
>>>>> [root at stor1 ~]# df -h
>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>> /dev/sdb1              26T  1,1T   25T   4%
/mnt/glusterfs/vol0
>>>>> /dev/sdc1              50T   16T   34T  33%
/mnt/glusterfs/vol1
>>>>> stor1data:/volumedisk0
>>>>>                       101T  3,3T   97T   4% /volumedisk0
>>>>> stor1data:/volumedisk1
>>>>>                       197T   61T  136T  31% /volumedisk1
>>>>>
>>>>>
>>>>> [root at stor2 ~]# df -h
>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>> /dev/sdb1              26T  1,1T   25T   4%
/mnt/glusterfs/vol0
>>>>> /dev/sdc1              50T   16T   34T  33%
/mnt/glusterfs/vol1
>>>>> stor2data:/volumedisk0
>>>>>                       101T  3,3T   97T   4% /volumedisk0
>>>>> stor2data:/volumedisk1
>>>>>                       197T   61T  136T  31% /volumedisk1
>>>>>
>>>>>
>>>>> [root at stor3 ~]# df -h
>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>> /dev/sdb1              25T  638G   24T   3%
/mnt/disk_b1/glusterfs/vol0
>>>>> /dev/sdb2              25T  654G   24T   3%
/mnt/disk_b2/glusterfs/vol0
>>>>> /dev/sdc1              50T   15T   35T  30%
/mnt/disk_c/glusterfs/vol1
>>>>> /dev/sdd1              50T   15T   35T  30%
/mnt/disk_d/glusterfs/vol1
>>>>> stor3data:/volumedisk0
>>>>>                       101T  3,3T   97T   4% /volumedisk0
>>>>> stor3data:/volumedisk1
>>>>>                       197T   61T  136T  31% /volumedisk1
>>>>>
>>>>>
>>>>> However I'm concerned because, as you can see, the
volumedisk0 on
>>>>> stor3data is composed by 2 bricks on thesame disk but on
different
>>>>> partitions (/dev/sdb1 and /dev/sdb2).
>>>>> After to aplly the workarround, the  shared-brick-count
parameter was
>>>>> setted to 1 in all the bricks and all the servers (see
below). Could be
>>>>> this an issue ?
>>>>>
>>>>> No, this is correct. The shared-brick-count will be > 1
only if
>>>> multiple bricks share the same partition.
>>>>
>>>>
>>>>
>>>>> Also, I can check that stor3data is now unbalanced respect
stor1data
>>>>> and stor2data. The three nodes have the same size of brick
but stor3data
>>>>> bricks have used 1TB less than stor1data and stor2data:
>>>>>
>>>>
>>>>
>>>> This does not necessarily indicate a problem. The distribution
need not
>>>> be exactly equal and depends on the filenames. Can you provide
more
>>>> information on the kind of dataset (how many files, sizes etc)
on this
>>>> volume? Did you create the volume with all 4 bricks or add some
later?
>>>>
>>>> Regards,
>>>> Nithya
>>>>
>>>>>
>>>>> stor1data:
>>>>> /dev/sdb1              26T  1,1T   25T   4%
/mnt/glusterfs/vol0
>>>>> /dev/sdc1              50T   16T   34T  33%
/mnt/glusterfs/vol1
>>>>>
>>>>> stor2data bricks:
>>>>> /dev/sdb1              26T  1,1T   25T   4%
/mnt/glusterfs/vol0
>>>>> /dev/sdc1              50T   16T   34T  33%
/mnt/glusterfs/vol1
>>>>>
>>>>> stor3data bricks:
>>>>>   /dev/sdb1              25T  638G   24T   3%
>>>>> /mnt/disk_b1/glusterfs/vol0
>>>>>   /dev/sdb2              25T  654G   24T   3%
>>>>> /mnt/disk_b2/glusterfs/vol0
>>>>>    dev/sdc1              50T   15T   35T  30%
>>>>> /mnt/disk_c/glusterfs/vol1
>>>>>    /dev/sdd1             50T   15T   35T  30%
>>>>> /mnt/disk_d/glusterfs/vol1
>>>>>
>>>>>
>>>>> [root at stor1 ~]# grep -n "share"
/var/lib/glusterd/vols/volumedisk1/*
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>>    option shared-brick-count 1
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 1
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>>    option shared-brick-count 1
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 0
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>> shared-brick-count 0
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>> shared-brick-count 0
>>>>>
>>>>> [root at stor2 ~]# grep -n "share"
/var/lib/glusterd/vols/volumedisk1/*
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>>    option shared-brick-count 1
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 0
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>>    option shared-brick-count 1
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 1
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>> shared-brick-count 0
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>> shared-brick-count 0
>>>>>
>>>>> [root at stor3t ~]# grep -n "share"
/var/lib/glusterd/vols/volumedisk1/*
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>>    option shared-brick-count 1
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 1
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>>    option shared-brick-count 1
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 0
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>> shared-brick-count 0
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>> shared-brick-count 0
>>>>>
>>>>> Thaks for your help,
>>>>> Greetings.
>>>>>
>>>>> Jose V.
>>>>>
>>>>>
>>>>> 2018-02-28 5:07 GMT+01:00 Nithya Balachandran <nbalacha
at redhat.com>:
>>>>>
>>>>>> Hi Jose,
>>>>>>
>>>>>> There is a known issue with gluster 3.12.x builds (see
[1]) so you
>>>>>> may be running into this.
>>>>>>
>>>>>> The "shared-brick-count" values seem fine on
stor1. Please send us "grep
>>>>>> -n "share"
/var/lib/glusterd/vols/volumedisk1/*" results for the
>>>>>> other nodes so we can check if they are the cause.
>>>>>>
>>>>>>
>>>>>> Regards,
>>>>>> Nithya
>>>>>>
>>>>>>
>>>>>>
>>>>>> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1517260
>>>>>>
>>>>>> On 28 February 2018 at 03:03, Jose V. Carri?n
<jocarbur at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> Some days ago all my glusterfs configuration was
working fine. Today
>>>>>>> I realized that the total size reported by df
command was changed and is
>>>>>>> smaller than the aggregated capacity of all the
bricks in the volume.
>>>>>>>
>>>>>>> I checked that all the volumes status are fine, all
the glusterd
>>>>>>> daemons are running, there is no error in logs, 
however df shows a bad
>>>>>>> total size.
>>>>>>>
>>>>>>> My configuration for one volume: volumedisk1
>>>>>>> [root at stor1 ~]# gluster volume status
volumedisk1  detail
>>>>>>>
>>>>>>> Status of volume: volumedisk1
>>>>>>>
------------------------------------------------------------
>>>>>>> ------------------
>>>>>>> Brick                : Brick
stor1data:/mnt/glusterfs/vol1/brick1
>>>>>>> TCP Port             : 49153
>>>>>>> RDMA Port            : 0
>>>>>>> Online               : Y
>>>>>>> Pid                  : 13579
>>>>>>> File System          : xfs
>>>>>>> Device               : /dev/sdc1
>>>>>>> Mount Options        : rw,noatime
>>>>>>> Inode Size           : 512
>>>>>>> Disk Space Free      : 35.0TB
>>>>>>> Total Disk Space     : 49.1TB
>>>>>>> Inode Count          : 5273970048
>>>>>>> Free Inodes          : 5273123069
>>>>>>>
------------------------------------------------------------
>>>>>>> ------------------
>>>>>>> Brick                : Brick
stor2data:/mnt/glusterfs/vol1/brick1
>>>>>>> TCP Port             : 49153
>>>>>>> RDMA Port            : 0
>>>>>>> Online               : Y
>>>>>>> Pid                  : 13344
>>>>>>> File System          : xfs
>>>>>>> Device               : /dev/sdc1
>>>>>>> Mount Options        : rw,noatime
>>>>>>> Inode Size           : 512
>>>>>>> Disk Space Free      : 35.0TB
>>>>>>> Total Disk Space     : 49.1TB
>>>>>>> Inode Count          : 5273970048
>>>>>>> Free Inodes          : 5273124718
>>>>>>>
------------------------------------------------------------
>>>>>>> ------------------
>>>>>>> Brick                : Brick
stor3data:/mnt/disk_c/glusterf
>>>>>>> s/vol1/brick1
>>>>>>> TCP Port             : 49154
>>>>>>> RDMA Port            : 0
>>>>>>> Online               : Y
>>>>>>> Pid                  : 17439
>>>>>>> File System          : xfs
>>>>>>> Device               : /dev/sdc1
>>>>>>> Mount Options        : rw,noatime
>>>>>>> Inode Size           : 512
>>>>>>> Disk Space Free      : 35.7TB
>>>>>>> Total Disk Space     : 49.1TB
>>>>>>> Inode Count          : 5273970048
>>>>>>> Free Inodes          : 5273125437
>>>>>>>
------------------------------------------------------------
>>>>>>> ------------------
>>>>>>> Brick                : Brick
stor3data:/mnt/disk_d/glusterf
>>>>>>> s/vol1/brick1
>>>>>>> TCP Port             : 49155
>>>>>>> RDMA Port            : 0
>>>>>>> Online               : Y
>>>>>>> Pid                  : 17459
>>>>>>> File System          : xfs
>>>>>>> Device               : /dev/sdd1
>>>>>>> Mount Options        : rw,noatime
>>>>>>> Inode Size           : 512
>>>>>>> Disk Space Free      : 35.6TB
>>>>>>> Total Disk Space     : 49.1TB
>>>>>>> Inode Count          : 5273970048
>>>>>>> Free Inodes          : 5273127036
>>>>>>>
>>>>>>>
>>>>>>> Then full size for volumedisk1 should be: 49.1TB +
49.1TB + 49.1TB
>>>>>>> +49.1TB = *196,4 TB  *but df shows:
>>>>>>>
>>>>>>> [root at stor1 ~]# df -h
>>>>>>> Filesystem            Size  Used Avail Use% Mounted
on
>>>>>>> /dev/sda2              48G   21G   25G  46% /
>>>>>>> tmpfs                  32G   80K   32G   1%
/dev/shm
>>>>>>> /dev/sda1             190M   62M  119M  35% /boot
>>>>>>> /dev/sda4             395G  251G  124G  68% /data
>>>>>>> /dev/sdb1              26T  601G   25T   3%
/mnt/glusterfs/vol0
>>>>>>> /dev/sdc1              50T   15T   36T  29%
/mnt/glusterfs/vol1
>>>>>>> stor1data:/volumedisk0
>>>>>>>                        76T  1,6T   74T   3%
/volumedisk0
>>>>>>> stor1data:/volumedisk1
>>>>>>>                       *148T*   42T  106T  29%
/volumedisk1
>>>>>>>
>>>>>>> Exactly 1 brick minus: 196,4 TB - 49,1TB = 148TB
>>>>>>>
>>>>>>> It's a production system so I hope you can help
me.
>>>>>>>
>>>>>>> Thanks in advance.
>>>>>>>
>>>>>>> Jose V.
>>>>>>>
>>>>>>>
>>>>>>> Below some other data of my configuration:
>>>>>>>
>>>>>>> [root at stor1 ~]# gluster volume info
>>>>>>>
>>>>>>> Volume Name: volumedisk0
>>>>>>> Type: Distribute
>>>>>>> Volume ID: 0ee52d94-1131-4061-bcef-bd8cf898da10
>>>>>>> Status: Started
>>>>>>> Snapshot Count: 0
>>>>>>> Number of Bricks: 4
>>>>>>> Transport-type: tcp
>>>>>>> Bricks:
>>>>>>> Brick1: stor1data:/mnt/glusterfs/vol0/brick1
>>>>>>> Brick2: stor2data:/mnt/glusterfs/vol0/brick1
>>>>>>> Brick3:
stor3data:/mnt/disk_b1/glusterfs/vol0/brick1
>>>>>>> Brick4:
stor3data:/mnt/disk_b2/glusterfs/vol0/brick1
>>>>>>> Options Reconfigured:
>>>>>>> performance.cache-size: 4GB
>>>>>>> cluster.min-free-disk: 1%
>>>>>>> performance.io-thread-count: 16
>>>>>>> performance.readdir-ahead: on
>>>>>>>
>>>>>>> Volume Name: volumedisk1
>>>>>>> Type: Distribute
>>>>>>> Volume ID: 591b7098-800e-4954-82a9-6b6d81c9e0a2
>>>>>>> Status: Started
>>>>>>> Snapshot Count: 0
>>>>>>> Number of Bricks: 4
>>>>>>> Transport-type: tcp
>>>>>>> Bricks:
>>>>>>> Brick1: stor1data:/mnt/glusterfs/vol1/brick1
>>>>>>> Brick2: stor2data:/mnt/glusterfs/vol1/brick1
>>>>>>> Brick3: stor3data:/mnt/disk_c/glusterfs/vol1/brick1
>>>>>>> Brick4: stor3data:/mnt/disk_d/glusterfs/vol1/brick1
>>>>>>> Options Reconfigured:
>>>>>>> cluster.min-free-inodes: 6%
>>>>>>> performance.cache-size: 4GB
>>>>>>> cluster.min-free-disk: 1%
>>>>>>> performance.io-thread-count: 16
>>>>>>> performance.readdir-ahead: on
>>>>>>>
>>>>>>> [root at stor1 ~]# grep -n "share"
/var/lib/glusterd/vols/volumedisk1/*
>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>>>> -glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 1
>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>>>> -glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 0
>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 0
>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 0
>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:   
option
>>>>>>> shared-brick-count 0
>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 0
>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:   
option
>>>>>>> shared-brick-count 0
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> Gluster-users mailing list
>>>>>>> Gluster-users at gluster.org
>>>>>>>
http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180301/858e3786/attachment.html>

Nithya Balachandran

2018-Mar-01 11:32 UTC

head link

[Gluster-users] df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)

On 1 March 2018 at 15:25, Jose V. Carri?n <jocarbur at gmail.com> wrote:
> I'm sorry for my last incomplete message.
>
> Below the output of both volumes:
>
> [root at stor1t ~]# gluster volume rebalance volumedisk1 status
>                                     Node Rebalanced-files          size
>     scanned      failures       skipped               status  run time in
> h:m:s
>                                ---------      -----------   -----------
> -----------   -----------   -----------         ------------
> --------------
>                                localhost           703964     16384.0PB
>     1475983             0             0            completed       64:37:55
>                            stor2data           704610     16384.0PB
> 1475199             0             0            completed       64:31:30
>                            stor3data           703964     16384.0PB
> 1475983             0             0            completed       64:37:55
> volume rebalance: volumedisk1: success
>
> [root at stor1 ~]# gluster volume rebalance volumedisk0 status
>                                     Node Rebalanced-files          size
>     scanned      failures       skipped               status  run time in
> h:m:s
>                                ---------      -----------   -----------
> -----------   -----------   -----------         ------------
> --------------
>                                localhost           411919         1.1GB
>      718044             0             0            completed        2:28:52
>                            stor2data           435340     16384.0PB
>  741287             0             0            completed        2:26:01
>                            stor3data           411919         1.1GB
>  718044             0             0            completed        2:28:52
> volume rebalance: volumedisk0: success
>
> And  volumedisk1 rebalance logs finished saying:
> [2018-02-13 03:47:48.703311] I [MSGID: 109028]
> [dht-rebalance.c:5053:gf_defrag_status_get] 0-volumedisk1-dht: Rebalance
> is completed. Time taken is 232675.00 secs
> [2018-02-13 03:47:48.703351] I [MSGID: 109028]
> [dht-rebalance.c:5057:gf_defrag_status_get] 0-volumedisk1-dht: Files
> migrated: 703964, size: 14046969178073, lookups: 1475983, failures: 0,
> skipped: 0
>
> Checking my logs the new stor3node and the rebalance task was executed on
>  2018-02-10 . From this date to now I have been storing new files.
> The exact sequence of commands to add the new node was:
>
> gluster peer probe stor3data
>
> gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b1/glusterfs/vol0
>
> gluster volume add-brick volumedisk0 stor3data:/mnt/disk_b2/glusterfs/vol0
>
> gluster volume add-brick volumedisk1 stor3data:/mnt/disk_c/glusterfs/vol1
>
> gluster volume add-brick volumedisk1 stor3data:/mnt/disk_d/glusterfs/vol1
>
> gluster volume rebalance volumedisk0 start force
>
> gluster volume rebalance volumedisk1 start force
>
> For some reason , could be unbalanced the assigned range of DHT for
> stor3data bricks ? Could be minor than stor1data and stor2data ? ,
>
> Any way to verify it ?
>
> Any way to modify/rebalance the DHT range between bricks  in order to
> unify the DHT range per brick ?.
>
> Thanks a lot,
>
> Greetings.
>
> Jose V.
>
>
> 2018-03-01 10:39 GMT+01:00 Jose V. Carri?n <jocarbur at gmail.com>:
>
>> Hi Nithya,
>> Below the output of both volumes:
>>
>> [root at stor1t ~]# gluster volume rebalance volumedisk1 status
>>                                     Node Rebalanced-files          size
>>     scanned      failures       skipped               status  run time
in
>> h:m:s
>>                                ---------      -----------   -----------
>> -----------   -----------   -----------         ------------
>> --------------
>>                                localhost           703964     16384.0PB
>>     1475983             0             0            completed      
64:37:55
>>                            stor2data           704610     16384.0PB
>> 1475199             0             0            completed       64:31:30
>>                            stor3data           703964     16384.0PB
>> 1475983             0             0            completed       64:37:55
>> volume rebalance: volumedisk1: success
>>
>> [root at stor1 ~]# gluster volume rebalance volumedisk0 status
>>                                     Node Rebalanced-files          size
>>     scanned      failures       skipped               status  run time
in
>> h:m:s
>>                                ---------      -----------   -----------
>> -----------   -----------   -----------         ------------
>> --------------
>>                                localhost           411919         1.1GB
>>      718044             0             0            completed       
2:28:52
>>                            stor2data           435340     16384.0PB
>>  741287             0             0            completed        2:26:01
>>                            stor3data           411919         1.1GB
>>  718044             0             0            completed        2:28:52
>> volume rebalance: volumedisk0: success
>>
>> And  volumedisk1 rebalance logs finished saying:
>> [2018-02-13 03:47:48.703311] I [MSGID: 109028]
>> [dht-rebalance.c:5053:gf_defrag_status_get] 0-volumedisk1-dht:
Rebalance
>> is completed. Time taken is 232675.00 secs
>> [2018-02-13 03:47:48.703351] I [MSGID: 109028]
>> [dht-rebalance.c:5057:gf_defrag_status_get] 0-volumedisk1-dht: Files
>> migrated: 703964, size: 14046969178073, lookups: 1475983, failures: 0,
>> skipped: 0
>>
>> Checking my logs the new stor3node and the rebalance task was executed
on
>>  2018-02-10 . From this date to now I have been storing new files.
>> The sequence of commands to add the node was:
>>
>> gluster peer probe stor3data
>>
>> gluster volume add-brick volumedisk0
stor3data:/mnt/disk_b1/glusterfs/vol0
>>
>> gluster volume add-brick volumedisk0
stor3data:/mnt/disk_b1/glusterfs/vol0
>>
>>
>>
>>While it is odd that both bricks on the third node show similar usage, I do
not see a problem in the steps or the status. Can you keep an eye on this
and let us know if this continues to be the case?


>>
>> 2018-03-01 6:32 GMT+01:00 Nithya Balachandran <nbalacha at
redhat.com>:
>>
>>> Hi Jose,
>>>
>>> On 28 February 2018 at 22:31, Jose V. Carri?n <jocarbur at
gmail.com>
>>> wrote:
>>>
>>>> Hi Nithya,
>>>>
>>>> My initial setup was composed of 2 similar nodes: stor1data and
>>>> stor2data. A month ago I expanded both volumes with a new node:
stor3data
>>>> (2 bricks per volume).
>>>> Of course, then to add the new peer with the bricks I did the
'balance
>>>> force' operation. This task finished successfully (you can
see info below)
>>>> and number of files on the 3 nodes were very similar .
>>>>
>>>> For volumedisk1 I only have files of 500MB and they are
continuosly
>>>> written in sequential mode. The filename pattern of written
files is:
>>>>
>>>> run.node1.0000.rd
>>>> run.node2.0000.rd
>>>> run.node1.0001.rd
>>>> run.node2.0001.rd
>>>> run.node1.0002.rd
>>>> run.node2.0002.rd
>>>> ...........
>>>> ...........
>>>> run.node1.X.rd
>>>> run.node2.X.rd
>>>>
>>>> (  X ranging from 0000 to infinite )
>>>>
>>>> Curiously stor1data and stor2data maintain similar ratios in
bytes:
>>>>
>>>> Filesystem              1K-blocks        Used              
Available
>>>>   Use% Mounted on
>>>> /dev/sdc1             52737613824 17079174264
<(707)%20917-4264>
>>>>  35658439560  33% /mnt/glusterfs/vol1   -> stor1data
>>>> /dev/sdc1             52737613824 17118810848  35618802976  33%
>>>> /mnt/glusterfs/vol1  ->  stor2data
>>>>
>>>> However the ratio on som3data differs too much (1TB):
>>>> Filesystem           1K-blocks        Used               
Available
>>>>   Use% Mounted on
>>>> /dev/sdc1             52737613824 15479191748  37258422076  30%
>>>> /mnt/disk_c/glusterfs/vol1 -> stor3data
>>>> /dev/sdd1             52737613824 15566398604  37171215220  30%
>>>> /mnt/disk_d/glusterfs/vol1 -> stor3data
>>>>
>>>> Thinking in  inodes:
>>>>
>>>> Filesystem                Inodes       IUsed       IFree       
IUse%
>>>> Mounted on
>>>> /dev/sdc1             5273970048  851053  5273118995    1%
>>>> /mnt/glusterfs/vol1 ->  stor1data
>>>> /dev/sdc1             5273970048  849388  5273120660    1%
>>>> /mnt/glusterfs/vol1 ->  stor2data
>>>>
>>>> /dev/sdc1             5273970048  846877  5273123171    1%
>>>> /mnt/disk_c/glusterfs/vol1 -> stor3data
>>>> /dev/sdd1             5273970048  845250  5273124798    1%
>>>> /mnt/disk_d/glusterfs/vol1 -> stor3data
>>>>
>>>> 851053 (stor1) - 845250 (stor3) = 5803 files of difference !
>>>>
>>>
>>> The inode numbers are a little misleading here - gluster uses some
to
>>> create its own internal files and directory structures. Based on
the
>>> average file size, I think this would actually work out to a
difference of
>>> around 2000 files.
>>>
>>>
>>>>
>>>> In adition, correct me if I'm wrong,  stor3data should have
50% of
>>>> probability to store a new file (even taking into account the
algorithm of
>>>> DHT with filename patterns)
>>>>
>>>> Theoretically yes , but again, it depends on the filenames and
their
>>> hash distribution.
>>>
>>> Please send us the output of :
>>> gluster volume rebalance <volname> status
>>>
>>> for the volume.
>>>
>>> Regards,
>>> Nithya
>>>
>>>
>>>> Thanks,
>>>> Greetings.
>>>>
>>>> Jose V.
>>>>
>>>> Status of volume: volumedisk0
>>>> Gluster process                             TCP Port  RDMA Port
Online
>>>>  Pid
>>>> ------------------------------------------------------------
>>>> ------------------
>>>> Brick stor1data:/mnt/glusterfs/vol0/bri
>>>> ck1                                         49152     0        
Y
>>>>   13533
>>>> Brick stor2data:/mnt/glusterfs/vol0/bri
>>>> ck1                                         49152     0        
Y
>>>>   13302
>>>> Brick stor3data:/mnt/disk_b1/glusterfs/
>>>> vol0/brick1                                 49152     0        
Y
>>>>   17371
>>>> Brick stor3data:/mnt/disk_b2/glusterfs/
>>>> vol0/brick1                                 49153     0        
Y
>>>>   17391
>>>> NFS Server on localhost                     N/A       N/A      
N
>>>>   N/A
>>>> NFS Server on stor3data                 N/A       N/A        N
>>>> N/A
>>>> NFS Server on stor2data                 N/A       N/A        N
>>>> N/A
>>>>
>>>> Task Status of Volume volumedisk0
>>>> ------------------------------------------------------------
>>>> ------------------
>>>> Task                 : Rebalance
>>>> ID                   : 7f5328cb-ed25-4627-9196-fb3e29e0e4ca
>>>> Status               : completed
>>>>
>>>> Status of volume: volumedisk1
>>>> Gluster process                             TCP Port  RDMA Port
Online
>>>>  Pid
>>>> ------------------------------------------------------------
>>>> ------------------
>>>> Brick stor1data:/mnt/glusterfs/vol1/bri
>>>> ck1                                         49153     0        
Y
>>>>   13579
>>>> Brick stor2data:/mnt/glusterfs/vol1/bri
>>>> ck1                                         49153     0        
Y
>>>>   13344
>>>> Brick stor3data:/mnt/disk_c/glusterfs/v
>>>> ol1/brick1                                  49154     0        
Y
>>>>   17439
>>>> Brick stor3data:/mnt/disk_d/glusterfs/v
>>>> ol1/brick1                                  49155     0        
Y
>>>>   17459
>>>> NFS Server on localhost                     N/A       N/A      
N
>>>>   N/A
>>>> NFS Server on stor3data                 N/A       N/A        N
>>>> N/A
>>>> NFS Server on stor2data                 N/A       N/A        N
>>>> N/A
>>>>
>>>> Task Status of Volume volumedisk1
>>>> ------------------------------------------------------------
>>>> ------------------
>>>> Task                 : Rebalance
>>>> ID                   : d0048704-beeb-4a6a-ae94-7e7916423fd3
>>>> Status               : completed
>>>>
>>>>
>>>> 2018-02-28 15:40 GMT+01:00 Nithya Balachandran <nbalacha at
redhat.com>:
>>>>
>>>>> Hi Jose,
>>>>>
>>>>> On 28 February 2018 at 18:28, Jose V. Carri?n <jocarbur
at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Nithya,
>>>>>>
>>>>>> I applied the workarround for this bug and now df shows
the right
>>>>>> size:
>>>>>>
>>>>>> That is good to hear.
>>>>>
>>>>>
>>>>>
>>>>>> [root at stor1 ~]# df -h
>>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>>> /dev/sdb1              26T  1,1T   25T   4%
/mnt/glusterfs/vol0
>>>>>> /dev/sdc1              50T   16T   34T  33%
/mnt/glusterfs/vol1
>>>>>> stor1data:/volumedisk0
>>>>>>                       101T  3,3T   97T   4%
/volumedisk0
>>>>>> stor1data:/volumedisk1
>>>>>>                       197T   61T  136T  31%
/volumedisk1
>>>>>>
>>>>>>
>>>>>> [root at stor2 ~]# df -h
>>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>>> /dev/sdb1              26T  1,1T   25T   4%
/mnt/glusterfs/vol0
>>>>>> /dev/sdc1              50T   16T   34T  33%
/mnt/glusterfs/vol1
>>>>>> stor2data:/volumedisk0
>>>>>>                       101T  3,3T   97T   4%
/volumedisk0
>>>>>> stor2data:/volumedisk1
>>>>>>                       197T   61T  136T  31%
/volumedisk1
>>>>>>
>>>>>>
>>>>>> [root at stor3 ~]# df -h
>>>>>> Filesystem            Size  Used Avail Use% Mounted on
>>>>>> /dev/sdb1              25T  638G   24T   3%
>>>>>> /mnt/disk_b1/glusterfs/vol0
>>>>>> /dev/sdb2              25T  654G   24T   3%
>>>>>> /mnt/disk_b2/glusterfs/vol0
>>>>>> /dev/sdc1              50T   15T   35T  30%
/mnt/disk_c/glusterfs/vol1
>>>>>> /dev/sdd1              50T   15T   35T  30%
/mnt/disk_d/glusterfs/vol1
>>>>>> stor3data:/volumedisk0
>>>>>>                       101T  3,3T   97T   4%
/volumedisk0
>>>>>> stor3data:/volumedisk1
>>>>>>                       197T   61T  136T  31%
/volumedisk1
>>>>>>
>>>>>>
>>>>>> However I'm concerned because, as you can see, the
volumedisk0 on
>>>>>> stor3data is composed by 2 bricks on thesame disk but
on different
>>>>>> partitions (/dev/sdb1 and /dev/sdb2).
>>>>>> After to aplly the workarround, the  shared-brick-count
parameter was
>>>>>> setted to 1 in all the bricks and all the servers (see
below). Could be
>>>>>> this an issue ?
>>>>>>
>>>>>> No, this is correct. The shared-brick-count will be
> 1 only if
>>>>> multiple bricks share the same partition.
>>>>>
>>>>>
>>>>>
>>>>>> Also, I can check that stor3data is now unbalanced
respect stor1data
>>>>>> and stor2data. The three nodes have the same size of
brick but stor3data
>>>>>> bricks have used 1TB less than stor1data and stor2data:
>>>>>>
>>>>>
>>>>>
>>>>> This does not necessarily indicate a problem. The
distribution need
>>>>> not be exactly equal and depends on the filenames. Can you
provide more
>>>>> information on the kind of dataset (how many files, sizes
etc) on this
>>>>> volume? Did you create the volume with all 4 bricks or add
some later?
>>>>>
>>>>> Regards,
>>>>> Nithya
>>>>>
>>>>>>
>>>>>> stor1data:
>>>>>> /dev/sdb1              26T  1,1T   25T   4%
/mnt/glusterfs/vol0
>>>>>> /dev/sdc1              50T   16T   34T  33%
/mnt/glusterfs/vol1
>>>>>>
>>>>>> stor2data bricks:
>>>>>> /dev/sdb1              26T  1,1T   25T   4%
/mnt/glusterfs/vol0
>>>>>> /dev/sdc1              50T   16T   34T  33%
/mnt/glusterfs/vol1
>>>>>>
>>>>>> stor3data bricks:
>>>>>>   /dev/sdb1              25T  638G   24T   3%
>>>>>> /mnt/disk_b1/glusterfs/vol0
>>>>>>   /dev/sdb2              25T  654G   24T   3%
>>>>>> /mnt/disk_b2/glusterfs/vol0
>>>>>>    dev/sdc1              50T   15T   35T  30%
>>>>>> /mnt/disk_c/glusterfs/vol1
>>>>>>    /dev/sdd1             50T   15T   35T  30%
>>>>>> /mnt/disk_d/glusterfs/vol1
>>>>>>
>>>>>>
>>>>>> [root at stor1 ~]# grep -n "share"
/var/lib/glusterd/vols/volumedisk1/*
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>>>    option shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>>>    option shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 0
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>>> shared-brick-count 0
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>>> shared-brick-count 0
>>>>>>
>>>>>> [root at stor2 ~]# grep -n "share"
/var/lib/glusterd/vols/volumedisk1/*
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>>>    option shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 0
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>>>    option shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>>> shared-brick-count 0
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>>> shared-brick-count 0
>>>>>>
>>>>>> [root at stor3t ~]# grep -n "share"
/var/lib/glusterd/vols/volumedisk1/*
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>>>    option shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt-glusterfs-vol1-brick1.vol:3:
>>>>>>    option shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count 0
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>>> shared-brick-count 0
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:    option
>>>>>> shared-brick-count 0
>>>>>>
>>>>>> Thaks for your help,
>>>>>> Greetings.
>>>>>>
>>>>>> Jose V.
>>>>>>
>>>>>>
>>>>>> 2018-02-28 5:07 GMT+01:00 Nithya Balachandran
<nbalacha at redhat.com>:
>>>>>>
>>>>>>> Hi Jose,
>>>>>>>
>>>>>>> There is a known issue with gluster 3.12.x builds
(see [1]) so you
>>>>>>> may be running into this.
>>>>>>>
>>>>>>> The "shared-brick-count" values seem fine
on stor1. Please send us "grep
>>>>>>> -n "share"
/var/lib/glusterd/vols/volumedisk1/*" results for the
>>>>>>> other nodes so we can check if they are the cause.
>>>>>>>
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nithya
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> [1]
https://bugzilla.redhat.com/show_bug.cgi?id=1517260
>>>>>>>
>>>>>>> On 28 February 2018 at 03:03, Jose V. Carri?n
<jocarbur at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> Some days ago all my glusterfs configuration
was working fine.
>>>>>>>> Today I realized that the total size reported
by df command was changed and
>>>>>>>> is smaller than the aggregated capacity of all
the bricks in the volume.
>>>>>>>>
>>>>>>>> I checked that all the volumes status are fine,
all the glusterd
>>>>>>>> daemons are running, there is no error in logs,
however df shows a bad
>>>>>>>> total size.
>>>>>>>>
>>>>>>>> My configuration for one volume: volumedisk1
>>>>>>>> [root at stor1 ~]# gluster volume status
volumedisk1  detail
>>>>>>>>
>>>>>>>> Status of volume: volumedisk1
>>>>>>>>
------------------------------------------------------------
>>>>>>>> ------------------
>>>>>>>> Brick                : Brick
stor1data:/mnt/glusterfs/vol1/brick1
>>>>>>>> TCP Port             : 49153
>>>>>>>> RDMA Port            : 0
>>>>>>>> Online               : Y
>>>>>>>> Pid                  : 13579
>>>>>>>> File System          : xfs
>>>>>>>> Device               : /dev/sdc1
>>>>>>>> Mount Options        : rw,noatime
>>>>>>>> Inode Size           : 512
>>>>>>>> Disk Space Free      : 35.0TB
>>>>>>>> Total Disk Space     : 49.1TB
>>>>>>>> Inode Count          : 5273970048
>>>>>>>> Free Inodes          : 5273123069
>>>>>>>>
------------------------------------------------------------
>>>>>>>> ------------------
>>>>>>>> Brick                : Brick
stor2data:/mnt/glusterfs/vol1/brick1
>>>>>>>> TCP Port             : 49153
>>>>>>>> RDMA Port            : 0
>>>>>>>> Online               : Y
>>>>>>>> Pid                  : 13344
>>>>>>>> File System          : xfs
>>>>>>>> Device               : /dev/sdc1
>>>>>>>> Mount Options        : rw,noatime
>>>>>>>> Inode Size           : 512
>>>>>>>> Disk Space Free      : 35.0TB
>>>>>>>> Total Disk Space     : 49.1TB
>>>>>>>> Inode Count          : 5273970048
>>>>>>>> Free Inodes          : 5273124718
>>>>>>>>
------------------------------------------------------------
>>>>>>>> ------------------
>>>>>>>> Brick                : Brick
stor3data:/mnt/disk_c/glusterf
>>>>>>>> s/vol1/brick1
>>>>>>>> TCP Port             : 49154
>>>>>>>> RDMA Port            : 0
>>>>>>>> Online               : Y
>>>>>>>> Pid                  : 17439
>>>>>>>> File System          : xfs
>>>>>>>> Device               : /dev/sdc1
>>>>>>>> Mount Options        : rw,noatime
>>>>>>>> Inode Size           : 512
>>>>>>>> Disk Space Free      : 35.7TB
>>>>>>>> Total Disk Space     : 49.1TB
>>>>>>>> Inode Count          : 5273970048
>>>>>>>> Free Inodes          : 5273125437
>>>>>>>>
------------------------------------------------------------
>>>>>>>> ------------------
>>>>>>>> Brick                : Brick
stor3data:/mnt/disk_d/glusterf
>>>>>>>> s/vol1/brick1
>>>>>>>> TCP Port             : 49155
>>>>>>>> RDMA Port            : 0
>>>>>>>> Online               : Y
>>>>>>>> Pid                  : 17459
>>>>>>>> File System          : xfs
>>>>>>>> Device               : /dev/sdd1
>>>>>>>> Mount Options        : rw,noatime
>>>>>>>> Inode Size           : 512
>>>>>>>> Disk Space Free      : 35.6TB
>>>>>>>> Total Disk Space     : 49.1TB
>>>>>>>> Inode Count          : 5273970048
>>>>>>>> Free Inodes          : 5273127036
>>>>>>>>
>>>>>>>>
>>>>>>>> Then full size for volumedisk1 should be:
49.1TB + 49.1TB + 49.1TB
>>>>>>>> +49.1TB = *196,4 TB  *but df shows:
>>>>>>>>
>>>>>>>> [root at stor1 ~]# df -h
>>>>>>>> Filesystem            Size  Used Avail Use%
Mounted on
>>>>>>>> /dev/sda2              48G   21G   25G  46% /
>>>>>>>> tmpfs                  32G   80K   32G   1%
/dev/shm
>>>>>>>> /dev/sda1             190M   62M  119M  35%
/boot
>>>>>>>> /dev/sda4             395G  251G  124G  68%
/data
>>>>>>>> /dev/sdb1              26T  601G   25T   3%
/mnt/glusterfs/vol0
>>>>>>>> /dev/sdc1              50T   15T   36T  29%
/mnt/glusterfs/vol1
>>>>>>>> stor1data:/volumedisk0
>>>>>>>>                        76T  1,6T   74T   3%
/volumedisk0
>>>>>>>> stor1data:/volumedisk1
>>>>>>>>                       *148T*   42T  106T  29%
/volumedisk1
>>>>>>>>
>>>>>>>> Exactly 1 brick minus: 196,4 TB - 49,1TB =
148TB
>>>>>>>>
>>>>>>>> It's a production system so I hope you can
help me.
>>>>>>>>
>>>>>>>> Thanks in advance.
>>>>>>>>
>>>>>>>> Jose V.
>>>>>>>>
>>>>>>>>
>>>>>>>> Below some other data of my configuration:
>>>>>>>>
>>>>>>>> [root at stor1 ~]# gluster volume info
>>>>>>>>
>>>>>>>> Volume Name: volumedisk0
>>>>>>>> Type: Distribute
>>>>>>>> Volume ID: 0ee52d94-1131-4061-bcef-bd8cf898da10
>>>>>>>> Status: Started
>>>>>>>> Snapshot Count: 0
>>>>>>>> Number of Bricks: 4
>>>>>>>> Transport-type: tcp
>>>>>>>> Bricks:
>>>>>>>> Brick1: stor1data:/mnt/glusterfs/vol0/brick1
>>>>>>>> Brick2: stor2data:/mnt/glusterfs/vol0/brick1
>>>>>>>> Brick3:
stor3data:/mnt/disk_b1/glusterfs/vol0/brick1
>>>>>>>> Brick4:
stor3data:/mnt/disk_b2/glusterfs/vol0/brick1
>>>>>>>> Options Reconfigured:
>>>>>>>> performance.cache-size: 4GB
>>>>>>>> cluster.min-free-disk: 1%
>>>>>>>> performance.io-thread-count: 16
>>>>>>>> performance.readdir-ahead: on
>>>>>>>>
>>>>>>>> Volume Name: volumedisk1
>>>>>>>> Type: Distribute
>>>>>>>> Volume ID: 591b7098-800e-4954-82a9-6b6d81c9e0a2
>>>>>>>> Status: Started
>>>>>>>> Snapshot Count: 0
>>>>>>>> Number of Bricks: 4
>>>>>>>> Transport-type: tcp
>>>>>>>> Bricks:
>>>>>>>> Brick1: stor1data:/mnt/glusterfs/vol1/brick1
>>>>>>>> Brick2: stor2data:/mnt/glusterfs/vol1/brick1
>>>>>>>> Brick3:
stor3data:/mnt/disk_c/glusterfs/vol1/brick1
>>>>>>>> Brick4:
stor3data:/mnt/disk_d/glusterfs/vol1/brick1
>>>>>>>> Options Reconfigured:
>>>>>>>> cluster.min-free-inodes: 6%
>>>>>>>> performance.cache-size: 4GB
>>>>>>>> cluster.min-free-disk: 1%
>>>>>>>> performance.io-thread-count: 16
>>>>>>>> performance.readdir-ahead: on
>>>>>>>>
>>>>>>>> [root at stor1 ~]# grep -n "share"
/var/lib/glusterd/vols/volumed
>>>>>>>> isk1/*
>>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>>>>> -glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 1
>>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor1data.mnt
>>>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count
>>>>>>>> 1
>>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>>>>> -glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 0
>>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor2data.mnt
>>>>>>>> -glusterfs-vol1-brick1.vol.rpmsave:3:    option
shared-brick-count
>>>>>>>> 0
>>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>>>> -disk_c-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 0
>>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>>>> -disk_c-glusterfs-vol1-brick1.vol.rpmsave:3:   
option
>>>>>>>> shared-brick-count 0
>>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>>>> -disk_d-glusterfs-vol1-brick1.vol:3:    option
shared-brick-count 0
>>>>>>>>
/var/lib/glusterd/vols/volumedisk1/volumedisk1.stor3data.mnt
>>>>>>>> -disk_d-glusterfs-vol1-brick1.vol.rpmsave:3:   
option
>>>>>>>> shared-brick-count 0
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Gluster-users mailing list
>>>>>>>> Gluster-users at gluster.org
>>>>>>>>
http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180301/155bf9e4/attachment-0001.html>

Maybe Matching Threads

Search for more seemingly similar threads

Gluster users - Mar 2018 - df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)

[Gluster-users] df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)

[Gluster-users] df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)

[Gluster-users] df reports wrong full capacity for distributed volumes (Glusterfs 3.12.6-1)

Maybe Matching Threads