thr3ads.net - Gluster users - [Gluster-users] Rebalance taking

If this information is useful, please help other people find it:
Share via:

Rusty Bower

2018-Jul-23 08:19 UTC

[Gluster-users] Rebalance taking > 2 months

datanode03 is the newest brick

the bricks had gotten pretty full, which I think might be part of the issue:
- datanode01 /dev/sda1                 51T   48T  3.3T  94% /mnt/data
- datanode02 /dev/sda1                 51T   48T  3.4T  94% /mnt/data
- datanode03 /dev/md0                 128T  4.6T  123T   4% /mnt/data

each of the bricks are on a completely separate disk from the OS

I'll shoot you the log files offline :)

Thanks!
Rusty

On Mon, Jul 23, 2018 at 3:12 AM, Nithya Balachandran <nbalacha at
redhat.com>
wrote:
> Hi Rusty,
>
> Sorry I took so long to get back to you.
>
> Which is the newly added brick? I see datanode02 has not picked up any
> files for migration which is odd.
> How full are the individual bricks (df -h ) output.
> Is each of your bricks in a separate partition?
> Can you send me the rebalance logs from all 3 nodes (offline if you
> prefer)?
>
> We can try using scripts to speed up the rebalance if you prefer.
>
> Regards,
> Nithya
>
>
>
> On 16 July 2018 at 22:06, Rusty Bower <rusty at rustybower.com>
wrote:
>
>> Thanks for the reply Nithya.
>>
>> 1. glusterfs 4.1.1
>>
>> 2. Volume Name: data
>> Type: Distribute
>> Volume ID: 294d95ce-0ff3-4df9-bd8c-a52fc50442ba
>> Status: Started
>> Snapshot Count: 0
>> Number of Bricks: 3
>> Transport-type: tcp
>> Bricks:
>> Brick1: datanode01:/mnt/data/bricks/data
>> Brick2: datanode02:/mnt/data/bricks/data
>> Brick3: datanode03:/mnt/data/bricks/data
>> Options Reconfigured:
>> performance.readdir-ahead: on
>>
>> 3.
>>                                     Node Rebalanced-files          size
>>      scanned      failures       skipped               status  run time
in
>> h:m:s
>>                                ---------      -----------   -----------
>>  -----------   -----------   -----------         ------------
>>  --------------
>>                                localhost            36822        11.3GB
>>        50715             0             0          in progress      
26:46:17
>>                               datanode02                0        0Bytes
>>         2852             0             0          in progress      
26:46:16
>>                               datanode03             3128       513.7MB
>>        11442             0          3128          in progress      
26:46:17
>> Estimated time left for rebalance to complete : > 2 months. Please
try
>> again later.
>> volume rebalance: data: success
>>
>> 4. Directory structure is basically an rsync backup of some old systems
>> as well as all of my personal media. I can elaborate more, but it's
a
>> pretty standard filesystem.
>>
>> 5. In some folders there might be up to like 12-15 levels of
directories
>> (especially the backups)
>>
>> 6. I'm honestly not sure, I can try to scrounge this number up
>>
>> 7. My guess would be > 100k
>>
>> 8. Most files are pretty large (media files), but there's a lot of
small
>> files (metadata and configuration files) as well
>>
>> I've also appended a (moderately sanitized) snippet of the
rebalance log
>> (let me know if you need more)
>>
>> [2018-07-16 17:37:59.979003] I [MSGID: 0]
[dht-rebalance.c:1799:dht_migrate_file]
>> 0-data-dht: destination for file - /this/is/a/file/path/that/exis
>> ts/wz/wz/Npc.wz/2040036.img.xml is changed to - data-client-2
>> [2018-07-16 17:38:00.004262] I [MSGID: 109022]
>> [dht-rebalance.c:2274:dht_migrate_file] 0-data-dht: completed migration
>> of /this/is/a/file/path/that/exists/wz/wz/Npc.wz/2112002.img.xml from
>> subvolume data-client-0 to data-client-2
>> [2018-07-16 17:38:00.725582] I
[dht-rebalance.c:4982:gf_defrag_get_estimates_based_on_size]
>> 0-glusterfs: TIME: (size) total_processed=43108305980 tmp_cnt >>
55419279917056,rate_processed=446597.869797, elapsed = 96526.000000
>> [2018-07-16 17:38:00.725641] I
[dht-rebalance.c:5130:gf_defrag_status_get]
>> 0-glusterfs: TIME: Estimated total time to complete (size)= 124092127
>> seconds, seconds left = 123995601
>> [2018-07-16 17:38:00.725709] I [MSGID: 109028]
>> [dht-rebalance.c:5210:gf_defrag_status_get] 0-glusterfs: Rebalance is
in
>> progress. Time taken is 96526.00 secs
>> [2018-07-16 17:38:00.725738] I [MSGID: 109028]
>> [dht-rebalance.c:5214:gf_defrag_status_get] 0-glusterfs: Files
migrated:
>> 36876, size: 12270259289, lookups: 50715, failures: 0, skipped: 0
>> [2018-07-16 17:38:02.769121] I
[dht-rebalance.c:4982:gf_defrag_get_estimates_based_on_size]
>> 0-glusterfs: TIME: (size) total_processed=43108305980 tmp_cnt >>
55419279917056,rate_processed=446588.616567, elapsed = 96528.000000
>> [2018-07-16 17:38:02.769207] I
[dht-rebalance.c:5130:gf_defrag_status_get]
>> 0-glusterfs: TIME: Estimated total time to complete (size)= 124094698
>> seconds, seconds left = 123998170
>> [2018-07-16 17:38:02.769263] I [MSGID: 109028]
>> [dht-rebalance.c:5210:gf_defrag_status_get] 0-glusterfs: Rebalance is
in
>> progress. Time taken is 96528.00 secs
>> [2018-07-16 17:38:02.769286] I [MSGID: 109028]
>> [dht-rebalance.c:5214:gf_defrag_status_get] 0-glusterfs: Files
migrated:
>> 36876, size: 12270259289, lookups: 50715, failures: 0, skipped: 0
>> [2018-07-16 17:38:03.410469] I [dht-rebalance.c:1645:dht_migrate_file]
>> 0-data-dht:
/this/is/a/file/path/that/exists/wz/wz/Npc.wz/9201002.img.xml:
>> attempting to move from data-client-0 to data-client-2
>> [2018-07-16 17:38:03.416127] I [MSGID: 109022]
>> [dht-rebalance.c:2274:dht_migrate_file] 0-data-dht: completed migration
>> of /this/is/a/file/path/that/exists/wz/wz/Npc.wz/2040036.img.xml from
>> subvolume data-client-0 to data-client-2
>> [2018-07-16 17:38:04.738885] I [dht-rebalance.c:1645:dht_migrate_file]
>> 0-data-dht:
/this/is/a/file/path/that/exists/wz/wz/Npc.wz/9110012.img.xml:
>> attempting to move from data-client-0 to data-client-2
>> [2018-07-16 17:38:04.745722] I [MSGID: 109022]
>> [dht-rebalance.c:2274:dht_migrate_file] 0-data-dht: completed migration
>> of /this/is/a/file/path/that/exists/wz/wz/Npc.wz/9201002.img.xml from
>> subvolume data-client-0 to data-client-2
>> [2018-07-16 17:38:04.812368] I
[dht-rebalance.c:4982:gf_defrag_get_estimates_based_on_size]
>> 0-glusterfs: TIME: (size) total_processed=43108308134 tmp_cnt >>
55419279917056,rate_processed=446579.386035, elapsed = 96530.000000
>> [2018-07-16 17:38:04.812417] I
[dht-rebalance.c:5130:gf_defrag_status_get]
>> 0-glusterfs: TIME: Estimated total time to complete (size)= 124097263
>> seconds, seconds left = 124000733
>> [2018-07-16 17:38:04.812465] I [MSGID: 109028]
>> [dht-rebalance.c:5210:gf_defrag_status_get] 0-glusterfs: Rebalance is
in
>> progress. Time taken is 96530.00 secs
>> [2018-07-16 17:38:04.812489] I [MSGID: 109028]
>> [dht-rebalance.c:5214:gf_defrag_status_get] 0-glusterfs: Files
migrated:
>> 36877, size: 12270261443, lookups: 50715, failures: 0, skipped: 0
>> [2018-07-16 17:38:04.992413] I [dht-rebalance.c:1645:dht_migrate_file]
>> 0-data-dht:
/this/is/a/file/path/that/exists/wz/wz/Npc.wz/2050000.img.xml:
>> attempting to move from data-client-0 to data-client-2
>> [2018-07-16 17:38:04.994122] I [MSGID: 109022]
>> [dht-rebalance.c:2274:dht_migrate_file] 0-data-dht: completed migration
>> of /this/is/a/file/path/that/exists/wz/wz/Npc.wz/9110012.img.xml from
>> subvolume data-client-0 to data-client-2
>> [2018-07-16 17:38:06.855618] I
[dht-rebalance.c:4982:gf_defrag_get_estimates_based_on_size]
>> 0-glusterfs: TIME: (size) total_processed=43108318798 tmp_cnt >>
55419279917056,rate_processed=446570.244043, elapsed = 96532.000000
>> [2018-07-16 17:38:06.855719] I
[dht-rebalance.c:5130:gf_defrag_status_get]
>> 0-glusterfs: TIME: Estimated total time to complete (size)= 124099804
>> seconds, seconds left = 124003272
>> [2018-07-16 17:38:06.855770] I [MSGID: 109028]
>> [dht-rebalance.c:5210:gf_defrag_status_get] 0-glusterfs: Rebalance is
in
>> progress. Time taken is 96532.00 secs
>> [2018-07-16 17:38:06.855793] I [MSGID: 109028]
>> [dht-rebalance.c:5214:gf_defrag_status_get] 0-glusterfs: Files
migrated:
>> 36879, size: 12270266602, lookups: 50715, failures: 0, skipped: 0
>> [2018-07-16 17:38:08.511064] I [dht-rebalance.c:1645:dht_migrate_file]
>> 0-data-dht:
/this/is/a/file/path/that/exists/wz/wz/Npc.wz/9201055.img.xml:
>> attempting to move from data-client-0 to data-client-2
>> [2018-07-16 17:38:08.533029] I [MSGID: 109022]
>> [dht-rebalance.c:2274:dht_migrate_file] 0-data-dht: completed migration
>> of /this/is/a/file/path/that/exists/wz/wz/Npc.wz/2050000.img.xml from
>> subvolume data-client-0 to data-client-2
>> [2018-07-16 17:38:08.899708] I
[dht-rebalance.c:4982:gf_defrag_get_estimates_based_on_size]
>> 0-glusterfs: TIME: (size) total_processed=43108318798 tmp_cnt >>
55419279917056,rate_processed=446560.991961, elapsed = 96534.000000
>> [2018-07-16 17:38:08.899791] I
[dht-rebalance.c:5130:gf_defrag_status_get]
>> 0-glusterfs: TIME: Estimated total time to complete (size)= 124102375
>> seconds, seconds left = 124005841
>> [2018-07-16 17:38:08.899842] I [MSGID: 109028]
>> [dht-rebalance.c:5210:gf_defrag_status_get] 0-glusterfs: Rebalance is
in
>> progress. Time taken is 96534.00 secs
>> [2018-07-16 17:38:08.899865] I [MSGID: 109028]
>> [dht-rebalance.c:5214:gf_defrag_status_get] 0-glusterfs: Files
migrated:
>> 36879, size: 12270266602, lookups: 50715, failures: 0, skipped: 0
>>
>>
>> On Mon, Jul 16, 2018 at 7:37 AM, Nithya Balachandran <nbalacha at
redhat.com
>> > wrote:
>>
>>> If possible, please send the rebalance logs as well.
>>>
>>>
>>> On 16 July 2018 at 10:14, Nithya Balachandran <nbalacha at
redhat.com>
>>> wrote:
>>>
>>>> Hi Rusty,
>>>>
>>>> We need the following information:
>>>>
>>>>    1. The exact gluster version you are running
>>>>    2. gluster volume info <volname>
>>>>    3. gluster rebalance status
>>>>    4. Information on the directory structure and file locations
on
>>>>    your volume.
>>>>    5. How many levels of directories
>>>>    6. How many files and directories in each level
>>>>    7. How many directories and files in total (a rough
estimate)
>>>>    8. Average file size
>>>>
>>>> Please note that having a rebalance running in the background
should
>>>> not affect your volume access in any way. However I would like
to know why
>>>> only 6000 files have been scanned in 6 hours.
>>>>
>>>> Regards,
>>>> Nithya
>>>>
>>>>
>>>> On 16 July 2018 at 06:13, Rusty Bower <rusty at
rustybower.com> wrote:
>>>>
>>>>> Hey folks,
>>>>>
>>>>> I just added a new brick to my existing gluster volume, but
*gluster
>>>>> volume rebalance data status* is telling me the following:
Estimated
>>>>> time left for rebalance to complete : > 2 months. Please
try again later.
>>>>>
>>>>> I already did a fix-mapping, but this thing is absolutely
crawling
>>>>> trying to rebalance everything (last estimate was ~40
years)
>>>>>
>>>>> Any thoughts on if this is a bug, or ways to speed this up?
It's
>>>>> taking ~6 hours to scan 6000 files, which seems
unreasonably slow.
>>>>>
>>>>> Thanks
>>>>> Rusty
>>>>>
>>>>> _______________________________________________
>>>>> Gluster-users mailing list
>>>>> Gluster-users at gluster.org
>>>>> https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>
>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180723/76d24716/attachment.html>

Rusty Bower

2018-Jul-28 01:41 UTC

head link

[Gluster-users] Rebalance taking > 2 months

Just wanted to ping this to see if you guys had any thoughts, or other
scripts I can run for this stuff. It's still predicting another 90 days to
rebalance this, and performance is basically garbage while it rebalances.

Rusty

On Mon, Jul 23, 2018 at 10:19 AM, Rusty Bower <rusty at rustybower.com>
wrote:
> datanode03 is the newest brick
>
> the bricks had gotten pretty full, which I think might be part of the
> issue:
> - datanode01 /dev/sda1                 51T   48T  3.3T  94% /mnt/data
> - datanode02 /dev/sda1                 51T   48T  3.4T  94% /mnt/data
> - datanode03 /dev/md0                 128T  4.6T  123T   4% /mnt/data
>
> each of the bricks are on a completely separate disk from the OS
>
> I'll shoot you the log files offline :)
>
> Thanks!
> Rusty
>
> On Mon, Jul 23, 2018 at 3:12 AM, Nithya Balachandran <nbalacha at
redhat.com>
> wrote:
>
>> Hi Rusty,
>>
>> Sorry I took so long to get back to you.
>>
>> Which is the newly added brick? I see datanode02 has not picked up any
>> files for migration which is odd.
>> How full are the individual bricks (df -h ) output.
>> Is each of your bricks in a separate partition?
>> Can you send me the rebalance logs from all 3 nodes (offline if you
>> prefer)?
>>
>> We can try using scripts to speed up the rebalance if you prefer.
>>
>> Regards,
>> Nithya
>>
>>
>>
>> On 16 July 2018 at 22:06, Rusty Bower <rusty at rustybower.com>
wrote:
>>
>>> Thanks for the reply Nithya.
>>>
>>> 1. glusterfs 4.1.1
>>>
>>> 2. Volume Name: data
>>> Type: Distribute
>>> Volume ID: 294d95ce-0ff3-4df9-bd8c-a52fc50442ba
>>> Status: Started
>>> Snapshot Count: 0
>>> Number of Bricks: 3
>>> Transport-type: tcp
>>> Bricks:
>>> Brick1: datanode01:/mnt/data/bricks/data
>>> Brick2: datanode02:/mnt/data/bricks/data
>>> Brick3: datanode03:/mnt/data/bricks/data
>>> Options Reconfigured:
>>> performance.readdir-ahead: on
>>>
>>> 3.
>>>                                     Node Rebalanced-files         
size
>>>      scanned      failures       skipped               status  run
time in
>>> h:m:s
>>>                                ---------      -----------  
-----------
>>>  -----------   -----------   -----------         ------------
>>>  --------------
>>>                                localhost            36822       
11.3GB
>>>        50715             0             0          in progress      
26:46:17
>>>                               datanode02                0       
0Bytes
>>>         2852             0             0          in progress      
26:46:16
>>>                               datanode03             3128      
513.7MB
>>>        11442             0          3128          in progress      
26:46:17
>>> Estimated time left for rebalance to complete : > 2 months.
Please try
>>> again later.
>>> volume rebalance: data: success
>>>
>>> 4. Directory structure is basically an rsync backup of some old
systems
>>> as well as all of my personal media. I can elaborate more, but
it's a
>>> pretty standard filesystem.
>>>
>>> 5. In some folders there might be up to like 12-15 levels of
directories
>>> (especially the backups)
>>>
>>> 6. I'm honestly not sure, I can try to scrounge this number up
>>>
>>> 7. My guess would be > 100k
>>>
>>> 8. Most files are pretty large (media files), but there's a lot
of small
>>> files (metadata and configuration files) as well
>>>
>>> I've also appended a (moderately sanitized) snippet of the
rebalance
>>> log (let me know if you need more)
>>>
>>> [2018-07-16 17:37:59.979003] I [MSGID: 0]
[dht-rebalance.c:1799:dht_migrate_file]
>>> 0-data-dht: destination for file - /this/is/a/file/path/that/exis
>>> ts/wz/wz/Npc.wz/2040036.img.xml is changed to - data-client-2
>>> [2018-07-16 17:38:00.004262] I [MSGID: 109022]
>>> [dht-rebalance.c:2274:dht_migrate_file] 0-data-dht: completed
migration
>>> of /this/is/a/file/path/that/exists/wz/wz/Npc.wz/2112002.img.xml
from
>>> subvolume data-client-0 to data-client-2
>>> [2018-07-16 17:38:00.725582] I
[dht-rebalance.c:4982:gf_defrag_get_estimates_based_on_size]
>>> 0-glusterfs: TIME: (size) total_processed=43108305980 tmp_cnt
>>> 55419279917056,rate_processed=446597.869797, elapsed = 96526.000000
>>> [2018-07-16 17:38:00.725641] I
[dht-rebalance.c:5130:gf_defrag_status_get]
>>> 0-glusterfs: TIME: Estimated total time to complete (size)=
124092127
>>> seconds, seconds left = 123995601
>>> [2018-07-16 17:38:00.725709] I [MSGID: 109028]
>>> [dht-rebalance.c:5210:gf_defrag_status_get] 0-glusterfs: Rebalance
is
>>> in progress. Time taken is 96526.00 secs
>>> [2018-07-16 17:38:00.725738] I [MSGID: 109028]
>>> [dht-rebalance.c:5214:gf_defrag_status_get] 0-glusterfs: Files
>>> migrated: 36876, size: 12270259289, lookups: 50715, failures: 0,
skipped: 0
>>> [2018-07-16 17:38:02.769121] I
[dht-rebalance.c:4982:gf_defrag_get_estimates_based_on_size]
>>> 0-glusterfs: TIME: (size) total_processed=43108305980 tmp_cnt
>>> 55419279917056,rate_processed=446588.616567, elapsed = 96528.000000
>>> [2018-07-16 17:38:02.769207] I
[dht-rebalance.c:5130:gf_defrag_status_get]
>>> 0-glusterfs: TIME: Estimated total time to complete (size)=
124094698
>>> seconds, seconds left = 123998170
>>> [2018-07-16 17:38:02.769263] I [MSGID: 109028]
>>> [dht-rebalance.c:5210:gf_defrag_status_get] 0-glusterfs: Rebalance
is
>>> in progress. Time taken is 96528.00 secs
>>> [2018-07-16 17:38:02.769286] I [MSGID: 109028]
>>> [dht-rebalance.c:5214:gf_defrag_status_get] 0-glusterfs: Files
>>> migrated: 36876, size: 12270259289, lookups: 50715, failures: 0,
skipped: 0
>>> [2018-07-16 17:38:03.410469] I
[dht-rebalance.c:1645:dht_migrate_file]
>>> 0-data-dht:
/this/is/a/file/path/that/exists/wz/wz/Npc.wz/9201002.img.xml:
>>> attempting to move from data-client-0 to data-client-2
>>> [2018-07-16 17:38:03.416127] I [MSGID: 109022]
>>> [dht-rebalance.c:2274:dht_migrate_file] 0-data-dht: completed
migration
>>> of /this/is/a/file/path/that/exists/wz/wz/Npc.wz/2040036.img.xml
from
>>> subvolume data-client-0 to data-client-2
>>> [2018-07-16 17:38:04.738885] I
[dht-rebalance.c:1645:dht_migrate_file]
>>> 0-data-dht:
/this/is/a/file/path/that/exists/wz/wz/Npc.wz/9110012.img.xml:
>>> attempting to move from data-client-0 to data-client-2
>>> [2018-07-16 17:38:04.745722] I [MSGID: 109022]
>>> [dht-rebalance.c:2274:dht_migrate_file] 0-data-dht: completed
migration
>>> of /this/is/a/file/path/that/exists/wz/wz/Npc.wz/9201002.img.xml
from
>>> subvolume data-client-0 to data-client-2
>>> [2018-07-16 17:38:04.812368] I
[dht-rebalance.c:4982:gf_defrag_get_estimates_based_on_size]
>>> 0-glusterfs: TIME: (size) total_processed=43108308134 tmp_cnt
>>> 55419279917056,rate_processed=446579.386035, elapsed = 96530.000000
>>> [2018-07-16 17:38:04.812417] I
[dht-rebalance.c:5130:gf_defrag_status_get]
>>> 0-glusterfs: TIME: Estimated total time to complete (size)=
124097263
>>> seconds, seconds left = 124000733
>>> [2018-07-16 17:38:04.812465] I [MSGID: 109028]
>>> [dht-rebalance.c:5210:gf_defrag_status_get] 0-glusterfs: Rebalance
is
>>> in progress. Time taken is 96530.00 secs
>>> [2018-07-16 17:38:04.812489] I [MSGID: 109028]
>>> [dht-rebalance.c:5214:gf_defrag_status_get] 0-glusterfs: Files
>>> migrated: 36877, size: 12270261443, lookups: 50715, failures: 0,
skipped: 0
>>> [2018-07-16 17:38:04.992413] I
[dht-rebalance.c:1645:dht_migrate_file]
>>> 0-data-dht:
/this/is/a/file/path/that/exists/wz/wz/Npc.wz/2050000.img.xml:
>>> attempting to move from data-client-0 to data-client-2
>>> [2018-07-16 17:38:04.994122] I [MSGID: 109022]
>>> [dht-rebalance.c:2274:dht_migrate_file] 0-data-dht: completed
migration
>>> of /this/is/a/file/path/that/exists/wz/wz/Npc.wz/9110012.img.xml
from
>>> subvolume data-client-0 to data-client-2
>>> [2018-07-16 17:38:06.855618] I
[dht-rebalance.c:4982:gf_defrag_get_estimates_based_on_size]
>>> 0-glusterfs: TIME: (size) total_processed=43108318798 tmp_cnt
>>> 55419279917056,rate_processed=446570.244043, elapsed = 96532.000000
>>> [2018-07-16 17:38:06.855719] I
[dht-rebalance.c:5130:gf_defrag_status_get]
>>> 0-glusterfs: TIME: Estimated total time to complete (size)=
124099804
>>> seconds, seconds left = 124003272
>>> [2018-07-16 17:38:06.855770] I [MSGID: 109028]
>>> [dht-rebalance.c:5210:gf_defrag_status_get] 0-glusterfs: Rebalance
is
>>> in progress. Time taken is 96532.00 secs
>>> [2018-07-16 17:38:06.855793] I [MSGID: 109028]
>>> [dht-rebalance.c:5214:gf_defrag_status_get] 0-glusterfs: Files
>>> migrated: 36879, size: 12270266602, lookups: 50715, failures: 0,
skipped: 0
>>> [2018-07-16 17:38:08.511064] I
[dht-rebalance.c:1645:dht_migrate_file]
>>> 0-data-dht:
/this/is/a/file/path/that/exists/wz/wz/Npc.wz/9201055.img.xml:
>>> attempting to move from data-client-0 to data-client-2
>>> [2018-07-16 17:38:08.533029] I [MSGID: 109022]
>>> [dht-rebalance.c:2274:dht_migrate_file] 0-data-dht: completed
migration
>>> of /this/is/a/file/path/that/exists/wz/wz/Npc.wz/2050000.img.xml
from
>>> subvolume data-client-0 to data-client-2
>>> [2018-07-16 17:38:08.899708] I
[dht-rebalance.c:4982:gf_defrag_get_estimates_based_on_size]
>>> 0-glusterfs: TIME: (size) total_processed=43108318798 tmp_cnt
>>> 55419279917056,rate_processed=446560.991961, elapsed = 96534.000000
>>> [2018-07-16 17:38:08.899791] I
[dht-rebalance.c:5130:gf_defrag_status_get]
>>> 0-glusterfs: TIME: Estimated total time to complete (size)=
124102375
>>> seconds, seconds left = 124005841
>>> [2018-07-16 17:38:08.899842] I [MSGID: 109028]
>>> [dht-rebalance.c:5210:gf_defrag_status_get] 0-glusterfs: Rebalance
is
>>> in progress. Time taken is 96534.00 secs
>>> [2018-07-16 17:38:08.899865] I [MSGID: 109028]
>>> [dht-rebalance.c:5214:gf_defrag_status_get] 0-glusterfs: Files
>>> migrated: 36879, size: 12270266602, lookups: 50715, failures: 0,
skipped: 0
>>>
>>>
>>> On Mon, Jul 16, 2018 at 7:37 AM, Nithya Balachandran <
>>> nbalacha at redhat.com> wrote:
>>>
>>>> If possible, please send the rebalance logs as well.
>>>>
>>>>
>>>> On 16 July 2018 at 10:14, Nithya Balachandran <nbalacha at
redhat.com>
>>>> wrote:
>>>>
>>>>> Hi Rusty,
>>>>>
>>>>> We need the following information:
>>>>>
>>>>>    1. The exact gluster version you are running
>>>>>    2. gluster volume info <volname>
>>>>>    3. gluster rebalance status
>>>>>    4. Information on the directory structure and file
locations on
>>>>>    your volume.
>>>>>    5. How many levels of directories
>>>>>    6. How many files and directories in each level
>>>>>    7. How many directories and files in total (a rough
estimate)
>>>>>    8. Average file size
>>>>>
>>>>> Please note that having a rebalance running in the
background should
>>>>> not affect your volume access in any way. However I would
like to know why
>>>>> only 6000 files have been scanned in 6 hours.
>>>>>
>>>>> Regards,
>>>>> Nithya
>>>>>
>>>>>
>>>>> On 16 July 2018 at 06:13, Rusty Bower <rusty at
rustybower.com> wrote:
>>>>>
>>>>>> Hey folks,
>>>>>>
>>>>>> I just added a new brick to my existing gluster volume,
but *gluster
>>>>>> volume rebalance data status* is telling me the
following: Estimated
>>>>>> time left for rebalance to complete : > 2 months.
Please try again later.
>>>>>>
>>>>>> I already did a fix-mapping, but this thing is
absolutely crawling
>>>>>> trying to rebalance everything (last estimate was ~40
years)
>>>>>>
>>>>>> Any thoughts on if this is a bug, or ways to speed this
up? It's
>>>>>> taking ~6 hours to scan 6000 files, which seems
unreasonably slow.
>>>>>>
>>>>>> Thanks
>>>>>> Rusty
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>>
https://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20180728/5242282f/attachment.html>

Gluster users - Jul 2018 - Rebalance taking > 2 months

[Gluster-users] Rebalance taking > 2 months

[Gluster-users] Rebalance taking > 2 months