Hi,
Does rebalance fix-layout triggers automatically by any chance?
because my cluster currently showing rebalance in progress and running
command "rebalance status" shows " fix-layout in progress "
in nodes added
recently to cluster and "fix-layout completed" in old nodes.
checking rebalance log in new nodes states it was started on 12th April.
strange what would have triggered rebalance process?
regards
Amudhan
On Thu, Apr 13, 2017 at 12:51 PM, Amudhan P <amudhan83 at gmail.com>
wrote:
> I have another issue now after expanding cluster folder listing time is
> increased to 400%.
>
> I have also tried to enable readdir-ahead & parallel-readdir but was
not
> showing any improvement in folder listing but started with an issue in
> listing folders like random folders disappeared from listing and data read
> shows IO error.
>
> Tried disabling Cluster.readdir-optimze and remount fuse client but issue
> continued. so, disabled readdir-ahead & parallel-readdir and enabled
> Cluster.readdir-optimze everything works fine.
>
> How do I bring down folder listing time?
>
>
> Below is my config in Volume :
> Options Reconfigured:
> nfs.disable: yes
> cluster.disperse-self-heal-daemon: enable
> cluster.weighted-rebalance: off
> cluster.rebal-throttle: aggressive
> performance.readdir-ahead: off
> cluster.min-free-disk: 10%
> features.default-soft-limit: 80%
> performance.force-readdirp: no
> dht.force-readdirp: off
> cluster.readdir-optimize: on
> cluster.heal-timeout: 43200
> cluster.data-self-heal: on
>
> On Fri, Apr 7, 2017 at 7:35 PM, Amudhan P <amudhan83 at gmail.com>
wrote:
>
>> Volume type:
>> Disperse Volume 8+2 = 1080 bricks
>>
>> First time added 8+2 * 3 sets and it started giving issue in listing
>> folder. so, remounted mount point and it was working fine.
>>
>> Second added 8+2 *13 sets and it also had the same issue.
>>
>> when listing folder it was returning an empty folder or not showing all
>> the folders.
>>
>> when ongoing write was interrupted it throws an error destination not
>> folder not available.
>>
>> adding few more lines from log.. let me know if you need full log file.
>>
>> [2017-04-05 13:40:03.702624] I [glusterfsd-mgmt.c:52:mgmt_cbk_spec]
>> 0-mgmt: Volume file changed
>> [2017-04-05 13:40:04.970055] I [MSGID: 122067]
>> [ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-123: Using
'sse' CPU
>> extensions
>> [2017-04-05 13:40:04.971194] I [MSGID: 122067]
>> [ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-122: Using
'sse' CPU
>> extensions
>> [2017-04-05 13:40:04.972144] I [MSGID: 122067]
>> [ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-121: Using
'sse' CPU
>> extensions
>> [2017-04-05 13:40:04.973131] I [MSGID: 122067]
>> [ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-120: Using
'sse' CPU
>> extensions
>> [2017-04-05 13:40:04.974072] I [MSGID: 122067]
>> [ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-119: Using
'sse' CPU
>> extensions
>> [2017-04-05 13:40:04.975005] I [MSGID: 122067]
>> [ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-118: Using
'sse' CPU
>> extensions
>> [2017-04-05 13:40:04.975936] I [MSGID: 122067]
>> [ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-117: Using
'sse' CPU
>> extensions
>> [2017-04-05 13:40:04.976905] I [MSGID: 122067]
>> [ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-116: Using
'sse' CPU
>> extensions
>> [2017-04-05 13:40:04.977825] I [MSGID: 122067]
>> [ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-115: Using
'sse' CPU
>> extensions
>> [2017-04-05 13:40:04.978755] I [MSGID: 122067]
>> [ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-114: Using
'sse' CPU
>> extensions
>> [2017-04-05 13:40:04.979689] I [MSGID: 122067]
>> [ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-113: Using
'sse' CPU
>> extensions
>> [2017-04-05 13:40:04.980626] I [MSGID: 122067]
>> [ec-code.c:1046:ec_code_detect] 2-gfs-vol-disperse-112: Using
'sse' CPU
>> extensions
>> [2017-04-05 13:40:07.270412] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-736: changing port to 49153 (from 0)
>> [2017-04-05 13:40:07.271902] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-746: changing port to 49154 (from 0)
>> [2017-04-05 13:40:07.272076] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-756: changing port to 49155 (from 0)
>> [2017-04-05 13:40:07.273154] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-766: changing port to 49156 (from 0)
>> [2017-04-05 13:40:07.273193] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-776: changing port to 49157 (from 0)
>> [2017-04-05 13:40:07.273371] I [MSGID: 114046]
>> [client-handshake.c:1216:client_setvolume_cbk] 2-gfs-vol-client-579:
>> Connected to gfs-vol-client-579, attached to remote volume
>> '/media/disk22/brick22'.
>> [2017-04-05 13:40:07.273388] I [MSGID: 114047]
>> [client-handshake.c:1227:client_setvolume_cbk] 2-gfs-vol-client-579:
>> Server and Client lk-version numbers are not same, reopening the fds
>> [2017-04-05 13:40:07.273435] I [MSGID: 114035]
>> [client-handshake.c:202:client_set_lk_version_cbk]
2-gfs-vol-client-433:
>> Server lk version = 1
>> [2017-04-05 13:40:07.275632] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-786: changing port to 49158 (from 0)
>> [2017-04-05 13:40:07.275685] I [MSGID: 114046]
>> [client-handshake.c:1216:client_setvolume_cbk] 2-gfs-vol-client-589:
>> Connected to gfs-vol-client-589, attached to remote volume
>> '/media/disk23/brick23'.
>> [2017-04-05 13:40:07.275707] I [MSGID: 114047]
>> [client-handshake.c:1227:client_setvolume_cbk] 2-gfs-vol-client-589:
>> Server and Client lk-version numbers are not same, reopening the fds
>> [2017-04-05 13:40:07.087011] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-811: changing port to 49161 (from 0)
>> [2017-04-05 13:40:07.087031] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-420: changing port to 49158 (from 0)
>> [2017-04-05 13:40:07.087045] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-521: changing port to 49168 (from 0)
>> [2017-04-05 13:40:07.087060] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-430: changing port to 49159 (from 0)
>> [2017-04-05 13:40:07.087074] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-531: changing port to 49169 (from 0)
>> [2017-04-05 13:40:07.087098] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-440: changing port to 49160 (from 0)
>> [2017-04-05 13:40:07.087105] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-821: changing port to 49162 (from 0)
>> [2017-04-05 13:40:07.087117] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-450: changing port to 49161 (from 0)
>> [2017-04-05 13:40:07.087131] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-831: changing port to 49163 (from 0)
>> [2017-04-05 13:40:07.087134] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-460: changing port to 49162 (from 0)
>> [2017-04-05 13:40:07.087157] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-841: changing port to 49164 (from 0)
>> [2017-04-05 13:40:07.087181] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-541: changing port to 49170 (from 0)
>> [2017-04-05 13:40:07.087185] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-470: changing port to 49163 (from 0)
>> [2017-04-05 13:40:07.087202] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-851: changing port to 49165 (from 0)
>> [2017-04-05 13:40:07.087241] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-480: changing port to 49164 (from 0)
>> [2017-04-05 13:40:07.087240] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-551: changing port to 49171 (from 0)
>> [2017-04-05 13:40:07.087263] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-861: changing port to 49166 (from 0)
>> [2017-04-05 13:40:07.087281] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-571: changing port to 49173 (from 0)
>> [2017-04-05 13:40:07.087284] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-561: changing port to 49172 (from 0)
>> [2017-04-05 13:40:07.087318] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-581: changing port to 49174 (from 0)
>> [2017-04-05 13:40:07.087318] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-490: changing port to 49165 (from 0)
>> [2017-04-05 13:40:07.087344] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-500: changing port to 49166 (from 0)
>> [2017-04-05 13:40:07.087352] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-871: changing port to 49167 (from 0)
>> [2017-04-05 13:40:07.087372] I [rpc-clnt.c:2000:rpc_clnt_reconfig]
>> 2-gfs-vol-client-591: changing port to 49175 (from 0)
>>
>> [2017-04-05 13:40:07.681293] I [MSGID: 114046]
>> [client-handshake.c:1216:client_setvolume_cbk] 2-gfs-vol-client-755:
>> Connected to gfs-vol-client-755, attached to remote volume
>> '/media/disk4/brick4'.
>> [2017-04-05 13:40:07.681312] I [MSGID: 114047]
>> [client-handshake.c:1227:client_setvolume_cbk] 2-gfs-vol-client-755:
>> Server and Client lk-version numbers are not same, reopening the fds
>> [2017-04-05 13:40:07.681317] I [MSGID: 122061] [ec.c:340:ec_up]
>> 2-gfs-vol-disperse-74: Going UP
>> [2017-04-05 13:40:07.681428] I [MSGID: 122061] [ec.c:340:ec_up]
>> 2-gfs-vol-disperse-75: Going UP
>> [2017-04-05 13:40:07.681454] I [MSGID: 114046]
>> [client-handshake.c:1216:client_setvolume_cbk] 2-gfs-vol-client-1049:
>> Connected to gfs-vol-client-1049, attached to remote volume
>> '/media/disk33/brick33'.
>> [2017-04-05 13:45:10.689344] I [MSGID: 114018]
>> [client.c:2276:client_rpc_notify] 0-gfs-vol-client-71: disconnected
from
>> gfs-vol-client-71. Client process will keep trying to connect to
glusterd
>> until brick's port is available
>> [2017-04-05 13:45:10.689376] I [MSGID: 114021] [client.c:2361:notify]
>> 0-gfs-vol-client-73: current graph is no longer active, destroying
>> rpc_client
>> [2017-04-05 13:45:10.689380] I [MSGID: 114018]
>> [client.c:2276:client_rpc_notify] 0-gfs-vol-client-72: disconnected
from
>> gfs-vol-client-72. Client process will keep trying to connect to
glusterd
>> until brick's port is available
>> [2017-04-05 13:45:10.689389] I [MSGID: 114021] [client.c:2361:notify]
>> 0-gfs-vol-client-74: current graph is no longer active, destroying
>> rpc_client
>> [2017-04-05 13:45:10.689394] I [MSGID: 114018]
>> [client.c:2276:client_rpc_notify] 0-gfs-vol-client-73: disconnected
from
>> gfs-vol-client-73. Client process will keep trying to connect to
glusterd
>> until brick's port is available
>> [2017-04-05 13:45:10.689390] I [MSGID: 122062] [ec.c:354:ec_down]
>> 0-gfs-vol-disperse-7: Going DOWN
>> [2017-04-05 13:45:10.689428] I [MSGID: 114021] [client.c:2361:notify]
>> 0-gfs-vol-client-75: current graph is no longer active, destroying
>> rpc_client
>> [2017-04-05 13:45:10.689443] I [MSGID: 114018]
>> [client.c:2276:client_rpc_notify] 0-gfs-vol-client-74: disconnected
from
>> gfs-vol-client-74. Client process will keep trying to connect to
glusterd
>> until brick's port is available
>>
>> On Fri, Apr 7, 2017 at 11:05 AM, Nithya Balachandran <nbalacha at
redhat.com
>> > wrote:
>>
>>>
>>>
>>> On 6 April 2017 at 14:56, Amudhan P <amudhan83 at gmail.com>
wrote:
>>>
>>>> Hi,
>>>>
>>>> I was able to add bricks to the volume successfully.
>>>> Client was reading, writing and listing data from mount point.
>>>> But after adding bricks I had issues in folder listing (not
listing all
>>>> folders or returning empty folder list) and write was
interrupted.
>>>>
>>>
>>> This is strange.The issue with listing folders you referred to
earlier
>>> was because of the rebalance but this seems new.
>>>
>>> How many bricks did you add and what is your volume config? What
errors
>>> did you see while writing or listing folders?
>>>
>>> remounting volume has solved the issue and now working fine.
>>>>
>>>> I was under the impression that running rebalance would cause
folder
>>>> listing issue but now adding brick itself created a problem.
>>>> It's irrelevant whether client busy or idle need to remount
to solve
>>>> the issue.
>>>>
>>>> Also, i would like to know using brick in a volume without
fix-layout
>>>> cause folder listing slowness.
>>>>
>>>>
>>>> Below a snippet of log from client when this happened. let me
know if
>>>> you any more additional info.
>>>>
>>>> Client and Servers are 3.10.1, volume mounted thru fuse.
>>>>
>>>> Machine busy downloading & uploading
>>>>
>>>> [2017-04-05 13:39:33.487176] I [MSGID: 114021]
[client.c:2361:notify]
>>>> 0-gfs-vol-client-1107: current graph is no longer active,
destroying
>>>> rpc_client
>>>> [2017-04-05 13:39:33.487196] I [MSGID: 114021]
[client.c:2361:notify]
>>>> 0-gfs-vol-client-1108: current graph is no longer active,
destroying
>>>> rpc_client
>>>> [2017-04-05 13:39:33.487201] I [MSGID: 114018]
>>>> [client.c:2276:client_rpc_notify] 0-gfs-vol-client-1107:
disconnected
>>>> from gfs-vol-client-1107. Client process will keep trying to
connect to
>>>> glusterd until brick's port is available
>>>> [2017-04-05 13:39:33.487212] I [MSGID: 114021]
[client.c:2361:notify]
>>>> 0-gfs-vol-client-1109: current graph is no longer active,
destroying
>>>> rpc_client
>>>> [2017-04-05 13:39:33.487217] I [MSGID: 114018]
>>>> [client.c:2276:client_rpc_notify] 0-gfs-vol-client-1108:
disconnected
>>>> from gfs-vol-client-1108. Client process will keep trying to
connect to
>>>> glusterd until brick's port is available
>>>> [2017-04-05 13:39:33.487232] I [MSGID: 114018]
>>>> [client.c:2276:client_rpc_notify] 0-gfs-vol-client-1109:
disconnected
>>>> from gfs-vol-client-1109. Client process will keep trying to
connect to
>>>> glusterd until brick's port is available
>>>>
>>>>
>>>> Idle system
>>>>
>>>> 2017-04-05 13:40:07.692336] I [MSGID: 114035]
>>>> [client-handshake.c:202:client_set_lk_version_cbk]
>>>> 2-gfs-vol-client-1065: Server lk version = 1
>>>> [2017-04-05 13:40:07.692383] I [MSGID: 114035]
>>>> [client-handshake.c:202:client_set_lk_version_cbk]
>>>> 2-gfs-vol-client-995: Server lk version = 1
>>>> [2017-04-05 13:40:07.692430] I [MSGID: 114035]
>>>> [client-handshake.c:202:client_set_lk_version_cbk]
>>>> 2-gfs-vol-client-965: Server lk version = 1
>>>> [2017-04-05 13:40:07.692485] I [MSGID: 114035]
>>>> [client-handshake.c:202:client_set_lk_version_cbk]
>>>> 2-gfs-vol-client-1075: Server lk version = 1
>>>> [2017-04-05 13:40:07.692532] I [MSGID: 114035]
>>>> [client-handshake.c:202:client_set_lk_version_cbk]
>>>> 2-gfs-vol-client-1025: Server lk version = 1
>>>> [2017-04-05 13:40:07.692569] I [MSGID: 114035]
>>>> [client-handshake.c:202:client_set_lk_version_cbk]
>>>> 2-gfs-vol-client-1055: Server lk version = 1
>>>> [2017-04-05 13:40:07.692620] I [MSGID: 114035]
>>>> [client-handshake.c:202:client_set_lk_version_cbk]
>>>> 2-gfs-vol-client-955: Server lk version = 1
>>>> [2017-04-05 13:40:07.692681] I [MSGID: 114035]
>>>> [client-handshake.c:202:client_set_lk_version_cbk]
>>>> 2-gfs-vol-client-1035: Server lk version = 1
>>>> [2017-04-05 13:40:07.692870] I [MSGID: 114035]
>>>> [client-handshake.c:202:client_set_lk_version_cbk]
>>>> 2-gfs-vol-client-1045: Server lk version = 1
>>>>
>>>>
>>>> Regards,
>>>> Amudhan
>>>>
>>>> On Tue, Apr 4, 2017 at 4:31 PM, Amudhan P <amudhan83 at
gmail.com> wrote:
>>>>
>>>>> I mean time takes for listing folders and files? because of
"rebalance
>>>>> fix layout" was not done.
>>>>>
>>>>>
>>>>> On Tue, Apr 4, 2017 at 1:51 PM, Amudhan P <amudhan83 at
gmail.com> wrote:
>>>>>
>>>>>> Ok, good to hear.
>>>>>>
>>>>>> will there be any impact in listing folder and files?.
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 4, 2017 at 1:43 PM, Nithya Balachandran
<
>>>>>> nbalacha at redhat.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On 4 April 2017 at 12:33, Amudhan P <amudhan83
at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I have a query on rebalancing.
>>>>>>>>
>>>>>>>> let's consider following is my folder
hierarchy.
>>>>>>>>
>>>>>>>> parent1-fol (parent folder)
>>>>>>>> |_
>>>>>>>> class-fol-1 ( 1 st level
subfolder)
>>>>>>>> |_
>>>>>>>> A ( 2 nd
level subfolder)
>>>>>>>> |_
>>>>>>>>
childfol-1 (child folder
>>>>>>>> created every time before writing files)
>>>>>>>>
>>>>>>>>
>>>>>>>> Now, I have a running cluster with 3.10.1 with
disperse volume and
>>>>>>>> I am planning to expand cluster by adding
bricks.
>>>>>>>>
>>>>>>>> will there be a problem using newly added
bricks without doing a
>>>>>>>> "rebalance fix layout" other than
existing files cannot be rebalanced to
>>>>>>>> new brick and files created under existing
folder will not go to new brick?.
>>>>>>>>
>>>>>>>> I tested above case in my test setup and
observed files created
>>>>>>>> under new folder goes to new brick. and I
don't see any issue on listing
>>>>>>>> files and folder.
>>>>>>>>
>>>>>>>> so, My case is we create child folder every
time before creating
>>>>>>>> files.
>>>>>>>>
>>>>>>>> The reason to avoid rebalance is I have more
than 10000 folders
>>>>>>>> across 1080 bricks. so triggering rebalance
will take a long time and in my
>>>>>>>> previous expansion in 3.7 was not able to
access some folders randomly
>>>>>>>> until fix layout completes.
>>>>>>>>
>>>>>>>>
>>>>>>> It sounds like you will not need to run a rebalance
or fix-layout
>>>>>>> for this. It should work fine.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Nithya
>>>>>>>
>>>>>>>>
>>>>>>>> regards
>>>>>>>> Amudhan
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> Gluster-users mailing list
>>>>>>>> Gluster-users at gluster.org
>>>>>>>>
http://lists.gluster.org/mailman/listinfo/gluster-users
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170419/b5458af2/attachment.html>