Hi Xin, I've not got a chance to look into it yet. delete stale volume function is in place to take care of wiping off volume configuration data which has been deleted from the cluster. However we need to revisit this code to see if this function is anymore needed given we recently added a validation to fail delete request if one of the glusterd is down. I'll get back to you on this. On Mon, 21 Nov 2016 at 07:24, songxin <songxin_1980 at 126.com> wrote:> Hi Atin, > Thank you for your support. > > And any conclusions about this issue? > > Thanks, > Xin > > > > > > ? 2016-11-16 20:59:05?"Atin Mukherjee" <amukherj at redhat.com> ??? > > > > On Tue, Nov 15, 2016 at 1:53 PM, songxin <songxin_1980 at 126.com> wrote: > > ok, thank you. > > > > > ? 2016-11-15 16:12:34?"Atin Mukherjee" <amukherj at redhat.com> ??? > > > > On Tue, Nov 15, 2016 at 12:47 PM, songxin <songxin_1980 at 126.com> wrote: > > > Hi Atin, > > I think the root cause is in the function glusterd_import_friend_volume as > below. > > int32_t > glusterd_import_friend_volume (dict_t *peer_data, size_t count) > { > ... > ret = glusterd_volinfo_find (new_volinfo->volname, &old_volinfo); > if (0 == ret) { > (void) gd_check_and_update_rebalance_info (old_volinfo, > new_volinfo); > (void) glusterd_delete_stale_volume (old_volinfo, > new_volinfo); > } > ... > ret = glusterd_store_volinfo (new_volinfo, > GLUSTERD_VOLINFO_VER_AC_NONE); > if (ret) { > gf_msg (this->name, GF_LOG_ERROR, 0, > GD_MSG_VOLINFO_STORE_FAIL, "Failed to store " > "volinfo for volume %s", new_volinfo->volname); > goto out; > } > ... > } > > glusterd_delete_stale_volume will remove the info and bricks/* and the > glusterd_store_volinfo will create the new one. > But if glusterd is killed before rename the info will is empty. > > And glusterd will start failed because the infois empty in the next time > you start the glusterd. > > Any idea? Atin? > > > Give me some time, will check it out, but reading at this analysis looks > very well possible if a volume is changed when the glusterd was done on > node a and when the same comes up during peer handshake we update the > volinfo and during that time glusterd goes down once again. I'll confirm it > by tomorrow. > > > I checked the code and it does look like you have got the right RCA for > the issue which you simulated through those two scripts. However this can > happen even when you try to create a fresh volume and while glusterd tries > to write the content into the store and goes down before renaming the > info.tmp file you get into the same situation. > > I'd really need to think through if this can be fixed. Suggestions are > always appreciated. > > > > > BTW, excellent work Xin! > > > Thanks, > Xin > > > ? 2016-11-15 12:07:05?"Atin Mukherjee" <amukherj at redhat.com> ??? > > > > On Tue, Nov 15, 2016 at 8:58 AM, songxin <songxin_1980 at 126.com> wrote: > > Hi Atin, > I have some clues about this issue. > I could reproduce this issue use the scrip that mentioned in > https://bugzilla.redhat.com/show_bug.cgi?id=1308487 . > > > I really appreciate your help in trying to nail down this issue. While I > am at your email and going through the code to figure out the possible > cause for it, unfortunately I don't see any script in the attachment of the > bug. Could you please cross check? > > > > After I added some debug print,which like below, in glusterd-store.c and I > found that the /var/lib/glusterd/vols/xxx/info and /var/lib/glusterd/vols/xxx/bricks/* > are removed. > But other files in /var/lib/glusterd/vols/xxx/ will not be remove. > > int32_t > glusterd_store_volinfo (glusterd_volinfo_t *volinfo, > glusterd_volinfo_ver_ac_t ac) > { > int32_t ret = -1; > > GF_ASSERT (volinfo) > > ret = access("/var/lib/glusterd/vols/gv0/info", F_OK); > if(ret < 0) > { > gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info is not > exit(%d)", errno); > } > else > { > ret = stat("/var/lib/glusterd/vols/gv0/info", &buf); > if(ret < 0) > { > gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "stat info > error"); > } > else > { > gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info size > is %lu, inode num is %lu", buf.st_size, buf.st_ino); > } > } > > glusterd_perform_volinfo_version_action (volinfo, ac); > ret = glusterd_store_create_volume_dir (volinfo); > if (ret) > goto out; > > ... > } > > So it is easy to understand why the info or > 10.32.1.144.-opt-lvmdir-c2-brick sometimes is empty. > It is becaue the info file is not exist, and it will be create by ?fd > open (path, O_RDWR | O_CREAT | O_APPEND, 0600);? in function > gf_store_handle_new. > And the info file is empty before rename. > So the info file is empty if glusterd shutdown before rename. > > > > My question is following. > 1.I did not find the point the info is removed.Could you tell me the point > where the info and /bricks/* are removed? > 2.why the file info and bricks/* is removed?But other files in var/lib/glusterd/vols/xxx/ > are not be removed? > > > AFAIK, we never delete the info file and hence this file is opened with > O_APPEND flag. As I said I will go back and cross check the code once again. > > > > > Thanks, > Xin > > > ? 2016-11-11 20:34:05?"Atin Mukherjee" <amukherj at redhat.com> ??? > > > > On Fri, Nov 11, 2016 at 4:00 PM, songxin <songxin_1980 at 126.com> wrote: > > Hi Atin, > > Thank you for your support. > Sincerely wait for your reply. > > By the way, could you make sure that the issue, file info is empty, cause > by rename is interrupted in kernel? > > > As per my RCA on that bug, it looked to be. > > > > Thanks, > Xin > > ? 2016-11-11 15:49:02?"Atin Mukherjee" <amukherj at redhat.com> ??? > > > > On Fri, Nov 11, 2016 at 1:15 PM, songxin <songxin_1980 at 126.com> wrote: > > Hi Atin, > Thank you for your reply. > Actually it is very difficult to reproduce because I don't know when there > was an ongoing commit happening.It is just a coincidence. > But I want to make sure the root cause. > > > I'll give it a another try and see if this situation can be > simulated/reproduced and will keep you posted. > > > > So I would be grateful if you could answer my questions below. > > You said that "This issue is hit at part of the negative testing where > while gluster volume set was executed at the same point of time glusterd in > another instance was brought down. In the faulty node we could see > /var/lib/glusterd/vols/<volname>info file been empty whereas the info.tmp > file has the correct contents." in comment. > > I have two questions for you. > > 1.Could you reproduce this issue by gluster volume set glusterd which was brought down? > 2.Could you be certain that this issue is cause by rename is interrupted in kernel? > > In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, are both empty. > But in my view only one rename can be running at the same time because of the big lock. > Why there are two files are empty? > > > Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be running in two thread? > > Thanks, > Xin > > > ? 2016-11-11 15:27:03?"Atin Mukherjee" <amukherj at redhat.com> ??? > > > > On Fri, Nov 11, 2016 at 12:38 PM, songxin <songxin_1980 at 126.com> wrote: > > > Hi Atin, > Thank you for your reply. > > As you said that the info file can only be changed in the glusterd_store_volinfo() > sequentially because of the big lock. > > I have found the similar issue as below that you mentioned. > https://bugzilla.redhat.com/show_bug.cgi?id=1308487 > > > Great, so this is what I was actually trying to refer in my first email > that I saw a similar issue. Have you got a chance to look at > https://bugzilla.redhat.com/show_bug.cgi?id=1308487#c4 ? But in your > case, did you try to bring down glusterd when there was an ongoing commit > happening? > > > > You said that "This issue is hit at part of the negative testing where > while gluster volume set was executed at the same point of time glusterd in > another instance was brought down. In the faulty node we could see > /var/lib/glusterd/vols/<volname>info file been empty whereas the info.tmp > file has the correct contents." in comment. > > I have two questions for you. > > 1.Could you reproduce this issue by gluster volume set glusterd which was brought down? > 2.Could you be certain that this issue is cause by rename is interrupted in kernel? > > In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, are both empty. > But in my view only one rename can be running at the same time because of the big lock. > Why there are two files are empty? > > > Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be running in two thread? > > Thanks, > Xin > > > > > ? 2016-11-11 14:36:40?"Atin Mukherjee" <amukherj at redhat.com> ??? > > > > On Fri, Nov 11, 2016 at 8:33 AM, songxin <songxin_1980 at 126.com> wrote: > > Hi Atin, > > Thank you for your reply. > I have two questions for you. > > 1.Are the two files info and info.tmp are only to be created or changed in > function glusterd_store_volinfo()? I did not find other point in which the > two file are changed. > > > If we are talking about info file volume then yes, the mentioned function > actually takes care of it. > > > 2.I found that glusterd_store_volinfo() will be call in many point by > glusterd.Is there a problem of thread synchronization?If so, one thread may > open a same file info.tmp using O_TRUNC flag when another thread is > writing the info,tmp.Could this case happen? > > > In glusterd threads are big lock protected and I don't see a possibility > (theoretically) to have two glusterd_store_volinfo () calls at a given > point of time. > > > > Thanks, > Xin > > > At 2016-11-10 21:41:06, "Atin Mukherjee" <amukherj at redhat.com> wrote: > > Did you run out of disk space by any chance? AFAIK, the code is like we > write new stuffs to .tmp file and rename it back to the original file. In > case of a disk space issue I expect both the files to be of non zero size. > But having said that I vaguely remember a similar issue (in the form of a > bug or an email) landed up once but we couldn't reproduce it, so something > is wrong with the atomic update here is what I guess. I'll be glad if you > have a reproducer for the same and then we can dig into it further. > > On Thu, Nov 10, 2016 at 1:32 PM, songxin <songxin_1980 at 126.com> wrote: > > Hi, > When I start the glusterd some error happened. > And the log is following. > > [2016-11-08 07:58:34.989365] I [MSGID: 100030] [glusterfsd.c:2318:main] > 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6 > (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO) > [2016-11-08 07:58:34.998356] I [MSGID: 106478] [glusterd.c:1350:init] > 0-management: Maximum allowed open file descriptors set to 65536 > [2016-11-08 07:58:35.000667] I [MSGID: 106479] [glusterd.c:1399:init] > 0-management: Using /system/glusterd as working directory > [2016-11-08 07:58:35.024508] I [MSGID: 106514] > [glusterd-store.c:2075:glusterd_restore_op_version] 0-management: Upgrade > detected. Setting op-version to minimum : 1 > *[2016-11-08 07:58:35.025356] E [MSGID: 106206] > [glusterd-store.c:2562:glusterd_store_update_volinfo] 0-management: Failed > to get next store iter * > *[2016-11-08 07:58:35.025401] E [MSGID: 106207] > [glusterd-store.c:2844:glusterd_store_retrieve_volume] 0-management: Failed > to update volinfo for c_glusterfs volume * > *[2016-11-08 07:58:35.025463] E [MSGID: 106201] > [glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management: > Unable to restore volume: c_glusterfs * > *[2016-11-08 07:58:35.025544] E [MSGID: 101019] [xlator.c:428:xlator_init] > 0-management: Initialization of volume 'management' failed, review your > volfile again * > *[2016-11-08 07:58:35.025582] E [graph.c:322:glusterfs_graph_init] > 0-management: initializing translator failed * > *[2016-11-08 07:58:35.025629] E [graph.c:661:glusterfs_graph_activate] > 0-graph: init failed * > [2016-11-08 07:58:35.026109] W [glusterfsd.c:1236:cleanup_and_exit] > (-->/usr/sbin/glusterd(glusterfs_volumes_init-0x1b260) [0x1000a718] > -->/usr/sbin/glusterd(glusterfs_process_volfp-0x1b3b8) [0x1000a5a8] > -->/usr/sbin/glusterd(cleanup_and_exit-0x1c02c) [0x100098bc] ) 0-: received > signum (0), shutting down > > > And then I found that the size of vols/volume_name/info is 0.It cause > glusterd shutdown. > But I found that vols/volume_name_info.tmp is not 0. > And I found that there is a brick file vols/volume_name/bricks/xxxx.brick > is 0, but vols/volume_name/bricks/xxxx.brick.tmp is not 0. > > I read the function code glusterd_store_volinfo () in glusterd-store.c . > I know that the info.tmp will be rename to info in function > glusterd_store_volume_atomic_update(). > > But my question is that why the info file is 0 but info.tmp is not 0. > > > Thanks, > Xin > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users > > > > > -- > > ~ Atin (atinm) > > > > > > > > > -- > > ~ Atin (atinm) > > > > > > > > > -- > > ~ Atin (atinm) > > > > > > > > > -- > > ~ Atin (atinm) > > > > > > > > > -- > > ~ Atin (atinm) > > > > > > > > > -- > > ~ Atin (atinm) > > > > > > > > > -- > > ~ Atin (atinm) > > > > > > > > > -- > > ~ Atin (atinm) > > > > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://www.gluster.org/mailman/listinfo/gluster-users-- --Atin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161121/eccfad63/attachment.html>
Hi Atin, Ok.Thank you for your reply. Thanks, Xin ? 2016-11-21 10:00:36?"Atin Mukherjee" <atin.mukherjee83 at gmail.com> ??? Hi Xin, I've not got a chance to look into it yet. delete stale volume function is in place to take care of wiping off volume configuration data which has been deleted from the cluster. However we need to revisit this code to see if this function is anymore needed given we recently added a validation to fail delete request if one of the glusterd is down. I'll get back to you on this. On Mon, 21 Nov 2016 at 07:24, songxin <songxin_1980 at 126.com> wrote: Hi Atin, Thank you for your support. And any conclusions about this issue? Thanks, Xin ? 2016-11-16 20:59:05?"Atin Mukherjee" <amukherj at redhat.com> ??? On Tue, Nov 15, 2016 at 1:53 PM, songxin <songxin_1980 at 126.com> wrote: ok, thank you. ? 2016-11-15 16:12:34?"Atin Mukherjee" <amukherj at redhat.com> ??? On Tue, Nov 15, 2016 at 12:47 PM, songxin <songxin_1980 at 126.com> wrote: Hi Atin, I think the root cause is in the function glusterd_import_friend_volume as below. int32_t glusterd_import_friend_volume (dict_t *peer_data, size_t count) { ... ret = glusterd_volinfo_find (new_volinfo->volname, &old_volinfo); if (0 == ret) { (void) gd_check_and_update_rebalance_info (old_volinfo, new_volinfo); (void) glusterd_delete_stale_volume (old_volinfo, new_volinfo); } ... ret = glusterd_store_volinfo (new_volinfo, GLUSTERD_VOLINFO_VER_AC_NONE); if (ret) { gf_msg (this->name, GF_LOG_ERROR, 0, GD_MSG_VOLINFO_STORE_FAIL, "Failed to store " "volinfo for volume %s", new_volinfo->volname); goto out; } ... } glusterd_delete_stale_volume will remove the info and bricks/* and the glusterd_store_volinfo will create the new one. But if glusterd is killed before rename the info will is empty. And glusterd will start failed because the infois empty in the next time you start the glusterd. Any idea? Atin? Give me some time, will check it out, but reading at this analysis looks very well possible if a volume is changed when the glusterd was done on node a and when the same comes up during peer handshake we update the volinfo and during that time glusterd goes down once again. I'll confirm it by tomorrow. I checked the code and it does look like you have got the right RCA for the issue which you simulated through those two scripts. However this can happen even when you try to create a fresh volume and while glusterd tries to write the content into the store and goes down before renaming the info.tmp file you get into the same situation. I'd really need to think through if this can be fixed. Suggestions are always appreciated. BTW, excellent work Xin! Thanks, Xin ? 2016-11-15 12:07:05?"Atin Mukherjee" <amukherj at redhat.com> ??? On Tue, Nov 15, 2016 at 8:58 AM, songxin <songxin_1980 at 126.com> wrote: Hi Atin, I have some clues about this issue. I could reproduce this issue use the scrip that mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1308487 . I really appreciate your help in trying to nail down this issue. While I am at your email and going through the code to figure out the possible cause for it, unfortunately I don't see any script in the attachment of the bug. Could you please cross check? After I added some debug print,which like below, in glusterd-store.c and I found that the /var/lib/glusterd/vols/xxx/info and /var/lib/glusterd/vols/xxx/bricks/* are removed. But other files in /var/lib/glusterd/vols/xxx/ will not be remove. int32_t glusterd_store_volinfo (glusterd_volinfo_t *volinfo, glusterd_volinfo_ver_ac_t ac) { int32_t ret = -1; GF_ASSERT (volinfo) ret = access("/var/lib/glusterd/vols/gv0/info", F_OK); if(ret < 0) { gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info is not exit(%d)", errno); } else { ret = stat("/var/lib/glusterd/vols/gv0/info", &buf); if(ret < 0) { gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "stat info error"); } else { gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info size is %lu, inode num is %lu", buf.st_size, buf.st_ino); } } glusterd_perform_volinfo_version_action (volinfo, ac); ret = glusterd_store_create_volume_dir (volinfo); if (ret) goto out; ... } So it is easy to understand why the info or 10.32.1.144.-opt-lvmdir-c2-brick sometimes is empty. It is becaue the info file is not exist, and it will be create by ?fd = open (path, O_RDWR | O_CREAT | O_APPEND, 0600);? in function gf_store_handle_new. And the info file is empty before rename. So the info file is empty if glusterd shutdown before rename. My question is following. 1.I did not find the point the info is removed.Could you tell me the point where the info and /bricks/* are removed? 2.why the file info and bricks/* is removed?But other files in var/lib/glusterd/vols/xxx/ are not be removed? AFAIK, we never delete the info file and hence this file is opened with O_APPEND flag. As I said I will go back and cross check the code once again. Thanks, Xin ? 2016-11-11 20:34:05?"Atin Mukherjee" <amukherj at redhat.com> ??? On Fri, Nov 11, 2016 at 4:00 PM, songxin <songxin_1980 at 126.com> wrote: Hi Atin, Thank you for your support. Sincerely wait for your reply. By the way, could you make sure that the issue, file info is empty, cause by rename is interrupted in kernel? As per my RCA on that bug, it looked to be. Thanks, Xin ? 2016-11-11 15:49:02?"Atin Mukherjee" <amukherj at redhat.com> ??? On Fri, Nov 11, 2016 at 1:15 PM, songxin <songxin_1980 at 126.com> wrote: Hi Atin, Thank you for your reply. Actually it is very difficult to reproduce because I don't know when there was an ongoing commit happening.It is just a coincidence. But I want to make sure the root cause. I'll give it a another try and see if this situation can be simulated/reproduced and will keep you posted. So I would be grateful if you could answer my questions below. You said that "This issue is hit at part of the negative testing where while gluster volume set was executed at the same point of time glusterd in another instance was brought down. In the faulty node we could see /var/lib/glusterd/vols/<volname>info file been empty whereas the info.tmp file has the correct contents." in comment. I have two questions for you. 1.Could you reproduce this issue by gluster volume set glusterd which was brought down? 2.Could you be certain that this issue is cause by rename is interrupted in kernel? In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, are both empty. But in my view only one rename can be running at the same time because of the big lock. Why there are two files are empty? Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be running in two thread? Thanks, Xin ? 2016-11-11 15:27:03?"Atin Mukherjee" <amukherj at redhat.com> ??? On Fri, Nov 11, 2016 at 12:38 PM, songxin <songxin_1980 at 126.com> wrote: Hi Atin, Thank you for your reply. As you said that the info file can only be changed in the glusterd_store_volinfo() sequentially because of the big lock. I have found the similar issue as below that you mentioned. https://bugzilla.redhat.com/show_bug.cgi?id=1308487 Great, so this is what I was actually trying to refer in my first email that I saw a similar issue. Have you got a chance to look at https://bugzilla.redhat.com/show_bug.cgi?id=1308487#c4 ? But in your case, did you try to bring down glusterd when there was an ongoing commit happening? You said that "This issue is hit at part of the negative testing where while gluster volume set was executed at the same point of time glusterd in another instance was brought down. In the faulty node we could see /var/lib/glusterd/vols/<volname>info file been empty whereas the info.tmp file has the correct contents." in comment. I have two questions for you. 1.Could you reproduce this issue by gluster volume set glusterd which was brought down? 2.Could you be certain that this issue is cause by rename is interrupted in kernel? In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, are both empty. But in my view only one rename can be running at the same time because of the big lock. Why there are two files are empty? Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be running in two thread? Thanks, Xin ? 2016-11-11 14:36:40?"Atin Mukherjee" <amukherj at redhat.com> ??? On Fri, Nov 11, 2016 at 8:33 AM, songxin <songxin_1980 at 126.com> wrote: Hi Atin, Thank you for your reply. I have two questions for you. 1.Are the two files info and info.tmp are only to be created or changed in function glusterd_store_volinfo()? I did not find other point in which the two file are changed. If we are talking about info file volume then yes, the mentioned function actually takes care of it. 2.I found that glusterd_store_volinfo() will be call in many point by glusterd.Is there a problem of thread synchronization?If so, one thread may open a same file info.tmp using O_TRUNC flag when another thread is writing the info,tmp.Could this case happen? In glusterd threads are big lock protected and I don't see a possibility (theoretically) to have two glusterd_store_volinfo () calls at a given point of time. Thanks, Xin At 2016-11-10 21:41:06, "Atin Mukherjee" <amukherj at redhat.com> wrote: Did you run out of disk space by any chance? AFAIK, the code is like we write new stuffs to .tmp file and rename it back to the original file. In case of a disk space issue I expect both the files to be of non zero size. But having said that I vaguely remember a similar issue (in the form of a bug or an email) landed up once but we couldn't reproduce it, so something is wrong with the atomic update here is what I guess. I'll be glad if you have a reproducer for the same and then we can dig into it further. On Thu, Nov 10, 2016 at 1:32 PM, songxin <songxin_1980 at 126.com> wrote: Hi, When I start the glusterd some error happened. And the log is following. [2016-11-08 07:58:34.989365] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO) [2016-11-08 07:58:34.998356] I [MSGID: 106478] [glusterd.c:1350:init] 0-management: Maximum allowed open file descriptors set to 65536 [2016-11-08 07:58:35.000667] I [MSGID: 106479] [glusterd.c:1399:init] 0-management: Using /system/glusterd as working directory [2016-11-08 07:58:35.024508] I [MSGID: 106514] [glusterd-store.c:2075:glusterd_restore_op_version] 0-management: Upgrade detected. Setting op-version to minimum : 1 [2016-11-08 07:58:35.025356] E [MSGID: 106206] [glusterd-store.c:2562:glusterd_store_update_volinfo] 0-management: Failed to get next store iter [2016-11-08 07:58:35.025401] E [MSGID: 106207] [glusterd-store.c:2844:glusterd_store_retrieve_volume] 0-management: Failed to update volinfo for c_glusterfs volume [2016-11-08 07:58:35.025463] E [MSGID: 106201] [glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management: Unable to restore volume: c_glusterfs [2016-11-08 07:58:35.025544] E [MSGID: 101019] [xlator.c:428:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again [2016-11-08 07:58:35.025582] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed [2016-11-08 07:58:35.025629] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed [2016-11-08 07:58:35.026109] W [glusterfsd.c:1236:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init-0x1b260) [0x1000a718] -->/usr/sbin/glusterd(glusterfs_process_volfp-0x1b3b8) [0x1000a5a8] -->/usr/sbin/glusterd(cleanup_and_exit-0x1c02c) [0x100098bc] ) 0-: received signum (0), shutting down And then I found that the size of vols/volume_name/info is 0.It cause glusterd shutdown. But I found that vols/volume_name_info.tmp is not 0. And I found that there is a brick file vols/volume_name/bricks/xxxx.brick is 0, but vols/volume_name/bricks/xxxx.brick.tmp is not 0. I read the function code glusterd_store_volinfo () in glusterd-store.c . I know that the info.tmp will be rename to info in function glusterd_store_volume_atomic_update(). But my question is that why the info file is 0 but info.tmp is not 0. Thanks, Xin _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://www.gluster.org/mailman/listinfo/gluster-users -- ~ Atin (atinm) -- ~ Atin (atinm) -- ~ Atin (atinm) -- ~ Atin (atinm) -- ~ Atin (atinm) -- ~ Atin (atinm) -- ~ Atin (atinm) -- ~ Atin (atinm) _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://www.gluster.org/mailman/listinfo/gluster-users -- --Atin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161121/2c391b2a/attachment.html>
songxin
2016-Nov-21 09:06 UTC
[Gluster-users] question about Transport endpoint is not connected
Hi all, I have a question to consult you. Gluterfs version is 3.7.6. I create a replicate volume using two node. When I access the mount point, it show that "Transport endpoint is not connected" in client log. From log it seems that the glusterfs and glusterfsd disconnect.But I don't why disconnection happen on local sockect. Below is the log from client and service. mnt-c.log(client log) [2016-10-31 04:04:08.023782] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.6 (args: /usr/sbin/glusterfs --acl --volfile-server=10.32.1.144 --volfile-id=c_glusterfs /mnt/c) [2016-10-31 04:04:08.042622] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2016-10-31 04:04:08.052641] I [graph.c:269:gf_add_cmdline_options] 0-c_glusterfs-md-cache: adding option 'cache-posix-acl' for volume 'c_glusterfs-md-cache' with value 'true' [2016-10-31 04:04:08.058854] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2016-10-31 04:04:08.060066] I [MSGID: 114020] [client.c:2118:notify] 0-c_glusterfs-client-9: parent translators are ready, attempting connect on transport [2016-10-31 04:04:08.061169] I [MSGID: 114020] [client.c:2118:notify] 0-c_glusterfs-client-10: parent translators are ready, attempting connect on transport [2016-10-31 04:04:08.061694] I [rpc-clnt.c:1847:rpc_clnt_reconfig] 0-c_glusterfs-client-9: changing port to 49391 (from 0) Final graph: +------------------------------------------------------------------------------+ 1: volume c_glusterfs-client-9 2: type protocol/client 3: option ping-timeout 4 4: option remote-host 10.32.1.144 5: option remote-subvolume /opt/lvmdir/c2/brick 6: option transport-type socket 7: option username 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 8: option password ade893cc-19dd-4d67-ae8e-cb929ef4e770 9: option send-gids true 10: end-volume 11: 12: volume c_glusterfs-client-10 13: type protocol/client 14: option ping-timeout 4 15: option remote-host 10.32.0.48 16: option remote-subvolume /opt/lvmdir/c2/brick 17: option transport-type socket 18: option username 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 19: option password ade893cc-19dd-4d67-ae8e-cb929ef4e770 20: option send-gids true 21: end-volume 22: 23: volume c_glusterfs-replicate-0 24: type cluster/replicate 25: subvolumes c_glusterfs-client-9 c_glusterfs-client-10 26: end-volume 27: 28: volume c_glusterfs-dht 29: type cluster/distribute 30: subvolumes c_glusterfs-replicate-0 31: end-volume 32: 33: volume c_glusterfs-write-behind 34: type performance/write-behind 35: subvolumes c_glusterfs-dht 36: end-volume 37: 38: volume c_glusterfs-read-ahead 39: type performance/read-ahead 40: subvolumes c_glusterfs-write-behind 41: end-volume 42: 43: volume c_glusterfs-readdir-ahead 44: type performance/readdir-ahead 45: subvolumes c_glusterfs-read-ahead 46: end-volume 47: 48: volume c_glusterfs-io-cache 49: type performance/io-cache 50: subvolumes c_glusterfs-readdir-ahead 51: end-volume 52: 53: volume c_glusterfs-quick-read 54: type performance/quick-read 55: subvolumes c_glusterfs-io-cache 56: end-volume 57: 58: volume c_glusterfs-open-behind 59: type performance/open-behind 60: subvolumes c_glusterfs-quick-read 61: end-volume 62: 63: volume c_glusterfs-md-cache 64: type performance/md-cache 65: option cache-posix-acl true 66: subvolumes c_glusterfs-open-behind 67: end-volume 68: 69: volume c_glusterfs 70: type debug/io-stats 71: option latency-measurement off 72: option count-fop-hits off 73: subvolumes c_glusterfs-md-cache 74: end-volume 75: 76: volume posix-acl-autoload 77: type system/posix-acl 78: subvolumes c_glusterfs 79: end-volume 80: 81: volume meta-autoload 82: type meta 83: subvolumes posix-acl-autoload 84: end-volume 85: +------------------------------------------------------------------------------+ [2016-10-31 04:04:08.062941] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-c_glusterfs-client-9: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2016-10-31 04:04:08.063367] I [rpc-clnt.c:1847:rpc_clnt_reconfig] 0-c_glusterfs-client-10: changing port to 49309 (from 0) [2016-10-31 04:04:08.063543] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-c_glusterfs-client-9: Connected to c_glusterfs-client-9, attached to remote volume '/opt/lvmdir/c2/brick'. [2016-10-31 04:04:08.063576] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-c_glusterfs-client-9: Server and Client lk-version numbers are not same, reopening the fds [2016-10-31 04:04:08.063716] I [MSGID: 108005] [afr-common.c:3841:afr_notify] 0-c_glusterfs-replicate-0: Subvolume 'c_glusterfs-client-9' came back up; going online. [2016-10-31 04:04:08.063785] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-c_glusterfs-client-9: Server lk version = 1 [2016-10-31 04:04:08.064659] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-c_glusterfs-client-10: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2016-10-31 04:04:08.065344] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-c_glusterfs-client-10: Connected to c_glusterfs-client-10, attached to remote volume '/opt/lvmdir/c2/brick'. [2016-10-31 04:04:08.065374] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-c_glusterfs-client-10: Server and Client lk-version numbers are not same, reopening the fds [2016-10-31 04:04:08.074629] I [fuse-bridge.c:5137:fuse_graph_setup] 0-fuse: switched to graph 0 [2016-10-31 04:04:08.074884] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-c_glusterfs-client-10: Server lk version = 1 [2016-10-31 04:04:08.075006] I [fuse-bridge.c:4030:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel 7.22 [2016-10-31 04:04:08.077023] I [MSGID: 108031] [afr-common.c:1782:afr_local_discovery_cbk] 0-c_glusterfs-replicate-0: selecting local read_child c_glusterfs-client-9 [2016-10-31 04:04:15.908933] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:15.909054] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:15.913787] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:15.913885] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:15.923030] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:15.923135] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:15.944227] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:15.944343] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:15.949661] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:15.949766] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:15.954882] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:15.954982] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:15.960768] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] ... [2016-10-31 04:04:16.107262] I [MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 0-c_glusterfs-replicate-0: performing metadata selfheal on 199e1862-7e74-4b13-8c42-c4cea7e9f2fc [2016-10-31 04:04:16.111886] I [MSGID: 108026] [afr-self-heal-common.c:651:afr_log_selfheal] 0-c_glusterfs-replicate-0: Completed metadata selfheal on 199e1862-7e74-4b13-8c42-c4cea7e9f2fc. source=0 sinks=1 [2016-10-31 04:04:16.114518] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:16.114620] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:16.122321] I [MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 0-c_glusterfs-replicate-0: performing metadata selfheal on 014e2eca-bb61-4939-9a4a-9a1af4351178 [2016-10-31 04:04:16.126960] I [MSGID: 108026] [afr-self-heal-common.c:651:afr_log_selfheal] 0-c_glusterfs-replicate-0: Completed metadata selfheal on 014e2eca-bb61-4939-9a4a-9a1af4351178. source=0 sinks=1 [2016-10-31 04:04:16.128101] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:16.128202] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:16.134165] I [MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 0-c_glusterfs-replicate-0: performing metadata selfheal on 60b5380b-0cea-423d-a1af-d3064968cc54 [2016-10-31 04:04:16.139036] I [MSGID: 108026] [afr-self-heal-common.c:651:afr_log_selfheal] 0-c_glusterfs-replicate-0: Completed metadata selfheal on 60b5380b-0cea-423d-a1af-d3064968cc54. source=0 sinks=1 [2016-10-31 04:04:16.141317] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:16.141432] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:16.147456] I [MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 0-c_glusterfs-replicate-0: performing metadata selfheal on accef130-4cf5-4171-be08-2089e7ba1ca5 [2016-10-31 04:04:16.152572] I [MSGID: 108026] [afr-self-heal-common.c:651:afr_log_selfheal] 0-c_glusterfs-replicate-0: Completed metadata selfheal on accef130-4cf5-4171-be08-2089e7ba1ca5. source=0 sinks=1 [2016-10-31 04:04:16.153633] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:16.153744] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:16.159954] I [MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 0-c_glusterfs-replicate-0: performing metadata selfheal on 9c666f31-f3d7-40c6-b17b-0f41768ddea4 [2016-10-31 04:04:16.164773] I [MSGID: 108026] [afr-self-heal-common.c:651:afr_log_selfheal] 0-c_glusterfs-replicate-0: Completed metadata selfheal on 9c666f31-f3d7-40c6-b17b-0f41768ddea4. source=0 sinks=1 [2016-10-31 04:04:16.167219] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:16.167323] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] ... [2016-10-31 04:06:03.627345] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808] ))))) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3) op(FINODELK(30)) called at 2016-10-31 04:06:03.626033 (xid=0x7f5e) [2016-10-31 04:06:03.627395] E [MSGID: 114031] [client-rpc-fops.c:1673:client3_3_finodelk_cbk] 0-c_glusterfs-client-9: remote operation failed [Transport endpoint is not connected] [2016-10-31 04:06:03.628381] I [socket.c:3308:socket_submit_request] 0-c_glusterfs-client-9: not connected (priv->connected = 0) [2016-10-31 04:06:03.628432] W [rpc-clnt.c:1586:rpc_clnt_submit] 0-c_glusterfs-client-9: failed to submit rpc-request (XID: 0x7f5f Program: GlusterFS 3.3, ProgVers: 330, Proc: 30) to rpc-transport (c_glusterfs-client-9) [2016-10-31 04:06:03.628466] E [MSGID: 114031] [client-rpc-fops.c:1673:client3_3_finodelk_cbk] 0-c_glusterfs-client-9: remote operation failed [Transport endpoint is not connected] [2016-10-31 04:06:03.628475] I [MSGID: 108019] [afr-lk-common.c:1086:afr_lock_blocking] 0-c_glusterfs-replicate-0: unable to lock on even one child [2016-10-31 04:06:03.628539] I [MSGID: 108019] [afr-transaction.c:1224:afr_post_blocking_inodelk_cbk] 0-c_glusterfs-replicate-0: Blocking inodelks failed. [2016-10-31 04:06:03.628630] W [fuse-bridge.c:1282:fuse_err_cbk] 0-glusterfs-fuse: 20790: FLUSH() ERR => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.629149] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808] ))))) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2016-10-31 04:06:03.624346 (xid=0x7f5a) [2016-10-31 04:06:03.629183] I [rpc-clnt.c:1847:rpc_clnt_reconfig] 0-c_glusterfs-client-9: changing port to 49391 (from 0) [2016-10-31 04:06:03.629210] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews (b0e5a94e-a432-4dce-b86f-a551555780a2) [Transport endpoint is not connected] [2016-10-31 04:06:03.629266] I [socket.c:3308:socket_submit_request] 0-c_glusterfs-client-9: not connected (priv->connected = 255) [2016-10-31 04:06:03.629277] I [MSGID: 109063] [dht-layout.c:702:dht_layout_normalize] 0-c_glusterfs-dht: Found anomalies in /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews (gfid = b0e5a94e-a432-4dce-b86f-a551555780a2). Holes=1 overlaps=0 [2016-10-31 04:06:03.629293] W [rpc-clnt.c:1586:rpc_clnt_submit] 0-c_glusterfs-client-9: failed to submit rpc-request (XID: 0x7f62 Program: GlusterFS 3.3, ProgVers: 330, Proc: 41) to rpc-transport (c_glusterfs-client-9) [2016-10-31 04:06:03.629333] W [fuse-resolve.c:149:fuse_resolve_gfid_cbk] 0-fuse: b0e5a94e-a432-4dce-b86f-a551555780a2: failed to resolve (Transport endpoint is not connected) [2016-10-31 04:06:03.629363] W [fuse-bridge.c:3385:fuse_getxattr_resume] 0-glusterfs-fuse: 20784: GETXATTR b0e5a94e-a432-4dce-b86f-a551555780a2/70366685109140 (system.posix_acl_access) resolution failed [2016-10-31 04:06:03.629729] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808] ))))) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2016-10-31 04:06:03.624568 (xid=0x7f5b) [2016-10-31 04:06:03.629779] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /pmd/2 (fa866805-ab22-4afe-8acb-528d15ce4117) [Transport endpoint is not connected] [2016-10-31 04:06:03.629784] E [MSGID: 114031] [client-rpc-fops.c:2883:client3_3_opendir_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews (b0e5a94e-a432-4dce-b86f-a551555780a2) [Transport endpoint is not connected] [2016-10-31 04:06:03.629836] I [MSGID: 109063] [dht-layout.c:702:dht_layout_normalize] 0-c_glusterfs-dht: Found anomalies in /pmd/2 (gfid = fa866805-ab22-4afe-8acb-528d15ce4117). Holes=1 overlaps=0 [2016-10-31 04:06:03.629857] W [fuse-bridge.c:980:fuse_fd_cbk] 0-glusterfs-fuse: 20794: OPENDIR() /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.629896] W [fuse-resolve.c:149:fuse_resolve_gfid_cbk] 0-fuse: fa866805-ab22-4afe-8acb-528d15ce4117: failed to resolve (Transport endpoint is not connected) [2016-10-31 04:06:03.629990] E [fuse-bridge.c:787:fuse_getattr_resume] 0-glusterfs-fuse: 20785: GETATTR 70366685017568 (fa866805-ab22-4afe-8acb-528d15ce4117) resolution failed [2016-10-31 04:06:03.630164] W [defaults.c:2212:default_releasedir] (-->/usr/lib64/glusterfs/3.7.6/xlator/protocol/client.so(client_local_wipe-0x42d64) [0x3fff86c9bc5c] -->/usr/lib64/libglusterfs.so.0(fd_unref-0x7ba98) [0x3fff8abb7ad0] -->/usr/lib64/libglusterfs.so.0(default_releasedir-0x9d6d8) [0x3fff8ab93c58] ) 0-fuse: xlator does not implement releasedir_cbk [2016-10-31 04:06:03.630367] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58] (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808] ))))) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2016-10-31 04:06:03.625675 (xid=0x7f5c) [2016-10-31 04:06:03.630368] W [MSGID: 114031] [client-rpc-fops.c:1569:client3_3_fstat_cbk] 0-c_glusterfs-client-9: remote operation failed [Transport endpoint is not connected] [2016-10-31 04:06:03.630497] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /configuration/oamrdncy.cfg (200b0bb2-d903-4924-b025-262d805c0f2f) [Transport endpoint is not connected] [2016-10-31 04:06:03.630871] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-c_glusterfs-client-9: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2016-10-31 04:06:03.631488] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-c_glusterfs-client-9: Connected to c_glusterfs-client-9, attached to remote volume '/opt/lvmdir/c2/brick'. [2016-10-31 04:06:03.631523] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-c_glusterfs-client-9: Server and Client lk-version numbers are not same, reopening the fds [2016-10-31 04:06:03.631555] I [MSGID: 114042] [client-handshake.c:1056:client_post_handshake] 0-c_glusterfs-client-9: 6 fds open - Delaying child_up until they are re-opened [2016-10-31 04:06:03.631475] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /configuration/ethmac.cfg (00000000-0000-0000-0000-000000000000) [Transport endpoint is not connected] [2016-10-31 04:06:03.632256] I [MSGID: 114060] [client-handshake.c:820:client3_3_reopendir_cbk] 0-c_glusterfs-client-9: reopendir on <gfid:00000000-0000-0000-0000-000000000001> succeeded (fd = 1) [2016-10-31 04:06:03.632728] I [MSGID: 114060] [client-handshake.c:820:client3_3_reopendir_cbk] 0-c_glusterfs-client-9: reopendir on <gfid:257add5a-9b87-4014-8a3a-09dd0a699eec> succeeded (fd = 2) [2016-10-31 04:06:03.632790] I [MSGID: 114060] [client-handshake.c:820:client3_3_reopendir_cbk] 0-c_glusterfs-client-9: reopendir on <gfid:6fd47fe9-a566-47c7-8df7-e7a5e379101f> succeeded (fd = 3) [2016-10-31 04:06:03.632847] I [MSGID: 114060] [client-handshake.c:820:client3_3_reopendir_cbk] 0-c_glusterfs-client-9: reopendir on <gfid:fa866805-ab22-4afe-8acb-528d15ce4117> succeeded (fd = 4) [2016-10-31 04:06:03.632905] I [MSGID: 114060] [client-handshake.c:820:client3_3_reopendir_cbk] 0-c_glusterfs-client-9: reopendir on <gfid:139f5034-5b42-4fb8-9372-6dd7cac1e674> succeeded (fd = 5) [2016-10-31 04:06:03.632935] I [MSGID: 114041] [client-handshake.c:678:client_child_up_reopen_done] 0-c_glusterfs-client-9: last fd open'd/lock-self-heal'd - notifying CHILD-UP [2016-10-31 04:06:03.633028] I [MSGID: 108005] [afr-common.c:3841:afr_notify] 0-c_glusterfs-replicate-0: Subvolume 'c_glusterfs-client-9' came back up; going online. [2016-10-31 04:06:03.633386] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-c_glusterfs-client-9: Server lk version = 1 [2016-10-31 04:06:03.634579] W [fuse-bridge.c:758:fuse_attr_cbk] 0-glusterfs-fuse: 20795: FSTAT() /configuration/oamrdncy.cfg => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.634889] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58] (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808] ))))) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3) op(XATTROP(33)) called at 2016-10-31 04:06:03.625716 (xid=0x7f5d) [2016-10-31 04:06:03.634950] W [MSGID: 114031] [client-rpc-fops.c:1845:client3_3_xattrop_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /configuration/oamrdncy.cfg (200b0bb2-d903-4924-b025-262d805c0f2f) [2016-10-31 04:06:03.635134] W [MSGID: 108008] [afr-read-txn.c:250:afr_read_txn] 0-c_glusterfs-replicate-0: Unreadable subvolume -1 found with event generation 3 for gfid 200b0bb2-d903-4924-b025-262d805c0f2f. (Possible split-brain) [2016-10-31 04:06:03.635364] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-c_glusterfs-client-9: disconnected from c_glusterfs-client-9. Client process will keep trying to connect to glusterd until brick's port is available [2016-10-31 04:06:03.635420] E [MSGID: 108006] [afr-common.c:3880:afr_notify] 0-c_glusterfs-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up. [2016-10-31 04:06:03.635762] I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up [2016-10-31 04:06:03.635812] I [MSGID: 109063] [dht-layout.c:702:dht_layout_normalize] 0-c_glusterfs-dht: Found anomalies in /pmd/15 (gfid = 52a8bce8-00ac-48a7-8abf-9a3fdf9a498b). Holes=1 overlaps=0 [2016-10-31 04:06:03.635858] W [fuse-resolve.c:149:fuse_resolve_gfid_cbk] 0-fuse: 52a8bce8-00ac-48a7-8abf-9a3fdf9a498b: failed to resolve (Transport endpoint is not connected) [2016-10-31 04:06:03.635885] E [fuse-bridge.c:787:fuse_getattr_resume] 0-glusterfs-fuse: 20802: GETATTR 70366685021780 (52a8bce8-00ac-48a7-8abf-9a3fdf9a498b) resolution failed [2016-10-31 04:06:03.638853] I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up [2016-10-31 04:06:03.639020] E [MSGID: 108008] [afr-read-txn.c:89:afr_read_txn_refresh_done] 0-c_glusterfs-replicate-0: Failing FSTAT on gfid 200b0bb2-d903-4924-b025-262d805c0f2f: split-brain observed. [Input/output error] [2016-10-31 04:06:03.640564] I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up [2016-10-31 04:06:03.641160] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20808: LOOKUP() /license => -1 (Transport endpoint is not connected) The message "I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up" repeated 6 times between [2016-10-31 04:06:03.640564] and [2016-10-31 04:06:03.641444] [2016-10-31 04:06:03.641484] E [MSGID: 114031] [client-rpc-fops.c:1621:client3_3_inodelk_cbk] 0-c_glusterfs-client-9: remote operation failed [Invalid argument] [2016-10-31 04:06:03.641562] I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up [2016-10-31 04:06:03.641573] E [MSGID: 108010] [afr-lk-common.c:665:afr_unlock_inodelk_cbk] 0-c_glusterfs-replicate-0: path=/configuration/oamrdncy.cfg gfid=200b0bb2-d903-4924-b025-262d805c0f2f: unlock failed on subvolume c_glusterfs-client-9 with lock owner 6cd0c189ff3f0000 [Invalid argument] [2016-10-31 04:06:03.641655] I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up [2016-10-31 04:06:03.641716] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20809: LOOKUP() /lost+found => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.642387] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20813: LOOKUP() /lost+found => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.647166] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20815: LOOKUP() /license => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.647647] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20816: LOOKUP() /tmp => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.648023] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20817: LOOKUP() /lost+found => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.653160] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20819: LOOKUP() /tmp => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.653635] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20820: LOOKUP() /logfiles => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.653992] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20821: LOOKUP() /security => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.654333] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20822: LOOKUP() /logfiles => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.660036] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20824: LOOKUP() /security => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.660675] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20825: LOOKUP() /logfiles => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.661018] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20826: LOOKUP() /up => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.665422] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20828: LOOKUP() /up => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.665886] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20829: LOOKUP() /usr => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.666242] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20830: LOOKUP() /xb => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.666790] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20832: LOOKUP() /xb => -1 (Transport endpoint is not connected) The message "I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up" repeated 61 times between [2016-10-31 04:06:03.641655] and [2016-10-31 04:06:03.670371] [2016-10-31 04:06:03.670392] I [MSGID: 109063] [dht-layout.c:702:dht_layout_normalize] 0-c_glusterfs-dht: Found anomalies in /pmd/2/000301 (gfid = 7af60568-eace-40f1-a94f-5ba92a3afadf). Holes=1 overlaps=0 [2016-10-31 04:06:03.670499] W [fuse-resolve.c:149:fuse_resolve_gfid_cbk] 0-fuse: 7af60568-eace-40f1-a94f-5ba92a3afadf: failed to resolve (Transport endpoint is not connected) [2016-10-31 04:06:03.670529] W [fuse-bridge.c:2517:fuse_opendir_resume] 0-glusterfs-fuse: 20834: OPENDIR (7af60568-eace-40f1-a94f-5ba92a3afadf) resolution failed [2016-10-31 04:06:03.675300] I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up [2016-10-31 04:06:03.675553] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20835: LOOKUP() /usr => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.675990] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20836: LOOKUP() /xb => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.676348] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20837: LOOKUP() /node_id => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.680652] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20839: LOOKUP() /node_id => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.681150] W [fuse-bridge.c:980:fuse_fd_cbk] 0-glusterfs-fuse: 20842: OPENDIR() /configuration => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.681510] W [defaults.c:2212:default_releasedir] (-->/usr/lib64/glusterfs/3.7.6/xlator/cluster/distribute.so(dht_local_wipe-0x70e58) [0x3fff86b88f58] -->/usr/lib64/libglusterfs.so.0(fd_unref-0x7ba98) [0x3fff8abb7ad0] -->/usr/lib64/libglusterfs.so.0(default_releasedir-0x9d6d8) [0x3fff8ab93c58] ) 0-fuse: xlator does not implement releasedir_cbk opt-lvmdir-c2-brick.log(server log) [2016-10-31 04:03:41.554325] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.7.6 (args: /usr/sbin/glusterfsd -s 10.32.1.144 --volfile-id c_glusterfs.10.32.1.144.opt-lvmdir-c2-brick -p /system/glusterd/vols/c_glusterfs/run/10.32.1.144-opt-lvmdir-c2-brick.pid -S /var/run/gluster/697c0e4a16ebc734cd06fd9150723005.socket --brick-name /opt/lvmdir/c2/brick -l /var/log/glusterfs/bricks/opt-lvmdir-c2-brick.log --xlator-option *-posix.glusterd-uuid=5818a372-d517-41af-bd2f-0fce648773df --brick-port 49391 --xlator-option c_glusterfs-server.listen-port=49391) [2016-10-31 04:03:41.571442] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2016-10-31 04:03:41.588390] I [graph.c:269:gf_add_cmdline_options] 0-c_glusterfs-server: adding option 'listen-port' for volume 'c_glusterfs-server' with value '49391' [2016-10-31 04:03:41.588440] I [graph.c:269:gf_add_cmdline_options] 0-c_glusterfs-posix: adding option 'glusterd-uuid' for volume 'c_glusterfs-posix' with value '5818a372-d517-41af-bd2f-0fce648773df' [2016-10-31 04:03:41.588815] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2016-10-31 04:03:41.588870] I [MSGID: 115034] [server.c:403:_check_for_auth_option] 0-/opt/lvmdir/c2/brick: skip format check for non-addr auth option auth.login./opt/lvmdir/c2/brick.allow [2016-10-31 04:03:41.588907] I [MSGID: 115034] [server.c:403:_check_for_auth_option] 0-/opt/lvmdir/c2/brick: skip format check for non-addr auth option auth.login.5c5afe17-02ce-4b9b-aef4-e66efa1c6132.password [2016-10-31 04:03:41.590577] I [rpcsvc.c:2215:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64 [2016-10-31 04:03:41.590773] W [MSGID: 101002] [options.c:957:xl_opt_validate] 0-c_glusterfs-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction [2016-10-31 04:03:41.593929] I [trash.c:2366:init] 0-c_glusterfs-trash: no option specified for 'eliminate', using NULL [2016-10-31 04:03:41.595494] W [graph.c:357:_log_if_unknown_option] 0-c_glusterfs-server: option 'rpc-auth.auth-glusterfs' is not recognized [2016-10-31 04:03:41.595528] W [graph.c:357:_log_if_unknown_option] 0-c_glusterfs-server: option 'rpc-auth.auth-unix' is not recognized [2016-10-31 04:03:41.595556] W [graph.c:357:_log_if_unknown_option] 0-c_glusterfs-server: option 'rpc-auth.auth-null' is not recognized [2016-10-31 04:03:41.595619] W [graph.c:357:_log_if_unknown_option] 0-c_glusterfs-quota: option 'timeout' is not recognized [2016-10-31 04:03:41.595647] W [graph.c:357:_log_if_unknown_option] 0-c_glusterfs-marker: option 'quota-version' is not recognized [2016-10-31 04:03:41.595677] W [graph.c:357:_log_if_unknown_option] 0-c_glusterfs-trash: option 'brick-path' is not recognized [2016-10-31 04:03:41.605103] W [MSGID: 113036] [posix.c:2018:posix_rename] 0-c_glusterfs-posix: found directory at /opt/lvmdir/c2/brick/.trashcan/ while expecting ENOENT [File exists] Final graph: +------------------------------------------------------------------------------+ 1: volume c_glusterfs-posix 2: type storage/posix 3: option glusterd-uuid 5818a372-d517-41af-bd2f-0fce648773df 4: option directory /opt/lvmdir/c2/brick 5: option volume-id 0a50e4e5-ecfc-432c-88d9-2083455e4baf 6: end-volume 7: 8: volume c_glusterfs-trash 9: type features/trash 10: option trash-dir .trashcan 11: option brick-path /opt/lvmdir/c2/brick 12: option trash-internal-op off 13: subvolumes c_glusterfs-posix 14: end-volume 15: 16: volume c_glusterfs-changelog 17: type features/changelog 18: option changelog-brick /opt/lvmdir/c2/brick 19: option changelog-dir /opt/lvmdir/c2/brick/.glusterfs/changelogs 20: option changelog-barrier-timeout 120 21: subvolumes c_glusterfs-trash 22: end-volume 23: 24: volume c_glusterfs-bitrot-stub 25: type features/bitrot-stub 26: option export /opt/lvmdir/c2/brick 27: subvolumes c_glusterfs-changelog 28: end-volume 29: 30: volume c_glusterfs-access-control 31: type features/access-control 32: subvolumes c_glusterfs-bitrot-stub 33: end-volume 34: 35: volume c_glusterfs-locks 36: type features/locks 37: subvolumes c_glusterfs-access-control 38: end-volume 39: 40: volume c_glusterfs-upcall 41: type features/upcall 42: option cache-invalidation off 43: subvolumes c_glusterfs-locks 44: end-volume 45: 46: volume c_glusterfs-io-threads 47: type performance/io-threads 48: subvolumes c_glusterfs-upcall 49: end-volume 50: 51: volume c_glusterfs-marker 52: type features/marker 53: option volume-uuid 0a50e4e5-ecfc-432c-88d9-2083455e4baf 54: option timestamp-file /system/glusterd/vols/c_glusterfs/marker.tstamp 55: option quota-version 0 56: option xtime off 57: option gsync-force-xtime off 58: option quota off 59: option inode-quota off 60: subvolumes c_glusterfs-io-threads 61: end-volume 62: 63: volume c_glusterfs-barrier 64: type features/barrier 65: option barrier disable 66: option barrier-timeout 120 67: subvolumes c_glusterfs-marker 68: end-volume 69: 70: volume c_glusterfs-index 71: type features/index 72: option index-base /opt/lvmdir/c2/brick/.glusterfs/indices 73: subvolumes c_glusterfs-barrier 74: end-volume 75: 76: volume c_glusterfs-quota 77: type features/quota 78: option volume-uuid c_glusterfs 79: option server-quota off 80: option timeout 0 81: option deem-statfs off 82: subvolumes c_glusterfs-index 83: end-volume 84: 85: volume c_glusterfs-worm 86: type features/worm 87: option worm off 88: subvolumes c_glusterfs-quota 89: end-volume 90: 91: volume c_glusterfs-read-only 92: type features/read-only 93: option read-only off 94: subvolumes c_glusterfs-worm 95: end-volume 96: 97: volume /opt/lvmdir/c2/brick 98: type debug/io-stats 99: option latency-measurement off 100: option count-fop-hits off 101: subvolumes c_glusterfs-read-only 102: end-volume 103: 104: volume c_glusterfs-server 105: type protocol/server 106: option transport.socket.listen-port 49391 107: option rpc-auth.auth-glusterfs on 108: option rpc-auth.auth-unix on 109: option rpc-auth.auth-null on 110: option rpc-auth-allow-insecure on 111: option transport-type tcp 112: option auth.login./opt/lvmdir/c2/brick.allow 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 113: option auth.login.5c5afe17-02ce-4b9b-aef4-e66efa1c6132.password ade893cc-19dd-4d67-ae8e-cb929ef4e770 114: option auth.addr./opt/lvmdir/c2/brick.allow * 115: option transport.tcp-user-timeout 4 116: subvolumes /opt/lvmdir/c2/brick 117: end-volume 118: +------------------------------------------------------------------------------+ [2016-10-31 04:03:43.191850] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:03:43.191927] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 000300-3365-2016/10/31-04:01:45:605337-c_glusterfs-client-9-0-1 (version: 3.7.6) [2016-10-31 04:03:45.556522] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:03:45.556581] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 002500-1599-2016/10/31-04:03:41:530232-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:04:07.831958] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:04:07.832023] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 000300-1603-2016/10/31-04:01:41:508960-c_glusterfs-client-9-0-1 (version: 3.7.6) [2016-10-31 04:04:07.974625] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:04:07.974688] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 000300-8222-2016/10/31-04:04:07:930474-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:04:08.063317] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:04:08.063359] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 002500-4291-2016/10/31-04:04:08:14796-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:04:10.032591] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-c_glusterfs-server: disconnecting connection from 000300-8222-2016/10/31-04:04:07:930474-c_glusterfs-client-9-0-0 [2016-10-31 04:04:10.032672] I [MSGID: 101055] [client_t.c:419:gf_client_unref] 0-c_glusterfs-server: Shutting down connection 000300-8222-2016/10/31-04:04:07:930474-c_glusterfs-client-9-0-0 [2016-10-31 04:05:04.530186] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-c_glusterfs-server: disconnecting connection from 000300-3365-2016/10/31-04:01:45:605337-c_glusterfs-client-9-0-1 [2016-10-31 04:05:04.530267] W [inodelk.c:404:pl_inodelk_log_cleanup] 0-c_glusterfs-server: releasing lock on 60a17763-95fd-4162-8402-9651dfbe9ec4 held by {client=0x3fff7c002990, pid=0 lk-owner=a87b0a7cff3f0000} [2016-10-31 04:05:04.530306] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /logfiles/systemlog/syslog [2016-10-31 04:05:04.530375] I [MSGID: 101055] [client_t.c:419:gf_client_unref] 0-c_glusterfs-server: Shutting down connection 000300-3365-2016/10/31-04:01:45:605337-c_glusterfs-client-9-0-1 [2016-10-31 04:05:05.760618] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-c_glusterfs-server: disconnecting connection from 000300-1603-2016/10/31-04:01:41:508960-c_glusterfs-client-9-0-1 [2016-10-31 04:05:05.760693] I [MSGID: 101055] [client_t.c:419:gf_client_unref] 0-c_glusterfs-server: Shutting down connection 000300-1603-2016/10/31-04:01:41:508960-c_glusterfs-client-9-0-1 [2016-10-31 04:05:57.341892] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:05:57.341928] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 000300-1604-2016/10/31-04:05:53:637848-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:06:01.291380] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:06:01.291441] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 002500-6738-2016/10/31-04:06:01:245058-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:06:01.403916] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:06:01.403963] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 000300-3292-2016/10/31-04:05:57:685417-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:06:03.631296] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:06:03.631332] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 002500-4291-2016/10/31-04:04:08:14796-c_glusterfs-client-9-0-1 (version: 3.7.6) [2016-10-31 04:06:03.634762] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-c_glusterfs-server: disconnecting connection from 002500-4291-2016/10/31-04:04:08:14796-c_glusterfs-client-9-0-0 [2016-10-31 04:06:03.634945] W [inodelk.c:404:pl_inodelk_log_cleanup] 0-c_glusterfs-server: releasing lock on 200b0bb2-d903-4924-b025-262d805c0f2f held by {client=0x3fff7c026520, pid=6232 lk-owner=6cd0c189ff3f0000} [2016-10-31 04:06:03.634987] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /pmd/2 [2016-10-31 04:06:03.635035] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /configuration/oamrdncy.cfg [2016-10-31 04:06:03.635349] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on / [2016-10-31 04:06:03.635414] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /logfiles/systemlog/syslog [2016-10-31 04:06:03.635468] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /loadmodules_norepl/CXC1725605_P93A001 [2016-10-31 04:06:03.635554] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /loadmodules_norepl [2016-10-31 04:06:03.635889] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /loadmodules_norepl/CXC1725605_P93A001/cello [2016-10-31 04:06:03.640329] I [MSGID: 101055] [client_t.c:419:gf_client_unref] 0-c_glusterfs-server: Shutting down connection 002500-4291-2016/10/31-04:04:08:14796-c_glusterfs-client-9-0-0 [2016-10-31 04:06:03.641143] E [inodelk.c:309:__inode_unlock_lock] 0-c_glusterfs-locks: Matching lock not found for unlock 0-9223372036854775807, by 6cd0c189ff3f0000 on 0x3fff840fd9a0 [2016-10-31 04:06:03.641200] E [MSGID: 115053] [server-rpc-fops.c:275:server_inodelk_cbk] 0-c_glusterfs-server: 32623: INODELK /configuration/oamrdncy.cfg (200b0bb2-d903-4924-b025-262d805c0f2f) ==> (Invalid argument) [Invalid argument] [2016-10-31 04:06:06.270314] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-c_glusterfs-server: disconnecting connection from 002500-6738-2016/10/31-04:06:01:245058-c_glusterfs-client-9-0-0 [2016-10-31 04:06:06.270449] I [MSGID: 101055] [client_t.c:419:gf_client_unref] 0-c_glusterfs-server: Shutting down connection 002500-6738-2016/10/31-04:06:01:245058-c_glusterfs-client-9-0-0 Thanks, Xin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161121/e0dad62e/attachment.html>
songxin
2016-Nov-22 09:01 UTC
[Gluster-users] question about Transport endpoint is not connected
Hi, I found a log as below in the client log. ... [2016-10-31 04:05:05.762520] W [socket.c:588:__socket_rwv] 0-c_glusterfs-client-10: readv on 10.32.0.48:49309 failed (No data available) [2016-10-31 04:05:05.762621] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-c_glusterfs-client-10: disconnected from c_glusterfs-client-10. Client process will keep trying to connect to glusterd until brick's port is available ... It seems that the disconnection happen. I did not find the code which trying to reconnect the server. Can anyone tell me where is the point in that the client keep trying to connect to server when disconnection happened. Thanks, Xin Hi all, I have a question to consult you. Gluterfs version is 3.7.6. I create a replicate volume using two node. When I access the mount point, it show that "Transport endpoint is not connected" in client log. From log it seems that the glusterfs and glusterfsd disconnect.But I don't why disconnection happen on local sockect. Below is the log from client and service. mnt-c.log(client log) [2016-10-31 04:04:08.023782] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.6 (args: /usr/sbin/glusterfs --acl --volfile-server=10.32.1.144 --volfile-id=c_glusterfs /mnt/c) [2016-10-31 04:04:08.042622] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2016-10-31 04:04:08.052641] I [graph.c:269:gf_add_cmdline_options] 0-c_glusterfs-md-cache: adding option 'cache-posix-acl' for volume 'c_glusterfs-md-cache' with value 'true' [2016-10-31 04:04:08.058854] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2016-10-31 04:04:08.060066] I [MSGID: 114020] [client.c:2118:notify] 0-c_glusterfs-client-9: parent translators are ready, attempting connect on transport [2016-10-31 04:04:08.061169] I [MSGID: 114020] [client.c:2118:notify] 0-c_glusterfs-client-10: parent translators are ready, attempting connect on transport [2016-10-31 04:04:08.061694] I [rpc-clnt.c:1847:rpc_clnt_reconfig] 0-c_glusterfs-client-9: changing port to 49391 (from 0) Final graph: +------------------------------------------------------------------------------+ 1: volume c_glusterfs-client-9 2: type protocol/client 3: option ping-timeout 4 4: option remote-host 10.32.1.144 5: option remote-subvolume /opt/lvmdir/c2/brick 6: option transport-type socket 7: option username 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 8: option password ade893cc-19dd-4d67-ae8e-cb929ef4e770 9: option send-gids true 10: end-volume 11: 12: volume c_glusterfs-client-10 13: type protocol/client 14: option ping-timeout 4 15: option remote-host 10.32.0.48 16: option remote-subvolume /opt/lvmdir/c2/brick 17: option transport-type socket 18: option username 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 19: option password ade893cc-19dd-4d67-ae8e-cb929ef4e770 20: option send-gids true 21: end-volume 22: 23: volume c_glusterfs-replicate-0 24: type cluster/replicate 25: subvolumes c_glusterfs-client-9 c_glusterfs-client-10 26: end-volume 27: 28: volume c_glusterfs-dht 29: type cluster/distribute 30: subvolumes c_glusterfs-replicate-0 31: end-volume 32: 33: volume c_glusterfs-write-behind 34: type performance/write-behind 35: subvolumes c_glusterfs-dht 36: end-volume 37: 38: volume c_glusterfs-read-ahead 39: type performance/read-ahead 40: subvolumes c_glusterfs-write-behind 41: end-volume 42: 43: volume c_glusterfs-readdir-ahead 44: type performance/readdir-ahead 45: subvolumes c_glusterfs-read-ahead 46: end-volume 47: 48: volume c_glusterfs-io-cache 49: type performance/io-cache 50: subvolumes c_glusterfs-readdir-ahead 51: end-volume 52: 53: volume c_glusterfs-quick-read 54: type performance/quick-read 55: subvolumes c_glusterfs-io-cache 56: end-volume 57: 58: volume c_glusterfs-open-behind 59: type performance/open-behind 60: subvolumes c_glusterfs-quick-read 61: end-volume 62: 63: volume c_glusterfs-md-cache 64: type performance/md-cache 65: option cache-posix-acl true 66: subvolumes c_glusterfs-open-behind 67: end-volume 68: 69: volume c_glusterfs 70: type debug/io-stats 71: option latency-measurement off 72: option count-fop-hits off 73: subvolumes c_glusterfs-md-cache 74: end-volume 75: 76: volume posix-acl-autoload 77: type system/posix-acl 78: subvolumes c_glusterfs 79: end-volume 80: 81: volume meta-autoload 82: type meta 83: subvolumes posix-acl-autoload 84: end-volume 85: +------------------------------------------------------------------------------+ [2016-10-31 04:04:08.062941] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-c_glusterfs-client-9: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2016-10-31 04:04:08.063367] I [rpc-clnt.c:1847:rpc_clnt_reconfig] 0-c_glusterfs-client-10: changing port to 49309 (from 0) [2016-10-31 04:04:08.063543] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-c_glusterfs-client-9: Connected to c_glusterfs-client-9, attached to remote volume '/opt/lvmdir/c2/brick'. [2016-10-31 04:04:08.063576] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-c_glusterfs-client-9: Server and Client lk-version numbers are not same, reopening the fds [2016-10-31 04:04:08.063716] I [MSGID: 108005] [afr-common.c:3841:afr_notify] 0-c_glusterfs-replicate-0: Subvolume 'c_glusterfs-client-9' came back up; going online. [2016-10-31 04:04:08.063785] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-c_glusterfs-client-9: Server lk version = 1 [2016-10-31 04:04:08.064659] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-c_glusterfs-client-10: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2016-10-31 04:04:08.065344] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-c_glusterfs-client-10: Connected to c_glusterfs-client-10, attached to remote volume '/opt/lvmdir/c2/brick'. [2016-10-31 04:04:08.065374] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-c_glusterfs-client-10: Server and Client lk-version numbers are not same, reopening the fds [2016-10-31 04:04:08.074629] I [fuse-bridge.c:5137:fuse_graph_setup] 0-fuse: switched to graph 0 [2016-10-31 04:04:08.074884] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-c_glusterfs-client-10: Server lk version = 1 [2016-10-31 04:04:08.075006] I [fuse-bridge.c:4030:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel 7.22 [2016-10-31 04:04:08.077023] I [MSGID: 108031] [afr-common.c:1782:afr_local_discovery_cbk] 0-c_glusterfs-replicate-0: selecting local read_child c_glusterfs-client-9 [2016-10-31 04:04:15.908933] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:15.909054] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:15.913787] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:15.913885] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:15.923030] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:15.923135] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:15.944227] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:15.944343] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:15.949661] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:15.949766] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:15.954882] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:15.954982] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:15.960768] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] ... [2016-10-31 04:04:16.107262] I [MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 0-c_glusterfs-replicate-0: performing metadata selfheal on 199e1862-7e74-4b13-8c42-c4cea7e9f2fc [2016-10-31 04:04:16.111886] I [MSGID: 108026] [afr-self-heal-common.c:651:afr_log_selfheal] 0-c_glusterfs-replicate-0: Completed metadata selfheal on 199e1862-7e74-4b13-8c42-c4cea7e9f2fc. source=0 sinks=1 [2016-10-31 04:04:16.114518] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:16.114620] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:16.122321] I [MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 0-c_glusterfs-replicate-0: performing metadata selfheal on 014e2eca-bb61-4939-9a4a-9a1af4351178 [2016-10-31 04:04:16.126960] I [MSGID: 108026] [afr-self-heal-common.c:651:afr_log_selfheal] 0-c_glusterfs-replicate-0: Completed metadata selfheal on 014e2eca-bb61-4939-9a4a-9a1af4351178. source=0 sinks=1 [2016-10-31 04:04:16.128101] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:16.128202] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:16.134165] I [MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 0-c_glusterfs-replicate-0: performing metadata selfheal on 60b5380b-0cea-423d-a1af-d3064968cc54 [2016-10-31 04:04:16.139036] I [MSGID: 108026] [afr-self-heal-common.c:651:afr_log_selfheal] 0-c_glusterfs-replicate-0: Completed metadata selfheal on 60b5380b-0cea-423d-a1af-d3064968cc54. source=0 sinks=1 [2016-10-31 04:04:16.141317] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:16.141432] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:16.147456] I [MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 0-c_glusterfs-replicate-0: performing metadata selfheal on accef130-4cf5-4171-be08-2089e7ba1ca5 [2016-10-31 04:04:16.152572] I [MSGID: 108026] [afr-self-heal-common.c:651:afr_log_selfheal] 0-c_glusterfs-replicate-0: Completed metadata selfheal on accef130-4cf5-4171-be08-2089e7ba1ca5. source=0 sinks=1 [2016-10-31 04:04:16.153633] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:16.153744] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] [2016-10-31 04:04:16.159954] I [MSGID: 108026] [afr-self-heal-metadata.c:56:__afr_selfheal_metadata_do] 0-c_glusterfs-replicate-0: performing metadata selfheal on 9c666f31-f3d7-40c6-b17b-0f41768ddea4 [2016-10-31 04:04:16.164773] I [MSGID: 108026] [afr-self-heal-common.c:651:afr_log_selfheal] 0-c_glusterfs-replicate-0: Completed metadata selfheal on 9c666f31-f3d7-40c6-b17b-0f41768ddea4. source=0 sinks=1 [2016-10-31 04:04:16.167219] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15bb8) [0x3fff86a5b8e8] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_access [Invalid argument] [2016-10-31 04:04:16.167323] I [dict.c:473:dict_get] (-->/usr/lib64/glusterfs/3.7.6/xlator/debug/io-stats.so(io_stats_lookup_cbk-0x1d7dc) [0x3fff86a85574] -->/usr/lib64/glusterfs/3.7.6/xlator/system/posix-acl.so(posix_acl_lookup_cbk-0x15b5c) [0x3fff86a5b944] -->/usr/lib64/libglusterfs.so.0(dict_get-0xc10f4) [0x3fff8ab6dc2c] ) 0-dict: !this || key=system.posix_acl_default [Invalid argument] ... [2016-10-31 04:06:03.627345] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808] ))))) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3) op(FINODELK(30)) called at 2016-10-31 04:06:03.626033 (xid=0x7f5e) [2016-10-31 04:06:03.627395] E [MSGID: 114031] [client-rpc-fops.c:1673:client3_3_finodelk_cbk] 0-c_glusterfs-client-9: remote operation failed [Transport endpoint is not connected] [2016-10-31 04:06:03.628381] I [socket.c:3308:socket_submit_request] 0-c_glusterfs-client-9: not connected (priv->connected = 0) [2016-10-31 04:06:03.628432] W [rpc-clnt.c:1586:rpc_clnt_submit] 0-c_glusterfs-client-9: failed to submit rpc-request (XID: 0x7f5f Program: GlusterFS 3.3, ProgVers: 330, Proc: 30) to rpc-transport (c_glusterfs-client-9) [2016-10-31 04:06:03.628466] E [MSGID: 114031] [client-rpc-fops.c:1673:client3_3_finodelk_cbk] 0-c_glusterfs-client-9: remote operation failed [Transport endpoint is not connected] [2016-10-31 04:06:03.628475] I [MSGID: 108019] [afr-lk-common.c:1086:afr_lock_blocking] 0-c_glusterfs-replicate-0: unable to lock on even one child [2016-10-31 04:06:03.628539] I [MSGID: 108019] [afr-transaction.c:1224:afr_post_blocking_inodelk_cbk] 0-c_glusterfs-replicate-0: Blocking inodelks failed. [2016-10-31 04:06:03.628630] W [fuse-bridge.c:1282:fuse_err_cbk] 0-glusterfs-fuse: 20790: FLUSH() ERR => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.629149] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808] ))))) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2016-10-31 04:06:03.624346 (xid=0x7f5a) [2016-10-31 04:06:03.629183] I [rpc-clnt.c:1847:rpc_clnt_reconfig] 0-c_glusterfs-client-9: changing port to 49391 (from 0) [2016-10-31 04:06:03.629210] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews (b0e5a94e-a432-4dce-b86f-a551555780a2) [Transport endpoint is not connected] [2016-10-31 04:06:03.629266] I [socket.c:3308:socket_submit_request] 0-c_glusterfs-client-9: not connected (priv->connected = 255) [2016-10-31 04:06:03.629277] I [MSGID: 109063] [dht-layout.c:702:dht_layout_normalize] 0-c_glusterfs-dht: Found anomalies in /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews (gfid = b0e5a94e-a432-4dce-b86f-a551555780a2). Holes=1 overlaps=0 [2016-10-31 04:06:03.629293] W [rpc-clnt.c:1586:rpc_clnt_submit] 0-c_glusterfs-client-9: failed to submit rpc-request (XID: 0x7f62 Program: GlusterFS 3.3, ProgVers: 330, Proc: 41) to rpc-transport (c_glusterfs-client-9) [2016-10-31 04:06:03.629333] W [fuse-resolve.c:149:fuse_resolve_gfid_cbk] 0-fuse: b0e5a94e-a432-4dce-b86f-a551555780a2: failed to resolve (Transport endpoint is not connected) [2016-10-31 04:06:03.629363] W [fuse-bridge.c:3385:fuse_getxattr_resume] 0-glusterfs-fuse: 20784: GETXATTR b0e5a94e-a432-4dce-b86f-a551555780a2/70366685109140 (system.posix_acl_access) resolution failed [2016-10-31 04:06:03.629729] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808] ))))) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2016-10-31 04:06:03.624568 (xid=0x7f5b) [2016-10-31 04:06:03.629779] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /pmd/2 (fa866805-ab22-4afe-8acb-528d15ce4117) [Transport endpoint is not connected] [2016-10-31 04:06:03.629784] E [MSGID: 114031] [client-rpc-fops.c:2883:client3_3_opendir_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews (b0e5a94e-a432-4dce-b86f-a551555780a2) [Transport endpoint is not connected] [2016-10-31 04:06:03.629836] I [MSGID: 109063] [dht-layout.c:702:dht_layout_normalize] 0-c_glusterfs-dht: Found anomalies in /pmd/2 (gfid = fa866805-ab22-4afe-8acb-528d15ce4117). Holes=1 overlaps=0 [2016-10-31 04:06:03.629857] W [fuse-bridge.c:980:fuse_fd_cbk] 0-glusterfs-fuse: 20794: OPENDIR() /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.629896] W [fuse-resolve.c:149:fuse_resolve_gfid_cbk] 0-fuse: fa866805-ab22-4afe-8acb-528d15ce4117: failed to resolve (Transport endpoint is not connected) [2016-10-31 04:06:03.629990] E [fuse-bridge.c:787:fuse_getattr_resume] 0-glusterfs-fuse: 20785: GETATTR 70366685017568 (fa866805-ab22-4afe-8acb-528d15ce4117) resolution failed [2016-10-31 04:06:03.630164] W [defaults.c:2212:default_releasedir] (-->/usr/lib64/glusterfs/3.7.6/xlator/protocol/client.so(client_local_wipe-0x42d64) [0x3fff86c9bc5c] -->/usr/lib64/libglusterfs.so.0(fd_unref-0x7ba98) [0x3fff8abb7ad0] -->/usr/lib64/libglusterfs.so.0(default_releasedir-0x9d6d8) [0x3fff8ab93c58] ) 0-fuse: xlator does not implement releasedir_cbk [2016-10-31 04:06:03.630367] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58] (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808] ))))) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2016-10-31 04:06:03.625675 (xid=0x7f5c) [2016-10-31 04:06:03.630368] W [MSGID: 114031] [client-rpc-fops.c:1569:client3_3_fstat_cbk] 0-c_glusterfs-client-9: remote operation failed [Transport endpoint is not connected] [2016-10-31 04:06:03.630497] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /configuration/oamrdncy.cfg (200b0bb2-d903-4924-b025-262d805c0f2f) [Transport endpoint is not connected] [2016-10-31 04:06:03.630871] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-c_glusterfs-client-9: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2016-10-31 04:06:03.631488] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-c_glusterfs-client-9: Connected to c_glusterfs-client-9, attached to remote volume '/opt/lvmdir/c2/brick'. [2016-10-31 04:06:03.631523] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-c_glusterfs-client-9: Server and Client lk-version numbers are not same, reopening the fds [2016-10-31 04:06:03.631555] I [MSGID: 114042] [client-handshake.c:1056:client_post_handshake] 0-c_glusterfs-client-9: 6 fds open - Delaying child_up until they are re-opened [2016-10-31 04:06:03.631475] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /configuration/ethmac.cfg (00000000-0000-0000-0000-000000000000) [Transport endpoint is not connected] [2016-10-31 04:06:03.632256] I [MSGID: 114060] [client-handshake.c:820:client3_3_reopendir_cbk] 0-c_glusterfs-client-9: reopendir on <gfid:00000000-0000-0000-0000-000000000001> succeeded (fd = 1) [2016-10-31 04:06:03.632728] I [MSGID: 114060] [client-handshake.c:820:client3_3_reopendir_cbk] 0-c_glusterfs-client-9: reopendir on <gfid:257add5a-9b87-4014-8a3a-09dd0a699eec> succeeded (fd = 2) [2016-10-31 04:06:03.632790] I [MSGID: 114060] [client-handshake.c:820:client3_3_reopendir_cbk] 0-c_glusterfs-client-9: reopendir on <gfid:6fd47fe9-a566-47c7-8df7-e7a5e379101f> succeeded (fd = 3) [2016-10-31 04:06:03.632847] I [MSGID: 114060] [client-handshake.c:820:client3_3_reopendir_cbk] 0-c_glusterfs-client-9: reopendir on <gfid:fa866805-ab22-4afe-8acb-528d15ce4117> succeeded (fd = 4) [2016-10-31 04:06:03.632905] I [MSGID: 114060] [client-handshake.c:820:client3_3_reopendir_cbk] 0-c_glusterfs-client-9: reopendir on <gfid:139f5034-5b42-4fb8-9372-6dd7cac1e674> succeeded (fd = 5) [2016-10-31 04:06:03.632935] I [MSGID: 114041] [client-handshake.c:678:client_child_up_reopen_done] 0-c_glusterfs-client-9: last fd open'd/lock-self-heal'd - notifying CHILD-UP [2016-10-31 04:06:03.633028] I [MSGID: 108005] [afr-common.c:3841:afr_notify] 0-c_glusterfs-replicate-0: Subvolume 'c_glusterfs-client-9' came back up; going online. [2016-10-31 04:06:03.633386] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-c_glusterfs-client-9: Server lk version = 1 [2016-10-31 04:06:03.634579] W [fuse-bridge.c:758:fuse_attr_cbk] 0-glusterfs-fuse: 20795: FSTAT() /configuration/oamrdncy.cfg => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.634889] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58] (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808] ))))) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3) op(XATTROP(33)) called at 2016-10-31 04:06:03.625716 (xid=0x7f5d) [2016-10-31 04:06:03.634950] W [MSGID: 114031] [client-rpc-fops.c:1845:client3_3_xattrop_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /configuration/oamrdncy.cfg (200b0bb2-d903-4924-b025-262d805c0f2f) [2016-10-31 04:06:03.635134] W [MSGID: 108008] [afr-read-txn.c:250:afr_read_txn] 0-c_glusterfs-replicate-0: Unreadable subvolume -1 found with event generation 3 for gfid 200b0bb2-d903-4924-b025-262d805c0f2f. (Possible split-brain) [2016-10-31 04:06:03.635364] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-c_glusterfs-client-9: disconnected from c_glusterfs-client-9. Client process will keep trying to connect to glusterd until brick's port is available [2016-10-31 04:06:03.635420] E [MSGID: 108006] [afr-common.c:3880:afr_notify] 0-c_glusterfs-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up. [2016-10-31 04:06:03.635762] I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up [2016-10-31 04:06:03.635812] I [MSGID: 109063] [dht-layout.c:702:dht_layout_normalize] 0-c_glusterfs-dht: Found anomalies in /pmd/15 (gfid = 52a8bce8-00ac-48a7-8abf-9a3fdf9a498b). Holes=1 overlaps=0 [2016-10-31 04:06:03.635858] W [fuse-resolve.c:149:fuse_resolve_gfid_cbk] 0-fuse: 52a8bce8-00ac-48a7-8abf-9a3fdf9a498b: failed to resolve (Transport endpoint is not connected) [2016-10-31 04:06:03.635885] E [fuse-bridge.c:787:fuse_getattr_resume] 0-glusterfs-fuse: 20802: GETATTR 70366685021780 (52a8bce8-00ac-48a7-8abf-9a3fdf9a498b) resolution failed [2016-10-31 04:06:03.638853] I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up [2016-10-31 04:06:03.639020] E [MSGID: 108008] [afr-read-txn.c:89:afr_read_txn_refresh_done] 0-c_glusterfs-replicate-0: Failing FSTAT on gfid 200b0bb2-d903-4924-b025-262d805c0f2f: split-brain observed. [Input/output error] [2016-10-31 04:06:03.640564] I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up [2016-10-31 04:06:03.641160] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20808: LOOKUP() /license => -1 (Transport endpoint is not connected) The message "I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up" repeated 6 times between [2016-10-31 04:06:03.640564] and [2016-10-31 04:06:03.641444] [2016-10-31 04:06:03.641484] E [MSGID: 114031] [client-rpc-fops.c:1621:client3_3_inodelk_cbk] 0-c_glusterfs-client-9: remote operation failed [Invalid argument] [2016-10-31 04:06:03.641562] I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up [2016-10-31 04:06:03.641573] E [MSGID: 108010] [afr-lk-common.c:665:afr_unlock_inodelk_cbk] 0-c_glusterfs-replicate-0: path=/configuration/oamrdncy.cfg gfid=200b0bb2-d903-4924-b025-262d805c0f2f: unlock failed on subvolume c_glusterfs-client-9 with lock owner 6cd0c189ff3f0000 [Invalid argument] [2016-10-31 04:06:03.641655] I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up [2016-10-31 04:06:03.641716] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20809: LOOKUP() /lost+found => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.642387] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20813: LOOKUP() /lost+found => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.647166] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20815: LOOKUP() /license => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.647647] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20816: LOOKUP() /tmp => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.648023] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20817: LOOKUP() /lost+found => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.653160] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20819: LOOKUP() /tmp => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.653635] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20820: LOOKUP() /logfiles => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.653992] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20821: LOOKUP() /security => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.654333] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20822: LOOKUP() /logfiles => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.660036] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20824: LOOKUP() /security => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.660675] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20825: LOOKUP() /logfiles => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.661018] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20826: LOOKUP() /up => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.665422] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20828: LOOKUP() /up => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.665886] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20829: LOOKUP() /usr => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.666242] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20830: LOOKUP() /xb => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.666790] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20832: LOOKUP() /xb => -1 (Transport endpoint is not connected) The message "I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up" repeated 61 times between [2016-10-31 04:06:03.641655] and [2016-10-31 04:06:03.670371] [2016-10-31 04:06:03.670392] I [MSGID: 109063] [dht-layout.c:702:dht_layout_normalize] 0-c_glusterfs-dht: Found anomalies in /pmd/2/000301 (gfid = 7af60568-eace-40f1-a94f-5ba92a3afadf). Holes=1 overlaps=0 [2016-10-31 04:06:03.670499] W [fuse-resolve.c:149:fuse_resolve_gfid_cbk] 0-fuse: 7af60568-eace-40f1-a94f-5ba92a3afadf: failed to resolve (Transport endpoint is not connected) [2016-10-31 04:06:03.670529] W [fuse-bridge.c:2517:fuse_opendir_resume] 0-glusterfs-fuse: 20834: OPENDIR (7af60568-eace-40f1-a94f-5ba92a3afadf) resolution failed [2016-10-31 04:06:03.675300] I [MSGID: 108006] [afr-common.c:4008:afr_local_init] 0-c_glusterfs-replicate-0: no subvolumes up [2016-10-31 04:06:03.675553] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20835: LOOKUP() /usr => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.675990] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20836: LOOKUP() /xb => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.676348] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20837: LOOKUP() /node_id => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.680652] W [fuse-bridge.c:462:fuse_entry_cbk] 0-glusterfs-fuse: 20839: LOOKUP() /node_id => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.681150] W [fuse-bridge.c:980:fuse_fd_cbk] 0-glusterfs-fuse: 20842: OPENDIR() /configuration => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.681510] W [defaults.c:2212:default_releasedir] (-->/usr/lib64/glusterfs/3.7.6/xlator/cluster/distribute.so(dht_local_wipe-0x70e58) [0x3fff86b88f58] -->/usr/lib64/libglusterfs.so.0(fd_unref-0x7ba98) [0x3fff8abb7ad0] -->/usr/lib64/libglusterfs.so.0(default_releasedir-0x9d6d8) [0x3fff8ab93c58] ) 0-fuse: xlator does not implement releasedir_cbk opt-lvmdir-c2-brick.log(server log) [2016-10-31 04:03:41.554325] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.7.6 (args: /usr/sbin/glusterfsd -s 10.32.1.144 --volfile-id c_glusterfs.10.32.1.144.opt-lvmdir-c2-brick -p /system/glusterd/vols/c_glusterfs/run/10.32.1.144-opt-lvmdir-c2-brick.pid -S /var/run/gluster/697c0e4a16ebc734cd06fd9150723005.socket --brick-name /opt/lvmdir/c2/brick -l /var/log/glusterfs/bricks/opt-lvmdir-c2-brick.log --xlator-option *-posix.glusterd-uuid=5818a372-d517-41af-bd2f-0fce648773df --brick-port 49391 --xlator-option c_glusterfs-server.listen-port=49391) [2016-10-31 04:03:41.571442] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1 [2016-10-31 04:03:41.588390] I [graph.c:269:gf_add_cmdline_options] 0-c_glusterfs-server: adding option 'listen-port' for volume 'c_glusterfs-server' with value '49391' [2016-10-31 04:03:41.588440] I [graph.c:269:gf_add_cmdline_options] 0-c_glusterfs-posix: adding option 'glusterd-uuid' for volume 'c_glusterfs-posix' with value '5818a372-d517-41af-bd2f-0fce648773df' [2016-10-31 04:03:41.588815] I [MSGID: 101190] [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread with index 2 [2016-10-31 04:03:41.588870] I [MSGID: 115034] [server.c:403:_check_for_auth_option] 0-/opt/lvmdir/c2/brick: skip format check for non-addr auth option auth.login./opt/lvmdir/c2/brick.allow [2016-10-31 04:03:41.588907] I [MSGID: 115034] [server.c:403:_check_for_auth_option] 0-/opt/lvmdir/c2/brick: skip format check for non-addr auth option auth.login.5c5afe17-02ce-4b9b-aef4-e66efa1c6132.password [2016-10-31 04:03:41.590577] I [rpcsvc.c:2215:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64 [2016-10-31 04:03:41.590773] W [MSGID: 101002] [options.c:957:xl_opt_validate] 0-c_glusterfs-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction [2016-10-31 04:03:41.593929] I [trash.c:2366:init] 0-c_glusterfs-trash: no option specified for 'eliminate', using NULL [2016-10-31 04:03:41.595494] W [graph.c:357:_log_if_unknown_option] 0-c_glusterfs-server: option 'rpc-auth.auth-glusterfs' is not recognized [2016-10-31 04:03:41.595528] W [graph.c:357:_log_if_unknown_option] 0-c_glusterfs-server: option 'rpc-auth.auth-unix' is not recognized [2016-10-31 04:03:41.595556] W [graph.c:357:_log_if_unknown_option] 0-c_glusterfs-server: option 'rpc-auth.auth-null' is not recognized [2016-10-31 04:03:41.595619] W [graph.c:357:_log_if_unknown_option] 0-c_glusterfs-quota: option 'timeout' is not recognized [2016-10-31 04:03:41.595647] W [graph.c:357:_log_if_unknown_option] 0-c_glusterfs-marker: option 'quota-version' is not recognized [2016-10-31 04:03:41.595677] W [graph.c:357:_log_if_unknown_option] 0-c_glusterfs-trash: option 'brick-path' is not recognized [2016-10-31 04:03:41.605103] W [MSGID: 113036] [posix.c:2018:posix_rename] 0-c_glusterfs-posix: found directory at /opt/lvmdir/c2/brick/.trashcan/ while expecting ENOENT [File exists] Final graph: +------------------------------------------------------------------------------+ 1: volume c_glusterfs-posix 2: type storage/posix 3: option glusterd-uuid 5818a372-d517-41af-bd2f-0fce648773df 4: option directory /opt/lvmdir/c2/brick 5: option volume-id 0a50e4e5-ecfc-432c-88d9-2083455e4baf 6: end-volume 7: 8: volume c_glusterfs-trash 9: type features/trash 10: option trash-dir .trashcan 11: option brick-path /opt/lvmdir/c2/brick 12: option trash-internal-op off 13: subvolumes c_glusterfs-posix 14: end-volume 15: 16: volume c_glusterfs-changelog 17: type features/changelog 18: option changelog-brick /opt/lvmdir/c2/brick 19: option changelog-dir /opt/lvmdir/c2/brick/.glusterfs/changelogs 20: option changelog-barrier-timeout 120 21: subvolumes c_glusterfs-trash 22: end-volume 23: 24: volume c_glusterfs-bitrot-stub 25: type features/bitrot-stub 26: option export /opt/lvmdir/c2/brick 27: subvolumes c_glusterfs-changelog 28: end-volume 29: 30: volume c_glusterfs-access-control 31: type features/access-control 32: subvolumes c_glusterfs-bitrot-stub 33: end-volume 34: 35: volume c_glusterfs-locks 36: type features/locks 37: subvolumes c_glusterfs-access-control 38: end-volume 39: 40: volume c_glusterfs-upcall 41: type features/upcall 42: option cache-invalidation off 43: subvolumes c_glusterfs-locks 44: end-volume 45: 46: volume c_glusterfs-io-threads 47: type performance/io-threads 48: subvolumes c_glusterfs-upcall 49: end-volume 50: 51: volume c_glusterfs-marker 52: type features/marker 53: option volume-uuid 0a50e4e5-ecfc-432c-88d9-2083455e4baf 54: option timestamp-file /system/glusterd/vols/c_glusterfs/marker.tstamp 55: option quota-version 0 56: option xtime off 57: option gsync-force-xtime off 58: option quota off 59: option inode-quota off 60: subvolumes c_glusterfs-io-threads 61: end-volume 62: 63: volume c_glusterfs-barrier 64: type features/barrier 65: option barrier disable 66: option barrier-timeout 120 67: subvolumes c_glusterfs-marker 68: end-volume 69: 70: volume c_glusterfs-index 71: type features/index 72: option index-base /opt/lvmdir/c2/brick/.glusterfs/indices 73: subvolumes c_glusterfs-barrier 74: end-volume 75: 76: volume c_glusterfs-quota 77: type features/quota 78: option volume-uuid c_glusterfs 79: option server-quota off 80: option timeout 0 81: option deem-statfs off 82: subvolumes c_glusterfs-index 83: end-volume 84: 85: volume c_glusterfs-worm 86: type features/worm 87: option worm off 88: subvolumes c_glusterfs-quota 89: end-volume 90: 91: volume c_glusterfs-read-only 92: type features/read-only 93: option read-only off 94: subvolumes c_glusterfs-worm 95: end-volume 96: 97: volume /opt/lvmdir/c2/brick 98: type debug/io-stats 99: option latency-measurement off 100: option count-fop-hits off 101: subvolumes c_glusterfs-read-only 102: end-volume 103: 104: volume c_glusterfs-server 105: type protocol/server 106: option transport.socket.listen-port 49391 107: option rpc-auth.auth-glusterfs on 108: option rpc-auth.auth-unix on 109: option rpc-auth.auth-null on 110: option rpc-auth-allow-insecure on 111: option transport-type tcp 112: option auth.login./opt/lvmdir/c2/brick.allow 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 113: option auth.login.5c5afe17-02ce-4b9b-aef4-e66efa1c6132.password ade893cc-19dd-4d67-ae8e-cb929ef4e770 114: option auth.addr./opt/lvmdir/c2/brick.allow * 115: option transport.tcp-user-timeout 4 116: subvolumes /opt/lvmdir/c2/brick 117: end-volume 118: +------------------------------------------------------------------------------+ [2016-10-31 04:03:43.191850] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:03:43.191927] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 000300-3365-2016/10/31-04:01:45:605337-c_glusterfs-client-9-0-1 (version: 3.7.6) [2016-10-31 04:03:45.556522] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:03:45.556581] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 002500-1599-2016/10/31-04:03:41:530232-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:04:07.831958] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:04:07.832023] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 000300-1603-2016/10/31-04:01:41:508960-c_glusterfs-client-9-0-1 (version: 3.7.6) [2016-10-31 04:04:07.974625] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:04:07.974688] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 000300-8222-2016/10/31-04:04:07:930474-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:04:08.063317] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:04:08.063359] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 002500-4291-2016/10/31-04:04:08:14796-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:04:10.032591] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-c_glusterfs-server: disconnecting connection from 000300-8222-2016/10/31-04:04:07:930474-c_glusterfs-client-9-0-0 [2016-10-31 04:04:10.032672] I [MSGID: 101055] [client_t.c:419:gf_client_unref] 0-c_glusterfs-server: Shutting down connection 000300-8222-2016/10/31-04:04:07:930474-c_glusterfs-client-9-0-0 [2016-10-31 04:05:04.530186] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-c_glusterfs-server: disconnecting connection from 000300-3365-2016/10/31-04:01:45:605337-c_glusterfs-client-9-0-1 [2016-10-31 04:05:04.530267] W [inodelk.c:404:pl_inodelk_log_cleanup] 0-c_glusterfs-server: releasing lock on 60a17763-95fd-4162-8402-9651dfbe9ec4 held by {client=0x3fff7c002990, pid=0 lk-owner=a87b0a7cff3f0000} [2016-10-31 04:05:04.530306] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /logfiles/systemlog/syslog [2016-10-31 04:05:04.530375] I [MSGID: 101055] [client_t.c:419:gf_client_unref] 0-c_glusterfs-server: Shutting down connection 000300-3365-2016/10/31-04:01:45:605337-c_glusterfs-client-9-0-1 [2016-10-31 04:05:05.760618] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-c_glusterfs-server: disconnecting connection from 000300-1603-2016/10/31-04:01:41:508960-c_glusterfs-client-9-0-1 [2016-10-31 04:05:05.760693] I [MSGID: 101055] [client_t.c:419:gf_client_unref] 0-c_glusterfs-server: Shutting down connection 000300-1603-2016/10/31-04:01:41:508960-c_glusterfs-client-9-0-1 [2016-10-31 04:05:57.341892] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:05:57.341928] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 000300-1604-2016/10/31-04:05:53:637848-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:06:01.291380] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:06:01.291441] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 002500-6738-2016/10/31-04:06:01:245058-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:06:01.403916] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:06:01.403963] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 000300-3292-2016/10/31-04:05:57:685417-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:06:03.631296] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:06:03.631332] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 002500-4291-2016/10/31-04:04:08:14796-c_glusterfs-client-9-0-1 (version: 3.7.6) [2016-10-31 04:06:03.634762] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-c_glusterfs-server: disconnecting connection from 002500-4291-2016/10/31-04:04:08:14796-c_glusterfs-client-9-0-0 [2016-10-31 04:06:03.634945] W [inodelk.c:404:pl_inodelk_log_cleanup] 0-c_glusterfs-server: releasing lock on 200b0bb2-d903-4924-b025-262d805c0f2f held by {client=0x3fff7c026520, pid=6232 lk-owner=6cd0c189ff3f0000} [2016-10-31 04:06:03.634987] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /pmd/2 [2016-10-31 04:06:03.635035] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /configuration/oamrdncy.cfg [2016-10-31 04:06:03.635349] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on / [2016-10-31 04:06:03.635414] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /logfiles/systemlog/syslog [2016-10-31 04:06:03.635468] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /loadmodules_norepl/CXC1725605_P93A001 [2016-10-31 04:06:03.635554] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /loadmodules_norepl [2016-10-31 04:06:03.635889] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /loadmodules_norepl/CXC1725605_P93A001/cello [2016-10-31 04:06:03.640329] I [MSGID: 101055] [client_t.c:419:gf_client_unref] 0-c_glusterfs-server: Shutting down connection 002500-4291-2016/10/31-04:04:08:14796-c_glusterfs-client-9-0-0 [2016-10-31 04:06:03.641143] E [inodelk.c:309:__inode_unlock_lock] 0-c_glusterfs-locks: Matching lock not found for unlock 0-9223372036854775807, by 6cd0c189ff3f0000 on 0x3fff840fd9a0 [2016-10-31 04:06:03.641200] E [MSGID: 115053] [server-rpc-fops.c:275:server_inodelk_cbk] 0-c_glusterfs-server: 32623: INODELK /configuration/oamrdncy.cfg (200b0bb2-d903-4924-b025-262d805c0f2f) ==> (Invalid argument) [Invalid argument] [2016-10-31 04:06:06.270314] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-c_glusterfs-server: disconnecting connection from 002500-6738-2016/10/31-04:06:01:245058-c_glusterfs-client-9-0-0 [2016-10-31 04:06:06.270449] I [MSGID: 101055] [client_t.c:419:gf_client_unref] 0-c_glusterfs-server: Shutting down connection 002500-6738-2016/10/31-04:06:01:245058-c_glusterfs-client-9-0-0 Thanks, Xin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161122/2a2c908e/attachment.html>
Hi everyone, I create a replicate volume using two nodes,A board and B board. A board ip:10.32.1.144 B board ip:10.32.0.48 One brick and mount point is on A board Another brick is on B board I found that I can't access the mount point because the disconnection happen between client and two server. The client log is below.(Please notice the red line) [2016-10-31 04:04:08.063543] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-c_glusterfs-client-9: Connected to c_glusterfs-client-9, attached to remote volume '/opt/lvmdir/c2/brick'. ... [2016-10-31 04:06:03.626047] W [socket.c:588:__socket_rwv] 0-c_glusterfs-client-9: readv on 10.32.1.144:49391 failed (Connection reset by peer) [2016-10-31 04:06:03.627345] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808] ))))) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3) op(FINODELK(30)) called at 2016-10-31 04:06:03.626033 (xid=0x7f5e) [2016-10-31 04:06:03.627395] E [MSGID: 114031] [client-rpc-fops.c:1673:client3_3_finodelk_cbk] 0-c_glusterfs-client-9: remote operation failed [Transport endpoint is not connected] [2016-10-31 04:06:03.628381] I [socket.c:3308:socket_submit_request] 0-c_glusterfs-client-9: not connected (priv->connected = 0) [2016-10-31 04:06:03.628432] W [rpc-clnt.c:1586:rpc_clnt_submit] 0-c_glusterfs-client-9: failed to submit rpc-request (XID: 0x7f5f Program: GlusterFS 3.3, ProgVers: 330, Proc: 30) to rpc-transport (c_glusterfs-client-9) [2016-10-31 04:06:03.628466] E [MSGID: 114031] [client-rpc-fops.c:1673:client3_3_finodelk_cbk] 0-c_glusterfs-client-9: remote operation failed [Transport endpoint is not connected] [2016-10-31 04:06:03.628475] I [MSGID: 108019] [afr-lk-common.c:1086:afr_lock_blocking] 0-c_glusterfs-replicate-0: unable to lock on even one child [2016-10-31 04:06:03.628539] I [MSGID: 108019] [afr-transaction.c:1224:afr_post_blocking_inodelk_cbk] 0-c_glusterfs-replicate-0: Blocking inodelks failed. [2016-10-31 04:06:03.628630] W [fuse-bridge.c:1282:fuse_err_cbk] 0-glusterfs-fuse: 20790: FLUSH() ERR => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.629149] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808] ))))) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2016-10-31 04:06:03.624346 (xid=0x7f5a) [2016-10-31 04:06:03.629183] I [rpc-clnt.c:1847:rpc_clnt_reconfig] 0-c_glusterfs-client-9: changing port to 49391 (from 0) [2016-10-31 04:06:03.629210] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews (b0e5a94e-a432-4dce-b86f-a551555780a2) [Transport endpoint is not connected] [2016-10-31 04:06:03.629266] I [socket.c:3308:socket_submit_request] 0-c_glusterfs-client-9: not connected (priv->connected = 255) [2016-10-31 04:06:03.629277] I [MSGID: 109063] [dht-layout.c:702:dht_layout_normalize] 0-c_glusterfs-dht: Found anomalies in /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews (gfid = b0e5a94e-a432-4dce-b86f-a551555780a2). Holes=1 overlaps=0 [2016-10-31 04:06:03.629293] W [rpc-clnt.c:1586:rpc_clnt_submit] 0-c_glusterfs-client-9: failed to submit rpc-request (XID: 0x7f62 Program: GlusterFS 3.3, ProgVers: 330, Proc: 41) to rpc-transport (c_glusterfs-client-9) [2016-10-31 04:06:03.629333] W [fuse-resolve.c:149:fuse_resolve_gfid_cbk] 0-fuse: b0e5a94e-a432-4dce-b86f-a551555780a2: failed to resolve (Transport endpoint is not connected) [2016-10-31 04:06:03.629363] W [fuse-bridge.c:3385:fuse_getxattr_resume] 0-glusterfs-fuse: 20784: GETXATTR b0e5a94e-a432-4dce-b86f-a551555780a2/70366685109140 (system.posix_acl_access) resolution failed [2016-10-31 04:06:03.629729] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58] (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808] ))))) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2016-10-31 04:06:03.624568 (xid=0x7f5b) [2016-10-31 04:06:03.629779] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /pmd/2 (fa866805-ab22-4afe-8acb-528d15ce4117) [Transport endpoint is not connected] [2016-10-31 04:06:03.629784] E [MSGID: 114031] [client-rpc-fops.c:2883:client3_3_opendir_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews (b0e5a94e-a432-4dce-b86f-a551555780a2) [Transport endpoint is not connected] [2016-10-31 04:06:03.629836] I [MSGID: 109063] [dht-layout.c:702:dht_layout_normalize] 0-c_glusterfs-dht: Found anomalies in /pmd/2 (gfid = fa866805-ab22-4afe-8acb-528d15ce4117). Holes=1 overlaps=0 [2016-10-31 04:06:03.629857] W [fuse-bridge.c:980:fuse_fd_cbk] 0-glusterfs-fuse: 20794: OPENDIR() /loadmodules_norepl/CXC1725605_P93A001/cello/emasviews => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.629896] W [fuse-resolve.c:149:fuse_resolve_gfid_cbk] 0-fuse: fa866805-ab22-4afe-8acb-528d15ce4117: failed to resolve (Transport endpoint is not connected) [2016-10-31 04:06:03.629990] E [fuse-bridge.c:787:fuse_getattr_resume] 0-glusterfs-fuse: 20785: GETATTR 70366685017568 (fa866805-ab22-4afe-8acb-528d15ce4117) resolution failed [2016-10-31 04:06:03.630164] W [defaults.c:2212:default_releasedir] (-->/usr/lib64/glusterfs/3.7.6/xlator/protocol/client.so(client_local_wipe-0x42d64) [0x3fff86c9bc5c] -->/usr/lib64/libglusterfs.so.0(fd_unref-0x7ba98) [0x3fff8abb7ad0] -->/usr/lib64/libglusterfs.so.0(default_releasedir-0x9d6d8) [0x3fff8ab93c58] ) 0-fuse: xlator does not implement releasedir_cbk [2016-10-31 04:06:03.630367] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58] (--> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90] (--> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18] (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808] ))))) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3) op(LOOKUP(27)) called at 2016-10-31 04:06:03.625675 (xid=0x7f5c) [2016-10-31 04:06:03.630368] W [MSGID: 114031] [client-rpc-fops.c:1569:client3_3_fstat_cbk] 0-c_glusterfs-client-9: remote operation failed [Transport endpoint is not connected] [2016-10-31 04:06:03.630497] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /configuration/oamrdncy.cfg (200b0bb2-d903-4924-b025-262d805c0f2f) [Transport endpoint is not connected] [2016-10-31 04:06:03.630871] I [MSGID: 114057] [client-handshake.c:1437:select_server_supported_programs] 0-c_glusterfs-client-9: Using Program GlusterFS 3.3, Num (1298437), Version (330) [2016-10-31 04:06:03.631488] I [MSGID: 114046] [client-handshake.c:1213:client_setvolume_cbk] 0-c_glusterfs-client-9: Connected to c_glusterfs-client-9, attached to remote volume '/opt/lvmdir/c2/brick'. [2016-10-31 04:06:03.631523] I [MSGID: 114047] [client-handshake.c:1224:client_setvolume_cbk] 0-c_glusterfs-client-9: Server and Client lk-version numbers are not same, reopening the fds [2016-10-31 04:06:03.631555] I [MSGID: 114042] [client-handshake.c:1056:client_post_handshake] 0-c_glusterfs-client-9: 6 fds open - Delaying child_up until they are re-opened [2016-10-31 04:06:03.631475] W [MSGID: 114031] [client-rpc-fops.c:2971:client3_3_lookup_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /configuration/ethmac.cfg (00000000-0000-0000-0000-000000000000) [Transport endpoint is not connected] [2016-10-31 04:06:03.632256] I [MSGID: 114060] [client-handshake.c:820:client3_3_reopendir_cbk] 0-c_glusterfs-client-9: reopendir on <gfid:00000000-0000-0000-0000-000000000001> succeeded (fd = 1) [2016-10-31 04:06:03.632728] I [MSGID: 114060] [client-handshake.c:820:client3_3_reopendir_cbk] 0-c_glusterfs-client-9: reopendir on <gfid:257add5a-9b87-4014-8a3a-09dd0a699eec> succeeded (fd = 2) [2016-10-31 04:06:03.632790] I [MSGID: 114060] [client-handshake.c:820:client3_3_reopendir_cbk] 0-c_glusterfs-client-9: reopendir on <gfid:6fd47fe9-a566-47c7-8df7-e7a5e379101f> succeeded (fd = 3) [2016-10-31 04:06:03.632847] I [MSGID: 114060] [client-handshake.c:820:client3_3_reopendir_cbk] 0-c_glusterfs-client-9: reopendir on <gfid:fa866805-ab22-4afe-8acb-528d15ce4117> succeeded (fd = 4) [2016-10-31 04:06:03.632905] I [MSGID: 114060] [client-handshake.c:820:client3_3_reopendir_cbk] 0-c_glusterfs-client-9: reopendir on <gfid:139f5034-5b42-4fb8-9372-6dd7cac1e674> succeeded (fd = 5) [2016-10-31 04:06:03.632935] I [MSGID: 114041] [client-handshake.c:678:client_child_up_reopen_done] 0-c_glusterfs-client-9: last fd open'd/lock-self-heal'd - notifying CHILD-UP [2016-10-31 04:06:03.633028] I [MSGID: 108005] [afr-common.c:3841:afr_notify] 0-c_glusterfs-replicate-0: Subvolume 'c_glusterfs-client-9' came back up; going online. [2016-10-31 04:06:03.633386] I [MSGID: 114035] [client-handshake.c:193:client_set_lk_version_cbk] 0-c_glusterfs-client-9: Server lk version = 1 [2016-10-31 04:06:03.634579] W [fuse-bridge.c:758:fuse_attr_cbk] 0-glusterfs-fuse: 20795: FSTAT() /configuration/oamrdncy.cfg => -1 (Transport endpoint is not connected) [2016-10-31 04:06:03.634889] E [rpc-clnt.c:362:saved_frames_unwind] (--> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn-0xb5c80)[0x3fff8ab79f58] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_unwind-0x1b7a0)[0x3fff8ab1dc90] ( --> /usr/lib64/libgfrpc.so.0(saved_frames_destroy-0x1b638)[0x3fff8ab1de10] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup-0x19af8)[0x3fff8ab1fb18] ( --> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify-0x18e68)[0x3fff8ab20808] ))))) 0-c_glusterfs-client-9: forced unwinding frame type(GlusterFS 3.3) op(XATTROP(33)) called at 2016-10-31 04:06:03.625716 (xid=0x7f5d) [2016-10-31 04:06:03.634950] W [MSGID: 114031] [client-rpc-fops.c:1845:client3_3_xattrop_cbk] 0-c_glusterfs-client-9: remote operation failed. Path: /configuration/oamrdncy.cfg (200b0bb2-d903-4924-b025-262d805c0f2f) [2016-10-31 04:06:03.635134] W [MSGID: 108008] [afr-read-txn.c:250:afr_read_txn] 0-c_glusterfs-replicate-0: Unreadable subvolume -1 found with event generation 3 for gfid 200b0bb2-d903-4924-b025-262d805c0f2f. (Possible split-brain) [2016-10-31 04:06:03.635364] I [MSGID: 114018] [client.c:2042:client_rpc_notify] 0-c_glusterfs-client-9: disconnected from c_glusterfs-client-9. Client process will keep trying to connect to glusterd until brick's port is available [2016-10-31 04:06:03.635420] E [MSGID: 108006] [afr-common.c:3880:afr_notify] 0-c_glusterfs-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up. analyze: When 04:04:08 the client connect successtion with server. When 04:06:03.626047 the disconnection happened because the readv error. When 04:06:03.631488 the client connect successtion again. The server log on A board is following.(Please notice the red line) [2016-10-31 04:04:08.063317] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:04:08.063359] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 002500-4291-2016/10/31-04:04:08:14796-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:04:10.032591] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-c_glusterfs-server: disconnecting connection from 000300-8222-2016/10/31-04:04:07:930474-c_glusterfs-client-9-0-0 [2016-10-31 04:04:10.032672] I [MSGID: 101055] [client_t.c:419:gf_client_unref] 0-c_glusterfs-server: Shutting down connection 000300-8222-2016/10/31-04:04:07:930474-c_glusterfs-client-9-0-0 [2016-10-31 04:05:04.530186] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-c_glusterfs-server: disconnecting connection from 000300-3365-2016/10/31-04:01:45:605337-c_glusterfs-client-9-0-1 [2016-10-31 04:05:04.530267] W [inodelk.c:404:pl_inodelk_log_cleanup] 0-c_glusterfs-server: releasing lock on 60a17763-95fd-4162-8402-9651dfbe9ec4 held by {client=0x3fff7c002990, pid=0 lk-owner=a87b0a7cff3f0000} [2016-10-31 04:05:04.530306] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /logfiles/systemlog/syslog [2016-10-31 04:05:04.530375] I [MSGID: 101055] [client_t.c:419:gf_client_unref] 0-c_glusterfs-server: Shutting down connection 000300-3365-2016/10/31-04:01:45:605337-c_glusterfs-client-9-0-1 [2016-10-31 04:05:05.760618] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-c_glusterfs-server: disconnecting connection from 000300-1603-2016/10/31-04:01:41:508960-c_glusterfs-client-9-0-1 [2016-10-31 04:05:05.760693] I [MSGID: 101055] [client_t.c:419:gf_client_unref] 0-c_glusterfs-server: Shutting down connection 000300-1603-2016/10/31-04:01:41:508960-c_glusterfs-client-9-0-1 [2016-10-31 04:05:57.341892] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:05:57.341928] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 000300-1604-2016/10/31-04:05:53:637848-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:06:01.291380] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:06:01.291441] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 002500-6738-2016/10/31-04:06:01:245058-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:06:01.403916] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:06:01.403963] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 000300-3292-2016/10/31-04:05:57:685417-c_glusterfs-client-9-0-0 (version: 3.7.6) [2016-10-31 04:06:03.631296] I [login.c:81:gf_auth] 0-auth/login: allowed user names: 5c5afe17-02ce-4b9b-aef4-e66efa1c6132 [2016-10-31 04:06:03.631332] I [MSGID: 115029] [server-handshake.c:612:server_setvolume] 0-c_glusterfs-server: accepted client from 002500-4291-2016/10/31-04:04:08:14796-c_glusterfs-client-9-0-1 (version: 3.7.6) [2016-10-31 04:06:03.634762] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-c_glusterfs-server: disconnecting connection from 002500-4291-2016/10/31-04:04:08:14796-c_glusterfs-client-9-0-0 [2016-10-31 04:06:03.634945] W [inodelk.c:404:pl_inodelk_log_cleanup] 0-c_glusterfs-server: releasing lock on 200b0bb2-d903-4924-b025-262d805c0f2f held by {client=0x3fff7c026520, pid=6232 lk-owner=6cd0c189ff3f0000} [2016-10-31 04:06:03.634987] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /pmd/2 [2016-10-31 04:06:03.635035] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /configuration/oamrdncy.cfg [2016-10-31 04:06:03.635349] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on / [2016-10-31 04:06:03.635414] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /logfiles/systemlog/syslog [2016-10-31 04:06:03.635468] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /loadmodules_norepl/CXC1725605_P93A001 [2016-10-31 04:06:03.635554] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /loadmodules_norepl [2016-10-31 04:06:03.635889] I [MSGID: 115013] [server-helpers.c:294:do_fd_cleanup] 0-c_glusterfs-server: fd cleanup on /loadmodules_norepl/CXC1725605_P93A001/cello [2016-10-31 04:06:03.640329] I [MSGID: 101055] [client_t.c:419:gf_client_unref] 0-c_glusterfs-server: Shutting down connection 002500-4291-2016/10/31-04:04:08:14796-c_glusterfs-client-9-0-0 [2016-10-31 04:06:03.641143] E [inodelk.c:309:__inode_unlock_lock] 0-c_glusterfs-locks: Matching lock not found for unlock 0-9223372036854775807, by 6cd0c189ff3f0000 on 0x3fff840fd9a0 [2016-10-31 04:06:03.641200] E [MSGID: 115053] [server-rpc-fops.c:275:server_inodelk_cbk] 0-c_glusterfs-server: 32623: INODELK /configuration/oamrdncy.cfg (200b0bb2-d903-4924-b025-262d805c0f2f) ==> (Invalid argument) [Invalid argument] [2016-10-31 04:06:06.270314] I [MSGID: 115036] [server.c:552:server_rpc_notify] 0-c_glusterfs-server: disconnecting connection from 002500-6738-2016/10/31-04:06:01:245058-c_glusterfs-client-9-0-0 [2016-10-31 04:06:06.270449] I [MSGID: 101055] [client_t.c:419:gf_client_unref] 0-c_glusterfs-server: Shutting down connection 002500-6738-2016/10/31-04:06:01:245058-c_glusterfs-client-9-0-0 analyze? When 04:04:08 the server accept the client?002500-4291-2016/10/31-04:04:08:14796-c_glusterfs-client-9-0-0 ?. When 04:06:03 the server accept the client (002500-4291-2016/10/31-04:04:08:14796-c_glusterfs-client-9-0-1 ) again. Question: 1.I don't understand why connection immediately happen at 04:06:03.631488 after disconnect happen at 04:06:03.626047. I think it should be 10 second interval before reconnect. From server the second time connection even happen before disconnection. So, why connect happen twice before disconnection? 2.After the disconnection happen at 04:06:03.626047, why the connection will not success again? Thanks, Xin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161123/eaef82e5/attachment.html>
Hi Atin, Actually, the glusterfs is used in my project. And our test team find this issue. So I want to make sure that whether you plan to fix it. if you have plan I will wait you because your method shoud be better than mine. Thanks, Xin ? 2016-11-21 10:00:36?"Atin Mukherjee" <atin.mukherjee83 at gmail.com> ??? Hi Xin, I've not got a chance to look into it yet. delete stale volume function is in place to take care of wiping off volume configuration data which has been deleted from the cluster. However we need to revisit this code to see if this function is anymore needed given we recently added a validation to fail delete request if one of the glusterd is down. I'll get back to you on this. On Mon, 21 Nov 2016 at 07:24, songxin <songxin_1980 at 126.com> wrote: Hi Atin, Thank you for your support. And any conclusions about this issue? Thanks, Xin ? 2016-11-16 20:59:05?"Atin Mukherjee" <amukherj at redhat.com> ??? On Tue, Nov 15, 2016 at 1:53 PM, songxin <songxin_1980 at 126.com> wrote: ok, thank you. ? 2016-11-15 16:12:34?"Atin Mukherjee" <amukherj at redhat.com> ??? On Tue, Nov 15, 2016 at 12:47 PM, songxin <songxin_1980 at 126.com> wrote: Hi Atin, I think the root cause is in the function glusterd_import_friend_volume as below. int32_t glusterd_import_friend_volume (dict_t *peer_data, size_t count) { ... ret = glusterd_volinfo_find (new_volinfo->volname, &old_volinfo); if (0 == ret) { (void) gd_check_and_update_rebalance_info (old_volinfo, new_volinfo); (void) glusterd_delete_stale_volume (old_volinfo, new_volinfo); } ... ret = glusterd_store_volinfo (new_volinfo, GLUSTERD_VOLINFO_VER_AC_NONE); if (ret) { gf_msg (this->name, GF_LOG_ERROR, 0, GD_MSG_VOLINFO_STORE_FAIL, "Failed to store " "volinfo for volume %s", new_volinfo->volname); goto out; } ... } glusterd_delete_stale_volume will remove the info and bricks/* and the glusterd_store_volinfo will create the new one. But if glusterd is killed before rename the info will is empty. And glusterd will start failed because the infois empty in the next time you start the glusterd. Any idea? Atin? Give me some time, will check it out, but reading at this analysis looks very well possible if a volume is changed when the glusterd was done on node a and when the same comes up during peer handshake we update the volinfo and during that time glusterd goes down once again. I'll confirm it by tomorrow. I checked the code and it does look like you have got the right RCA for the issue which you simulated through those two scripts. However this can happen even when you try to create a fresh volume and while glusterd tries to write the content into the store and goes down before renaming the info.tmp file you get into the same situation. I'd really need to think through if this can be fixed. Suggestions are always appreciated. BTW, excellent work Xin! Thanks, Xin ? 2016-11-15 12:07:05?"Atin Mukherjee" <amukherj at redhat.com> ??? On Tue, Nov 15, 2016 at 8:58 AM, songxin <songxin_1980 at 126.com> wrote: Hi Atin, I have some clues about this issue. I could reproduce this issue use the scrip that mentioned in https://bugzilla.redhat.com/show_bug.cgi?id=1308487 . I really appreciate your help in trying to nail down this issue. While I am at your email and going through the code to figure out the possible cause for it, unfortunately I don't see any script in the attachment of the bug. Could you please cross check? After I added some debug print,which like below, in glusterd-store.c and I found that the /var/lib/glusterd/vols/xxx/info and /var/lib/glusterd/vols/xxx/bricks/* are removed. But other files in /var/lib/glusterd/vols/xxx/ will not be remove. int32_t glusterd_store_volinfo (glusterd_volinfo_t *volinfo, glusterd_volinfo_ver_ac_t ac) { int32_t ret = -1; GF_ASSERT (volinfo) ret = access("/var/lib/glusterd/vols/gv0/info", F_OK); if(ret < 0) { gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info is not exit(%d)", errno); } else { ret = stat("/var/lib/glusterd/vols/gv0/info", &buf); if(ret < 0) { gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "stat info error"); } else { gf_msg (THIS->name, GF_LOG_ERROR, 0, 0, "info size is %lu, inode num is %lu", buf.st_size, buf.st_ino); } } glusterd_perform_volinfo_version_action (volinfo, ac); ret = glusterd_store_create_volume_dir (volinfo); if (ret) goto out; ... } So it is easy to understand why the info or 10.32.1.144.-opt-lvmdir-c2-brick sometimes is empty. It is becaue the info file is not exist, and it will be create by ?fd = open (path, O_RDWR | O_CREAT | O_APPEND, 0600);? in function gf_store_handle_new. And the info file is empty before rename. So the info file is empty if glusterd shutdown before rename. My question is following. 1.I did not find the point the info is removed.Could you tell me the point where the info and /bricks/* are removed? 2.why the file info and bricks/* is removed?But other files in var/lib/glusterd/vols/xxx/ are not be removed? AFAIK, we never delete the info file and hence this file is opened with O_APPEND flag. As I said I will go back and cross check the code once again. Thanks, Xin ? 2016-11-11 20:34:05?"Atin Mukherjee" <amukherj at redhat.com> ??? On Fri, Nov 11, 2016 at 4:00 PM, songxin <songxin_1980 at 126.com> wrote: Hi Atin, Thank you for your support. Sincerely wait for your reply. By the way, could you make sure that the issue, file info is empty, cause by rename is interrupted in kernel? As per my RCA on that bug, it looked to be. Thanks, Xin ? 2016-11-11 15:49:02?"Atin Mukherjee" <amukherj at redhat.com> ??? On Fri, Nov 11, 2016 at 1:15 PM, songxin <songxin_1980 at 126.com> wrote: Hi Atin, Thank you for your reply. Actually it is very difficult to reproduce because I don't know when there was an ongoing commit happening.It is just a coincidence. But I want to make sure the root cause. I'll give it a another try and see if this situation can be simulated/reproduced and will keep you posted. So I would be grateful if you could answer my questions below. You said that "This issue is hit at part of the negative testing where while gluster volume set was executed at the same point of time glusterd in another instance was brought down. In the faulty node we could see /var/lib/glusterd/vols/<volname>info file been empty whereas the info.tmp file has the correct contents." in comment. I have two questions for you. 1.Could you reproduce this issue by gluster volume set glusterd which was brought down? 2.Could you be certain that this issue is cause by rename is interrupted in kernel? In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, are both empty. But in my view only one rename can be running at the same time because of the big lock. Why there are two files are empty? Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be running in two thread? Thanks, Xin ? 2016-11-11 15:27:03?"Atin Mukherjee" <amukherj at redhat.com> ??? On Fri, Nov 11, 2016 at 12:38 PM, songxin <songxin_1980 at 126.com> wrote: Hi Atin, Thank you for your reply. As you said that the info file can only be changed in the glusterd_store_volinfo() sequentially because of the big lock. I have found the similar issue as below that you mentioned. https://bugzilla.redhat.com/show_bug.cgi?id=1308487 Great, so this is what I was actually trying to refer in my first email that I saw a similar issue. Have you got a chance to look at https://bugzilla.redhat.com/show_bug.cgi?id=1308487#c4 ? But in your case, did you try to bring down glusterd when there was an ongoing commit happening? You said that "This issue is hit at part of the negative testing where while gluster volume set was executed at the same point of time glusterd in another instance was brought down. In the faulty node we could see /var/lib/glusterd/vols/<volname>info file been empty whereas the info.tmp file has the correct contents." in comment. I have two questions for you. 1.Could you reproduce this issue by gluster volume set glusterd which was brought down? 2.Could you be certain that this issue is cause by rename is interrupted in kernel? In my case there are two files, info and 10.32.1.144.-opt-lvmdir-c2-brick, are both empty. But in my view only one rename can be running at the same time because of the big lock. Why there are two files are empty? Could rename("info.tmp", "info") and rename("xxx-brick.tmp", "xxx-brick") be running in two thread? Thanks, Xin ? 2016-11-11 14:36:40?"Atin Mukherjee" <amukherj at redhat.com> ??? On Fri, Nov 11, 2016 at 8:33 AM, songxin <songxin_1980 at 126.com> wrote: Hi Atin, Thank you for your reply. I have two questions for you. 1.Are the two files info and info.tmp are only to be created or changed in function glusterd_store_volinfo()? I did not find other point in which the two file are changed. If we are talking about info file volume then yes, the mentioned function actually takes care of it. 2.I found that glusterd_store_volinfo() will be call in many point by glusterd.Is there a problem of thread synchronization?If so, one thread may open a same file info.tmp using O_TRUNC flag when another thread is writing the info,tmp.Could this case happen? In glusterd threads are big lock protected and I don't see a possibility (theoretically) to have two glusterd_store_volinfo () calls at a given point of time. Thanks, Xin At 2016-11-10 21:41:06, "Atin Mukherjee" <amukherj at redhat.com> wrote: Did you run out of disk space by any chance? AFAIK, the code is like we write new stuffs to .tmp file and rename it back to the original file. In case of a disk space issue I expect both the files to be of non zero size. But having said that I vaguely remember a similar issue (in the form of a bug or an email) landed up once but we couldn't reproduce it, so something is wrong with the atomic update here is what I guess. I'll be glad if you have a reproducer for the same and then we can dig into it further. On Thu, Nov 10, 2016 at 1:32 PM, songxin <songxin_1980 at 126.com> wrote: Hi, When I start the glusterd some error happened. And the log is following. [2016-11-08 07:58:34.989365] I [MSGID: 100030] [glusterfsd.c:2318:main] 0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.7.6 (args: /usr/sbin/glusterd -p /var/run/glusterd.pid --log-level INFO) [2016-11-08 07:58:34.998356] I [MSGID: 106478] [glusterd.c:1350:init] 0-management: Maximum allowed open file descriptors set to 65536 [2016-11-08 07:58:35.000667] I [MSGID: 106479] [glusterd.c:1399:init] 0-management: Using /system/glusterd as working directory [2016-11-08 07:58:35.024508] I [MSGID: 106514] [glusterd-store.c:2075:glusterd_restore_op_version] 0-management: Upgrade detected. Setting op-version to minimum : 1 [2016-11-08 07:58:35.025356] E [MSGID: 106206] [glusterd-store.c:2562:glusterd_store_update_volinfo] 0-management: Failed to get next store iter [2016-11-08 07:58:35.025401] E [MSGID: 106207] [glusterd-store.c:2844:glusterd_store_retrieve_volume] 0-management: Failed to update volinfo for c_glusterfs volume [2016-11-08 07:58:35.025463] E [MSGID: 106201] [glusterd-store.c:3042:glusterd_store_retrieve_volumes] 0-management: Unable to restore volume: c_glusterfs [2016-11-08 07:58:35.025544] E [MSGID: 101019] [xlator.c:428:xlator_init] 0-management: Initialization of volume 'management' failed, review your volfile again [2016-11-08 07:58:35.025582] E [graph.c:322:glusterfs_graph_init] 0-management: initializing translator failed [2016-11-08 07:58:35.025629] E [graph.c:661:glusterfs_graph_activate] 0-graph: init failed [2016-11-08 07:58:35.026109] W [glusterfsd.c:1236:cleanup_and_exit] (-->/usr/sbin/glusterd(glusterfs_volumes_init-0x1b260) [0x1000a718] -->/usr/sbin/glusterd(glusterfs_process_volfp-0x1b3b8) [0x1000a5a8] -->/usr/sbin/glusterd(cleanup_and_exit-0x1c02c) [0x100098bc] ) 0-: received signum (0), shutting down And then I found that the size of vols/volume_name/info is 0.It cause glusterd shutdown. But I found that vols/volume_name_info.tmp is not 0. And I found that there is a brick file vols/volume_name/bricks/xxxx.brick is 0, but vols/volume_name/bricks/xxxx.brick.tmp is not 0. I read the function code glusterd_store_volinfo () in glusterd-store.c . I know that the info.tmp will be rename to info in function glusterd_store_volume_atomic_update(). But my question is that why the info file is 0 but info.tmp is not 0. Thanks, Xin _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://www.gluster.org/mailman/listinfo/gluster-users -- ~ Atin (atinm) -- ~ Atin (atinm) -- ~ Atin (atinm) -- ~ Atin (atinm) -- ~ Atin (atinm) -- ~ Atin (atinm) -- ~ Atin (atinm) -- ~ Atin (atinm) _______________________________________________ Gluster-users mailing list Gluster-users at gluster.org http://www.gluster.org/mailman/listinfo/gluster-users -- --Atin -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20161124/fb40eb44/attachment.html>