thr3ads.net - Gluster users - [Gluster-users] Gluster volume brick keeps going offline [Mar 2015]

If this information is useful, please help other people find it:
Share via:

Atin Mukherjee

2015-Mar-20 04:57 UTC

[Gluster-users] Gluster volume brick keeps going offline

I see there is a crash in the brick log.

patchset: git://git.gluster.com/glusterfs.git

signal received: 11

time of crash: 2015-03-19 06:00:35configuration details:

argp 1

backtrace 1

dlfcn 1

fdatasync 1

libpthread 1

llistxattr 1

setfsid 1

spinlock 1

epoll.h 1

xattr.h 1

st_atim.tv_nsec 1

package-string: glusterfs 3.5.0

/lib/x86_64-linux-gnu/libc.so.6(+0x321e0)[0x7f027c7031e0]

/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/locks.so(__get_entrylk_count+0x40)[0x7f0277fc5d70]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/locks.so(get_entrylk_count+0x4d)[0x7f0277fc5ddd]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/locks.so(pl_entrylk_xattr_fill+0x19)[0x7f0277fc2df9]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/locks.so(pl_lookup_cbk+0x1d0)[0x7f0277fc3390]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/access-control.so(posix_acl_lookup_cbk+0x12b)[0x7f02781d91fb]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/storage/posix.so(posix_lookup+0x331)[0x7f02788046c1]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_lookup+0x70)[0x7f027d6d1270]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/access-control.so(posix_acl_lookup+0x1b5)[0x7f02781d72f5]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/locks.so(pl_lookup+0x211)[0x7f0277fbd391]
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/performance/io-threads.so(iot_lookup_wrapper+0x140)[0x7f0277da82d0]
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(call_resume+0x126)[0x7f027d6e5f16]

/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/performance/io-threads.so(iot_worker+0x13e)[0x7f0277da86be]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x7f027ce5bb50]

/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f027c7ad70d]

Pranith/Ravi,

Could you help Kaamesh for it?

Also at the glusterd side I see there is some RPC related failures
(Probably a corruption). What's the gluster version are you using? Is
there any surprising logs at the other node?

~Atin

On 03/19/2015 12:58 PM, Kaamesh Kamalaaharan wrote:> Sorry, forgot to include the attachment
> 
> Thank You Kindly,
> Kaamesh
> Bioinformatician
> Novocraft Technologies Sdn Bhd
> C-23A-05, 3 Two Square, Section 19, 46300 Petaling Jaya
> Selangor Darul Ehsan
> Malaysia
> Mobile: +60176562635
> Ph: +60379600541
> Fax: +60379600540
> 
> On Thu, Mar 19, 2015 at 2:40 PM, Kaamesh Kamalaaharan <kaamesh at
novocraft.com
>> wrote:
> 
>> Hi Atin, Thanks for the reply. Im not sure which logs are relevant so
ill
>> just attach them all in a gz file.
>>
>> I ran a sudo gluster volume start gfsvolume force at  2015-03-19 05:49
>> i hope this helps.
>>
>> Thank You Kindly,
>> Kaamesh
>>
>> On Sun, Mar 15, 2015 at 11:41 PM, Atin Mukherjee <amukherj at
redhat.com>
>> wrote:
>>
>>> Could you attach the logs for the analysis?
>>>
>>> ~Atin
>>>
>>> On 03/13/2015 03:29 PM, Kaamesh Kamalaaharan wrote:
>>>> Hi guys. Ive been using gluster for a while now and despite a
few
>>> hiccups,
>>>> i find its a great system to use. One of my more persistent
hiccups is
>>> an
>>>> issue with one brick going offline.
>>>>
>>>> My setup is a 2 brick 2 node setup. my main brick is gfs1 which
has not
>>>> given me any problem. gfs2 however keeps going offline.
Following
>>>>
http://www.gluster.org/pipermail/gluster-users/2014-June/017583.html
>>>> temporarily fixed the error but  the brick goes offline within
the hour.
>>>>
>>>> This is what i get from my volume status command :
>>>>
>>>> sudo gluster volume status
>>>>>
>>>>> Status of volume: gfsvolume
>>>>> Gluster process Port Online Pid
>>>>>
>>>>>
>>>
------------------------------------------------------------------------------
>>>>> Brick gfs1:/export/sda/brick 49153 Y 9760
>>>>> Brick gfs2:/export/sda/brick N/A N 13461
>>>>> NFS Server on localhost 2049 Y 13473
>>>>> Self-heal Daemon on localhost N/A Y 13480
>>>>> NFS Server on gfs1 2049 Y 16166
>>>>> Self-heal Daemon on gfs1 N/A Y 16173
>>>>>
>>>>> Task Status of Volume gfsvolume
>>>>>
>>>>>
>>>
------------------------------------------------------------------------------
>>>>> There are no active volume tasks
>>>>>
>>>>>
>>>> doing sudo gluster volume start gfsvolume force gives me this:
>>>>
>>>> sudo gluster volume status
>>>>>
>>>>> Status of volume: gfsvolume
>>>>> Gluster process Port Online Pid
>>>>>
>>>>>
>>>
------------------------------------------------------------------------------
>>>>> Brick gfs1:/export/sda/brick 49153 Y 9760
>>>>> Brick gfs2:/export/sda/brick 49153 Y 13461
>>>>> NFS Server on localhost 2049 Y 13473
>>>>> Self-heal Daemon on localhost N/A Y 13480
>>>>> NFS Server on gfs1 2049 Y 16166
>>>>> Self-heal Daemon on gfs1 N/A Y 16173
>>>>>
>>>>> Task Status of Volume gfsvolume
>>>>>
>>>>>
>>>
------------------------------------------------------------------------------
>>>>> There are no active volume tasks
>>>>>
>>>>> half an hour later and my brick goes down again.
>>>>
>>>>>
>>>>>
>>>>> This is my glustershd.log. I snipped it because the rest of
the log is
>>> a
>>>> repeat of the same error
>>>>
>>>>
>>>>>
>>>>> [2015-03-13 02:09:41.951556] I [glusterfsd.c:1959:main]
>>>>> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs
version
>>> 3.5.0
>>>>> (/usr/sbin/glusterfs -s localhost --volfile-id
gluster/glustershd -p
>>>>> /var/lib/glus
>>>>> terd/glustershd/run/glustershd.pid -l
>>> /var/log/glusterfs/glustershd.log -S
>>>>> /var/run/deac2f873d0ac5b6c3e84b23c4790172.socket
--xlator-option
>>>>> *replicate*.node-uuid=adbb7505-3342-4c6d-be3d-75938633612c)
>>>>> [2015-03-13 02:09:41.954173] I [socket.c:3561:socket_init]
>>>>> 0-socket.glusterfsd: SSL support is NOT enabled
>>>>> [2015-03-13 02:09:41.954236] I [socket.c:3576:socket_init]
>>>>> 0-socket.glusterfsd: using system polling thread
>>>>> [2015-03-13 02:09:41.954421] I [socket.c:3561:socket_init]
0-glusterfs:
>>>>> SSL support is NOT enabled
>>>>> [2015-03-13 02:09:41.954443] I [socket.c:3576:socket_init]
0-glusterfs:
>>>>> using system polling thread
>>>>> [2015-03-13 02:09:41.956731] I
[graph.c:254:gf_add_cmdline_options]
>>>>> 0-gfsvolume-replicate-0: adding option 'node-uuid'
for volume
>>>>> 'gfsvolume-replicate-0' with value
>>> 'adbb7505-3342-4c6d-be3d-75938633612c'
>>>>> [2015-03-13 02:09:41.960210] I
>>> [rpc-clnt.c:972:rpc_clnt_connection_init]
>>>>> 0-gfsvolume-client-1: setting frame-timeout to 90
>>>>> [2015-03-13 02:09:41.960288] I [socket.c:3561:socket_init]
>>>>> 0-gfsvolume-client-1: SSL support is NOT enabled
>>>>> [2015-03-13 02:09:41.960301] I [socket.c:3576:socket_init]
>>>>> 0-gfsvolume-client-1: using system polling thread
>>>>> [2015-03-13 02:09:41.961095] I
>>> [rpc-clnt.c:972:rpc_clnt_connection_init]
>>>>> 0-gfsvolume-client-0: setting frame-timeout to 90
>>>>> [2015-03-13 02:09:41.961134] I [socket.c:3561:socket_init]
>>>>> 0-gfsvolume-client-0: SSL support is NOT enabled
>>>>> [2015-03-13 02:09:41.961145] I [socket.c:3576:socket_init]
>>>>> 0-gfsvolume-client-0: using system polling thread
>>>>> [2015-03-13 02:09:41.961173] I [client.c:2273:notify]
>>>>> 0-gfsvolume-client-0: parent translators are ready,
attempting connect
>>> on
>>>>> transport
>>>>> [2015-03-13 02:09:41.961412] I [client.c:2273:notify]
>>>>> 0-gfsvolume-client-1: parent translators are ready,
attempting connect
>>> on
>>>>> transport
>>>>> Final graph:
>>>>>
>>>>>
>>>
+------------------------------------------------------------------------------+
>>>>>   1: volume gfsvolume-client-0
>>>>>   2:     type protocol/client
>>>>>   3:     option remote-host gfs1
>>>>>   4:     option remote-subvolume /export/sda/brick
>>>>>   5:     option transport-type socket
>>>>>   6:     option frame-timeout 90
>>>>>   7:     option ping-timeout 30
>>>>>   8: end-volume
>>>>>   9:
>>>>>  10: volume gfsvolume-client-1
>>>>>  11:     type protocol/client
>>>>>  12:     option remote-host gfs2
>>>>>  13:     option remote-subvolume /export/sda/brick
>>>>>  14:     option transport-type socket
>>>>>  15:     option frame-timeout 90
>>>>>  16:     option ping-timeout 30
>>>>>  17: end-volume
>>>>>  18:
>>>>>  19: volume gfsvolume-replicate-0
>>>>>  20:     type cluster/replicate
>>>>>  21:     option node-uuid
adbb7505-3342-4c6d-be3d-75938633612c
>>>>>  22:     option background-self-heal-count 0
>>>>>  23:     option metadata-self-heal on
>>>>>  24:     option data-self-heal on
>>>>>  25:     option entry-self-heal on
>>>>>  26:     option self-heal-daemon on
>>>>>  27:     option data-self-heal-algorithm diff
>>>>>  28:     option quorum-type fixed
>>>>>  29:     option quorum-count 1
>>>>>  30:     option iam-self-heal-daemon yes
>>>>>  31:     subvolumes gfsvolume-client-0 gfsvolume-client-1
>>>>>  32: end-volume
>>>>>  33:
>>>>>  34: volume glustershd
>>>>>  35:     type debug/io-stats
>>>>>  36:     subvolumes gfsvolume-replicate-0
>>>>>  37: end-volume
>>>>>
>>>>>
>>>
+------------------------------------------------------------------------------+
>>>>> [2015-03-13 02:09:41.961871] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
>>>>> 0-gfsvolume-client-1: changing port to 49153 (from 0)
>>>>> [2015-03-13 02:09:41.962129] I
>>>>> [client-handshake.c:1659:select_server_supported_programs]
>>>>> 0-gfsvolume-client-1: Using Program GlusterFS 3.3, Num
(1298437),
>>> Version
>>>>> (330)
>>>>> [2015-03-13 02:09:41.962344] I
>>>>> [client-handshake.c:1456:client_setvolume_cbk]
0-gfsvolume-client-1:
>>>>> Connected to 172.20.20.22:49153, attached to remote volume
>>>>> '/export/sda/brick'.
>>>>> [2015-03-13 02:09:41.962363] I
>>>>> [client-handshake.c:1468:client_setvolume_cbk]
0-gfsvolume-client-1:
>>> Server
>>>>> and Client lk-version numbers are not same, reopening the
fds
>>>>> [2015-03-13 02:09:41.962416] I
[afr-common.c:3922:afr_notify]
>>>>> 0-gfsvolume-replicate-0: Subvolume
'gfsvolume-client-1' came back up;
>>> going
>>>>> online.
>>>>> [2015-03-13 02:09:41.962487] I
>>>>> [client-handshake.c:450:client_set_lk_version_cbk]
>>> 0-gfsvolume-client-1:
>>>>> Server lk version = 1
>>>>> [2015-03-13 02:09:41.963109] E
>>>>> [afr-self-heald.c:1479:afr_find_child_position]
>>> 0-gfsvolume-replicate-0:
>>>>> getxattr failed on gfsvolume-client-0 - (Transport endpoint
is not
>>>>> connected)
>>>>> [2015-03-13 02:09:41.963502] I
>>>>> [afr-self-heald.c:1687:afr_dir_exclusive_crawl]
>>> 0-gfsvolume-replicate-0:
>>>>> Another crawl is in progress for gfsvolume-client-1
>>>>> [2015-03-13 02:09:41.967478] E
>>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
>>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks failed for
>>>>> <gfid:66af7dc1-a2e6-4919-9ea1-ad75fe2d40b9>.
>>>>> [2015-03-13 02:09:41.968550] E
>>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
>>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks failed for
>>>>> <gfid:8a7cfa39-9a12-43cd-a9f3-9142b7403d0e>.
>>>>> [2015-03-13 02:09:41.969663] E
>>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
>>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks failed for
>>>>> <gfid:3762920e-9631-4a52-9a9f-4f04d09e8d84>.
>>>>> [2015-03-13 02:09:41.974345] E
>>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
>>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks failed for
>>>>> <gfid:66af7dc1-a2e6-4919-9ea1-ad75fe2d40b9>.
>>>>> [2015-03-13 02:09:41.975657] E
>>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
>>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks failed for
>>>>> <gfid:8a7cfa39-9a12-43cd-a9f3-9142b7403d0e>.
>>>>> [2015-03-13 02:09:41.977020] E
>>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
>>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks failed for
>>>>> <gfid:3762920e-9631-4a52-9a9f-4f04d09e8d84>.
>>>>> [2015-03-13 02:09:44.307219] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
>>>>> 0-gfsvolume-client-0: changing port to 49153 (from 0)
>>>>> [2015-03-13 02:09:44.307748] I
>>>>> [client-handshake.c:1659:select_server_supported_programs]
>>>>> 0-gfsvolume-client-0: Using Program GlusterFS 3.3, Num
(1298437),
>>> Version
>>>>> (330)
>>>>> [2015-03-13 02:09:44.448377] I
>>>>> [client-handshake.c:1456:client_setvolume_cbk]
0-gfsvolume-client-0:
>>>>> Connected to 172.20.20.21:49153, attached to remote volume
>>>>> '/export/sda/brick'.
>>>>> [2015-03-13 02:09:44.448418] I
>>>>> [client-handshake.c:1468:client_setvolume_cbk]
0-gfsvolume-client-0:
>>> Server
>>>>> and Client lk-version numbers are not same, reopening the
fds
>>>>> [2015-03-13 02:09:44.448713] I
>>>>> [client-handshake.c:450:client_set_lk_version_cbk]
>>> 0-gfsvolume-client-0:
>>>>> Server lk version = 1
>>>>> [2015-03-13 02:09:44.515112] I
>>>>>
[afr-self-heal-common.c:2859:afr_log_self_heal_completion_status]
>>>>> 0-gfsvolume-replicate-0:  foreground data self heal  is
successfully
>>>>> completed,  data self heal from gfsvolume-client-0  to
sinks
>>>>>  gfsvolume-client-1, with 892928 bytes on
gfsvolume-client-0, 892928
>>> bytes
>>>>> on gfsvolume-client-1,  data - Pending matrix:  [ [ 0
155762 ] [ 0 0 ]
>>> ]
>>>>>  on <gfid:123536cc-c34b-43d7-b0c6-cf80eefa8322>
>>>>> [2015-03-13 02:09:44.809988] I
>>>>>
[afr-self-heal-common.c:2859:afr_log_self_heal_completion_status]
>>>>> 0-gfsvolume-replicate-0:  foreground data self heal  is
successfully
>>>>> completed,  data self heal from gfsvolume-client-0  to
sinks
>>>>>  gfsvolume-client-1, with 15998976 bytes on
gfsvolume-client-0,
>>> 15998976
>>>>> bytes on gfsvolume-client-1,  data - Pending matrix:  [ [ 0
36506 ] [
>>> 0 0 ]
>>>>> ]  on <gfid:b6dc0e74-31bf-469a-b629-ee51ab4cf729>
>>>>> [2015-03-13 02:09:44.946050] W
>>>>> [client-rpc-fops.c:574:client3_3_readlink_cbk]
0-gfsvolume-client-0:
>>> remote
>>>>> operation failed: Stale NFS file handle
>>>>> [2015-03-13 02:09:44.946097] I
>>>>>
[afr-self-heal-entry.c:1538:afr_sh_entry_impunge_readlink_sink_cbk]
>>>>> 0-gfsvolume-replicate-0: readlink of
>>>>>
<gfid:66af7dc1-a2e6-4919-9ea1-ad75fe2d40b9>/PB2_corrected.fastq on
>>>>> gfsvolume-client-1 failed (Stale NFS file handle)
>>>>> [2015-03-13 02:09:44.951370] I
>>>>> [afr-self-heal-entry.c:2321:afr_sh_entry_fix]
0-gfsvolume-replicate-0:
>>>>> <gfid:8a7cfa39-9a12-43cd-a9f3-9142b7403d0e>:
Performing conservative
>>> merge
>>>>> [2015-03-13 02:09:45.149995] W
>>>>> [client-rpc-fops.c:574:client3_3_readlink_cbk]
0-gfsvolume-client-0:
>>> remote
>>>>> operation failed: Stale NFS file handle
>>>>> [2015-03-13 02:09:45.150036] I
>>>>>
[afr-self-heal-entry.c:1538:afr_sh_entry_impunge_readlink_sink_cbk]
>>>>> 0-gfsvolume-replicate-0: readlink of
>>>>> <gfid:8a7cfa39-9a12-43cd-a9f3-9142b7403d0e>/Rscript
on
>>> gfsvolume-client-1
>>>>> failed (Stale NFS file handle)
>>>>> [2015-03-13 02:09:45.214253] W
>>>>> [client-rpc-fops.c:574:client3_3_readlink_cbk]
0-gfsvolume-client-0:
>>> remote
>>>>> operation failed: Stale NFS file handle
>>>>> [2015-03-13 02:09:45.214295] I
>>>>>
[afr-self-heal-entry.c:1538:afr_sh_entry_impunge_readlink_sink_cbk]
>>>>> 0-gfsvolume-replicate-0: readlink of
>>>>>
<gfid:3762920e-9631-4a52-9a9f-4f04d09e8d84>/ananas_d_tmp on
>>>>> gfsvolume-client-1 failed (Stale NFS file handle)
>>>>> [2015-03-13 02:13:27.324856] W [socket.c:522:__socket_rwv]
>>>>> 0-gfsvolume-client-1: readv on 172.20.20.22:49153 failed
(No data
>>>>> available)
>>>>> [2015-03-13 02:13:27.324961] I
[client.c:2208:client_rpc_notify]
>>>>> 0-gfsvolume-client-1: disconnected from 172.20.20.22:49153.
Client
>>>>> process will keep trying to connect to glusterd until
brick's port is
>>>>> available
>>>>> [2015-03-13 02:13:37.981531] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
>>>>> 0-gfsvolume-client-1: changing port to 49153 (from 0)
>>>>> [2015-03-13 02:13:37.981781] E
[socket.c:2161:socket_connect_finish]
>>>>> 0-gfsvolume-client-1: connection to 172.20.20.22:49153
failed
>>> (Connection
>>>>> refused)
>>>>> [2015-03-13 02:13:41.982125] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
>>>>> 0-gfsvolume-client-1: changing port to 49153 (from 0)
>>>>> [2015-03-13 02:13:41.982353] E
[socket.c:2161:socket_connect_finish]
>>>>> 0-gfsvolume-client-1: connection to 172.20.20.22:49153
failed
>>> (Connection
>>>>> refused)
>>>>> [2015-03-13 02:13:45.982693] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
>>>>> 0-gfsvolume-client-1: changing port to 49153 (from 0)
>>>>> [2015-03-13 02:13:45.982926] E
[socket.c:2161:socket_connect_finish]
>>>>> 0-gfsvolume-client-1: connection to 172.20.20.22:49153
failed
>>> (Connection
>>>>> refused)
>>>>> [2015-03-13 02:13:49.983309] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
>>>>> 0-gfsvolume-client-1: changing port to 49153 (from 0)
>>>>>
>>>>>
>>>>
>>>> Any help would be greatly appreciated.
>>>> Thank You Kindly,
>>>> Kaamesh
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> Gluster-users mailing list
>>>> Gluster-users at gluster.org
>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>
>>>
>>>
>>>
>>
> 
-- 
~Atin

Kaamesh Kamalaaharan

2015-Mar-20 07:37 UTC

head link

[Gluster-users] Gluster volume brick keeps going offline

Hi Atin,
Thank you so much for your continual assistance. I am using gluster 3.6.2
on both servers  and on some of the clients. I have attached the gluster1
logs for your reference. The gluster1  log files are empty and the log.1
files are the ones that have data.  I couldnt attach all the files as they
exceed the 25 MB limit. Please let me know if there are any other files i
could attach to help you understand this better.


Thank You Kindly,
Kaamesh
Bioinformatician
Novocraft Technologies Sdn Bhd
C-23A-05, 3 Two Square, Section 19, 46300 Petaling Jaya
Selangor Darul Ehsan
Malaysia
Mobile: +60176562635
Ph: +60379600541
Fax: +60379600540

On Fri, Mar 20, 2015 at 12:57 PM, Atin Mukherjee <amukherj at redhat.com>
wrote:
> I see there is a crash in the brick log.
>
> patchset: git://git.gluster.com/glusterfs.git
>
> signal received: 11
>
> time of crash: 2015-03-19 06:00:35configuration details:
>
> argp 1
>
> backtrace 1
>
> dlfcn 1
>
> fdatasync 1
>
> libpthread 1
>
> llistxattr 1
>
> setfsid 1
>
> spinlock 1
>
> epoll.h 1
>
> xattr.h 1
>
> st_atim.tv_nsec 1
>
> package-string: glusterfs 3.5.0
>
> /lib/x86_64-linux-gnu/libc.so.6(+0x321e0)[0x7f027c7031e0]
>
>
>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/locks.so(__get_entrylk_count+0x40)[0x7f0277fc5d70]
>
>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/locks.so(get_entrylk_count+0x4d)[0x7f0277fc5ddd]
>
>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/locks.so(pl_entrylk_xattr_fill+0x19)[0x7f0277fc2df9]
>
>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/locks.so(pl_lookup_cbk+0x1d0)[0x7f0277fc3390]
>
>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/access-control.so(posix_acl_lookup_cbk+0x12b)[0x7f02781d91fb]
>
>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/storage/posix.so(posix_lookup+0x331)[0x7f02788046c1]
>
>
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_lookup+0x70)[0x7f027d6d1270]
>
>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/access-control.so(posix_acl_lookup+0x1b5)[0x7f02781d72f5]
>
>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/locks.so(pl_lookup+0x211)[0x7f0277fbd391]
>
>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/performance/io-threads.so(iot_lookup_wrapper+0x140)[0x7f0277da82d0]
>
>
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(call_resume+0x126)[0x7f027d6e5f16]
>
>
>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/performance/io-threads.so(iot_worker+0x13e)[0x7f0277da86be]
> /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x7f027ce5bb50]
>
> /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f027c7ad70d]
>
> Pranith/Ravi,
>
> Could you help Kaamesh for it?
>
> Also at the glusterd side I see there is some RPC related failures
> (Probably a corruption). What's the gluster version are you using? Is
> there any surprising logs at the other node?
>
> ~Atin
>
> On 03/19/2015 12:58 PM, Kaamesh Kamalaaharan wrote:
> > Sorry, forgot to include the attachment
> >
> > Thank You Kindly,
> > Kaamesh
> > Bioinformatician
> > Novocraft Technologies Sdn Bhd
> > C-23A-05, 3 Two Square, Section 19, 46300 Petaling Jaya
> > Selangor Darul Ehsan
> > Malaysia
> > Mobile: +60176562635
> > Ph: +60379600541
> > Fax: +60379600540
> >
> > On Thu, Mar 19, 2015 at 2:40 PM, Kaamesh Kamalaaharan <
> kaamesh at novocraft.com
> >> wrote:
> >
> >> Hi Atin, Thanks for the reply. Im not sure which logs are relevant
so
> ill
> >> just attach them all in a gz file.
> >>
> >> I ran a sudo gluster volume start gfsvolume force at  2015-03-19
05:49
> >> i hope this helps.
> >>
> >> Thank You Kindly,
> >> Kaamesh
> >>
> >> On Sun, Mar 15, 2015 at 11:41 PM, Atin Mukherjee <amukherj at
redhat.com>
> >> wrote:
> >>
> >>> Could you attach the logs for the analysis?
> >>>
> >>> ~Atin
> >>>
> >>> On 03/13/2015 03:29 PM, Kaamesh Kamalaaharan wrote:
> >>>> Hi guys. Ive been using gluster for a while now and
despite a few
> >>> hiccups,
> >>>> i find its a great system to use. One of my more
persistent hiccups is
> >>> an
> >>>> issue with one brick going offline.
> >>>>
> >>>> My setup is a 2 brick 2 node setup. my main brick is gfs1
which has
> not
> >>>> given me any problem. gfs2 however keeps going offline.
Following
> >>>>
http://www.gluster.org/pipermail/gluster-users/2014-June/017583.html
> >>>> temporarily fixed the error but  the brick goes offline
within the
> hour.
> >>>>
> >>>> This is what i get from my volume status command :
> >>>>
> >>>> sudo gluster volume status
> >>>>>
> >>>>> Status of volume: gfsvolume
> >>>>> Gluster process Port Online Pid
> >>>>>
> >>>>>
> >>>
>
------------------------------------------------------------------------------
> >>>>> Brick gfs1:/export/sda/brick 49153 Y 9760
> >>>>> Brick gfs2:/export/sda/brick N/A N 13461
> >>>>> NFS Server on localhost 2049 Y 13473
> >>>>> Self-heal Daemon on localhost N/A Y 13480
> >>>>> NFS Server on gfs1 2049 Y 16166
> >>>>> Self-heal Daemon on gfs1 N/A Y 16173
> >>>>>
> >>>>> Task Status of Volume gfsvolume
> >>>>>
> >>>>>
> >>>
>
------------------------------------------------------------------------------
> >>>>> There are no active volume tasks
> >>>>>
> >>>>>
> >>>> doing sudo gluster volume start gfsvolume force gives me
this:
> >>>>
> >>>> sudo gluster volume status
> >>>>>
> >>>>> Status of volume: gfsvolume
> >>>>> Gluster process Port Online Pid
> >>>>>
> >>>>>
> >>>
>
------------------------------------------------------------------------------
> >>>>> Brick gfs1:/export/sda/brick 49153 Y 9760
> >>>>> Brick gfs2:/export/sda/brick 49153 Y 13461
> >>>>> NFS Server on localhost 2049 Y 13473
> >>>>> Self-heal Daemon on localhost N/A Y 13480
> >>>>> NFS Server on gfs1 2049 Y 16166
> >>>>> Self-heal Daemon on gfs1 N/A Y 16173
> >>>>>
> >>>>> Task Status of Volume gfsvolume
> >>>>>
> >>>>>
> >>>
>
------------------------------------------------------------------------------
> >>>>> There are no active volume tasks
> >>>>>
> >>>>> half an hour later and my brick goes down again.
> >>>>
> >>>>>
> >>>>>
> >>>>> This is my glustershd.log. I snipped it because the
rest of the log
> is
> >>> a
> >>>> repeat of the same error
> >>>>
> >>>>
> >>>>>
> >>>>> [2015-03-13 02:09:41.951556] I
[glusterfsd.c:1959:main]
> >>>>> 0-/usr/sbin/glusterfs: Started running
/usr/sbin/glusterfs version
> >>> 3.5.0
> >>>>> (/usr/sbin/glusterfs -s localhost --volfile-id
gluster/glustershd -p
> >>>>> /var/lib/glus
> >>>>> terd/glustershd/run/glustershd.pid -l
> >>> /var/log/glusterfs/glustershd.log -S
> >>>>> /var/run/deac2f873d0ac5b6c3e84b23c4790172.socket
--xlator-option
> >>>>>
*replicate*.node-uuid=adbb7505-3342-4c6d-be3d-75938633612c)
> >>>>> [2015-03-13 02:09:41.954173] I
[socket.c:3561:socket_init]
> >>>>> 0-socket.glusterfsd: SSL support is NOT enabled
> >>>>> [2015-03-13 02:09:41.954236] I
[socket.c:3576:socket_init]
> >>>>> 0-socket.glusterfsd: using system polling thread
> >>>>> [2015-03-13 02:09:41.954421] I
[socket.c:3561:socket_init]
> 0-glusterfs:
> >>>>> SSL support is NOT enabled
> >>>>> [2015-03-13 02:09:41.954443] I
[socket.c:3576:socket_init]
> 0-glusterfs:
> >>>>> using system polling thread
> >>>>> [2015-03-13 02:09:41.956731] I
[graph.c:254:gf_add_cmdline_options]
> >>>>> 0-gfsvolume-replicate-0: adding option
'node-uuid' for volume
> >>>>> 'gfsvolume-replicate-0' with value
> >>> 'adbb7505-3342-4c6d-be3d-75938633612c'
> >>>>> [2015-03-13 02:09:41.960210] I
> >>> [rpc-clnt.c:972:rpc_clnt_connection_init]
> >>>>> 0-gfsvolume-client-1: setting frame-timeout to 90
> >>>>> [2015-03-13 02:09:41.960288] I
[socket.c:3561:socket_init]
> >>>>> 0-gfsvolume-client-1: SSL support is NOT enabled
> >>>>> [2015-03-13 02:09:41.960301] I
[socket.c:3576:socket_init]
> >>>>> 0-gfsvolume-client-1: using system polling thread
> >>>>> [2015-03-13 02:09:41.961095] I
> >>> [rpc-clnt.c:972:rpc_clnt_connection_init]
> >>>>> 0-gfsvolume-client-0: setting frame-timeout to 90
> >>>>> [2015-03-13 02:09:41.961134] I
[socket.c:3561:socket_init]
> >>>>> 0-gfsvolume-client-0: SSL support is NOT enabled
> >>>>> [2015-03-13 02:09:41.961145] I
[socket.c:3576:socket_init]
> >>>>> 0-gfsvolume-client-0: using system polling thread
> >>>>> [2015-03-13 02:09:41.961173] I [client.c:2273:notify]
> >>>>> 0-gfsvolume-client-0: parent translators are ready,
attempting
> connect
> >>> on
> >>>>> transport
> >>>>> [2015-03-13 02:09:41.961412] I [client.c:2273:notify]
> >>>>> 0-gfsvolume-client-1: parent translators are ready,
attempting
> connect
> >>> on
> >>>>> transport
> >>>>> Final graph:
> >>>>>
> >>>>>
> >>>
>
+------------------------------------------------------------------------------+
> >>>>>   1: volume gfsvolume-client-0
> >>>>>   2:     type protocol/client
> >>>>>   3:     option remote-host gfs1
> >>>>>   4:     option remote-subvolume /export/sda/brick
> >>>>>   5:     option transport-type socket
> >>>>>   6:     option frame-timeout 90
> >>>>>   7:     option ping-timeout 30
> >>>>>   8: end-volume
> >>>>>   9:
> >>>>>  10: volume gfsvolume-client-1
> >>>>>  11:     type protocol/client
> >>>>>  12:     option remote-host gfs2
> >>>>>  13:     option remote-subvolume /export/sda/brick
> >>>>>  14:     option transport-type socket
> >>>>>  15:     option frame-timeout 90
> >>>>>  16:     option ping-timeout 30
> >>>>>  17: end-volume
> >>>>>  18:
> >>>>>  19: volume gfsvolume-replicate-0
> >>>>>  20:     type cluster/replicate
> >>>>>  21:     option node-uuid
adbb7505-3342-4c6d-be3d-75938633612c
> >>>>>  22:     option background-self-heal-count 0
> >>>>>  23:     option metadata-self-heal on
> >>>>>  24:     option data-self-heal on
> >>>>>  25:     option entry-self-heal on
> >>>>>  26:     option self-heal-daemon on
> >>>>>  27:     option data-self-heal-algorithm diff
> >>>>>  28:     option quorum-type fixed
> >>>>>  29:     option quorum-count 1
> >>>>>  30:     option iam-self-heal-daemon yes
> >>>>>  31:     subvolumes gfsvolume-client-0
gfsvolume-client-1
> >>>>>  32: end-volume
> >>>>>  33:
> >>>>>  34: volume glustershd
> >>>>>  35:     type debug/io-stats
> >>>>>  36:     subvolumes gfsvolume-replicate-0
> >>>>>  37: end-volume
> >>>>>
> >>>>>
> >>>
>
+------------------------------------------------------------------------------+
> >>>>> [2015-03-13 02:09:41.961871] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
> >>>>> 0-gfsvolume-client-1: changing port to 49153 (from 0)
> >>>>> [2015-03-13 02:09:41.962129] I
> >>>>>
[client-handshake.c:1659:select_server_supported_programs]
> >>>>> 0-gfsvolume-client-1: Using Program GlusterFS 3.3, Num
(1298437),
> >>> Version
> >>>>> (330)
> >>>>> [2015-03-13 02:09:41.962344] I
> >>>>> [client-handshake.c:1456:client_setvolume_cbk]
0-gfsvolume-client-1:
> >>>>> Connected to 172.20.20.22:49153, attached to remote
volume
> >>>>> '/export/sda/brick'.
> >>>>> [2015-03-13 02:09:41.962363] I
> >>>>> [client-handshake.c:1468:client_setvolume_cbk]
0-gfsvolume-client-1:
> >>> Server
> >>>>> and Client lk-version numbers are not same, reopening
the fds
> >>>>> [2015-03-13 02:09:41.962416] I
[afr-common.c:3922:afr_notify]
> >>>>> 0-gfsvolume-replicate-0: Subvolume
'gfsvolume-client-1' came back up;
> >>> going
> >>>>> online.
> >>>>> [2015-03-13 02:09:41.962487] I
> >>>>> [client-handshake.c:450:client_set_lk_version_cbk]
> >>> 0-gfsvolume-client-1:
> >>>>> Server lk version = 1
> >>>>> [2015-03-13 02:09:41.963109] E
> >>>>> [afr-self-heald.c:1479:afr_find_child_position]
> >>> 0-gfsvolume-replicate-0:
> >>>>> getxattr failed on gfsvolume-client-0 - (Transport
endpoint is not
> >>>>> connected)
> >>>>> [2015-03-13 02:09:41.963502] I
> >>>>> [afr-self-heald.c:1687:afr_dir_exclusive_crawl]
> >>> 0-gfsvolume-replicate-0:
> >>>>> Another crawl is in progress for gfsvolume-client-1
> >>>>> [2015-03-13 02:09:41.967478] E
> >>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
> >>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks failed
for
> >>>>> <gfid:66af7dc1-a2e6-4919-9ea1-ad75fe2d40b9>.
> >>>>> [2015-03-13 02:09:41.968550] E
> >>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
> >>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks failed
for
> >>>>> <gfid:8a7cfa39-9a12-43cd-a9f3-9142b7403d0e>.
> >>>>> [2015-03-13 02:09:41.969663] E
> >>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
> >>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks failed
for
> >>>>> <gfid:3762920e-9631-4a52-9a9f-4f04d09e8d84>.
> >>>>> [2015-03-13 02:09:41.974345] E
> >>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
> >>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks failed
for
> >>>>> <gfid:66af7dc1-a2e6-4919-9ea1-ad75fe2d40b9>.
> >>>>> [2015-03-13 02:09:41.975657] E
> >>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
> >>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks failed
for
> >>>>> <gfid:8a7cfa39-9a12-43cd-a9f3-9142b7403d0e>.
> >>>>> [2015-03-13 02:09:41.977020] E
> >>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
> >>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks failed
for
> >>>>> <gfid:3762920e-9631-4a52-9a9f-4f04d09e8d84>.
> >>>>> [2015-03-13 02:09:44.307219] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
> >>>>> 0-gfsvolume-client-0: changing port to 49153 (from 0)
> >>>>> [2015-03-13 02:09:44.307748] I
> >>>>>
[client-handshake.c:1659:select_server_supported_programs]
> >>>>> 0-gfsvolume-client-0: Using Program GlusterFS 3.3, Num
(1298437),
> >>> Version
> >>>>> (330)
> >>>>> [2015-03-13 02:09:44.448377] I
> >>>>> [client-handshake.c:1456:client_setvolume_cbk]
0-gfsvolume-client-0:
> >>>>> Connected to 172.20.20.21:49153, attached to remote
volume
> >>>>> '/export/sda/brick'.
> >>>>> [2015-03-13 02:09:44.448418] I
> >>>>> [client-handshake.c:1468:client_setvolume_cbk]
0-gfsvolume-client-0:
> >>> Server
> >>>>> and Client lk-version numbers are not same, reopening
the fds
> >>>>> [2015-03-13 02:09:44.448713] I
> >>>>> [client-handshake.c:450:client_set_lk_version_cbk]
> >>> 0-gfsvolume-client-0:
> >>>>> Server lk version = 1
> >>>>> [2015-03-13 02:09:44.515112] I
> >>>>>
[afr-self-heal-common.c:2859:afr_log_self_heal_completion_status]
> >>>>> 0-gfsvolume-replicate-0:  foreground data self heal 
is successfully
> >>>>> completed,  data self heal from gfsvolume-client-0  to
sinks
> >>>>>  gfsvolume-client-1, with 892928 bytes on
gfsvolume-client-0, 892928
> >>> bytes
> >>>>> on gfsvolume-client-1,  data - Pending matrix:  [ [ 0
155762 ] [ 0 0
> ]
> >>> ]
> >>>>>  on <gfid:123536cc-c34b-43d7-b0c6-cf80eefa8322>
> >>>>> [2015-03-13 02:09:44.809988] I
> >>>>>
[afr-self-heal-common.c:2859:afr_log_self_heal_completion_status]
> >>>>> 0-gfsvolume-replicate-0:  foreground data self heal 
is successfully
> >>>>> completed,  data self heal from gfsvolume-client-0  to
sinks
> >>>>>  gfsvolume-client-1, with 15998976 bytes on
gfsvolume-client-0,
> >>> 15998976
> >>>>> bytes on gfsvolume-client-1,  data - Pending matrix: 
[ [ 0 36506 ] [
> >>> 0 0 ]
> >>>>> ]  on
<gfid:b6dc0e74-31bf-469a-b629-ee51ab4cf729>
> >>>>> [2015-03-13 02:09:44.946050] W
> >>>>> [client-rpc-fops.c:574:client3_3_readlink_cbk]
0-gfsvolume-client-0:
> >>> remote
> >>>>> operation failed: Stale NFS file handle
> >>>>> [2015-03-13 02:09:44.946097] I
> >>>>>
[afr-self-heal-entry.c:1538:afr_sh_entry_impunge_readlink_sink_cbk]
> >>>>> 0-gfsvolume-replicate-0: readlink of
> >>>>>
<gfid:66af7dc1-a2e6-4919-9ea1-ad75fe2d40b9>/PB2_corrected.fastq on
> >>>>> gfsvolume-client-1 failed (Stale NFS file handle)
> >>>>> [2015-03-13 02:09:44.951370] I
> >>>>> [afr-self-heal-entry.c:2321:afr_sh_entry_fix]
> 0-gfsvolume-replicate-0:
> >>>>> <gfid:8a7cfa39-9a12-43cd-a9f3-9142b7403d0e>:
Performing conservative
> >>> merge
> >>>>> [2015-03-13 02:09:45.149995] W
> >>>>> [client-rpc-fops.c:574:client3_3_readlink_cbk]
0-gfsvolume-client-0:
> >>> remote
> >>>>> operation failed: Stale NFS file handle
> >>>>> [2015-03-13 02:09:45.150036] I
> >>>>>
[afr-self-heal-entry.c:1538:afr_sh_entry_impunge_readlink_sink_cbk]
> >>>>> 0-gfsvolume-replicate-0: readlink of
> >>>>>
<gfid:8a7cfa39-9a12-43cd-a9f3-9142b7403d0e>/Rscript on
> >>> gfsvolume-client-1
> >>>>> failed (Stale NFS file handle)
> >>>>> [2015-03-13 02:09:45.214253] W
> >>>>> [client-rpc-fops.c:574:client3_3_readlink_cbk]
0-gfsvolume-client-0:
> >>> remote
> >>>>> operation failed: Stale NFS file handle
> >>>>> [2015-03-13 02:09:45.214295] I
> >>>>>
[afr-self-heal-entry.c:1538:afr_sh_entry_impunge_readlink_sink_cbk]
> >>>>> 0-gfsvolume-replicate-0: readlink of
> >>>>>
<gfid:3762920e-9631-4a52-9a9f-4f04d09e8d84>/ananas_d_tmp on
> >>>>> gfsvolume-client-1 failed (Stale NFS file handle)
> >>>>> [2015-03-13 02:13:27.324856] W
[socket.c:522:__socket_rwv]
> >>>>> 0-gfsvolume-client-1: readv on 172.20.20.22:49153
failed (No data
> >>>>> available)
> >>>>> [2015-03-13 02:13:27.324961] I
[client.c:2208:client_rpc_notify]
> >>>>> 0-gfsvolume-client-1: disconnected from
172.20.20.22:49153. Client
> >>>>> process will keep trying to connect to glusterd until
brick's port is
> >>>>> available
> >>>>> [2015-03-13 02:13:37.981531] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
> >>>>> 0-gfsvolume-client-1: changing port to 49153 (from 0)
> >>>>> [2015-03-13 02:13:37.981781] E
[socket.c:2161:socket_connect_finish]
> >>>>> 0-gfsvolume-client-1: connection to 172.20.20.22:49153
failed
> >>> (Connection
> >>>>> refused)
> >>>>> [2015-03-13 02:13:41.982125] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
> >>>>> 0-gfsvolume-client-1: changing port to 49153 (from 0)
> >>>>> [2015-03-13 02:13:41.982353] E
[socket.c:2161:socket_connect_finish]
> >>>>> 0-gfsvolume-client-1: connection to 172.20.20.22:49153
failed
> >>> (Connection
> >>>>> refused)
> >>>>> [2015-03-13 02:13:45.982693] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
> >>>>> 0-gfsvolume-client-1: changing port to 49153 (from 0)
> >>>>> [2015-03-13 02:13:45.982926] E
[socket.c:2161:socket_connect_finish]
> >>>>> 0-gfsvolume-client-1: connection to 172.20.20.22:49153
failed
> >>> (Connection
> >>>>> refused)
> >>>>> [2015-03-13 02:13:49.983309] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
> >>>>> 0-gfsvolume-client-1: changing port to 49153 (from 0)
> >>>>>
> >>>>>
> >>>>
> >>>> Any help would be greatly appreciated.
> >>>> Thank You Kindly,
> >>>> Kaamesh
> >>>>
> >>>>
> >>>>
> >>>> _______________________________________________
> >>>> Gluster-users mailing list
> >>>> Gluster-users at gluster.org
> >>>> http://www.gluster.org/mailman/listinfo/gluster-users
> >>>>
> >>>
> >>>
> >>>
> >>
> >
>
> --
> ~Atin
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150320/2b6200ff/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: export-sda-brick.log
Type: text/x-log
Size: 78289 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150320/2b6200ff/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: glustershd.log.1
Type: application/octet-stream
Size: 6014029 bytes
Desc: not available
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150320/2b6200ff/attachment-0001.obj>

Atin Mukherjee

2015-Mar-20 08:29 UTC

head link

[Gluster-users] Gluster volume brick keeps going offline

On 03/20/2015 01:07 PM, Kaamesh Kamalaaharan wrote:> Hi Atin,
> Thank you so much for your continual assistance. I am using gluster 3.6.2
> on both servers  and on some of the clients. I have attached the gluster1
> logs for your reference. The gluster1  log files are empty and the log.1
> files are the ones that have data.  I couldnt attach all the files as they
> exceed the 25 MB limit. Please let me know if there are any other files i
> could attach to help you understand this better.Glusterd log indicates that version is 3.5 as per this:

[2015-03-16 01:05:09.829478] I [glusterfsd.c:1959:main]
0-/usr/sbin/glusterd: Started running /usr/sbin/glusterd version 3.5.0
(/    usr/sbin/glusterd -p /var/run/glusterd.pid)

Could you re-confirm?

~Atin> 
> 
> Thank You Kindly,
> Kaamesh
> Bioinformatician
> Novocraft Technologies Sdn Bhd
> C-23A-05, 3 Two Square, Section 19, 46300 Petaling Jaya
> Selangor Darul Ehsan
> Malaysia
> Mobile: +60176562635
> Ph: +60379600541
> Fax: +60379600540
> 
> On Fri, Mar 20, 2015 at 12:57 PM, Atin Mukherjee <amukherj at
redhat.com>
> wrote:
> 
>> I see there is a crash in the brick log.
>>
>> patchset: git://git.gluster.com/glusterfs.git
>>
>> signal received: 11
>>
>> time of crash: 2015-03-19 06:00:35configuration details:
>>
>> argp 1
>>
>> backtrace 1
>>
>> dlfcn 1
>>
>> fdatasync 1
>>
>> libpthread 1
>>
>> llistxattr 1
>>
>> setfsid 1
>>
>> spinlock 1
>>
>> epoll.h 1
>>
>> xattr.h 1
>>
>> st_atim.tv_nsec 1
>>
>> package-string: glusterfs 3.5.0
>>
>> /lib/x86_64-linux-gnu/libc.so.6(+0x321e0)[0x7f027c7031e0]
>>
>>
>>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/locks.so(__get_entrylk_count+0x40)[0x7f0277fc5d70]
>>
>>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/locks.so(get_entrylk_count+0x4d)[0x7f0277fc5ddd]
>>
>>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/locks.so(pl_entrylk_xattr_fill+0x19)[0x7f0277fc2df9]
>>
>>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/locks.so(pl_lookup_cbk+0x1d0)[0x7f0277fc3390]
>>
>>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/access-control.so(posix_acl_lookup_cbk+0x12b)[0x7f02781d91fb]
>>
>>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/storage/posix.so(posix_lookup+0x331)[0x7f02788046c1]
>>
>>
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(default_lookup+0x70)[0x7f027d6d1270]
>>
>>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/access-control.so(posix_acl_lookup+0x1b5)[0x7f02781d72f5]
>>
>>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/features/locks.so(pl_lookup+0x211)[0x7f0277fbd391]
>>
>>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/performance/io-threads.so(iot_lookup_wrapper+0x140)[0x7f0277da82d0]
>>
>>
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0(call_resume+0x126)[0x7f027d6e5f16]
>>
>>
>>
/usr/lib/x86_64-linux-gnu/glusterfs/3.5.0/xlator/performance/io-threads.so(iot_worker+0x13e)[0x7f0277da86be]
>> /lib/x86_64-linux-gnu/libpthread.so.0(+0x6b50)[0x7f027ce5bb50]
>>
>> /lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f027c7ad70d]
>>
>> Pranith/Ravi,
>>
>> Could you help Kaamesh for it?
>>
>> Also at the glusterd side I see there is some RPC related failures
>> (Probably a corruption). What's the gluster version are you using?
Is
>> there any surprising logs at the other node?
>>
>> ~Atin
>>
>> On 03/19/2015 12:58 PM, Kaamesh Kamalaaharan wrote:
>>> Sorry, forgot to include the attachment
>>>
>>> Thank You Kindly,
>>> Kaamesh
>>> Bioinformatician
>>> Novocraft Technologies Sdn Bhd
>>> C-23A-05, 3 Two Square, Section 19, 46300 Petaling Jaya
>>> Selangor Darul Ehsan
>>> Malaysia
>>> Mobile: +60176562635
>>> Ph: +60379600541
>>> Fax: +60379600540
>>>
>>> On Thu, Mar 19, 2015 at 2:40 PM, Kaamesh Kamalaaharan <
>> kaamesh at novocraft.com
>>>> wrote:
>>>
>>>> Hi Atin, Thanks for the reply. Im not sure which logs are
relevant so
>> ill
>>>> just attach them all in a gz file.
>>>>
>>>> I ran a sudo gluster volume start gfsvolume force at 
2015-03-19 05:49
>>>> i hope this helps.
>>>>
>>>> Thank You Kindly,
>>>> Kaamesh
>>>>
>>>> On Sun, Mar 15, 2015 at 11:41 PM, Atin Mukherjee <amukherj
at redhat.com>
>>>> wrote:
>>>>
>>>>> Could you attach the logs for the analysis?
>>>>>
>>>>> ~Atin
>>>>>
>>>>> On 03/13/2015 03:29 PM, Kaamesh Kamalaaharan wrote:
>>>>>> Hi guys. Ive been using gluster for a while now and
despite a few
>>>>> hiccups,
>>>>>> i find its a great system to use. One of my more
persistent hiccups is
>>>>> an
>>>>>> issue with one brick going offline.
>>>>>>
>>>>>> My setup is a 2 brick 2 node setup. my main brick is
gfs1 which has
>> not
>>>>>> given me any problem. gfs2 however keeps going offline.
Following
>>>>>>
http://www.gluster.org/pipermail/gluster-users/2014-June/017583.html
>>>>>> temporarily fixed the error but  the brick goes offline
within the
>> hour.
>>>>>>
>>>>>> This is what i get from my volume status command :
>>>>>>
>>>>>> sudo gluster volume status
>>>>>>>
>>>>>>> Status of volume: gfsvolume
>>>>>>> Gluster process Port Online Pid
>>>>>>>
>>>>>>>
>>>>>
>>
------------------------------------------------------------------------------
>>>>>>> Brick gfs1:/export/sda/brick 49153 Y 9760
>>>>>>> Brick gfs2:/export/sda/brick N/A N 13461
>>>>>>> NFS Server on localhost 2049 Y 13473
>>>>>>> Self-heal Daemon on localhost N/A Y 13480
>>>>>>> NFS Server on gfs1 2049 Y 16166
>>>>>>> Self-heal Daemon on gfs1 N/A Y 16173
>>>>>>>
>>>>>>> Task Status of Volume gfsvolume
>>>>>>>
>>>>>>>
>>>>>
>>
------------------------------------------------------------------------------
>>>>>>> There are no active volume tasks
>>>>>>>
>>>>>>>
>>>>>> doing sudo gluster volume start gfsvolume force gives
me this:
>>>>>>
>>>>>> sudo gluster volume status
>>>>>>>
>>>>>>> Status of volume: gfsvolume
>>>>>>> Gluster process Port Online Pid
>>>>>>>
>>>>>>>
>>>>>
>>
------------------------------------------------------------------------------
>>>>>>> Brick gfs1:/export/sda/brick 49153 Y 9760
>>>>>>> Brick gfs2:/export/sda/brick 49153 Y 13461
>>>>>>> NFS Server on localhost 2049 Y 13473
>>>>>>> Self-heal Daemon on localhost N/A Y 13480
>>>>>>> NFS Server on gfs1 2049 Y 16166
>>>>>>> Self-heal Daemon on gfs1 N/A Y 16173
>>>>>>>
>>>>>>> Task Status of Volume gfsvolume
>>>>>>>
>>>>>>>
>>>>>
>>
------------------------------------------------------------------------------
>>>>>>> There are no active volume tasks
>>>>>>>
>>>>>>> half an hour later and my brick goes down again.
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> This is my glustershd.log. I snipped it because the
rest of the log
>> is
>>>>> a
>>>>>> repeat of the same error
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> [2015-03-13 02:09:41.951556] I
[glusterfsd.c:1959:main]
>>>>>>> 0-/usr/sbin/glusterfs: Started running
/usr/sbin/glusterfs version
>>>>> 3.5.0
>>>>>>> (/usr/sbin/glusterfs -s localhost --volfile-id
gluster/glustershd -p
>>>>>>> /var/lib/glus
>>>>>>> terd/glustershd/run/glustershd.pid -l
>>>>> /var/log/glusterfs/glustershd.log -S
>>>>>>> /var/run/deac2f873d0ac5b6c3e84b23c4790172.socket
--xlator-option
>>>>>>>
*replicate*.node-uuid=adbb7505-3342-4c6d-be3d-75938633612c)
>>>>>>> [2015-03-13 02:09:41.954173] I
[socket.c:3561:socket_init]
>>>>>>> 0-socket.glusterfsd: SSL support is NOT enabled
>>>>>>> [2015-03-13 02:09:41.954236] I
[socket.c:3576:socket_init]
>>>>>>> 0-socket.glusterfsd: using system polling thread
>>>>>>> [2015-03-13 02:09:41.954421] I
[socket.c:3561:socket_init]
>> 0-glusterfs:
>>>>>>> SSL support is NOT enabled
>>>>>>> [2015-03-13 02:09:41.954443] I
[socket.c:3576:socket_init]
>> 0-glusterfs:
>>>>>>> using system polling thread
>>>>>>> [2015-03-13 02:09:41.956731] I
[graph.c:254:gf_add_cmdline_options]
>>>>>>> 0-gfsvolume-replicate-0: adding option
'node-uuid' for volume
>>>>>>> 'gfsvolume-replicate-0' with value
>>>>> 'adbb7505-3342-4c6d-be3d-75938633612c'
>>>>>>> [2015-03-13 02:09:41.960210] I
>>>>> [rpc-clnt.c:972:rpc_clnt_connection_init]
>>>>>>> 0-gfsvolume-client-1: setting frame-timeout to 90
>>>>>>> [2015-03-13 02:09:41.960288] I
[socket.c:3561:socket_init]
>>>>>>> 0-gfsvolume-client-1: SSL support is NOT enabled
>>>>>>> [2015-03-13 02:09:41.960301] I
[socket.c:3576:socket_init]
>>>>>>> 0-gfsvolume-client-1: using system polling thread
>>>>>>> [2015-03-13 02:09:41.961095] I
>>>>> [rpc-clnt.c:972:rpc_clnt_connection_init]
>>>>>>> 0-gfsvolume-client-0: setting frame-timeout to 90
>>>>>>> [2015-03-13 02:09:41.961134] I
[socket.c:3561:socket_init]
>>>>>>> 0-gfsvolume-client-0: SSL support is NOT enabled
>>>>>>> [2015-03-13 02:09:41.961145] I
[socket.c:3576:socket_init]
>>>>>>> 0-gfsvolume-client-0: using system polling thread
>>>>>>> [2015-03-13 02:09:41.961173] I
[client.c:2273:notify]
>>>>>>> 0-gfsvolume-client-0: parent translators are ready,
attempting
>> connect
>>>>> on
>>>>>>> transport
>>>>>>> [2015-03-13 02:09:41.961412] I
[client.c:2273:notify]
>>>>>>> 0-gfsvolume-client-1: parent translators are ready,
attempting
>> connect
>>>>> on
>>>>>>> transport
>>>>>>> Final graph:
>>>>>>>
>>>>>>>
>>>>>
>>
+------------------------------------------------------------------------------+
>>>>>>>   1: volume gfsvolume-client-0
>>>>>>>   2:     type protocol/client
>>>>>>>   3:     option remote-host gfs1
>>>>>>>   4:     option remote-subvolume /export/sda/brick
>>>>>>>   5:     option transport-type socket
>>>>>>>   6:     option frame-timeout 90
>>>>>>>   7:     option ping-timeout 30
>>>>>>>   8: end-volume
>>>>>>>   9:
>>>>>>>  10: volume gfsvolume-client-1
>>>>>>>  11:     type protocol/client
>>>>>>>  12:     option remote-host gfs2
>>>>>>>  13:     option remote-subvolume /export/sda/brick
>>>>>>>  14:     option transport-type socket
>>>>>>>  15:     option frame-timeout 90
>>>>>>>  16:     option ping-timeout 30
>>>>>>>  17: end-volume
>>>>>>>  18:
>>>>>>>  19: volume gfsvolume-replicate-0
>>>>>>>  20:     type cluster/replicate
>>>>>>>  21:     option node-uuid
adbb7505-3342-4c6d-be3d-75938633612c
>>>>>>>  22:     option background-self-heal-count 0
>>>>>>>  23:     option metadata-self-heal on
>>>>>>>  24:     option data-self-heal on
>>>>>>>  25:     option entry-self-heal on
>>>>>>>  26:     option self-heal-daemon on
>>>>>>>  27:     option data-self-heal-algorithm diff
>>>>>>>  28:     option quorum-type fixed
>>>>>>>  29:     option quorum-count 1
>>>>>>>  30:     option iam-self-heal-daemon yes
>>>>>>>  31:     subvolumes gfsvolume-client-0
gfsvolume-client-1
>>>>>>>  32: end-volume
>>>>>>>  33:
>>>>>>>  34: volume glustershd
>>>>>>>  35:     type debug/io-stats
>>>>>>>  36:     subvolumes gfsvolume-replicate-0
>>>>>>>  37: end-volume
>>>>>>>
>>>>>>>
>>>>>
>>
+------------------------------------------------------------------------------+
>>>>>>> [2015-03-13 02:09:41.961871] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
>>>>>>> 0-gfsvolume-client-1: changing port to 49153 (from
0)
>>>>>>> [2015-03-13 02:09:41.962129] I
>>>>>>>
[client-handshake.c:1659:select_server_supported_programs]
>>>>>>> 0-gfsvolume-client-1: Using Program GlusterFS 3.3,
Num (1298437),
>>>>> Version
>>>>>>> (330)
>>>>>>> [2015-03-13 02:09:41.962344] I
>>>>>>> [client-handshake.c:1456:client_setvolume_cbk]
0-gfsvolume-client-1:
>>>>>>> Connected to 172.20.20.22:49153, attached to remote
volume
>>>>>>> '/export/sda/brick'.
>>>>>>> [2015-03-13 02:09:41.962363] I
>>>>>>> [client-handshake.c:1468:client_setvolume_cbk]
0-gfsvolume-client-1:
>>>>> Server
>>>>>>> and Client lk-version numbers are not same,
reopening the fds
>>>>>>> [2015-03-13 02:09:41.962416] I
[afr-common.c:3922:afr_notify]
>>>>>>> 0-gfsvolume-replicate-0: Subvolume
'gfsvolume-client-1' came back up;
>>>>> going
>>>>>>> online.
>>>>>>> [2015-03-13 02:09:41.962487] I
>>>>>>> [client-handshake.c:450:client_set_lk_version_cbk]
>>>>> 0-gfsvolume-client-1:
>>>>>>> Server lk version = 1
>>>>>>> [2015-03-13 02:09:41.963109] E
>>>>>>> [afr-self-heald.c:1479:afr_find_child_position]
>>>>> 0-gfsvolume-replicate-0:
>>>>>>> getxattr failed on gfsvolume-client-0 - (Transport
endpoint is not
>>>>>>> connected)
>>>>>>> [2015-03-13 02:09:41.963502] I
>>>>>>> [afr-self-heald.c:1687:afr_dir_exclusive_crawl]
>>>>> 0-gfsvolume-replicate-0:
>>>>>>> Another crawl is in progress for gfsvolume-client-1
>>>>>>> [2015-03-13 02:09:41.967478] E
>>>>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
>>>>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks
failed for
>>>>>>> <gfid:66af7dc1-a2e6-4919-9ea1-ad75fe2d40b9>.
>>>>>>> [2015-03-13 02:09:41.968550] E
>>>>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
>>>>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks
failed for
>>>>>>> <gfid:8a7cfa39-9a12-43cd-a9f3-9142b7403d0e>.
>>>>>>> [2015-03-13 02:09:41.969663] E
>>>>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
>>>>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks
failed for
>>>>>>> <gfid:3762920e-9631-4a52-9a9f-4f04d09e8d84>.
>>>>>>> [2015-03-13 02:09:41.974345] E
>>>>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
>>>>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks
failed for
>>>>>>> <gfid:66af7dc1-a2e6-4919-9ea1-ad75fe2d40b9>.
>>>>>>> [2015-03-13 02:09:41.975657] E
>>>>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
>>>>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks
failed for
>>>>>>> <gfid:8a7cfa39-9a12-43cd-a9f3-9142b7403d0e>.
>>>>>>> [2015-03-13 02:09:41.977020] E
>>>>>>>
[afr-self-heal-entry.c:2364:afr_sh_post_nonblocking_entry_cbk]
>>>>>>> 0-gfsvolume-replicate-0: Non Blocking entrylks
failed for
>>>>>>> <gfid:3762920e-9631-4a52-9a9f-4f04d09e8d84>.
>>>>>>> [2015-03-13 02:09:44.307219] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
>>>>>>> 0-gfsvolume-client-0: changing port to 49153 (from
0)
>>>>>>> [2015-03-13 02:09:44.307748] I
>>>>>>>
[client-handshake.c:1659:select_server_supported_programs]
>>>>>>> 0-gfsvolume-client-0: Using Program GlusterFS 3.3,
Num (1298437),
>>>>> Version
>>>>>>> (330)
>>>>>>> [2015-03-13 02:09:44.448377] I
>>>>>>> [client-handshake.c:1456:client_setvolume_cbk]
0-gfsvolume-client-0:
>>>>>>> Connected to 172.20.20.21:49153, attached to remote
volume
>>>>>>> '/export/sda/brick'.
>>>>>>> [2015-03-13 02:09:44.448418] I
>>>>>>> [client-handshake.c:1468:client_setvolume_cbk]
0-gfsvolume-client-0:
>>>>> Server
>>>>>>> and Client lk-version numbers are not same,
reopening the fds
>>>>>>> [2015-03-13 02:09:44.448713] I
>>>>>>> [client-handshake.c:450:client_set_lk_version_cbk]
>>>>> 0-gfsvolume-client-0:
>>>>>>> Server lk version = 1
>>>>>>> [2015-03-13 02:09:44.515112] I
>>>>>>>
[afr-self-heal-common.c:2859:afr_log_self_heal_completion_status]
>>>>>>> 0-gfsvolume-replicate-0:  foreground data self heal
is successfully
>>>>>>> completed,  data self heal from gfsvolume-client-0 
to sinks
>>>>>>>  gfsvolume-client-1, with 892928 bytes on
gfsvolume-client-0, 892928
>>>>> bytes
>>>>>>> on gfsvolume-client-1,  data - Pending matrix:  [ [
0 155762 ] [ 0 0
>> ]
>>>>> ]
>>>>>>>  on
<gfid:123536cc-c34b-43d7-b0c6-cf80eefa8322>
>>>>>>> [2015-03-13 02:09:44.809988] I
>>>>>>>
[afr-self-heal-common.c:2859:afr_log_self_heal_completion_status]
>>>>>>> 0-gfsvolume-replicate-0:  foreground data self heal
is successfully
>>>>>>> completed,  data self heal from gfsvolume-client-0 
to sinks
>>>>>>>  gfsvolume-client-1, with 15998976 bytes on
gfsvolume-client-0,
>>>>> 15998976
>>>>>>> bytes on gfsvolume-client-1,  data - Pending
matrix:  [ [ 0 36506 ] [
>>>>> 0 0 ]
>>>>>>> ]  on
<gfid:b6dc0e74-31bf-469a-b629-ee51ab4cf729>
>>>>>>> [2015-03-13 02:09:44.946050] W
>>>>>>> [client-rpc-fops.c:574:client3_3_readlink_cbk]
0-gfsvolume-client-0:
>>>>> remote
>>>>>>> operation failed: Stale NFS file handle
>>>>>>> [2015-03-13 02:09:44.946097] I
>>>>>>>
[afr-self-heal-entry.c:1538:afr_sh_entry_impunge_readlink_sink_cbk]
>>>>>>> 0-gfsvolume-replicate-0: readlink of
>>>>>>>
<gfid:66af7dc1-a2e6-4919-9ea1-ad75fe2d40b9>/PB2_corrected.fastq on
>>>>>>> gfsvolume-client-1 failed (Stale NFS file handle)
>>>>>>> [2015-03-13 02:09:44.951370] I
>>>>>>> [afr-self-heal-entry.c:2321:afr_sh_entry_fix]
>> 0-gfsvolume-replicate-0:
>>>>>>> <gfid:8a7cfa39-9a12-43cd-a9f3-9142b7403d0e>:
Performing conservative
>>>>> merge
>>>>>>> [2015-03-13 02:09:45.149995] W
>>>>>>> [client-rpc-fops.c:574:client3_3_readlink_cbk]
0-gfsvolume-client-0:
>>>>> remote
>>>>>>> operation failed: Stale NFS file handle
>>>>>>> [2015-03-13 02:09:45.150036] I
>>>>>>>
[afr-self-heal-entry.c:1538:afr_sh_entry_impunge_readlink_sink_cbk]
>>>>>>> 0-gfsvolume-replicate-0: readlink of
>>>>>>>
<gfid:8a7cfa39-9a12-43cd-a9f3-9142b7403d0e>/Rscript on
>>>>> gfsvolume-client-1
>>>>>>> failed (Stale NFS file handle)
>>>>>>> [2015-03-13 02:09:45.214253] W
>>>>>>> [client-rpc-fops.c:574:client3_3_readlink_cbk]
0-gfsvolume-client-0:
>>>>> remote
>>>>>>> operation failed: Stale NFS file handle
>>>>>>> [2015-03-13 02:09:45.214295] I
>>>>>>>
[afr-self-heal-entry.c:1538:afr_sh_entry_impunge_readlink_sink_cbk]
>>>>>>> 0-gfsvolume-replicate-0: readlink of
>>>>>>>
<gfid:3762920e-9631-4a52-9a9f-4f04d09e8d84>/ananas_d_tmp on
>>>>>>> gfsvolume-client-1 failed (Stale NFS file handle)
>>>>>>> [2015-03-13 02:13:27.324856] W
[socket.c:522:__socket_rwv]
>>>>>>> 0-gfsvolume-client-1: readv on 172.20.20.22:49153
failed (No data
>>>>>>> available)
>>>>>>> [2015-03-13 02:13:27.324961] I
[client.c:2208:client_rpc_notify]
>>>>>>> 0-gfsvolume-client-1: disconnected from
172.20.20.22:49153. Client
>>>>>>> process will keep trying to connect to glusterd
until brick's port is
>>>>>>> available
>>>>>>> [2015-03-13 02:13:37.981531] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
>>>>>>> 0-gfsvolume-client-1: changing port to 49153 (from
0)
>>>>>>> [2015-03-13 02:13:37.981781] E
[socket.c:2161:socket_connect_finish]
>>>>>>> 0-gfsvolume-client-1: connection to
172.20.20.22:49153 failed
>>>>> (Connection
>>>>>>> refused)
>>>>>>> [2015-03-13 02:13:41.982125] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
>>>>>>> 0-gfsvolume-client-1: changing port to 49153 (from
0)
>>>>>>> [2015-03-13 02:13:41.982353] E
[socket.c:2161:socket_connect_finish]
>>>>>>> 0-gfsvolume-client-1: connection to
172.20.20.22:49153 failed
>>>>> (Connection
>>>>>>> refused)
>>>>>>> [2015-03-13 02:13:45.982693] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
>>>>>>> 0-gfsvolume-client-1: changing port to 49153 (from
0)
>>>>>>> [2015-03-13 02:13:45.982926] E
[socket.c:2161:socket_connect_finish]
>>>>>>> 0-gfsvolume-client-1: connection to
172.20.20.22:49153 failed
>>>>> (Connection
>>>>>>> refused)
>>>>>>> [2015-03-13 02:13:49.983309] I
[rpc-clnt.c:1685:rpc_clnt_reconfig]
>>>>>>> 0-gfsvolume-client-1: changing port to 49153 (from
0)
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Any help would be greatly appreciated.
>>>>>> Thank You Kindly,
>>>>>> Kaamesh
>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> Gluster-users mailing list
>>>>>> Gluster-users at gluster.org
>>>>>> http://www.gluster.org/mailman/listinfo/gluster-users
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>> --
>> ~Atin
>>
> 
-- 
~Atin

Gluster users - Mar 2015 - Gluster volume brick keeps going offline

[Gluster-users] Gluster volume brick keeps going offline

[Gluster-users] Gluster volume brick keeps going offline

[Gluster-users] Gluster volume brick keeps going offline