thr3ads.net - Gluster users - [Gluster-users] Why is it not possible to mount a replicated gluster volume with one Gluster server? [Aug 2015]

If this information is useful, please help other people find it:
Share via:

Yiping Peng

2015-Aug-31 12:10 UTC

[Gluster-users] Why is it not possible to mount a replicated gluster volume with one Gluster server?

>
> One more thing, when I do this on server1, which has been in the pool for
> a long time:
> server1:~$ mount server1:/vol1 mountpoint
> It also fails.
> The log gave me:
>
My fault, I used localhost as endpoint.

I re-issued "mount -t glusterfs server01:/speech0 qqq"
and the log shows a lot of things like:

[2015-08-31 12:08:44.801169] W [socket.c:923:__socket_keepalive] 0-socket:
failed to set TCP_USER_TIMEOUT 0 on socket 57, Protocol not available
[2015-08-31 12:08:44.801187] E [socket.c:3019:socket_connect]
0-speech0-client-43: Failed to set keep-alive: Protocol not available
[2015-08-31 12:08:44.801305] W [socket.c:642:__socket_rwv]
0-speech0-client-43: readv on 10.88.153.25:24007 failed (Connection reset
by peer)
[2015-08-31 12:08:44.801404] E [rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7fcf540db65b] (-->
/usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fcf53ea71b7] (-->
/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fcf53ea72ce] (-->
/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xab)[0x7fcf53ea739b]
(--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15f)[0x7fcf53ea795f] )))))
0-speech0-client-43: forced unwinding frame type(GF-DUMP) op(DUMP(1))
called at 2015-08-31 12:08:44.801294 (xid=0x17)
[2015-08-31 12:08:44.801423] W [MSGID: 114032]
[client-handshake.c:1623:client_dump_version_cbk] 0-speech0-client-43:
received RPC status error [Transport endpoint is not connected]
[2015-08-31 12:08:44.801440] I [MSGID: 114018]
[client.c:2042:client_rpc_notify] 0-speech0-client-43: disconnected from
speech0-client-43. Client process will keep trying to connect to glusterd
until brick's port is available
[2015-08-31 12:08:44.804488] W [socket.c:923:__socket_keepalive] 0-socket:
failed to set TCP_USER_TIMEOUT 0 on socket 57, Protocol not available
[2015-08-31 12:08:44.804505] E [socket.c:3019:socket_connect]
0-speech0-client-51: Failed to set keep-alive: Protocol not available
[2015-08-31 12:08:44.804775] W [socket.c:642:__socket_rwv]
0-speech0-client-51: readv on 10.88.146.19:24007 failed (Connection reset
by peer)
[2015-08-31 12:08:44.804878] E [rpc-clnt.c:362:saved_frames_unwind] (-->
/usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7fcf540db65b] (-->
/usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fcf53ea71b7] (-->
/usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fcf53ea72ce] (-->
/usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xab)[0x7fcf53ea739b]
(--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15f)[0x7fcf53ea795f] )))))
0-speech0-client-51: forced unwinding frame type(GF-DUMP) op(DUMP(1))
called at 2015-08-31 12:08:44.804693 (xid=0x18)
[2015-08-31 12:08:44.804898] W [MSGID: 114032]
[client-handshake.c:1623:client_dump_version_cbk] 0-speech0-client-51:
received RPC status error [Transport endpoint is not connected]
[2015-08-31 12:08:44.804917] I [MSGID: 114018]
[client.c:2042:client_rpc_notify] 0-speech0-client-51: disconnected from
speech0-client-51. Client process will keep trying to connect to glusterd
until brick's port is available

2015-08-31 20:06 GMT+08:00 Yiping Peng <barius.cn at gmail.com>:
>
> I believe the following events have happened in the cluster resulting
>> into this situation:
>> 1. GlusterD & brick process on node 2 was brought down
>> 2. Node 1 was rebooted.
>>
> Strangely enough, glusterfs, glusterd and glusterfsd are running on my
> server. Is glusterfsd the brick process? Also server01 has not been
> rebooted during the whole process.
>
> glusterfsd has the following arguments:
> /usr/sbin/glusterfsd -s server01.local.net --volfile-id
> speech0.server01.local.net.home-glusterfs-speech0-brick0 -p
>
/var/lib/glusterd/vols/speech0/run/server01.local.net-home-glusterfs-speech0-brick0.pid
> -S /var/run/gluster/6bf40a98deade9dde8b615226bc57567.socket --brick-name
> /home/glusterfs/speech0/brick0 -l
> /var/log/glusterfs/bricks/home-glusterfs-speech0-brick0.log --xlator-option
> *-posix.glusterd-uuid=1c33ff18-2a6a-44cf-9a04-727fc96e92be --brick-port
> 49159 --xlator-option speech0-server.listen-port=49159
>
> One more thing, when I do this on server1, which has been in the pool for
> a long time:
> server1:~$ mount server1:/vol1 mountpoint
> It also fails.
> The log gave me:
>
> [2015-08-31 11:56:57.123307] I [MSGID: 100030] [glusterfsd.c:2301:main]
> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.7.3
> (args: /usr/sbin/glusterfs --volfile-server=localhost --volfile-id=/speech0
> qqq)
> [2015-08-31 11:56:57.134642] W [socket.c:923:__socket_keepalive] 0-socket:
> failed to set TCP_USER_TIMEOUT 0 on socket 9, Protocol not available
> [2015-08-31 11:56:57.134688] E [socket.c:3019:socket_connect] 0-glusterfs:
> Failed to set keep-alive: Protocol not available
> [2015-08-31 11:56:57.135063] I [MSGID: 101190]
> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
> with index 1
> [2015-08-31 11:56:57.135113] E [socket.c:2332:socket_connect_finish]
> 0-glusterfs: connection to 127.0.0.1:24007 failed (Connection reset by
> peer)
> [2015-08-31 11:56:57.135149] E [glusterfsd-mgmt.c:1819:mgmt_rpc_notify]
> 0-glusterfsd-mgmt: failed to connect with remote-host: localhost (Transport
> endpoint is not connected)
> [2015-08-31 11:56:57.135158] I [glusterfsd-mgmt.c:1825:mgmt_rpc_notify]
> 0-glusterfsd-mgmt: Exhausted all volfile servers
> [2015-08-31 11:56:57.135333] W [glusterfsd.c:1219:cleanup_and_exit]
> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1a3) [0x7fb5e1be39a3]
> -->/usr/sbin/glusterfs() [0x4099c8]
> -->/usr/sbin/glusterfs(cleanup_and_exit+0x65) [0x4059b5] ) 0-: received
> signum (1), shutting down
> [2015-08-31 11:56:57.135371] I [fuse-bridge.c:5595:fini] 0-fuse:
> Unmounting '/home/speech/pengyiping/qqq'.
> [2015-08-31 11:56:57.140640] W [glusterfsd.c:1219:cleanup_and_exit]
> (-->/lib64/libpthread.so.0() [0x318b207851]
> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x405e4d]
> -->/usr/sbin/glusterfs(cleanup_and_exit+0x65) [0x4059b5] ) 0-: received
> signum (15), shutting down
>
>
> Any help is much appreciated.
>
>
> 2015-08-31 19:15 GMT+08:00 Atin Mukherjee <amukherj at redhat.com>:
>
>> I believe the following events have happened in the cluster resulting
>> into this situation:
>> 1. GlusterD & brick process on node 2 was brought down
>> 2. Node 1 was rebooted.
>>
>> In the above case the mount will definitely fail since the brick
process
>> was not started as in a 2 node set up glusterd waits its peers to come
>> up before it starts the bricks. Could you check whether the brick
>> process is running or not?
>>
>> Thanks,
>> Atin
>>
>> On 08/31/2015 04:17 PM, Yiping Peng wrote:
>> > I've tried both: assuming server1 is already in pool, server2
is
>> undergoing
>> > peer-probing
>> >
>> > server2:~$ mount server1:/vol1 mountpoint, fail;
>> > server2:~$ mount server2:/vol1 mountpoint, fail.
>> >
>> > Strange enough. I *should* be able to mount server1:/vol1 on
server2.
>> But
>> > this is not the case :(
>> > Maybe something is broken in the server pool, as I'm seeing
disconnected
>> > nodes?
>> >
>> >
>> > 2015-08-31 18:02 GMT+08:00 Ravishankar N <ravishankar at
redhat.com>:
>> >
>> >>
>> >>
>> >> On 08/31/2015 12:53 PM, Merlin Morgenstern wrote:
>> >>
>> >> Trying to mount the brick on the same physical server with
deamon
>> running
>> >> on this server but not on the other server:
>> >>
>> >> @node2:~$ sudo mount -t glusterfs gs2:/volume1 /data/nfs
>> >> Mount failed. Please check the log file for more details.
>> >>
>> >> For mount to succeed the glusterd must be up on the node that
you
>> specify
>> >> as the volfile-server; gs2 in this case. You can use -o
>> >> backupvolfile-server=gs1 as a fallback.
>> >> -Ravi
>> >>
>> >> _______________________________________________
>> >> Gluster-users mailing list
>> >> Gluster-users at gluster.org
>> >> http://www.gluster.org/mailman/listinfo/gluster-users
>> >>
>> >
>> >
>> >
>> > _______________________________________________
>> > Gluster-users mailing list
>> > Gluster-users at gluster.org
>> > http://www.gluster.org/mailman/listinfo/gluster-users
>> >
>>
>
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150831/dc400684/attachment.html>

Merlin Morgenstern

2015-Aug-31 17:04 UTC

head link

[Gluster-users] Why is it not possible to mount a replicated gluster volume with one Gluster server?

Thank you all for your help.

To explain the setup better, here is the goal I am trying to achieve:

- 3 servers running in a cluster, each with a webserver uploading and
serving files to visitors from a common glusterfs share.
- Server1 and Server2 have gluster-server installed
- One brick replicated between Server1 and Server2 with the goal of
achieving High Availability
- Server1, Server2 and Server3 mount the brick through fuse.
- Server1 mounts Gluster-Server1 with Backup of Server 2. Same via versa
for Server2

Now following scenario:

1. Server2 dies

In this case Server1 serves as a failover and serves the files for
Server1,2,3 until Server1 comes back up again. This works.

2. Server2 dies. Server1 has to reboot.

In this case the service stays down. It is inpossible to remount the share
without Server1. This is not acceptable for a High Availability System and
I believe also not intended, but a misconfiguration or bug.

Thank you again for looking into this.


2015-08-31 14:10 GMT+02:00 Yiping Peng <barius.cn at gmail.com>:
> One more thing, when I do this on server1, which has been in the pool for
>> a long time:
>> server1:~$ mount server1:/vol1 mountpoint
>> It also fails.
>> The log gave me:
>>
>
> My fault, I used localhost as endpoint.
>
> I re-issued "mount -t glusterfs server01:/speech0 qqq"
> and the log shows a lot of things like:
>
> [2015-08-31 12:08:44.801169] W [socket.c:923:__socket_keepalive] 0-socket:
> failed to set TCP_USER_TIMEOUT 0 on socket 57, Protocol not available
> [2015-08-31 12:08:44.801187] E [socket.c:3019:socket_connect]
> 0-speech0-client-43: Failed to set keep-alive: Protocol not available
> [2015-08-31 12:08:44.801305] W [socket.c:642:__socket_rwv]
> 0-speech0-client-43: readv on 10.88.153.25:24007 failed (Connection reset
> by peer)
> [2015-08-31 12:08:44.801404] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7fcf540db65b]
(-->
> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fcf53ea71b7] (-->
> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fcf53ea72ce] (-->
> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xab)[0x7fcf53ea739b]
> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15f)[0x7fcf53ea795f]
)))))
> 0-speech0-client-43: forced unwinding frame type(GF-DUMP) op(DUMP(1))
> called at 2015-08-31 12:08:44.801294 (xid=0x17)
> [2015-08-31 12:08:44.801423] W [MSGID: 114032]
> [client-handshake.c:1623:client_dump_version_cbk] 0-speech0-client-43:
> received RPC status error [Transport endpoint is not connected]
> [2015-08-31 12:08:44.801440] I [MSGID: 114018]
> [client.c:2042:client_rpc_notify] 0-speech0-client-43: disconnected from
> speech0-client-43. Client process will keep trying to connect to glusterd
> until brick's port is available
> [2015-08-31 12:08:44.804488] W [socket.c:923:__socket_keepalive] 0-socket:
> failed to set TCP_USER_TIMEOUT 0 on socket 57, Protocol not available
> [2015-08-31 12:08:44.804505] E [socket.c:3019:socket_connect]
> 0-speech0-client-51: Failed to set keep-alive: Protocol not available
> [2015-08-31 12:08:44.804775] W [socket.c:642:__socket_rwv]
> 0-speech0-client-51: readv on 10.88.146.19:24007 failed (Connection reset
> by peer)
> [2015-08-31 12:08:44.804878] E [rpc-clnt.c:362:saved_frames_unwind] (-->
> /usr/lib64/libglusterfs.so.0(_gf_log_callingfn+0x1eb)[0x7fcf540db65b]
(-->
> /usr/lib64/libgfrpc.so.0(saved_frames_unwind+0x1e7)[0x7fcf53ea71b7] (-->
> /usr/lib64/libgfrpc.so.0(saved_frames_destroy+0xe)[0x7fcf53ea72ce] (-->
> /usr/lib64/libgfrpc.so.0(rpc_clnt_connection_cleanup+0xab)[0x7fcf53ea739b]
> (--> /usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x15f)[0x7fcf53ea795f]
)))))
> 0-speech0-client-51: forced unwinding frame type(GF-DUMP) op(DUMP(1))
> called at 2015-08-31 12:08:44.804693 (xid=0x18)
> [2015-08-31 12:08:44.804898] W [MSGID: 114032]
> [client-handshake.c:1623:client_dump_version_cbk] 0-speech0-client-51:
> received RPC status error [Transport endpoint is not connected]
> [2015-08-31 12:08:44.804917] I [MSGID: 114018]
> [client.c:2042:client_rpc_notify] 0-speech0-client-51: disconnected from
> speech0-client-51. Client process will keep trying to connect to glusterd
> until brick's port is available
>
>
> 2015-08-31 20:06 GMT+08:00 Yiping Peng <barius.cn at gmail.com>:
>
>>
>> I believe the following events have happened in the cluster resulting
>>> into this situation:
>>> 1. GlusterD & brick process on node 2 was brought down
>>> 2. Node 1 was rebooted.
>>>
>> Strangely enough, glusterfs, glusterd and glusterfsd are running on my
>> server. Is glusterfsd the brick process? Also server01 has not been
>> rebooted during the whole process.
>>
>> glusterfsd has the following arguments:
>> /usr/sbin/glusterfsd -s server01.local.net --volfile-id
>> speech0.server01.local.net.home-glusterfs-speech0-brick0 -p
>>
/var/lib/glusterd/vols/speech0/run/server01.local.net-home-glusterfs-speech0-brick0.pid
>> -S /var/run/gluster/6bf40a98deade9dde8b615226bc57567.socket
--brick-name
>> /home/glusterfs/speech0/brick0 -l
>> /var/log/glusterfs/bricks/home-glusterfs-speech0-brick0.log
--xlator-option
>> *-posix.glusterd-uuid=1c33ff18-2a6a-44cf-9a04-727fc96e92be --brick-port
>> 49159 --xlator-option speech0-server.listen-port=49159
>>
>> One more thing, when I do this on server1, which has been in the pool
for
>> a long time:
>> server1:~$ mount server1:/vol1 mountpoint
>> It also fails.
>> The log gave me:
>>
>> [2015-08-31 11:56:57.123307] I [MSGID: 100030] [glusterfsd.c:2301:main]
>> 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version
3.7.3
>> (args: /usr/sbin/glusterfs --volfile-server=localhost
--volfile-id=/speech0
>> qqq)
>> [2015-08-31 11:56:57.134642] W [socket.c:923:__socket_keepalive]
>> 0-socket: failed to set TCP_USER_TIMEOUT 0 on socket 9, Protocol not
>> available
>> [2015-08-31 11:56:57.134688] E [socket.c:3019:socket_connect]
>> 0-glusterfs: Failed to set keep-alive: Protocol not available
>> [2015-08-31 11:56:57.135063] I [MSGID: 101190]
>> [event-epoll.c:632:event_dispatch_epoll_worker] 0-epoll: Started thread
>> with index 1
>> [2015-08-31 11:56:57.135113] E [socket.c:2332:socket_connect_finish]
>> 0-glusterfs: connection to 127.0.0.1:24007 failed (Connection reset by
>> peer)
>> [2015-08-31 11:56:57.135149] E [glusterfsd-mgmt.c:1819:mgmt_rpc_notify]
>> 0-glusterfsd-mgmt: failed to connect with remote-host: localhost
(Transport
>> endpoint is not connected)
>> [2015-08-31 11:56:57.135158] I [glusterfsd-mgmt.c:1825:mgmt_rpc_notify]
>> 0-glusterfsd-mgmt: Exhausted all volfile servers
>> [2015-08-31 11:56:57.135333] W [glusterfsd.c:1219:cleanup_and_exit]
>> (-->/usr/lib64/libgfrpc.so.0(rpc_clnt_notify+0x1a3) [0x7fb5e1be39a3]
>> -->/usr/sbin/glusterfs() [0x4099c8]
>> -->/usr/sbin/glusterfs(cleanup_and_exit+0x65) [0x4059b5] ) 0-:
received
>> signum (1), shutting down
>> [2015-08-31 11:56:57.135371] I [fuse-bridge.c:5595:fini] 0-fuse:
>> Unmounting '/home/speech/pengyiping/qqq'.
>> [2015-08-31 11:56:57.140640] W [glusterfsd.c:1219:cleanup_and_exit]
>> (-->/lib64/libpthread.so.0() [0x318b207851]
>> -->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xcd) [0x405e4d]
>> -->/usr/sbin/glusterfs(cleanup_and_exit+0x65) [0x4059b5] ) 0-:
received
>> signum (15), shutting down
>>
>>
>> Any help is much appreciated.
>>
>>
>> 2015-08-31 19:15 GMT+08:00 Atin Mukherjee <amukherj at
redhat.com>:
>>
>>> I believe the following events have happened in the cluster
resulting
>>> into this situation:
>>> 1. GlusterD & brick process on node 2 was brought down
>>> 2. Node 1 was rebooted.
>>>
>>> In the above case the mount will definitely fail since the brick
process
>>> was not started as in a 2 node set up glusterd waits its peers to
come
>>> up before it starts the bricks. Could you check whether the brick
>>> process is running or not?
>>>
>>> Thanks,
>>> Atin
>>>
>>> On 08/31/2015 04:17 PM, Yiping Peng wrote:
>>> > I've tried both: assuming server1 is already in pool,
server2 is
>>> undergoing
>>> > peer-probing
>>> >
>>> > server2:~$ mount server1:/vol1 mountpoint, fail;
>>> > server2:~$ mount server2:/vol1 mountpoint, fail.
>>> >
>>> > Strange enough. I *should* be able to mount server1:/vol1 on
server2.
>>> But
>>> > this is not the case :(
>>> > Maybe something is broken in the server pool, as I'm
seeing
>>> disconnected
>>> > nodes?
>>> >
>>> >
>>> > 2015-08-31 18:02 GMT+08:00 Ravishankar N <ravishankar at
redhat.com>:
>>> >
>>> >>
>>> >>
>>> >> On 08/31/2015 12:53 PM, Merlin Morgenstern wrote:
>>> >>
>>> >> Trying to mount the brick on the same physical server with
deamon
>>> running
>>> >> on this server but not on the other server:
>>> >>
>>> >> @node2:~$ sudo mount -t glusterfs gs2:/volume1 /data/nfs
>>> >> Mount failed. Please check the log file for more details.
>>> >>
>>> >> For mount to succeed the glusterd must be up on the node
that you
>>> specify
>>> >> as the volfile-server; gs2 in this case. You can use -o
>>> >> backupvolfile-server=gs1 as a fallback.
>>> >> -Ravi
>>> >>
>>> >> _______________________________________________
>>> >> Gluster-users mailing list
>>> >> Gluster-users at gluster.org
>>> >> http://www.gluster.org/mailman/listinfo/gluster-users
>>> >>
>>> >
>>> >
>>> >
>>> > _______________________________________________
>>> > Gluster-users mailing list
>>> > Gluster-users at gluster.org
>>> > http://www.gluster.org/mailman/listinfo/gluster-users
>>> >
>>>
>>
>>
>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://www.gluster.org/mailman/listinfo/gluster-users
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://www.gluster.org/pipermail/gluster-users/attachments/20150831/ce49d56c/attachment.html>

Gluster users - Aug 2015 - Why is it not possible to mount a replicated gluster volume with one Gluster server?

[Gluster-users] Why is it not possible to mount a replicated gluster volume with one Gluster server?

[Gluster-users] Why is it not possible to mount a replicated gluster volume with one Gluster server?