thr3ads.net - Gluster users - [Gluster-users] systemd kill mode [Sep 2020]

If this information is useful, please help other people find it:
Share via:

Strahil Nikolov

2020-Sep-02 19:30 UTC

[Gluster-users] systemd kill mode

Hi,

you shouldn't do that,as it is intentional - glusterd is just a management
layer and you might need to restart it in order to reconfigure a node. You
don't want to kill your bricks to introduce a change, right??
For details, you can check?https://access.redhat.com/solutions/1313303 (you can
obtain a subscription from developers.redhat.com).

In CentOS there is a dedicated service that takes care to shutdown all processes
and avoid such freeze .
Here is it in case your distro doesn't provide it:

user at system:~/Gluster/usr/lib/systemd/system> cat glusterfsd.service
[Unit]
Description=GlusterFS brick processes (stopping only)
After=network.target glusterd.service


[Service]
Type=oneshot
# glusterd starts the glusterfsd processed on-demand
# /bin/true will mark this service as started, RemainAfterExit keeps it active
ExecStart=/bin/true
RemainAfterExit=yes
# if there are no glusterfsd processes, a stop/reload should not give an error
ExecStop=/bin/sh -c "/bin/killall --wait glusterfsd || /bin/true"
ExecReload=/bin/sh -c "/bin/killall -HUP glusterfsd || /bin/true"


[Install]
WantedBy=multi-user.target


Of course you can also use
'/usr/share/glusterfs/scripts/stop-all-gluster-processes.sh' to prevent
the freeze as it will kill all gluster processes (including FUSE mounts on the
system) and thus allow the FUSE clients accessing the bricks' processes and
the rest of the TSP to act accordingly.

Both the glusterfsd.service and the stop-all-gluster-processes.sh are provided
by the glusterfs-server package.


Best Regards,
Strahil Nikolov




? ?????, 2 ????????? 2020 ?., 21:59:45 ???????+3, Ward Poelmans <wpoely86 at
gmail.com> ??????:





Hi,

I've playing with glusterfs on a couple of VMs to get some feeling with
it. The setup is 2 bricks with replication with a thin arbiter. I've
noticed something 'odd' with the systemd unit file for glusterd. It has
KillMode=process
which means that on a 'systemctl stop glusterd' it will only kill the
glusterd daemon and not any of the subprocesses started by glusterd
(like glusterfs and glusterfsd).

Does anyone know the reason for this? The git history of the file
doesn't help. It was added in 2013 but the commit doesn't mention
anything about it.

The reason I'm asking is because I noticed that a write was hanging when
I rebooted one of the brick VMs: a client was doing 'dd if=/dev/zero
of=/some/file' on gluster when I did a clean shut down of one of the
brick VMs. This caused the dd to hang for the duration of
network.ping-timeout (42 seconds by default). When I changed the kill
mode to 'control-group' (which kills all process started by glusterd
too), this didn't happen any more.

I was not expecting any 'hangs' on a proper shut down of one of the
bricks when replication is used. Is this a bug or is something wrong
with my setup?

Ward
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Strahil Nikolov

2020-Sep-02 19:52 UTC

head link

[Gluster-users] systemd kill mode

And it seems gdeploy is deprecated in favour of gluster-ansible
->?gluster/gluster-ansible?.

 
 
 
   
 gluster/gluster-ansible
 A core library of gluster specific roles and modules for ansible/ansible tower.
- gluster/gluster-ansible
Best Regards,
Strahil Nikolov









? ?????, 2 ????????? 2020 ?., 22:30:33 ???????+3, Strahil Nikolov
<hunter86_bg at yahoo.com> ??????:





Hi,

you shouldn't do that,as it is intentional - glusterd is just a management
layer and you might need to restart it in order to reconfigure a node. You
don't want to kill your bricks to introduce a change, right??
For details, you can check?https://access.redhat.com/solutions/1313303 (you can
obtain a subscription from developers.redhat.com).

In CentOS there is a dedicated service that takes care to shutdown all processes
and avoid such freeze .
Here is it in case your distro doesn't provide it:

user at system:~/Gluster/usr/lib/systemd/system> cat glusterfsd.service
[Unit]
Description=GlusterFS brick processes (stopping only)
After=network.target glusterd.service


[Service]
Type=oneshot
# glusterd starts the glusterfsd processed on-demand
# /bin/true will mark this service as started, RemainAfterExit keeps it active
ExecStart=/bin/true
RemainAfterExit=yes
# if there are no glusterfsd processes, a stop/reload should not give an error
ExecStop=/bin/sh -c "/bin/killall --wait glusterfsd || /bin/true"
ExecReload=/bin/sh -c "/bin/killall -HUP glusterfsd || /bin/true"


[Install]
WantedBy=multi-user.target


Of course you can also use
'/usr/share/glusterfs/scripts/stop-all-gluster-processes.sh' to prevent
the freeze as it will kill all gluster processes (including FUSE mounts on the
system) and thus allow the FUSE clients accessing the bricks' processes and
the rest of the TSP to act accordingly.

Both the glusterfsd.service and the stop-all-gluster-processes.sh are provided
by the glusterfs-server package.


Best Regards,
Strahil Nikolov




? ?????, 2 ????????? 2020 ?., 21:59:45 ???????+3, Ward Poelmans <wpoely86 at
gmail.com> ??????:





Hi,

I've playing with glusterfs on a couple of VMs to get some feeling with
it. The setup is 2 bricks with replication with a thin arbiter. I've
noticed something 'odd' with the systemd unit file for glusterd. It has
KillMode=process
which means that on a 'systemctl stop glusterd' it will only kill the
glusterd daemon and not any of the subprocesses started by glusterd
(like glusterfs and glusterfsd).

Does anyone know the reason for this? The git history of the file
doesn't help. It was added in 2013 but the commit doesn't mention
anything about it.

The reason I'm asking is because I noticed that a write was hanging when
I rebooted one of the brick VMs: a client was doing 'dd if=/dev/zero
of=/some/file' on gluster when I did a clean shut down of one of the
brick VMs. This caused the dd to hang for the duration of
network.ping-timeout (42 seconds by default). When I changed the kill
mode to 'control-group' (which kills all process started by glusterd
too), this didn't happen any more.

I was not expecting any 'hangs' on a proper shut down of one of the
bricks when replication is used. Is this a bug or is something wrong
with my setup?

Ward
________



Community Meeting Calendar:

Schedule -
Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
Bridge: https://bluejeans.com/441850968

Gluster-users mailing list
Gluster-users at gluster.org
https://lists.gluster.org/mailman/listinfo/gluster-users

Joe Julian

2020-Sep-02 21:08 UTC

head link

[Gluster-users] systemd kill mode

> In CentOS there is a dedicated service that takes care to shutdown all
processes and avoid such freezeIf you didn't stop your network interfaces as part of the shutdown, this 
wouldn't happen either. The final kill will kill the glusterfsd 
processes, closing the TCP connections properly and preventing the 
clients from waiting for the server to come back.

The problem you're seeing is that the network is being shut down - 
preventing the clients from getting the proper TCP termination.

On 9/2/20 12:30 PM, Strahil Nikolov wrote:> Hi,
>
> you shouldn't do that,as it is intentional - glusterd is just a
management layer and you might need to restart it in order to reconfigure a
node. You don't want to kill your bricks to introduce a change, right?
> For details, you can check?https://access.redhat.com/solutions/1313303 (you
can obtain a subscription from developers.redhat.com).
>
> In CentOS there is a dedicated service that takes care to shutdown all
processes and avoid such freeze .
> Here is it in case your distro doesn't provide it:
>
> user at system:~/Gluster/usr/lib/systemd/system> cat glusterfsd.service
> [Unit]
> Description=GlusterFS brick processes (stopping only)
> After=network.target glusterd.service
>
>
> [Service]
> Type=oneshot
> # glusterd starts the glusterfsd processed on-demand
> # /bin/true will mark this service as started, RemainAfterExit keeps it
active
> ExecStart=/bin/true
> RemainAfterExit=yes
> # if there are no glusterfsd processes, a stop/reload should not give an
error
> ExecStop=/bin/sh -c "/bin/killall --wait glusterfsd || /bin/true"
> ExecReload=/bin/sh -c "/bin/killall -HUP glusterfsd || /bin/true"
>
>
> [Install]
> WantedBy=multi-user.target
>
>
> Of course you can also use
'/usr/share/glusterfs/scripts/stop-all-gluster-processes.sh' to prevent
the freeze as it will kill all gluster processes (including FUSE mounts on the
system) and thus allow the FUSE clients accessing the bricks' processes and
the rest of the TSP to act accordingly.
>
> Both the glusterfsd.service and the stop-all-gluster-processes.sh are
provided by the glusterfs-server package.
>
>
> Best Regards,
> Strahil Nikolov
>
>
>
>
> ? ?????, 2 ????????? 2020 ?., 21:59:45 ???????+3, Ward Poelmans
<wpoely86 at gmail.com> ??????:
>
>
>
>
>
> Hi,
>
> I've playing with glusterfs on a couple of VMs to get some feeling with
> it. The setup is 2 bricks with replication with a thin arbiter. I've
> noticed something 'odd' with the systemd unit file for glusterd. It
has
> KillMode=process
> which means that on a 'systemctl stop glusterd' it will only kill
the
> glusterd daemon and not any of the subprocesses started by glusterd
> (like glusterfs and glusterfsd).
>
> Does anyone know the reason for this? The git history of the file
> doesn't help. It was added in 2013 but the commit doesn't mention
> anything about it.
>
> The reason I'm asking is because I noticed that a write was hanging
when
> I rebooted one of the brick VMs: a client was doing 'dd if=/dev/zero
> of=/some/file' on gluster when I did a clean shut down of one of the
> brick VMs. This caused the dd to hang for the duration of
> network.ping-timeout (42 seconds by default). When I changed the kill
> mode to 'control-group' (which kills all process started by
glusterd
> too), this didn't happen any more.
>
> I was not expecting any 'hangs' on a proper shut down of one of the
> bricks when replication is used. Is this a bug or is something wrong
> with my setup?
>
> Ward
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users
> ________
>
>
>
> Community Meeting Calendar:
>
> Schedule -
> Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC
> Bridge: https://bluejeans.com/441850968
>
> Gluster-users mailing list
> Gluster-users at gluster.org
> https://lists.gluster.org/mailman/listinfo/gluster-users

Ward Poelmans

2020-Sep-03 07:11 UTC

head link

[Gluster-users] systemd kill mode

Hi Strahil,

On 2/09/2020 21:30, Strahil Nikolov wrote:
> you shouldn't do that,as it is intentional - glusterd is just a
management layer and you might need to restart it in order to reconfigure a
node. You don't want to kill your bricks to introduce a change, right?
Starting up daemons in one systemd unit and killing them with another is
a bit weird? Can't a reconfigure happen through a ExecReload? Or let the
management daemon and the actual brick daemons run under different
systemd units?
> In CentOS there is a dedicated service that takes care to shutdown all
processes and avoid such freeze .
Thanks, that should fix the issue too.

Ward

Gluster users - Sep 2020 - systemd kill mode

[Gluster-users] systemd kill mode

[Gluster-users] systemd kill mode

[Gluster-users] systemd kill mode

[Gluster-users] systemd kill mode