thr3ads.net - Gluster users - [Gluster-users] Upgrade Gluster 3.7 to 3.12 and add 3rd replica [howto/help] [Sep 2017]

If this information is useful, please help other people find it:
Share via:

Martin Toth

2017-Sep-22 12:22 UTC

[Gluster-users] Upgrade Gluster 3.7 to 3.12 and add 3rd replica [howto/help]

Hi, 

thanks for suggesions. Yes "gluster peer probe node3? will be first command
in order to discover 3rd node by Gluster.
I am running on latest 3.7.x - there is 3.7.6-1ubuntu1 installed and latest
3.7.x according https://packages.ubuntu.com/xenial/glusterfs-server
<https://packages.ubuntu.com/xenial/glusterfs-server> is 3.7.6-1ubuntu1,
so this should be OK.
> If you are *not* on the latest 3.7.x, you are unlikely to be able to go
Do you mean latest package from Ubuntu repository or latest package from Gluster
PPA (3.7.20-ubuntu1~xenial1).
Currently I am using Ubuntu repository package, but want to use PPA for upgrade
because Ubuntu has old packages of Gluster in repo.

I do not use sharding because all bricks has same size, so it will not speedup
healing of VMs images in case of heal operation. Volume is 3TB, how long does it
take to heal on 2x1gbit (linux bond) connection, can you approximate ?
I want to turn every VM off because its required for upgrading gluster
procedure, thats why I want to add 3rd brick (3rd replica) at this time (after
upgrade when VMs will be offline).

Martin
> On 22 Sep 2017, at 12:20, Diego Remolina <dijuremo at gmail.com>
wrote:
> 
> Procedure looks good.
> 
> Remember to back up Gluster config files before update:
> 
> /etc/glusterfs
> /var/lib/glusterd
> 
> If you are *not* on the latest 3.7.x, you are unlikely to be able to go
back to it because PPA only keeps the latest version of each major branch, so
keep that in mind. With Ubuntu, every time you update, make sure to download and
keep a manual copy of the .Deb files. Otherwise you will have to compile the
packages yourself in the event you wanted to go back.
> 
> Might need before adding 3rd replica:
> gluster peer probe node3 
> 
> When you add the 3rd replica, it should start healing, and there may be an
issue there if the VMs are running. Your plan to not have VMs up is good here.
Are you using sharding? If you are not sharding, I/O in running VMs may be
stopped for too long while a large image is healed. If you were already using
sharding you should be able to add the 3rd replica when VMs are running without
much issue.
> 
> Once healing is completed and if you are satisfied with 3.12, then remember
to bump op version of Gluster.
> 
> Diego
> 
> 
> On Sep 20, 2017 19:32, "Martin Toth" <snowmailer at gmail.com
<mailto:snowmailer at gmail.com>> wrote:
> Hello all fellow GlusterFriends,
> 
> I would like you to comment / correct my upgrade procedure steps on replica
2 volume of 3.7.x gluster.
> Than I would like to change replica 2 to replica 3 in order to correct
quorum issue that Infrastructure currently has.
> 
> Infrastructure setup:
> - all clients running on same nodes as servers (FUSE mounts)
> - under gluster there is ZFS pool running as raidz2 with SSD ZLOG/ZIL cache
> - all two hypervisor running as GlusterFS nodes and also Qemu compute nodes
(Ubuntu 16.04 LTS)
> - we are running Qemu VMs that accesses VMs disks via gfapi (Opennebula)
> - we currently run : 1x2 , Type: Replicate volume
> 
> Current Versions :
> glusterfs-* [package] 3.7.6-1ubuntu1
> qemu-*		[package] 2.5+dfsg-5ubuntu10.2glusterfs3.7.14xenial1
> 
> What we need : (New versions)
> - upgrade GlusterFS to 3.12 LTM version (Ubuntu 16.06 LTS packages are EOL
- see https://www.gluster.org/community/release-schedule/
<https://www.gluster.org/community/release-schedule/>)
> 	- I want to use
https://launchpad.net/~gluster/+archive/ubuntu/glusterfs-3.12
<https://launchpad.net/~gluster/+archive/ubuntu/glusterfs-3.12> as package
repository for 3.12
> - upgrade Qemu (with build-in support for libgfapi) -
https://launchpad.net/~monotek/+archive/ubuntu/qemu-glusterfs-3.12
<https://launchpad.net/~monotek/+archive/ubuntu/qemu-glusterfs-3.12>
> 	- (sadly Ubuntu has packages build without libgfapi support)
> - add third node to replica setup of volume (this is probably most
dangerous operation)
> 
> Backup Phase
> - backup "NFS storage? - raw DATA that runs on VMs
> - stop all running VMs
> - backup all running VMs (Qcow2 images) outside of gluster
> 
> Upgrading Gluster Phase
> - killall glusterfs glusterfsd glusterd (on every server)
> 	(this should stop all gluster services - server and client as it runs on
same nodes)
> - install new Gluster Server and Client packages from repository mentioned
upper (on every server)
> - install new Monotek's qemu glusterfs package with gfapi enabled
support (on every server)
> - /etc/init.d/glusterfs-server start (on every server)
> - /etc/init.d/glusterfs-server status - verify that all runs ok (on every
server)
> 	- check :
> 		- gluster volume info
> 		- gluster volume status
> 		- check gluster FUSE clients, if mounts working as expected
> - test if various VMs are able tu boot and run as expected (if libgfapi
works in Qemu)
> - reboot all nodes - do system upgrade of packages
> - test and check again
> 
> Adding third node to replica 2 setup (replica 2 => replica 3)
> (volumes will be mounted and up after upgrade and we tested VMs are able to
be served with libgfapi = upgrade of gluster sucessfuly completed)
> (next we extend replica 2 to replica 3 while volumes are mounted but no
data is touched = no running VMs, only glusterfs servers and clients on nodes)
> - issue command : gluster volume add-brick volume replica 3
node3.san:/tank/gluster/brick1 (on new single node - node3)
> 	so we change : 
> 		Bricks:
> 			Brick1: node1.san:/tank/gluster/brick1
> 			Brick2: node2.san:/tank/gluster/brick1
> 	to :
> 			Bricks:
> 			Brick1: node1.san:/tank/gluster/brick1
> 			Brick2: node2.san:/tank/gluster/brick1
> 			Brick3: node3.san:/tank/gluster/brick1
> - check gluster status
> - (is rebalance / heal required here ?)
> - start all VMs and start celebration :)
> 
> My Questions
> - is heal and rebalance necessary in order to upgrade replica 2 to replica
3 ?
> - is this upgrade procedure OK ? What more/else should I do in order to do
this upgrade correctly ?
> 
> Many thanks to all for support. Hope my little preparation howto will help
others to solve same situation.
> 
> Best Regards,
> Martin
> 
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org <mailto:Gluster-users at gluster.org>
> http://lists.gluster.org/mailman/listinfo/gluster-users
<http://lists.gluster.org/mailman/listinfo/gluster-users>
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20170922/c18ee477/attachment.html>

Diego Remolina

2017-Sep-22 12:50 UTC

head link

[Gluster-users] Upgrade Gluster 3.7 to 3.12 and add 3rd replica [howto/help]

Hi Martin,
> Do you mean latest package from Ubuntu repository or latest package from
> Gluster PPA (3.7.20-ubuntu1~xenial1).
> Currently I am using Ubuntu repository package, but want to use PPA for
> upgrade because Ubuntu has old packages of Gluster in repo.
When you switch to PPA, make sure to download and keep a copy of each
set of gluster deb packages, otherwise if you ever want to back out an
upgrade to an older release, you will have to download the source deb
file and build it yourself, because PPAs only keep the latest version
for binaries.
>
> I do not use sharding because all bricks has same size, so it will not
> speedup healing of VMs images in case of heal operation. Volume is 3TB, how
> long does it take to heal on 2x1gbit (linux bond) connection, can you
> approximate ?
Sharding is not so much about brick size. Sharding is about preventing
a whole large VM file being locked when it is being healed. Also
minimizes the amount of data copied because gluster only heals smaller
pieces versus a whole VM image.

Say your 100GB IMG needs to be healed, the file is locked while it
gets copied from one server to the other and the running VM may not be
able to use it while the heal is going, so your VM may in fact stop
working or have I/O errors. With sharding, VMs are cut into, well,
shards, largest shard is 512MB, then the heal process only locks the
shards being healed. So gluster only heals the shards that changed
which are much smaller and faster to copy, and do not need to lock the
whole 100GB IMG file which takes longer to copy, just the shard being
healed. Do note that if you had never used sharding, if you turn it on
it will *not* convert your older files. Also you should *never* turn
on sharding and then back off, as that will result in corrupted VM
image files. Once it is on, if you want to turn it off, stop your VMs,
then move all VM IMG files elsewhere, turn off sharding and then copy
the files back  to the volume after disabling sharding.

As for speed, I really cannot tell you as it depends on the disks,
netowr, etc. For example, I have a two node setup plus an arbiter (2
nodes with bricks, one is just the arbiter to keep quorum if one of
the brick servers goes down). I recently replaced the HDDs in one
machine as the drives hit the 5 year age mark. So I took the 12 drives
out, added 24 drives to the machine (we had unused slots),
reconfigured raid 6 and left it initializing in the background and
started the heal of 13.1TB of data. My servers are connected via
10Gbit (I am not seeing reads/writes over 112MB/s) and this process
started last Monday at 7;20PM and it is not done yet. It is missing
healing about 40GB still. Now my servers are used as a file server,
which means lots of small files which take longer to heal. I would
think your VM images will heal much faster.
> I want to turn every VM off because its required for upgrading gluster
> procedure, thats why I want to add 3rd brick (3rd replica) at this time
> (after upgrade when VMs will be offline).
>
You could even attempt an online upgrade if you try to add the new
node/brick running 3.12 to the mix before upgrading from 3.7.x on the
other nodes. However, I am not sure how that is going to work. With
such a difference in versions, it may not work well.

If you can afford the downtime to upgrade, that will be the safest option.

Diego

Martin Toth

2017-Oct-01 15:53 UTC

head link

[Gluster-users] Upgrade Gluster 3.7 to 3.12 and add 3rd replica [howto/help]

Hi Diego,

I?ve tried to upgrade and then extend gluster with 3rd node in virtualbox test
environment and all went without problems.
Sharding will not help me at this time so I will consider upgrading 1G to 10G
before this procedure in production. That should lower downtime - healing time
of VM image files on Gluster.

I hope healing will take as short as possible on 10G.

Additional info for Gluster/Qemu Users:
- Ubuntu does not have Qemu compiled with libgfapi support so I?ve created PPA
for that :
	https://launchpad.net/~snowmanko/+archive/ubuntu/qemu-glusterfs-3.12
<https://launchpad.net/~snowmanko/+archive/ubuntu/qemu-glusterfs-3.12> (I
will try to make this repo up to date)
	- it?s tested against glusterfs3.12.1 version (libgfapi works as expected with
this repo)

- Moreover related to this problem - there is MIR -
https://bugs.launchpad.net/ubuntu/+source/glusterfs/+bug/1274247
<https://bugs.launchpad.net/ubuntu/+source/glusterfs/+bug/1274247> - it?s
now accepted, I am really excited to see libgfapi compiled by default in Ubuntu
Qemu packages in near future

Thanks for support.

BR,

Martin
> On 22 Sep 2017, at 14:50, Diego Remolina <dijuremo at gmail.com>
wrote:
> 
> Hi Martin,
> 
>> Do you mean latest package from Ubuntu repository or latest package
from
>> Gluster PPA (3.7.20-ubuntu1~xenial1).
>> Currently I am using Ubuntu repository package, but want to use PPA for
>> upgrade because Ubuntu has old packages of Gluster in repo.
> 
> When you switch to PPA, make sure to download and keep a copy of each
> set of gluster deb packages, otherwise if you ever want to back out an
> upgrade to an older release, you will have to download the source deb
> file and build it yourself, because PPAs only keep the latest version
> for binaries.
> 
>> 
>> I do not use sharding because all bricks has same size, so it will not
>> speedup healing of VMs images in case of heal operation. Volume is 3TB,
how
>> long does it take to heal on 2x1gbit (linux bond) connection, can you
>> approximate ?
> 
> Sharding is not so much about brick size. Sharding is about preventing
> a whole large VM file being locked when it is being healed. Also
> minimizes the amount of data copied because gluster only heals smaller
> pieces versus a whole VM image.
> 
> Say your 100GB IMG needs to be healed, the file is locked while it
> gets copied from one server to the other and the running VM may not be
> able to use it while the heal is going, so your VM may in fact stop
> working or have I/O errors. With sharding, VMs are cut into, well,
> shards, largest shard is 512MB, then the heal process only locks the
> shards being healed. So gluster only heals the shards that changed
> which are much smaller and faster to copy, and do not need to lock the
> whole 100GB IMG file which takes longer to copy, just the shard being
> healed. Do note that if you had never used sharding, if you turn it on
> it will *not* convert your older files. Also you should *never* turn
> on sharding and then back off, as that will result in corrupted VM
> image files. Once it is on, if you want to turn it off, stop your VMs,
> then move all VM IMG files elsewhere, turn off sharding and then copy
> the files back  to the volume after disabling sharding.
> 
> As for speed, I really cannot tell you as it depends on the disks,
> netowr, etc. For example, I have a two node setup plus an arbiter (2
> nodes with bricks, one is just the arbiter to keep quorum if one of
> the brick servers goes down). I recently replaced the HDDs in one
> machine as the drives hit the 5 year age mark. So I took the 12 drives
> out, added 24 drives to the machine (we had unused slots),
> reconfigured raid 6 and left it initializing in the background and
> started the heal of 13.1TB of data. My servers are connected via
> 10Gbit (I am not seeing reads/writes over 112MB/s) and this process
> started last Monday at 7;20PM and it is not done yet. It is missing
> healing about 40GB still. Now my servers are used as a file server,
> which means lots of small files which take longer to heal. I would
> think your VM images will heal much faster.
> 
>> I want to turn every VM off because its required for upgrading gluster
>> procedure, thats why I want to add 3rd brick (3rd replica) at this time
>> (after upgrade when VMs will be offline).
>> 
> 
> You could even attempt an online upgrade if you try to add the new
> node/brick running 3.12 to the mix before upgrading from 3.7.x on the
> other nodes. However, I am not sure how that is going to work. With
> such a difference in versions, it may not work well.
> 
> If you can afford the downtime to upgrade, that will be the safest option.
> 
> Diego
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<http://lists.gluster.org/pipermail/gluster-users/attachments/20171001/5648bfe4/attachment.html>

Seemingly Similar Threads

Search for more reasonably related threads

Gluster users - Sep 2017 - Upgrade Gluster 3.7 to 3.12 and add 3rd replica [howto/help]

[Gluster-users] Upgrade Gluster 3.7 to 3.12 and add 3rd replica [howto/help]

[Gluster-users] Upgrade Gluster 3.7 to 3.12 and add 3rd replica [howto/help]

[Gluster-users] Upgrade Gluster 3.7 to 3.12 and add 3rd replica [howto/help]

Seemingly Similar Threads