thr3ads.net - Gluster users - [Gluster-users] NIC died migration timetable moved up [Jul 2016]

If this information is useful, please help other people find it:
Share via:

David Gossage

2016-Jul-08 02:51 UTC

[Gluster-users] NIC died migration timetable moved up

So a NIC on one of my nodes died today.  Chance are good that  a reboot
would bring it back to life, but then I'd be down while 2-3TB of unsharded
VM's healed which wouldn't be fun.  I figured I'll run with 2 nodes
while I
get sharding enabled and then bring in the 3rd new node I was going to
replace that one with anyway.

1) I am using oVirt3.6 with CentOS 7 and while I am going to confirm I
believe it does its communications over fuse mount still.  So I am thinking
it would help for me to move from 3.7.11 to 3.7.12 as the ligpfapi issues
shouldn't hit me.

2) enable sharding.  Do I need to completely move the VM(powered off) image
off the mount then back on for it to shad or can I rename VM images on the
fuse mount?

Typical disk dir has 3 files.  I'm thinking since only the large image will
shard that is only one I would need to move as others wouldn't shard?
-rw-rw----. 1 vdsm kvm  25G Apr 15 14:21
e7818fd2-7e2e-46e8-92b5-bc036850d88b
-rw-rw----. 1 vdsm kvm 1.0M Dec  2  2015
e7818fd2-7e2e-46e8-92b5-bc036850d88b.lease
-rw-r--r--. 1 vdsm kvm  320 Dec  2  2015
e7818fd2-7e2e-46e8-92b5-bc036850d88b.meta

3) Since I don't plan to re-enable the server with NIC issues can I just
rsync /var/lib/glusterd then give new server same IP other one used to peer
with?  Do I need to change UUID of new server?  Can I manually update the
info in /var/lib/glusterd/vols/*/info so options match after enabling
shards?

Any glaring gotchas I am overlooking?

*David Gossage*
*Carousel Checks Inc. | System Administrator*
*Office* 708.613.2284
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<gluster.org/pipermail/gluster-users/attachments/20160707/cddd8b94/attachment.html>

Lindsay Mathieson

2016-Jul-08 03:08 UTC

head link

[Gluster-users] NIC died migration timetable moved up

On 8 July 2016 at 12:51, David Gossage <dgossage at carouselchecks.com>
wrote:> 2) enable sharding.  Do I need to completely move the VM(powered off) image
> off the mount then back on for it to shad or can I rename VM images on the
> fuse mount?

The former. I'd be backing up all VM's and then restoring.

Alternatively you could try Gluster 3.8 which has granular
self-healing - the ability to track and heal individual blocks of a
file, rather than just the whole file. Currently testing it myself.

-- 
Lindsay

Lindsay Mathieson

2016-Jul-08 05:41 UTC

head link

[Gluster-users] NIC died migration timetable moved up

On 8 July 2016 at 12:51, David Gossage <dgossage at carouselchecks.com>
wrote:> 2) enable sharding.  Do I need to completely move the VM(powered off) image
> off the mount then back on for it to shad or can I rename VM images on the
> fuse mount?
>
> Typical disk dir has 3 files.  I'm thinking since only the large image
will
> shard that is only one I would need to move as others wouldn't shard?
> -rw-rw----. 1 vdsm kvm  25G Apr 15 14:21
> e7818fd2-7e2e-46e8-92b5-bc036850d88b
> -rw-rw----. 1 vdsm kvm 1.0M Dec  2  2015
> e7818fd2-7e2e-46e8-92b5-bc036850d88b.lease
> -rw-r--r--. 1 vdsm kvm  320 Dec  2  2015
> e7818fd2-7e2e-46e8-92b5-bc036850d88b.meta

Any file smaller than your shard size will stay the same (not sharded)

-- 
Lindsay

David Gossage

2016-Jul-09 19:17 UTC

head link

[Gluster-users] NIC died migration timetable moved up

On Thu, Jul 7, 2016 at 9:51 PM, David Gossage <dgossage at
carouselchecks.com>
wrote:
> So a NIC on one of my nodes died today.  Chance are good that  a reboot
> would bring it back to life, but then I'd be down while 2-3TB of
unsharded
> VM's healed which wouldn't be fun.  I figured I'll run with 2
nodes while I
> get sharding enabled and then bring in the 3rd new node I was going to
> replace that one with anyway.
>
> 1) I am using oVirt3.6 with CentOS 7 and while I am going to confirm I
> believe it does its communications over fuse mount still.  So I am thinking
> it would help for me to move from 3.7.11 to 3.7.12 as the ligpfapi issues
> shouldn't hit me.
>
>Came in this morning to update to 3.7.12 and noticed that 3.7.13 had been
released.  So shut down VM's and gluster volumes and updated.
update process itself went smoothly but on starting up oVirt engine the
main gluster storage volume didn't activate.  I manually activated and it
came up but oVirt wouldn't report on how much space was used.  However
ovirt nodes did mount and allow me to start VM's.  However after a few
minutes it would claim to be inactive again even if the nodes themselevs
still had access and mounted volumes and the VM's were still running.
Found these errors flooding the gluster logs on nodes.

[2016-07-09 15:27:46.935694] I [fuse-bridge.c:4083:fuse_init]
0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel
7.22
[2016-07-09 15:27:49.555466] W [MSGID: 114031]
[client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-1: remote
operation failed [Operation not permitted]
[2016-07-09 15:27:49.556574] W [MSGID: 114031]
[client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-0: remote
operation failed [Operation not permitted]
[2016-07-09 15:27:49.556659] W [fuse-bridge.c:2227:fuse_readv_cbk]
0-glusterfs-fuse: 80: READ => -1 gfid=deb61291-5176-4b81-8315-3f1cf8e3534d
fd=0x7f5224002f68 (Operation not permitted)
[2016-07-09 15:27:59.612477] W [MSGID: 114031]
[client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-1: remote
operation failed [Operation not permitted]
[2016-07-09 15:27:59.613700] W [MSGID: 114031]
[client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-0: remote
operation failed [Operation not permitted]
[2016-07-09 15:27:59.613781] W [fuse-bridge.c:2227:fuse_readv_cbk]
0-glusterfs-fuse: 168: READ => -1 gfid=deb61291-5176-4b81-8315-3f1cf8e3534d
fd=0x7f5224002f68 (Operation not permitted)


Downgrading everything to 3.7.12 still had same issues.  Issues went away
finally when back on 3.7.11 and manually chaned op back to 30710


> 2) enable sharding.  Do I need to completely move the VM(powered off)
> image off the mount then back on for it to shad or can I rename VM images
> on the fuse mount?
>
> Typical disk dir has 3 files.  I'm thinking since only the large image
> will shard that is only one I would need to move as others wouldn't
shard?
> -rw-rw----. 1 vdsm kvm  25G Apr 15 14:21
> e7818fd2-7e2e-46e8-92b5-bc036850d88b
> -rw-rw----. 1 vdsm kvm 1.0M Dec  2  2015
> e7818fd2-7e2e-46e8-92b5-bc036850d88b.lease
> -rw-r--r--. 1 vdsm kvm  320 Dec  2  2015
> e7818fd2-7e2e-46e8-92b5-bc036850d88b.meta
>
> 3) Since I don't plan to re-enable the server with NIC issues can I
just
> rsync /var/lib/glusterd then give new server same IP other one used to peer
> with?  Do I need to change UUID of new server?  Can I manually update the
> info in /var/lib/glusterd/vols/*/info so options match after enabling
> shards?
>
> Any glaring gotchas I am overlooking?
>
> *David Gossage*
> *Carousel Checks Inc. | System Administrator*
> *Office* 708.613.2284
>-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<gluster.org/pipermail/gluster-users/attachments/20160709/7b449317/attachment.html>

Gluster users - Jul 2016 - NIC died migration timetable moved up

[Gluster-users] NIC died migration timetable moved up

[Gluster-users] NIC died migration timetable moved up

[Gluster-users] NIC died migration timetable moved up

[Gluster-users] NIC died migration timetable moved up