On Thu, Jul 7, 2016 at 9:51 PM, David Gossage <dgossage at carouselchecks.com> wrote:> So a NIC on one of my nodes died today. Chance are good that a reboot > would bring it back to life, but then I'd be down while 2-3TB of unsharded > VM's healed which wouldn't be fun. I figured I'll run with 2 nodes while I > get sharding enabled and then bring in the 3rd new node I was going to > replace that one with anyway. > > 1) I am using oVirt3.6 with CentOS 7 and while I am going to confirm I > believe it does its communications over fuse mount still. So I am thinking > it would help for me to move from 3.7.11 to 3.7.12 as the ligpfapi issues > shouldn't hit me. > >Came in this morning to update to 3.7.12 and noticed that 3.7.13 had been released. So shut down VM's and gluster volumes and updated. update process itself went smoothly but on starting up oVirt engine the main gluster storage volume didn't activate. I manually activated and it came up but oVirt wouldn't report on how much space was used. However ovirt nodes did mount and allow me to start VM's. However after a few minutes it would claim to be inactive again even if the nodes themselevs still had access and mounted volumes and the VM's were still running. Found these errors flooding the gluster logs on nodes. [2016-07-09 15:27:46.935694] I [fuse-bridge.c:4083:fuse_init] 0-glusterfs-fuse: FUSE inited with protocol versions: glusterfs 7.22 kernel 7.22 [2016-07-09 15:27:49.555466] W [MSGID: 114031] [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-1: remote operation failed [Operation not permitted] [2016-07-09 15:27:49.556574] W [MSGID: 114031] [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-0: remote operation failed [Operation not permitted] [2016-07-09 15:27:49.556659] W [fuse-bridge.c:2227:fuse_readv_cbk] 0-glusterfs-fuse: 80: READ => -1 gfid=deb61291-5176-4b81-8315-3f1cf8e3534d fd=0x7f5224002f68 (Operation not permitted) [2016-07-09 15:27:59.612477] W [MSGID: 114031] [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-1: remote operation failed [Operation not permitted] [2016-07-09 15:27:59.613700] W [MSGID: 114031] [client-rpc-fops.c:3050:client3_3_readv_cbk] 0-GLUSTER1-client-0: remote operation failed [Operation not permitted] [2016-07-09 15:27:59.613781] W [fuse-bridge.c:2227:fuse_readv_cbk] 0-glusterfs-fuse: 168: READ => -1 gfid=deb61291-5176-4b81-8315-3f1cf8e3534d fd=0x7f5224002f68 (Operation not permitted) Downgrading everything to 3.7.12 still had same issues. Issues went away finally when back on 3.7.11 and manually chaned op back to 30710> 2) enable sharding. Do I need to completely move the VM(powered off) > image off the mount then back on for it to shad or can I rename VM images > on the fuse mount? >> Typical disk dir has 3 files. I'm thinking since only the large image > will shard that is only one I would need to move as others wouldn't shard? > -rw-rw----. 1 vdsm kvm 25G Apr 15 14:21 > e7818fd2-7e2e-46e8-92b5-bc036850d88b > -rw-rw----. 1 vdsm kvm 1.0M Dec 2 2015 > e7818fd2-7e2e-46e8-92b5-bc036850d88b.lease > -rw-r--r--. 1 vdsm kvm 320 Dec 2 2015 > e7818fd2-7e2e-46e8-92b5-bc036850d88b.meta > > 3) Since I don't plan to re-enable the server with NIC issues can I just > rsync /var/lib/glusterd then give new server same IP other one used to peer > with? Do I need to change UUID of new server? Can I manually update the > info in /var/lib/glusterd/vols/*/info so options match after enabling > shards? > > Any glaring gotchas I am overlooking? > > *David Gossage* > *Carousel Checks Inc. | System Administrator* > *Office* 708.613.2284 >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160709/7b449317/attachment.html>
Lindsay Mathieson
2016-Jul-10 00:45 UTC
[Gluster-users] NIC died migration timetable moved up
On 10/07/2016 5:17 AM, David Gossage wrote:> Came in this morning to update to 3.7.12 and noticed that 3.7.13 had > been released. So shut down VM's and gluster volumes and updated. > update process itself went smoothly but on starting up oVirt engine > the main gluster storage volume didn't activate. I manually activated > and it came up but oVirt wouldn't report on how much space was used. > However ovirt nodes did mount and allow me to start VM's. However > after a few minutes it would claim to be inactive again even if the > nodes themselevs still had access and mounted volumes and the VM's > were still running. Found these errors flooding the gluster logs on nodes.Hi David, I did a quick test this morning with Proxmox and 3.7.13 and was able to get it working with the fuse mount *and* libgfapi. One caveat - you *have* to enable qemu caching, either write-back or write-through. 3.7.12 & 13 seem to now disable aio support, and qemu requires that when caching is turned off. There are setting for aio in gluster that I haven't played with yet. -- Lindsay Mathieson
Gandalf Corvotempesta
2016-Jul-10 09:09 UTC
[Gluster-users] NIC died migration timetable moved up
Il 09 lug 2016 21:18, "David Gossage" <dgossage at carouselchecks.com> ha scritto:> Came in this morning to update to 3.7.12 and noticed that 3.7.13 had beenreleased. So shut down VM's and gluster volumes and updated. Why you had to shut down vms and gluster volumes? Isn't an online rolling upgrade available also for patch releases? Tearing down the whole cluster is always needed with gluster? -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://www.gluster.org/pipermail/gluster-users/attachments/20160710/de7620a5/attachment.html>