Carlos Capriotti
2014-Mar-01 16:35 UTC
[Gluster-users] ESXI cannot access stripped gluster volume
ESXI cannot access stripped gluster volume Hello all. Unfortunately this is going to be a long post, so I will not spend too many words with compliments; Gluster is a great solution and I should be writing odes about it, so, bravo all of you. A bit about me: I've been working with FreeBSD and Linux for over a decade now. I use CentOS nowadays because of some convenient features, regarding my applications. Now a bit more about my problem: After adding my stripped gluster volume to my ESXI via NFS, I try to browse it with vSphere's browser, and it... a) cannot see any of the folders already there. The operation never times out, and I have a line of dots telling me the system (esxi) is trying to do something. Other systems CAN access and see content on the same volume. b) gluster or ESXI return/log NO ERROR when I (try to) create folders there, but does not show them either. Folder IS created. c) When trying to create a virtual machine it DOES return an error. Folder and first file related to the VM ARE created but I get an error about the an "Invalid virtual machine configuration". I am under the impression esxi returns this when it tries to create the file for the virtual disk. d) When trying to remove the volume, it DOES return an error, stating the resource is busy. I am forced to reboot the ESXI host in order to successfully delete the nfs volume. Now, a bit about my environment: I am one of those cursed with an Isilon, but with NO service contract and NO license. So, basically I have a big, fast and resilient NAS. Cool stuff, but with great inconveniences as well. Goes without saying that me, as a free software guy, would love to build something that can retire the isilon, or at least, move it to a secondary role. Anyway, trying to add an alternative to it, I searched for days and decided Gluster was the way to go. And I am not going back. I will make it work. All of my VM servers, (about 15) are spread across 3 metal boxes, and - please, don't blame me, I inherited this situation - there is no backup solution whatsoever. Gluster will, in its final configuration, run on 4 boxes, providing HA and backup. So, on esxi, Isilon's NFS volume/share works like a charm; pure NFS sharing on CentOS works like a charm; gluster stripe - using two of my four servers - does not like me. The very same gluster NSF volume is mounted and works happily on a ContOS client. Actually, more than one. I have been reading literally dozens of docs, guides, manuals, VMware, Gluster and Red Hat for more than a week, and while that, I've even created ESXI VIRTUAL SERVERS *INSIDE* my EXSI physical servers, because I can no longer afford rebooting a production server whenever I need to test yet another change on gluster. My software versions: CentOS 6.5 Gluster 3.4.2 ESXI 5.1 all patches applied ESXI 5.5 My hardware for the nodes: 2 x Dell PE2950 with RAID5, bricks are single volume of about 1.5 TB on each node. One stand-alone PE2900 with a single volume, on RAID5, and about 2.4 TB, which will be added to the stripe, eventually. One PE2950, brick on RAID5 800GB, which will also be added eventually. All of them with one NIC for regular networking and a bonded nic made out of 2 physical NICs, for gluster/nfs. ESXIs are running on R710s, lots of ram, and, at least one NIC dedicated to NFS. I have one test server running with all four NICs on the NFS netowrk. NFS network is 9000 MTU, tuned for iSCSI (in the future). Now, trying to make it all work, these are the steps I took: Regarding ESXI tweaks, I've changed GLUSTER'S ping limit, lowering it to 20, to stop the volume from being intermittently inaccessible. On ESXI itself I've set the NFS max queue length to 64. I've chmoded gluster's share with 777, and you can find my gluster tweaks for the volume below. Gluster's NFS and regular NFS both are forcing uid and gid to nfsnobody:nfsnobody. iptables has been disabled, along with SElinux. Of course regular NFS is disabled. My gluster settings: Volume Name: glvol0 Type: Stripe Volume ID: f76af2ac-6a42-42ea-9887-941bf1600ced Status: Started Number of Bricks: 1 x 2 = 2 Transport-type: tcp Bricks: Brick1: 10.0.1.21:/export/glusterroot Brick2: 10.0.1.22:/export/glusterroot Options Reconfigured: nfs.ports-insecure: on nfs.addr-namelookup: off auth.reject: NONE nfs.volume-access: read-write nfs.nlm: off network.ping-timeout: 20 server.root-squash: on performance.nfs.write-behind: on performance.nfs.read-ahead: on performance.nfs.io-cache: on performance.nfs.quick-read: on performance.nfs.stat-prefetch: on performance.nfs.io-threads: on storage.owner-uid: 65534 storage.owner-gid: 65534 nfs.disable: off My regular NFS settings, that work just the way I need: /export/share *(rw,all_squash,anonuid=65534,anongid=65534,no_subtree_check) Once I get this all to work, I intend to create a nice page with instructions/info on gluster for ESXi. This field needs some more work out there. Now the question: is there anything I forgot, overlooked and don't know ? Could you help me ? Except for a comment from someone, saying "stripe does not work with esxi", nothing REALLY rings a bell. I've used all the pertinent info I had, and ran out of moves. My only option now would be testing a volume that is not a stripe. I'll do that for now, but I don't think it will work. BTW, can I be added to the list ? Cheers, Carlos. -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140301/5138b54e/attachment.html>
Bryan Whitehead
2014-Mar-03 03:46 UTC
[Gluster-users] ESXI cannot access stripped gluster volume
I don't have ESXi experience but the first thing that jumps out at me is you probably need to mount NFS/tcp. NFS/udp doesn't work on glusterfs (unless this has changed and I've not been paying close enough attention lately). On Sat, Mar 1, 2014 at 8:35 AM, Carlos Capriotti <capriotti.carlos at gmail.com> wrote:> ESXI cannot access stripped gluster volume > > Hello all. > > Unfortunately this is going to be a long post, so I will not spend too > many words with compliments; Gluster is a great solution and I should be > writing odes about it, so, bravo all of you. > > A bit about me: I've been working with FreeBSD and Linux for over a decade > now. I use CentOS nowadays because of some convenient features, regarding > my applications. > > Now a bit more about my problem: After adding my stripped gluster volume > to my ESXI via NFS, I try to browse it with vSphere's browser, and it... > > a) cannot see any of the folders already there. The operation never times > out, and I have a line of dots telling me the system (esxi) is trying to do > something. Other systems CAN access and see content on the same volume. > > b) gluster or ESXI return/log NO ERROR when I (try to) create folders > there, but does not show them either. Folder IS created. > > c) When trying to create a virtual machine it DOES return an error. Folder > and first file related to the VM ARE created but I get an error about the > an "Invalid virtual machine configuration". I am under the impression esxi > returns this when it tries to create the file for the virtual disk. > > d) When trying to remove the volume, it DOES return an error, stating the > resource is busy. I am forced to reboot the ESXI host in order to > successfully delete the nfs volume. > > Now, a bit about my environment: > > I am one of those cursed with an Isilon, but with NO service contract and > NO license. So, basically I have a big, fast and resilient NAS. Cool stuff, > but with great inconveniences as well. Goes without saying that me, as a > free software guy, would love to build something that can retire the > isilon, or at least, move it to a secondary role. > > Anyway, trying to add an alternative to it, I searched for days and > decided Gluster was the way to go. And I am not going back. I will make it > work. > > All of my VM servers, (about 15) are spread across 3 metal boxes, and - > please, don't blame me, I inherited this situation - there is no backup > solution whatsoever. Gluster will, in its final configuration, run on 4 > boxes, providing HA and backup. > > So, on esxi, Isilon's NFS volume/share works like a charm; pure NFS > sharing on CentOS works like a charm; gluster stripe - using two of my four > servers - does not like me. > > The very same gluster NSF volume is mounted and works happily on a ContOS > client. Actually, more than one. > > I have been reading literally dozens of docs, guides, manuals, VMware, > Gluster and Red Hat for more than a week, and while that, I've even created > ESXI VIRTUAL SERVERS *INSIDE* my EXSI physical servers, because I can no > longer afford rebooting a production server whenever I need to test yet > another change on gluster. > > > My software versions: > > CentOS 6.5 > Gluster 3.4.2 > ESXI 5.1 all patches applied > ESXI 5.5 > > My hardware for the nodes: 2 x Dell PE2950 with RAID5, bricks are single > volume of about 1.5 TB on each node. > > One stand-alone PE2900 with a single volume, on RAID5, and about 2.4 TB, > which will be added to the stripe, eventually. One PE2950, brick on RAID5 > 800GB, which will also be added eventually. > > All of them with one NIC for regular networking and a bonded nic made out > of 2 physical NICs, for gluster/nfs. > > ESXIs are running on R710s, lots of ram, and, at least one NIC dedicated > to NFS. I have one test server running with all four NICs on the NFS > netowrk. > > NFS network is 9000 MTU, tuned for iSCSI (in the future). > > Now, trying to make it all work, these are the steps I took: > > Regarding ESXI tweaks, I've changed GLUSTER'S ping limit, lowering it to > 20, to stop the volume from being intermittently inaccessible. On ESXI > itself I've set the NFS max queue length to 64. > > I've chmoded gluster's share with 777, and you can find my gluster tweaks > for the volume below. > > Gluster's NFS and regular NFS both are forcing uid and gid to > nfsnobody:nfsnobody. > > iptables has been disabled, along with SElinux. > > Of course regular NFS is disabled. > > My gluster settings: > > > Volume Name: glvol0 > Type: Stripe > Volume ID: f76af2ac-6a42-42ea-9887-941bf1600ced > Status: Started > Number of Bricks: 1 x 2 = 2 > Transport-type: tcp > Bricks: > Brick1: 10.0.1.21:/export/glusterroot > Brick2: 10.0.1.22:/export/glusterroot > Options Reconfigured: > nfs.ports-insecure: on > nfs.addr-namelookup: off > auth.reject: NONE > nfs.volume-access: read-write > nfs.nlm: off > network.ping-timeout: 20 > server.root-squash: on > performance.nfs.write-behind: on > performance.nfs.read-ahead: on > performance.nfs.io-cache: on > performance.nfs.quick-read: on > performance.nfs.stat-prefetch: on > performance.nfs.io-threads: on > storage.owner-uid: 65534 > storage.owner-gid: 65534 > nfs.disable: off > > > My regular NFS settings, that work just the way I need: > > /export/share > *(rw,all_squash,anonuid=65534,anongid=65534,no_subtree_check) > > Once I get this all to work, I intend to create a nice page with > instructions/info on gluster for ESXi. This field needs some more work out > there. > > Now the question: is there anything I forgot, overlooked and don't know ? > > Could you help me ? Except for a comment from someone, saying "stripe does > not work with esxi", nothing REALLY rings a bell. I've used all the > pertinent info I had, and ran out of moves. > > My only option now would be testing a volume that is not a stripe. I'll do > that for now, but I don't think it will work. > > BTW, can I be added to the list ? > > Cheers, > > Carlos. > > _______________________________________________ > Gluster-users mailing list > Gluster-users at gluster.org > http://supercolony.gluster.org/mailman/listinfo/gluster-users >-------------- next part -------------- An HTML attachment was scrubbed... URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20140302/0030a839/attachment.html>