I''m curious if anyone else has run into this problem, and if so, what solutions they use to get around it. We are using Vmware Esxi servers with an Opensolaris NFS backend. This allows us to leverage all the awesomeness of ZFS, including the snapshots and clones. The best feature of this is we can create a vmware guest template (centos/ubuntu/win/whatever) and use the snapshot/cloning to make an instant copy of that machine, and it will hardly take any additional space (initially that is). Everything is great. My issue is that vmware esx only allows you a limit of 32 nfs mounts. And because of this we can''t seem to get any more than 32 servers (from nfs). Every single vmware machine is it''s own zvol. I tried making my nfs mount to higher zvol level. But I cannot traverse to the sub-zvols from this mount. Another thing I tried was adding another nic to the vmware server. But you cannot have more than one vmkernel on the same subnet. Does anyone have any experience with overcoming these limitations? -- HUGE David Stahl Systems Administrator 718 233 9164 / F 718 625 5157 www.hugeinc.com <http://www.hugeinc.com> -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090616/bfdc1bd2/attachment.html>
HUGE | David Stahl wrote:> I''m curious if anyone else has run into this problem, and if so, what > solutions they use to get around it. > We are using Vmware Esxi servers with an Opensolaris NFS backend. > This allows us to leverage all the awesomeness of ZFS, including the > snapshots and clones. The best feature of this is we can create a > vmware guest template (centos/ubuntu/win/whatever) and use the > snapshot/cloning to make an instant copy of that machine, and it will > hardly take any additional space (initially that is). Everything is great. > My issue is that vmware esx only allows you a limit of 32 nfs > mounts. And because of this we can''t seem to get any more than 32 > servers (from nfs). Every single vmware machine is it''s own zvol. I > tried making my nfs mount to higher zvol level. But I cannot traverse > to the sub-zvols from this mount. > Another thing I tried was adding another nic to the vmware server. > But you cannot have more than one vmkernel on the same subnet. > > Does anyone have any experience with overcoming these limitations?I understand what you are trying to do, but yes, Vmware has that 32 mount point limit (64 in vSphere 4.0 I believe). My only suggestion is to put more VMs in a single mountpoint. And Vmware does not support NFSv4 mirror mounts so you can''t try mounting them under subdirs. So in your case of using this for quick clone deployment, create your golden image and then use Vmware to clone that another 5-10 times (or whatever, an NFS share can handle a larger number of VM than a FC/iSCSI VMFS3 datastore) in that same NFS mountpoint. Then when you snap/clone on the array side, your granularity will be in group of 5-10 (or more). For example, if you put 10 images in a single "golden" NFS share and clone that 32 times you can get 320 VMs. 20 per directory would give you 640. Not ideal, you''ll have to worrying about updating 5-10 images when you refresh patches/apps, instead of one, but that''s the VMware limitation we are dealing with. We''d really like to see Vmware support NFSv4 mirror-mounts in the future, but they have not commented on their plans there. -ryan> > -- > HUGE > > David Stahl > Systems Administrator > 718 233 9164 / F 718 625 5157 > > www.hugeinc.com <http://www.hugeinc.com> > ------------------------------------------------------------------------ > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >-- Ryan Arneson Sun Microsystems, Inc. 303-223-6264 ryan.arneson at sun.com http://blogs.sun.com/rarneson
Try iSCSI? -- This message posted from opensolaris.org
That is a very interesting idea Ryan. Not as ideal as I hoped, but does open up a way of maximizing my amount of vm guests. Thanks for that suggestion. Also if I added another subnet and another vmkernel would I be allowed another 32 nfs mounts? So is it 32 nfs mounts per vmkernel or 32 nfs mounts period? -- HUGE David Stahl Systems Administrator 718 233 9164 / F 718 625 5157 www.hugeinc.com <http://www.hugeinc.com>> From: Ryan Arneson <Ryan.Arneson at Sun.COM> > Date: Tue, 16 Jun 2009 15:14:31 -0600 > To: HUGE | David Stahl <dstahl at hugeinc.com> > Cc: <zfs-discuss at opensolaris.org> > Subject: Re: [zfs-discuss] ZFS, ESX ,and NFS. oh my! > > HUGE | David Stahl wrote: >> I''m curious if anyone else has run into this problem, and if so, what >> solutions they use to get around it. >> We are using Vmware Esxi servers with an Opensolaris NFS backend. >> This allows us to leverage all the awesomeness of ZFS, including the >> snapshots and clones. The best feature of this is we can create a >> vmware guest template (centos/ubuntu/win/whatever) and use the >> snapshot/cloning to make an instant copy of that machine, and it will >> hardly take any additional space (initially that is). Everything is great. >> My issue is that vmware esx only allows you a limit of 32 nfs >> mounts. And because of this we can''t seem to get any more than 32 >> servers (from nfs). Every single vmware machine is it''s own zvol. I >> tried making my nfs mount to higher zvol level. But I cannot traverse >> to the sub-zvols from this mount. >> Another thing I tried was adding another nic to the vmware server. >> But you cannot have more than one vmkernel on the same subnet. >> >> Does anyone have any experience with overcoming these limitations? > I understand what you are trying to do, but yes, Vmware has that 32 > mount point limit (64 in vSphere 4.0 I believe). My only suggestion is > to put more VMs in a single mountpoint. And Vmware does not support > NFSv4 mirror mounts so you can''t try mounting them under subdirs. > > So in your case of using this for quick clone deployment, create your > golden image and then use Vmware to clone that another 5-10 times (or > whatever, an NFS share can handle a larger number of VM than a FC/iSCSI > VMFS3 datastore) in that same NFS mountpoint. Then when you snap/clone > on the array side, your granularity will be in group of 5-10 (or more). > For example, if you put 10 images in a single "golden" NFS share and > clone that 32 times you can get 320 VMs. 20 per directory would give you > 640. > > Not ideal, you''ll have to worrying about updating 5-10 images when you > refresh patches/apps, instead of one, but that''s the VMware limitation > we are dealing with. > > We''d really like to see Vmware support NFSv4 mirror-mounts in the > future, but they have not commented on their plans there. > > -ryan > > > > >> >> -- >> HUGE >> >> David Stahl >> Systems Administrator >> 718 233 9164 / F 718 625 5157 >> >> www.hugeinc.com <http://www.hugeinc.com> >> ------------------------------------------------------------------------ >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > > > -- > Ryan Arneson > Sun Microsystems, Inc. > 303-223-6264 > ryan.arneson at sun.com > http://blogs.sun.com/rarneson >
HUGE | David Stahl wrote:> That is a very interesting idea Ryan. Not as ideal as I hoped, but does open > up a way of maximizing my amount of vm guests. > Thanks for that suggestion. > Also if I added another subnet and another vmkernel would I be allowed > another 32 nfs mounts? So is it 32 nfs mounts per vmkernel or 32 nfs mounts > period? > >32 mounts period. Of course, you could have a 2nd ESXi host with separate 32 mounts as well. -ryan -- Ryan Arneson Sun Microsystems, Inc. 303-223-6264 ryan.arneson at sun.com http://blogs.sun.com/rarneson
My testing with 2008.11 iSCSI vs NFS was that iSCSI was about 2x faster. I used a 3 stripe 5 disk raidz (15 1.5TB sata disks). I just used the default zil, no SSD or similar to make NFS faster. I think (don''t quote me) that ESX can only mount 64 iSCSI targets, so you aren''t much better off. But, COMSTAR (2009.06) exports a single iSCSI target with multiple LUNs, so that gets around the limitation. I could be all wet on this one, however, so look into it before taking my word. Obviously iSCSI and NFS are quite different at the storage level, and I actually like NFS for the flexibility over iSCSI (quotas, reservations, etc.) -Scott -- This message posted from opensolaris.org
I actually prefer nfs for right now. We had an issue with iscsi where we lost some data and were unable to recover it due to solaris not being able to read propriatory vmfs. -- HUGE David Stahl Systems Administrator 718 233 9164 / F 718 625 5157 www.hugeinc.com <http://www.hugeinc.com>> From: Scott Meilicke <no-reply at opensolaris.org> > Date: Tue, 16 Jun 2009 14:47:26 PDT > To: <zfs-discuss at opensolaris.org> > Subject: Re: [zfs-discuss] ZFS, ESX ,and NFS. oh my! > > My testing with 2008.11 iSCSI vs NFS was that iSCSI was about 2x faster. I > used a 3 stripe 5 disk raidz (15 1.5TB sata disks). I just used the default > zil, no SSD or similar to make NFS faster. > > I think (don''t quote me) that ESX can only mount 64 iSCSI targets, so you > aren''t much better off. But, COMSTAR (2009.06) exports a single iSCSI target > with multiple LUNs, so that gets around the limitation. I could be all wet on > this one, however, so look into it before taking my word. > > Obviously iSCSI and NFS are quite different at the storage level, and I > actually like NFS for the flexibility over iSCSI (quotas, reservations, etc.) > > -Scott > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
On Jun 16, 2009, at 17:47, Scott Meilicke wrote:> I think (don''t quote me) that ESX can only mount 64 iSCSI targets, > so you aren''t much better off. But, COMSTAR (2009.06) exports a > single iSCSI target with multiple LUNs, so that gets around the > limitation. I could be all wet on this one, however, so look into it > before taking my word.Version 3.5 of the various VMware software has a one-connection-per- target limit (page 18):> ESX Server?based iSCSI initiators establish only one connection to > each target. This means storage systems with a single target > containing multiple LUNs have all LUN traffic on that one > connection. With a system that has [say] three targets with one LUN > each, three connections exist between an ESX Server and the three > volumes available.http://www.vmware.com/pdf/vi3_35/esx_3/r35/vi3_35_25_iscsi_san_cfg.pdf> The iSCSI driver used by ESX Server does not currently use multiple > connections per session, so there are no such settings that you can > tune.http://www.vmware.com/files/pdf/iSCSI_design_deploy.pdf This may impact how much throughput you can get, as well as any MPIO scenarios you want to use. There''s also a maximum of 8 (?) targets that can be connected to: http://www.vmware.com/pdf/vi3_iscsi_cfg.pdf There''s a good explanation of how VMware''s iSCSI system works (at least for 3.x): http://tinyurl.com/dd4com http://virtualgeek.typepad.com/virtual_geek/2009/01/a-multivendor-post-to-help-our-mutual-iscsi-customers-using-vmware.html
Scott Meilicke wrote:> Obviously iSCSI and NFS are quite different at the storage level, and I > actually like NFS for the flexibility over iSCSI (quotas, reservations, > etc.)Another key difference between them is that with iSCSI, the VMFS filesystem (built on the zvol presented as a block device) never frees up unused disk space. Once ESX has written to a block on that zvol, it will always be taking up space in your zpool, even if you delete the .vmdk file that contains it. The zvol has no idea that the block is not used any more. With NFS, ZFS is aware that the file is deleted, and can deallocate those blocks. This would be less of an issue if we had deduplication on the zpool (have ESX write blocks of all-0 and those would be deduped down to a single block) or if there was some way (like the SSD TRIM command) for the VMFS filesystem to tell the block device that a block is no longer used. --Joe
So how are folks getting around the NFS speed hit? Using SSD or battery backed RAM ZILs? Regarding limited NFS mounts, underneath a single NFS mount, would it work to: * Create a new VM * Remove the VM from inventory * Create a new ZFS file system underneath the original * Copy the VM to that file system * Add to inventory At this point the VM is running underneath it''s own file system. I don''t know if ESX would see this? To create another VM: * Snap the original VM * Create a clone underneath the original NFS FS, along side the original VM ZFS. Laborious to be sure. -- This message posted from opensolaris.org
I would think you would run into the same problem I have. Where you can''t view child zvols from a parent zvol nfs share.> From: Scott Meilicke <no-reply at opensolaris.org> > Date: Fri, 19 Jun 2009 08:29:29 PDT > To: <zfs-discuss at opensolaris.org> > Subject: Re: [zfs-discuss] ZFS, ESX ,and NFS. oh my! > > So how are folks getting around the NFS speed hit? Using SSD or battery backed > RAM ZILs? > > Regarding limited NFS mounts, underneath a single NFS mount, would it work to: > > * Create a new VM > * Remove the VM from inventory > * Create a new ZFS file system underneath the original > * Copy the VM to that file system > * Add to inventory > > At this point the VM is running underneath it''s own file system. I don''t know > if ESX would see this? > > To create another VM: > > * Snap the original VM > * Create a clone underneath the original NFS FS, along side the original VM > ZFS. > > Laborious to be sure. > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Why the use of zvols, why not just; zfs create my_pool/group1 zfs create my_pool/group1/vm1 zfs create my_pool/group1/vm2 and export my_pool/group1 If you don''t want the people in group1 to see vm2 anymore just zfs rename it to a different group. I''ll admit I am coming into this green - but if you''re not doing iscsi, why zvols? SM. -- This message posted from opensolaris.org
The real benefit of the of using a separate zvol for each vm is the instantaneous cloning of a machine, and the clone will take almost no additional space initially. In our case we build a template VM and then provision our development machines from this. However the limit of 32 nfs mounts per esx machine is kind of a bummer. -----Original Message----- From: zfs-discuss-bounces at opensolaris.org on behalf of Steve Madden Sent: Wed 7/1/2009 8:46 PM To: zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] ZFS, ESX ,and NFS. oh my! Why the use of zvols, why not just; zfs create my_pool/group1 zfs create my_pool/group1/vm1 zfs create my_pool/group1/vm2 and export my_pool/group1 If you don''t want the people in group1 to see vm2 anymore just zfs rename it to a different group. I''ll admit I am coming into this green - but if you''re not doing iscsi, why zvols? SM. -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090701/79af6c2c/attachment.html>
On Wed, Jul 1, 2009 at 7:29 PM, HUGE | David Stahl<dstahl at hugeinc.com> wrote:> The real benefit of the of using a separate zvol for each vm is the > instantaneous cloning of a machine, and the clone will take almost no > additional space initially. In our case we build a template VM and then > provision our development machines from this. > However the limit of 32 nfs mounts per esx machine is kind of a bummer. > > > -----Original Message----- > From: zfs-discuss-bounces at opensolaris.org on behalf of Steve Madden > Sent: Wed 7/1/2009 8:46 PM > To: zfs-discuss at opensolaris.org > Subject: Re: [zfs-discuss] ZFS, ESX ,and NFS. oh my! > > Why the use of zvols, why not just; > > zfs create my_pool/group1 > zfs create my_pool/group1/vm1 > zfs create my_pool/group1/vm2 > > and export my_pool/group1 > > If you don''t want the people in group1 to see vm2 anymore just zfs rename it > to a different group. > > I''ll admit I am coming into this green - but if you''re not doing iscsi, why > zvols? > > SM. > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >Is there a supported way to multipath NFS? Thats one benefit to iSCSI is your VMware can multipath to a target to get more speed/HA... -- Brent Jones brent at servuhome.net
On Jul 1, 2009, at 10:58 PM, Brent Jones <brent at servuhome.net> wrote:> On Wed, Jul 1, 2009 at 7:29 PM, HUGE | David > Stahl<dstahl at hugeinc.com> wrote: >> The real benefit of the of using a separate zvol for each vm is the >> instantaneous cloning of a machine, and the clone will take almost no >> additional space initially. In our case we build a template VM and >> then >> provision our development machines from this. >> However the limit of 32 nfs mounts per esx machine is kind of a >> bummer. >> >> >> -----Original Message----- >> From: zfs-discuss-bounces at opensolaris.org on behalf of Steve Madden >> Sent: Wed 7/1/2009 8:46 PM >> To: zfs-discuss at opensolaris.org >> Subject: Re: [zfs-discuss] ZFS, ESX ,and NFS. oh my! >> >> Why the use of zvols, why not just; >> >> zfs create my_pool/group1 >> zfs create my_pool/group1/vm1 >> zfs create my_pool/group1/vm2 >> >> and export my_pool/group1 >> >> If you don''t want the people in group1 to see vm2 anymore just zfs >> rename it >> to a different group. >> >> I''ll admit I am coming into this green - but if you''re not doing >> iscsi, why >> zvols? >> >> SM. >> -- > > Is there a supported way to multipath NFS? Thats one benefit to iSCSI > is your VMware can multipath to a target to get more speed/HA...Yes, it''s called IPMP on Solaris. Define two interfaces in a common group with no failover (used to probe network failures) then define any number of virtual interfaces on each and if one interface goes down the virtual interfaces will fail-over to the other physical interface. It will also do load balancing between them, or you can create a LAG which does redundancy and load balancing. -Ross -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090702/c161b705/attachment.html>
>>>>> "rw" == Ross Walker <rswwalker at gmail.com> writes:rw> you can create a LAG which does redundancy and load balancing. be careful---these aggregators are all hash-based, so the question is, of what is the hash taken? The widest scale on which the hash can be taken is L4 (TCP source/dest port numbers) because this type of aggregation only preserves packet order within a single link, and reordering packets is ``bad'''', not sure why exactly, but I presume it hurts TCP performance, so the way around that problem is to keep each TCP flow nailed to a particular physical link. It looks like there''s a ''dladm -P L4'' option so I imagine L4 hashing is supported on the transmit side *iff* you explicitly ask for it. though sometimes things like that might be less or more performant depending on the NIC you buy, I can''t imagine a convincing story why it would be in this case. so that handles the TRANSMIT direction. The RECEIVE direction is another story. Application-layer multipath uses a different source IP address for the two sessions, so both sent and received traffic will be automatically spread over the two NIC''s. With LACP-style aggregation it''s entirely the discretion of each end of the link how they''d like to divide up transmitted traffic. Typically switches hash L2 MAC only, which is obviously useless. It''s meant for switching trunks with many end systems on either side. host->switch is covered by dladm above, but if you want L4 hashing for packets in the switch->host direction you must buy an L3 switch and configure it ``appropriately'''', which seems to be described here for cisco 6500: http://www.cisco.com/en/US/docs/switches/lan/catalyst6500/ios/12.1E/native/configuration/guide/channel.html#wp1020804 I believve it''s layer-violating feature, so it works fine on a port channel in an L2 VLAN. You don''t have to configure a /30 router-style non-VLAN two-host-subnet interface on the 6500 to use L4 hashing, I think. however the Cisco command applies to all port channels on the entire switch!!, including trunks to other switches, so the network team is likely to give lots of push-back when you ask them to turn this knob. IMHO it''s not harmful, and they should do it for you, but maybe they will complain about SYN-flood vulnerability and TCAM wastage and wait but how does it interact with dCEF and FUDFUDFUD and all the things they usually say whenever you want to actually use any feature of the 6500 instead of just bragging about its theoretical availability. Finally, *ALL THIS IS COMPLETELY USELESS FOR NFS* because L4 hashing can only split up separate TCP flows. I checked with a Linux client and Solaris host, and it puts all the NFSv3 mounts onto a single TCP flow, not one mount per flow. iSCSI seems to do one flow per session, while I bet multiple LUN''s (comstar-style) would share the same TCP flow for several LUN''s. so...as elegant as network-layer multipath is, I think you''ll need SCSI-layer multipath to squeeze more performance from an aggregated link between two physical hosts. And if you are using network-layer multipath (such as a port-aggregated trunk) carrying iSCSI it might work better to (a) make sure the equal-cost-multipath hash you''re using is L4, not L3 or L2, and (b) use a single LUN per session (multiple flows per target). This might also be better on very recent versions of Solaris (something later than snv_105) which also have 10Gbit network cards even without any network ECMP because the TCP stack can supposedly divide TCP flows among the CPU''s: http://www.opensolaris.org/os/project/crossbow/topics/nic/ I''m not sure, though. The data path is getting really advanced, and there are so many optimisations conflicting with each other at this point. Maybe it''s better to worry about this optimisation for http clients, and forget about it entirely for iSCSI u.s.w. and instead try to scheme for a NIC that can do SRP or iSER/iWARP. There''s a downside to it, too. Multiple TCP flows will use more switch buffers when going from a faster link into a slower or shared link than a single flow, so if you have a 3560 or some other switch with small output queues, reading a wide RAID stripe could in theory overwhelm the switch when all the targets answer at once. If this happens, you should be able to see dropped packet counters incrementing in the switch. FC and IB are both lossless and does not have this problem. If you''re not using any port-aggregated trunks and don''t have 10Gbit/s, the TCP flow control might work better to avoid this ``microbursting'''' if you use multi LUN per flow, multiplexing all the LUN''s onto a single TCP flow per initiator/target pair Comstar-style (or, well, NFS-style). (all pretty speculative, though. YMMV.) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20090702/5dc5748b/attachment.bin>
According to the link bellow, VMWare will only use a single TCP session for NFS data, which means you''re unlikely to get it to travel down more than one interface on the VMware side, even if you can find a way to do it on the solaris side. http://virtualgeek.typepad.com/virtual_geek/2009/06/a-multivendor-post-t o-help-our-mutual-nfs-customers-using-vmware.html T -----Original Message----- From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of Brent Jones Sent: Thursday, 2 July 2009 12:58 PM To: HUGE | David Stahl Cc: Steve Madden; zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] ZFS, ESX ,and NFS. oh my! On Wed, Jul 1, 2009 at 7:29 PM, HUGE | David Stahl<dstahl at hugeinc.com> wrote:> The real benefit of the of using a separate zvol for each vm is the > instantaneous cloning of a machine, and the clone will take almost no > additional space initially. In our case we build a template VM andthen> provision our development machines from this. > However the limit of 32 nfs mounts per esx machine is kind of abummer.> > > -----Original Message----- > From: zfs-discuss-bounces at opensolaris.org on behalf of Steve Madden > Sent: Wed 7/1/2009 8:46 PM > To: zfs-discuss at opensolaris.org > Subject: Re: [zfs-discuss] ZFS, ESX ,and NFS. oh my! > > Why the use of zvols, why not just; > > zfs create my_pool/group1 > zfs create my_pool/group1/vm1 > zfs create my_pool/group1/vm2 > > and export my_pool/group1 > > If you don''t want the people in group1 to see vm2 anymore just zfsrename it> to a different group. > > I''ll admit I am coming into this green - but if you''re not doingiscsi, why> zvols? > > SM. > -- > This message posted from opensolaris.org > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > >Is there a supported way to multipath NFS? Thats one benefit to iSCSI is your VMware can multipath to a target to get more speed/HA... -- Brent Jones brent at servuhome.net _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ______________________________________________________________________ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email ______________________________________________________________________
"Tristan Ball" <Tristan.Ball at leica-microsystems.com> writes:> According to the link bellow, VMWare will only use a single TCP session > for NFS data, which means you''re unlikely to get it to travel down more > than one interface on the VMware side, even if you can find a way to do > it on the solaris side. > > http://virtualgeek.typepad.com/virtual_geek/2009/06/a-multivendor-post-to-help-our-mutual-nfs-customers-using-vmware.htmlPlease read this site carefully. inside is something about LACP/trunking. if you wish to use more then 1Gbit do it as example this way: - solaris host with 4 1Gbit interfaces - aggregate this interfaces into aggr1 - enable a trunk or LACP on your switch for this ports - assign *4 addresses* (or just more then one) to this aggr1 - setup you vmkernel network with as example 4 NICs - check the setting for network balancing in esx - enable a trunk or LACP on your switch for this ports -- important for esxi use a static trunk not LACP because esxi is not capable doing LACP - now mount the datastores this way: - host: ipaddr1 share /share1 - host: ipaddr2 share /share2 - host: ipaddr3 share /share3 - host: ipaddr4 share /share4 voila: - you have 4 TCP-Sessions - you can have 1Gbit per datastore. now before arguing that this ist to slow an you just want to get 4Gbit per /share. The answer is no. this is not possible. Why: read the 802.3ad specifications. LACP uses only one link per MAC. if you need more speed than 1Gbit / host (TCP-Session) you have to switch your network environment to 10Gbit Ethernet. example: 10GBE host <1 10Gbit cable> switchA < 4*1GBit trunk LACP > switchB <110Gbit cable> 10GBE host the maximum possible speed between this 2 host will be 1Gbit.> > T > > -----Original Message----- > From: zfs-discuss-bounces at opensolaris.org > [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of Brent Jones > Sent: Thursday, 2 July 2009 12:58 PM > To: HUGE | David Stahl > Cc: Steve Madden; zfs-discuss at opensolaris.org > Subject: Re: [zfs-discuss] ZFS, ESX ,and NFS. oh my! > > On Wed, Jul 1, 2009 at 7:29 PM, HUGE | David Stahl<dstahl at hugeinc.com> > wrote: >> The real benefit of the of using a separate zvol for each vm is the >> instantaneous cloning of a machine, and the clone will take almost no >> additional space initially. In our case we build a template VM and > then >> provision our development machines from this. >> However the limit of 32 nfs mounts per esx machine is kind of a > bummer. >> >> >> -----Original Message----- >> From: zfs-discuss-bounces at opensolaris.org on behalf of Steve Madden >> Sent: Wed 7/1/2009 8:46 PM >> To: zfs-discuss at opensolaris.org >> Subject: Re: [zfs-discuss] ZFS, ESX ,and NFS. oh my! >> >> Why the use of zvols, why not just; >> >> zfs create my_pool/group1 >> zfs create my_pool/group1/vm1 >> zfs create my_pool/group1/vm2 >> >> and export my_pool/group1 >> >> If you don''t want the people in group1 to see vm2 anymore just zfs > rename it >> to a different group. >> >> I''ll admit I am coming into this green - but if you''re not doing > iscsi, why >> zvols? >> >> SM. >> -- >> This message posted from opensolaris.org >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss> >> >> _______________________________________________ >> zfs-discuss mailing list >> zfs-discuss at opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss> >> > > Is there a supported way to multipath NFS? Thats one benefit to iSCSI > is your VMware can multipath to a target to get more speed/HA...-- disy Informationssysteme GmbH Daniel Priem Netzwerk- und Systemadministrator Tel: +49 721 1 600 6000, Fax: -605, E-Mail: daniel.priem at disy.net Entdecken Sie "L?sungen mit K?pfchen" auf unserer neuen Website: www.disy.net Firmensitz: Erbprinzenstr. 4-12, 76133 Karlsruhe Registergericht: Amtsgericht Mannheim, HRB 107964 Gesch?ftsf?hrer: Claus Hofmann ----------------------------- Environment . Reporting . GIS -----------------------------
Nils Goroll
2009-Jul-08 10:51 UTC
[zfs-discuss] NFS load balancing / was: ZFS, ESX , and NFS. oh my!
Hi Miles and All, this is off-topic, but as the discussion has started here:> Finally, *ALL THIS IS COMPLETELY USELESS FOR NFS* because L4 hashing > can only split up separate TCP flows.The reason why I have spend some time with http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6817942 is to make NFS loadbalancing over more than one TCP stream work again. When rpcmod:clnt_max_conns is set to a value > 1, the NFS client will use multiple TCP connections. Now the next question is which IP adresses and TCP ports are chosen for these connections, which are not guaranteed to be successive in order to get optimal load distribution with the hashes I''ve seen in the field. That''s a topic I''ll probably revisit.. Nils
roland
2009-Aug-11 22:00 UTC
[zfs-discuss] NFS load balancing / was: ZFS, ESX , and NFS. oh my!
>I tried making my nfs mount to higher zvol level. But I cannot traverse to the >sub-zvols from this mount.i really wonder when someone will come up with a little patch which implements crossmnt option for solaris nfsd (like it exists for linux nfsd). ok, even if it?s a hack - if it works it just works and using it with esx/nfs would be a killer feature. -- This message posted from opensolaris.org
Scott Meilicke
2009-Aug-12 15:42 UTC
[zfs-discuss] NFS load balancing / was: ZFS, ESX , and NFS. oh my!
Yes! That would be icing on the cake. -- This message posted from opensolaris.org
> The real benefit of the of using a > separate zvol for each vm is the instantaneous > cloning of a machine, and the clone will take almost > no additional space initially. In our case we build aYou don''t have to use ZVOL devices to do that. As mentioned by others...> zfs create my_pool/group1 > zfs create my_pool/group1/vm1 > zfs create my_pool/group1/vm2In this case, ''vm1'' and ''vm2'' are on separate filesystems, that will show up in ''zfs list'', since ''zfs create'' was used to make them. But they are still both under the common mount point ''/my_pool/group1'' Now, you could zfs snapshot my_pool/group1/vm2 at snap-1-2009-06-12 zfs clone my_pool/group1/vm2 at snap-1-2009-06-12 my_pool/group1/vm3 zfs promote my_pool/group1/vm3 And you would then have your clone, also under the common mount point.. -- This message posted from opensolaris.org
In my testing, vmware doesn''t see the vm1 and vm2 filesystems. Vmware doesn''t have an automounter, and doesn''t traverse NFS4 sub-mounts (whatever the formal name for them is). Actually, it doesn''t support NFS4 at all! Regards, Tristan. -----Original Message----- From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of James Hess Sent: Thursday, 13 August 2009 3:38 PM To: zfs-discuss at opensolaris.org Subject: Re: [zfs-discuss] ZFS, ESX ,and NFS. oh my!> The real benefit of the of using a > separate zvol for each vm is the instantaneous > cloning of a machine, and the clone will take almost > no additional space initially. In our case we build aYou don''t have to use ZVOL devices to do that. As mentioned by others...> zfs create my_pool/group1 > zfs create my_pool/group1/vm1 > zfs create my_pool/group1/vm2In this case, ''vm1'' and ''vm2'' are on separate filesystems, that will show up in ''zfs list'', since ''zfs create'' was used to make them. But they are still both under the common mount point ''/my_pool/group1'' Now, you could zfs snapshot my_pool/group1/vm2 at snap-1-2009-06-12 zfs clone my_pool/group1/vm2 at snap-1-2009-06-12 my_pool/group1/vm3 zfs promote my_pool/group1/vm3 And you would then have your clone, also under the common mount point.. -- This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss at opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ______________________________________________________________________ This email has been scanned by the MessageLabs Email Security System. For more information please visit http://www.messagelabs.com/email ______________________________________________________________________
On Aug 13, 2009, at 1:37 AM, James Hess <no-reply at opensolaris.org> wrote:>> The real benefit of the of using a >> separate zvol for each vm is the instantaneous >> cloning of a machine, and the clone will take almost >> no additional space initially. In our case we build a > > You don''t have to use ZVOL devices to do that. > As mentioned by others... > >> zfs create my_pool/group1 >> zfs create my_pool/group1/vm1 >> zfs create my_pool/group1/vm2 > > In this case, ''vm1'' and ''vm2'' are on separate filesystems, that > will show up in ''zfs list'', since ''zfs create'' was used to make > them. But they are still both under the common mount point ''/ > my_pool/group1'' > > Now, you could > zfs snapshot my_pool/group1/vm2 at snap-1-2009-06-12 > zfs clone my_pool/group1/vm2 at snap-1-2009-06-12 my_pool/group1/ > vm3 > zfs promote my_pool/group1/vm3 > > And you would then have your clone, also under the common mount > point..That was my first mistake I made with ZFS over NFS. What you are actually seeing is the mountpoints remotely not the sub- datasets. If you copy something into those mountpoints remotely you are actually sliging it under the mounted dataset and it won''t be snapshotted or cloned. ZFS datasets act like separate file systems, so therefore behave as such in concert with NFS and only NFSv4 allows hierarchal exports. -Ross