Hello, I''ve recently joined this list, primarily because of a thread I found from late April ("Is file cloning anywhere on ZFS roadmap") asking about file-level cloning in ZFS. Based on that thread I understand that it''s not currently possible to ''clone'' files instead of ''copying'' them, but the thread didn''t answer the original question about whether this was a reasonable feature or one being considered for future development. My limited understanding of how ZFS actually stores data makes me think a ''clone'' vs. a ''copy'' would be very easy since we have deduplication now - some new file/filesystem metadata entries, some manipulation of the deduplication table and the clone is complete. Is this a gross oversimplification? Would this be much harder to implement than it seems? VMware vSphere uses NFSv3 and requires a separate mount for each filesystem. It also specifies a maximum number of NFS mounts per host (64) and maximum number of iSCSI LUNs per host (256). In a cluster environment this means the existing filesystem/zvol cloning techniques would only work well in very specific configurations. Thanks, -Will
>>>>> "sw" == Saxon, Will <Will.Saxon at sage.com> writes:sw> ''clone'' vs. a ''copy'' would be very easy since we have sw> deduplication now dedup doesn''t replace the snapshot/clone feature for the NFS-share-full-of-vmdk use case because there''s no equivalent of ''zfs rollback'' I''m tempted to say, ``vmware needs to remove their silly limit'''' but there are takes-three-hours-to-boot problems with thousands of Solaris NFS exports so maybe their limit is not so silly after all. What is the scenario, you have? Is it something like 40 hosts with live migration among them, and 40 guests on each host? so you need 1600 filesystems mounted even though only 40 are actually in use? ''zfs set sharenfs=absorb <dataset>'' would be my favorite answer, but lots of people have asked for such a feature, and answer is always ``wait for mirror mounts'''' (which BTW are actually just-works for me on very-recent linux, even with plain ''mount host:/fs /fs'', without saying ''mount -t nfs4'', in spite of my earlier rant complaining they are not real). Of course NFSv4 features are no help to vmware, but hypothetically I guess mirror-mounting would work if vmware supported it, so long as they were careful not to provoke the mounting of guests not in use. The ``implicit automounter'''' on which the mirror mount feature''s based would avoid the boot delay of mounting 1600 filesystems. and BTW I''ve not been able to get the Real Automounter in Linux to do what this implicit one already can with subtrees. Why is it so hard to write a working automounter? The other thing I''ve never understood is, if you ''zfs rollback'' an NFS-exported filesystem, what happens to all the NFS clients? It seems like this would cause much worse corruption than the worry when people give fire-and-brimstone speeches about never disabling zil-writing while using the NFS server. but it seems to mostly work anyway when I do this, so I''m probably confused about something. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100722/42036338/attachment.bin>
> -----Original Message----- > From: zfs-discuss-bounces at opensolaris.org > [mailto:zfs-discuss-bounces at opensolaris.org] On Behalf Of Miles Nordin > Sent: Thursday, July 22, 2010 2:42 PM > To: zfs-discuss at opensolaris.org > Subject: Re: [zfs-discuss] File cloning > > >>>>> "sw" == Saxon, Will <Will.Saxon at sage.com> writes: > > sw> ''clone'' vs. a ''copy'' would be very easy since we have > sw> deduplication now > > dedup doesn''t replace the snapshot/clone feature for the > NFS-share-full-of-vmdk use case because there''s no equivalent of > ''zfs rollback'' > > > I''m tempted to say, ``vmware needs to remove their silly limit'''' but > there are takes-three-hours-to-boot problems with thousands of Solaris > NFS exports so maybe their limit is not so silly after all. > > What is the scenario, you have? Is it something like 40 hosts with > live migration among them, and 40 guests on each host? so you need > 1600 filesystems mounted even though only 40 are actually in use? >Well in my case it''s 8 hosts with live migration. We have ~650 VMs right now. They are not all running at once, but we tend to have 200-240 running at peak times of the day. We have to have all of them registered/available at any time. Our current solution is to have iSCSI LUNs holding many VMs at once, which is what I suspect most people would do to meet our requirements. I''ve been working with OpenSolaris to test delivery of VM storage over NFS, and I think it would also work well for us. Our users are allowed to create/deploy/clone their own VMs, and a common complaint I have today is that provisioning takes a long time. This will always be the case if the users provision using the vendor-supplied tools, but I think if I was storing VMs and templates on ZFS and serving via NFS, a file-level clone would let me offer much faster provisioning to the user without requiring additional software like VMware Lab Manager or more expensive storage hardware. -Will
On Jul 22, 2010, at 2:41 PM, Miles Nordin <carton at Ivy.NET> wrote:>>>>>> "sw" == Saxon, Will <Will.Saxon at sage.com> writes: > > sw> ''clone'' vs. a ''copy'' would be very easy since we have > sw> deduplication now > > dedup doesn''t replace the snapshot/clone feature for the > NFS-share-full-of-vmdk use case because there''s no equivalent of > ''zfs rollback'' > > > I''m tempted to say, ``vmware needs to remove their silly limit'''' but > there are takes-three-hours-to-boot problems with thousands of Solaris > NFS exports so maybe their limit is not so silly after all. > > What is the scenario, you have? Is it something like 40 hosts with > live migration among them, and 40 guests on each host? so you need > 1600 filesystems mounted even though only 40 are actually in use? > > ''zfs set sharenfs=absorb <dataset>'' would be my favorite answer, but > lots of people have asked for such a feature, and answer is always > ``wait for mirror mounts'''' (which BTW are actually just-works for me > on very-recent linux, even with plain ''mount host:/fs /fs'', without > saying ''mount -t nfs4'', in spite of my earlier rant complaining they > are not real). Of course NFSv4 features are no help to vmware, but > hypothetically I guess mirror-mounting would work if vmware supported > it, so long as they were careful not to provoke the mounting of guests > not in use. The ``implicit automounter'''' on which the mirror mount > feature''s based would avoid the boot delay of mounting 1600 > filesystems. > > and BTW I''ve not been able to get the Real Automounter in Linux to do > what this implicit one already can with subtrees. Why is it so hard > to write a working automounter? > > The other thing I''ve never understood is, if you ''zfs rollback'' an > NFS-exported filesystem, what happens to all the NFS clients? It > seems like this would cause much worse corruption than the worry when > people give fire-and-brimstone speeches about never disabling > zil-writing while using the NFS server. but it seems to mostly work > anyway when I do this, so I''m probably confused about something.To add to Miles'' comments, what you are trying to accomplish isn''t possible via NFS to ESX, but could be accomplished with iSCSI zvols I believe. If I understand you can thin-provision a zvol and clone it as many times as you wish and present all the clones over iSCSI. Haven''t tried it myself, but would be worth testing. -Ross