I have 2 questions regarding using ZFS with disk arrays. Will there be any recommended practices with allocation of storage for use with ZFS on Hitachi, EMC, etc. arrays (that would be specific for zfs), or is that even necessary? As a simple example, with our databases, we generally make sure the redo logs are placed on separate raid groups from the table spaces when on Hitachi arrays, or more specifically, try to not mix different access patterns (random r/w, sequential r/w, etc) together within the same raid group. The other question is will there be any hooks that will allow zfs pools to be duplicated using utilities such as ShadowImage or TimeFinder? We use those tools to quickly replicate some databases for reporting purposes (so we can have a dedicated database instance for reporting). Since all the database files reside on vxvm volumes, this causes headaches since the tools do not integrate with vxvm (without a rather expensive addon). This message posted from opensolaris.org
Jason King wrote:> I have 2 questions regarding using ZFS with disk arrays. > > Will there be any recommended practices with allocation of storage for use > with ZFS on Hitachi, EMC, etc. arrays (that would be specific for zfs), or is > that even necessary? As a simple example, with our databases, we generally > make sure the redo logs are placed on separate raid groups from the table > spaces when on Hitachi arrays, or more specifically, try to not mix different > access patterns (random r/w, sequential r/w, etc) together within the same > raid group.As long as you have two paths to your storage, you should be fine. Now as to your specific question regarding placement -- that''s not your problem when you''re using ZFS because the filesystem figures out where to put everything in a highly optimal manner. You can certainly create different pools of luns from your array and tell your db to use those... it''s up to you. But generally you shouldn''t need to worry. Torrey and I are scoping out a blueprint on zfs best practices -- this should be covered in it.> The other question is will there be any hooks that will allow zfs pools to be > duplicated using utilities such as ShadowImage or TimeFinder? We use those > tools to quickly replicate some databases for reporting purposes (so we can > have a dedicated database instance for reporting). Since all the database > files reside on vxvm volumes, this causes headaches since the tools do not > integrate with vxvm (without a rather expensive addon).I _think_ that''s on a roadmap somewhere, but I really don''t know. Jeff/Bill/Eric/team-zfs.... is this logged as an RFE already? best regards, James C. McPherson -- Solaris Datapath Engineering Data Management Group Sun Microsystems
Jason King wrote:> I have 2 questions regarding using ZFS with disk arrays. > > Will there be any recommended practices with allocation of storage for use with ZFS on Hitachi, EMC, etc. arrays (that would be specific for zfs), or is that even necessary? As a simple example, with our databases, we generally make sure the redo logs are placed on separate raid groups from the table spaces when on Hitachi arrays, or more specifically, try to not mix different access patterns (random r/w, sequential r/w, etc) together within the same raid group. >I would think you would want to create different pools if you required different performance specs per fs. Or perhaps RAS characteristics? (SATA pool for temp data, FC pool for .... ) However, make sure to read the thread concerning how most random writes will get pushed into a sequential mode. A lot of of the previous fs tuning guidelines just went out the door with ZFS ... and that''s a good thing for the most part. The bad part is going to be re-education - "No really please drink this kool-aid. It''s made with real sugar!" - and the parts that have actually changed bit.> The other question is will there be any hooks that will allow zfs pools to be duplicated using utilities such as ShadowImage or TimeFinder? We use those tools to quickly replicate some databases for reporting purposes (so we can have a dedicated database instance for reporting). Since all the database files reside on vxvm volumes, this causes headaches since the tools do not integrate with vxvm (without a rather expensive addon).Now that''s a good question and one I''m not sure of the answer to, so hopefully someone will jump in here, but in the past you''ve had to lockfs, take your snapshot, and unlock to ensure consistency. Since ZFS is always consistent you may be able to just take the snap and replicate it without locking the fs but I think there would be a need to complete any in transit writes before you did.
Dear List, ZFS is taking off and that''s good. With that, questions start arising and I have a few after visiting an Academic Research Institute yesterday. Please provide me with the insights or better: the sources to go look myself. Situation: customer is migrating to an EMC based storage solution (done deal, nothing we can change on that) and is now looking at ZFS to decide if this provides extra value when used on top of EMC: Here are the questions: 1. How many snapshots can I take AND keep with ZFS? The EMC box has a limit of 8 snapshot that can be kept. 2. Can ZFS snapshot be used read/write while still preserving the current state? I mean (example): - At time t customer takes a snapshot - at time (t + 1 hour) customer decides he wants to go back to the situation from time t - He wants: * to preserve the status as it is on time (t + 1 hour) * start using the snapshot taken at time t READ/WRITE Is this possible? 3. What is the performance penalty when taking a snapshot? is this xxx % (in CPY time) per snapshot? What is xxx? Or is the penalty independent of the number of snapshots taken? 4. I thought the EMC has a "switch" that tells it that a Sun [Solaris] server is connected for a certain volume; and apparently the EMC box has knowledge about our UFS filesystem in its firmware. I am not sure this is true. But if it is: does this mean that ZFS cannot work with EMC storage (because the EMC has no ZFS knowledge)?? Many thanks for any input! Regards, >] Bartm [< -- Bart Muijzer Email: bart.muyzer at sun.com Solution Architect & OS Ambassador Tel: +31-33-4515218; Fax: +31-33-4515001 CS/Data Center & Data Mgt Practice Intranet: http://webhome.holland/bartm Sun Microsystems Nederland BV Internet: http://www.muijzer.com/
>Here are the questions: > >1. How many snapshots can I take AND keep with ZFS? > The EMC box has a limit of 8 snapshot that can be kept.No limits (snapshots are cheap and instantaneous)>2. Can ZFS snapshot be used read/write while still preserving the current > state? I mean (example): > - At time t customer takes a snapshot > - at time (t + 1 hour) customer decides he wants to go back to the > situation from time t > - He wants: > * to preserve the status as it is on time (t + 1 hour) > * start using the snapshot taken at time t READ/WRITE > Is this possible?I think so, but I''ll let the expert answer that one.>3. What is the performance penalty when taking a snapshot? > is this xxx % (in CPY time) per snapshot? What is xxx? > Or is the penalty independent of the number of snapshots taken?Free. Because ZFS is copy on write, snapshots are actually cheaper than not making snapshots, because no disk space needs to be freed. All a snapshot is, is keeping a reference to the ZFS uberblock at the time of the snapshot.>4. I thought the EMC has a "switch" that tells it that a Sun [Solaris] > server is connected for a certain volume; and apparently the EMC box > has knowledge about our UFS filesystem in its firmware.Argh.> I am not sure this is true. But if it is: does this mean that ZFS cannot > work with EMC storage (because the EMC has no ZFS knowledge)??I''m sure they have a "dumb storage" switch. Casper
On Fri, Nov 18, 2005 at 02:50:04PM +0100, Casper.Dik at sun.com wrote:> >2. Can ZFS snapshot be used read/write while still preserving the current > > state? I mean (example): > > - At time t customer takes a snapshot > > - at time (t + 1 hour) customer decides he wants to go back to the > > situation from time t > > - He wants: > > * to preserve the status as it is on time (t + 1 hour) > > * start using the snapshot taken at time t READ/WRITE > > Is this possible? > > I think so, but I''ll let the expert answer that one.Absolutely. The functionality is called "clones", which are really writable snapshots. Read the info on our external web page and let us know if you have any further questions. --Bill
On Fri, Nov 18, 2005 at 02:50:04PM +0100, Casper.Dik at sun.com wrote:> > > >2. Can ZFS snapshot be used read/write while still preserving the current > > state? I mean (example): > > - At time t customer takes a snapshot > > - at time (t + 1 hour) customer decides he wants to go back to the > > situation from time t > > - He wants: > > * to preserve the status as it is on time (t + 1 hour) > > * start using the snapshot taken at time t READ/WRITE > > Is this possible? > > I think so, but I''ll let the expert answer that one. >Sort of. See ''zfs rollback'' will revert a filesystem to a previous snapshot state. However, this will destroy any changes in the live filesystem. You can get the behavior you describe using ''zfs clone'', though it will require renaming a new. There''s an open RFE for ''clone swap'', which would take an arbitrary clone, make it the live filesystem, and preserve the live filesystem as a clone of the original (thereby switching their positions). - Eric -- Eric Schrock, Solaris Kernel Development http://blogs.sun.com/eschrock
Casper.Dik at sun.com wrote:> >> 4. I thought the EMC has a "switch" that tells it that a Sun [Solaris] >> server is connected for a certain volume; and apparently the EMC box >> has knowledge about our UFS filesystem in its firmware. >> > > Argh. >It, like man storage arrays, have switches that control LUN discovery and registration. I know of nothing in an EMC box - Granted I''m not an expert - that deals with UFS. There may be some cache parameters you set but in most cases it''s just the VPD information that gets changed around. The 99x0 is the same way.
>Casper.Dik at sun.com wrote: >> >>> 4. I thought the EMC has a "switch" that tells it that a Sun [Solaris] >>> server is connected for a certain volume; and apparently the EMC box >>> has knowledge about our UFS filesystem in its firmware. >>> >> >> Argh. >> > >It, like man storage arrays, have switches that control LUN discovery >and registration. I know of nothing in an EMC box - Granted I''m not an >expert - that deals with UFS. There may be some cache parameters you set >but in most cases it''s just the VPD information that gets changed >around. The 99x0 is the same way.Pfew.. Casper
Casper.Dik at Sun.COM wrote:>> Casper.Dik at sun.com wrote: >> >>>> 4. I thought the EMC has a "switch" that tells it that a Sun [Solaris] >>>> server is connected for a certain volume; and apparently the EMC box >>>> has knowledge about our UFS filesystem in its firmware. >>>> >>>> >>> Argh. >>> >>> >> It, like man storage arrays, have switches that control LUN discovery >> and registration. I know of nothing in an EMC box - Granted I''m not an >> expert - that deals with UFS. There may be some cache parameters you set >> but in most cases it''s just the VPD information that gets changed >> around. The 99x0 is the same way. >> > > > Pfew.. >I''d love to get every other OS vendor in the world to support SCSI IEEE Registered Extended LUN discovery and all use the same VPD settings but there are only so many hours in the day. :-)
As others have mentioned, snapshots are unlimited and efficient, and you can make a read/write "copy" of a snapshot with ''zfs clone'' (and the forthcoming ''clone swap'' may be useful in your situation as well). <p>You (or your customer) may find my <a href=http://blogs.sun.com/roller/page/ahrens?entry=is_it_magic>blog entry on how snapshots are implemented</a> enlightening. This message posted from opensolaris.org