Hello, I''d like to check for any guidance about using zfs on iscsi storage appliances. Recently I had an unlucky situation with an unlucky storage machine freezing. Once the storage was up again (rebooted) all other iscsi clients were happy, while one of the iscsi clients (a sun solaris sparc, running Oracle) did not mount the volume marking it as corrupted. I had no way to get back my zfs data: had to destroy and recreate from backups. So I have some questions regarding this nice story: - I remember sysadmins being able to almost always recover data on corrupted ufs filesystems by magic of superblocks. Is there something similar on zfs? Is there really no way to access data of a corrupted zfs filesystem? - In this case, the storage appliance is a legacy system based on linux, so raids/mirrors are managed at the storage side its own way. Being an iscsi target, this volume was mounted as a single iscsi disk from the solaris host, and prepared as a zfs pool consisting of this single iscsi target. ZFS best practices, tell me that to be safe in case of corruption, pools should always be mirrors or raidz on 2 or more disks. In this case, I considered all safe, because the mirror and raid was managed by the storage machine. But from the solaris host point of view, the pool was just one! And maybe this has been the point of failure. What is the correct way to go in this case? - Finally, looking forward to run new storage appliances using OpenSolaris and its ZFS+iscsitadm and/or comstar, I feel a bit confused by the possibility of having a double zfs situation: in this case, I would have the storage zfs filesystem divided into zfs volumes, accessed via iscsi by a possible solaris host that creates his own zfs pool on it (...is it too redundant??) and again I would fall in the same previous case (host zfs pool connected to one only iscsi resource). Any guidance would be really appreciated :) Thanks a lot Gabriele. -- This message posted from opensolaris.org
On Mar 15, 2010, at 10:55 AM, Gabriele Bulfon wrote:> - In this case, the storage appliance is a legacy system based on linux, so raids/mirrors are managed at the storage side its own way. Being an iscsi target, this volume was mounted as a single iscsi disk from the solaris host, and prepared as a zfs pool consisting of this single iscsi target. ZFS best practices, tell me that to be safe in case of corruption, pools should always be mirrors or raidz on 2 or more disks. In this case, I considered all safe, because the mirror and raid was managed by the storage machine. But from the solaris host point of view, the pool was just one! And maybe this has been the point of failure. What is the correct way to go in this case?I''d guess this could be because the iscsi target wasn''t honoring ZFS flush requests.> - Finally, looking forward to run new storage appliances using OpenSolaris and its ZFS+iscsitadm and/or comstar, I feel a bit confused by the possibility of having a double zfs situation: in this case, I would have the storage zfs filesystem divided into zfs volumes, accessed via iscsi by a possible solaris host that creates his own zfs pool on it (...is it too redundant??) and again I would fall in the same previous case (host zfs pool connected to one only iscsi resource).My experience with this is significantly lower end, but I have had iSCSI shares from a ZFS NAS come up as corrupt to the client. It''s fixable if you have snapshots. I''ve been using iSCSI to provide Time Machine targets to OS X boxes. We had a client crash during writing, and upon reboot it showed the iSCSI volume is corrupt. You can put whatever file system you like the iSCSI target obviously. The current OpenSolaris iSCSI implementation I believe uses synchronous writes, so hopefully what happened to you wouldn''t happen in this case. In my case I was using HFS+ (the OS X client has to), and I couldn''t repair the volume. However, with a snapshot I could roll it back. If you plan ahead this should save you some restoration work (you''ll need to be able to roll back all the files that have to be consistent). Good luck, Ware
On Mar 15, 2010, at 10:55 AM, Gabriele Bulfon <gbulfon at sonicle.com> wrote:> Hello, > I''d like to check for any guidance about using zfs on iscsi storage > appliances. > Recently I had an unlucky situation with an unlucky storage machine > freezing. > Once the storage was up again (rebooted) all other iscsi clients > were happy, while one of the iscsi clients (a sun solaris sparc, > running Oracle) did not mount the volume marking it as corrupted. > I had no way to get back my zfs data: had to destroy and recreate > from backups. > So I have some questions regarding this nice story: > - I remember sysadmins being able to almost always recover data on > corrupted ufs filesystems by magic of superblocks. Is there > something similar on zfs? Is there really no way to access data of a > corrupted zfs filesystem? > - In this case, the storage appliance is a legacy system based on > linux, so raids/mirrors are managed at the storage side its own way. > Being an iscsi target, this volume was mounted as a single iscsi > disk from the solaris host, and prepared as a zfs pool consisting of > this single iscsi target. ZFS best practices, tell me that to be > safe in case of corruption, pools should always be mirrors or raidz > on 2 or more disks. In this case, I considered all safe, because the > mirror and raid was managed by the storage machine. But from the > solaris host point of view, the pool was just one! And maybe this > has been the point of failure. What is the correct way to go in this > case? > - Finally, looking forward to run new storage appliances using > OpenSolaris and its ZFS+iscsitadm and/or comstar, I feel a bit > confused by the possibility of having a double zfs situation: in > this case, I would have the storage zfs filesystem divided into zfs > volumes, accessed via iscsi by a possible solaris host that creates > his own zfs pool on it (...is it too redundant??) and again I would > fall in the same previous case (host zfs pool connected to one only > iscsi resource). > > Any guidance would be really appreciated :) > Thanks a lot > Gabriele.What iSCSI target was this? If it was IET I hope you were NOT using the write-back option on it as it caches write data in volatile RAM. IET does support cache flushes, but if you cache in RAM (bad idea) a system lockup or panic will ALWAYS loose data. -Ross
Well, I actually don''t know what implementation is inside this legacy machine. This machine is an AMI StoreTrends ITX, but maybe it has been built around IET, don''t know. Well, maybe I should disable write-back on every zfs host connecting on iscsi? How do I check this? Thx Gabriele. -- This message posted from opensolaris.org
On Mar 15, 2010, at 12:13 PM, Gabriele Bulfon wrote:> Well, I actually don''t know what implementation is inside this legacy machine. > This machine is an AMI StoreTrends ITX, but maybe it has been built around IET, don''t know. > Well, maybe I should disable write-back on every zfs host connecting on iscsi? > How do I check this?I think this would be a property of the NAS, not the clients. --Ware
On Mar 15, 2010, at 12:19 PM, Ware Adams <rwalists at washdcmail.com> wrote:> > On Mar 15, 2010, at 12:13 PM, Gabriele Bulfon wrote: > >> Well, I actually don''t know what implementation is inside this >> legacy machine. >> This machine is an AMI StoreTrends ITX, but maybe it has been built >> around IET, don''t know. >> Well, maybe I should disable write-back on every zfs host >> connecting on iscsi? >> How do I check this? > > I think this would be a property of the NAS, not the clients.Yes, Ware''s right the setting should be on the AMI device. I don''t know what target it''s using either, but if it has an option to disable write-back caching at least then if it doesn''t honor flushing your data should still be safe. -Ross
> Being an iscsi > target, this volume was mounted as a single iscsi > disk from the solaris host, and prepared as a zfs > pool consisting of this single iscsi target. ZFS best > practices, tell me that to be safe in case of > corruption, pools should always be mirrors or raidz > on 2 or more disks. In this case, I considered all > safe, because the mirror and raid was managed by the > storage machine.As far as I understand Best Practises, redundancy needs to be within zfs in order to provide full protection. So, actually Best Practises says that your scenario is rather one to be avoided. Regards, Tonmaus -- This message posted from opensolaris.org
On Mon, Mar 15, 2010 at 9:55 AM, Gabriele Bulfon <gbulfon at sonicle.com>wrote:> Hello, > I''d like to check for any guidance about using zfs on iscsi storage > appliances. > Recently I had an unlucky situation with an unlucky storage machine > freezing. > Once the storage was up again (rebooted) all other iscsi clients were > happy, while one of the iscsi clients (a sun solaris sparc, running Oracle) > did not mount the volume marking it as corrupted. > I had no way to get back my zfs data: had to destroy and recreate from > backups. > So I have some questions regarding this nice story: > - I remember sysadmins being able to almost always recover data on > corrupted ufs filesystems by magic of superblocks. Is there something > similar on zfs? Is there really no way to access data of a corrupted zfs > filesystem? > - In this case, the storage appliance is a legacy system based on linux, so > raids/mirrors are managed at the storage side its own way. Being an iscsi > target, this volume was mounted as a single iscsi disk from the solaris > host, and prepared as a zfs pool consisting of this single iscsi target. ZFS > best practices, tell me that to be safe in case of corruption, pools should > always be mirrors or raidz on 2 or more disks. In this case, I considered > all safe, because the mirror and raid was managed by the storage machine. > But from the solaris host point of view, the pool was just one! And maybe > this has been the point of failure. What is the correct way to go in this > case? > - Finally, looking forward to run new storage appliances using OpenSolaris > and its ZFS+iscsitadm and/or comstar, I feel a bit confused by the > possibility of having a double zfs situation: in this case, I would have the > storage zfs filesystem divided into zfs volumes, accessed via iscsi by a > possible solaris host that creates his own zfs pool on it (...is it too > redundant??) and again I would fall in the same previous case (host zfs pool > connected to one only iscsi resource). > > Any guidance would be really appreciated :) > Thanks a lot > Gabriele. > >To answer the other portion of your question, yes, you can roll back zfs if you''re at the proper version. The procedure is listed below, essentially it will try to find the last known good transaction. If that doesn''t work, your only remaining option is to restore from backup: http://docs.sun.com/app/docs/doc/817-2271/gbctt?l=ja&a=view --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100315/fcf378c4/attachment.html>
On Mar 15, 2010, at 7:11 PM, Tonmaus <sequoiamobil at gmx.net> wrote:>> Being an iscsi >> target, this volume was mounted as a single iscsi >> disk from the solaris host, and prepared as a zfs >> pool consisting of this single iscsi target. ZFS best >> practices, tell me that to be safe in case of >> corruption, pools should always be mirrors or raidz >> on 2 or more disks. In this case, I considered all >> safe, because the mirror and raid was managed by the >> storage machine. > > As far as I understand Best Practises, redundancy needs to be within > zfs in order to provide full protection. So, actually Best Practises > says that your scenario is rather one to be avoided.There is nothing saying redundancy can''t be provided below ZFS just if you want auto recovery you need redundancy within ZFS itself as well. You can have 2 separate raid arrays served up via iSCSI to ZFS which then makes a mirror out of the storage. -Ross
On Mon, Mar 15, 2010 at 9:10 PM, Ross Walker <rswwalker at gmail.com> wrote:> On Mar 15, 2010, at 7:11 PM, Tonmaus <sequoiamobil at gmx.net> wrote: > > Being an iscsi >>> target, this volume was mounted as a single iscsi >>> disk from the solaris host, and prepared as a zfs >>> pool consisting of this single iscsi target. ZFS best >>> practices, tell me that to be safe in case of >>> corruption, pools should always be mirrors or raidz >>> on 2 or more disks. In this case, I considered all >>> safe, because the mirror and raid was managed by the >>> storage machine. >>> >> >> As far as I understand Best Practises, redundancy needs to be within zfs >> in order to provide full protection. So, actually Best Practises says that >> your scenario is rather one to be avoided. >> > > There is nothing saying redundancy can''t be provided below ZFS just if you > want auto recovery you need redundancy within ZFS itself as well. > > You can have 2 separate raid arrays served up via iSCSI to ZFS which then > makes a mirror out of the storage. > > -Ross > >Perhaps I''m remembering incorrectly, but I didn''t think mirroring would auto-heal/recover, I thought that was limited to the raidz* implementations. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100315/1d7a042e/attachment.html>
On Mar 15, 2010, at 11:10 PM, Tim Cook <tim at cook.ms> wrote:> > > On Mon, Mar 15, 2010 at 9:10 PM, Ross Walker <rswwalker at gmail.com> > wrote: > On Mar 15, 2010, at 7:11 PM, Tonmaus <sequoiamobil at gmx.net> wrote: > > Being an iscsi > target, this volume was mounted as a single iscsi > disk from the solaris host, and prepared as a zfs > pool consisting of this single iscsi target. ZFS best > practices, tell me that to be safe in case of > corruption, pools should always be mirrors or raidz > on 2 or more disks. In this case, I considered all > safe, because the mirror and raid was managed by the > storage machine. > > As far as I understand Best Practises, redundancy needs to be within > zfs in order to provide full protection. So, actually Best Practises > says that your scenario is rather one to be avoided. > > There is nothing saying redundancy can''t be provided below ZFS just > if you want auto recovery you need redundancy within ZFS itself as > well. > > You can have 2 separate raid arrays served up via iSCSI to ZFS which > then makes a mirror out of the storage. > > -Ross > > > Perhaps I''m remembering incorrectly, but I didn''t think mirroring > would auto-heal/recover, I thought that was limited to the raidz* > implementations.Mirroring auto-heals, in fact copies=2 on a single disk vdev can auto- heal (if it isn''t a disk failure). -Ross -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100315/4a3fbd4b/attachment.html>