Edward Ned Harvey
2010-Apr-10 15:08 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
Due to recent experiences, and discussion on this list, my colleague and I performed some tests: Using solaris 10, fully upgraded. (zpool 15 is latest, which does not have log device removal that was introduced in zpool 19) In any way possible, you lose an unmirrored log device, and the OS will crash, and the whole zpool is permanently gone, even after reboots. Using opensolaris, upgraded to latest, which includes zpool version 22. (Or was it 23? I forget now.) Anyway, it''s >=19 so it has log device removal. 1. Created a pool, with unmirrored log device. 2. Started benchmark of sync writes, verified the log device getting heavily used. 3. Yank out the log device. Behavior was good. The pool became "degraded" which is to say, it started using the primary storage for the ZIL, performance presumably degraded, but the system remained operational and error free. I was able to restore perfect health by "zpool remove" the failed log device, and "zpool add" a new log device. Next: 1. Created a pool, with unmirrored log device. 2. Started benchmark of sync writes, verified the log device getting heavily used. 3. Yank out both power cords. 4. While the system is down, also remove the log device. (OOoohhh, that''s harsh.) I created a situation where an unmirrored log device is known to have unplayed records, there is an ungraceful shutdown, *and* the device disappears. That''s the absolute worst case scenario possible, other than the whole building burning down. Anyway, the system behaved as well as it possibly could. During boot, the faulted pool did not come up, but the OS came up fine. My "zpool status" showed this: # zpool status pool: junkpool state: FAULTED status: An intent log record could not be read. Waiting for adminstrator intervention to fix the faulted pool. action: Either restore the affected device(s) and run ''zpool online'', or ignore the intent log records by running ''zpool clear''. see: http://www.sun.com/msg/ZFS-8000-K4 scrub: none requested config: NAME STATE READ WRITE CKSUM junkpool FAULTED 0 0 0 bad intent log c8t4d0 ONLINE 0 0 0 c8t5d0 ONLINE 0 0 0 logs c8t3d0 UNAVAIL 0 0 0 cannot open (---------------------------) I know the unplayed log device data is lost forever. So I clear the error, remove the faulted log device, and acknowledge that I have lost the last few seconds of written data, up to the system crash: # zpool clear junkpool # zpool status pool: junkpool state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using ''zpool online''. see: http://www.sun.com/msg/ZFS-8000-2Q scrub: none requested config: NAME STATE READ WRITE CKSUM junkpool DEGRADED 0 0 0 c8t4d0 ONLINE 0 0 0 c8t5d0 ONLINE 0 0 0 logs c8t3d0 UNAVAIL 0 0 0 cannot open # zpool remove junkpool c8t3d0 # zpool status junkpool pool: junkpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM junkpool ONLINE 0 0 0 c8t4d0 ONLINE 0 0 0 c8t5d0 ONLINE 0 0 0 -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100410/8ceab3ce/attachment.html>
Bob Friesenhahn
2010-Apr-10 15:22 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
On Sat, 10 Apr 2010, Edward Ned Harvey wrote:> Using solaris 10, fully upgraded.? (zpool 15 is latest, which does not have log device removal that was > introduced in zpool 19)? In any way possible, you lose an unmirrored log device, and the OS will crash, and > the whole zpool is permanently gone, even after reboots.Is anyone willing to share what zfs version will be included with Solaris 10 U9? Will graceful intent log removal be included? Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
Tim Cook
2010-Apr-10 16:47 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
On Sat, Apr 10, 2010 at 10:08 AM, Edward Ned Harvey <solaris2 at nedharvey.com>wrote:> Due to recent experiences, and discussion on this list, my colleague and > I performed some tests: > > > > Using solaris 10, fully upgraded. (zpool 15 is latest, which does not have > log device removal that was introduced in zpool 19) In any way possible, > you lose an unmirrored log device, and the OS will crash, and the whole > zpool is permanently gone, even after reboots. > > > > Using opensolaris, upgraded to latest, which includes zpool version 22. > (Or was it 23? I forget now.) Anyway, it?s >=19 so it has log device > removal. > > 1. Created a pool, with unmirrored log device. > > 2. Started benchmark of sync writes, verified the log device getting > heavily used. > > 3. Yank out the log device. > > Behavior was good. The pool became ?degraded? which is to say, it started > using the primary storage for the ZIL, performance presumably degraded, but > the system remained operational and error free. > > I was able to restore perfect health by ?zpool remove? the failed log > device, and ?zpool add? a new log device. > > > > Next: > > 1. Created a pool, with unmirrored log device. > > 2. Started benchmark of sync writes, verified the log device getting > heavily used. > > 3. Yank out both power cords. > > 4. While the system is down, also remove the log device. > > (OOoohhh, that?s harsh.) I created a situation where an unmirrored log > device is known to have unplayed records, there is an ungraceful shutdown, * > *and** the device disappears. That?s the absolute worst case scenario > possible, other than the whole building burning down. Anyway, the system > behaved as well as it possibly could. During boot, the faulted pool did not > come up, but the OS came up fine. My ?zpool status? showed this: > > > > # zpool status > > > > pool: junkpool > > state: FAULTED > > status: An intent log record could not be read. > > Waiting for adminstrator intervention to fix the faulted pool. > > action: Either restore the affected device(s) and run ''zpool online'', > > or ignore the intent log records by running ''zpool clear''. > > see: http://www.sun.com/msg/ZFS-8000-K4 > > scrub: none requested > > config: > > > > NAME STATE READ WRITE CKSUM > > junkpool FAULTED 0 0 0 bad intent log > > c8t4d0 ONLINE 0 0 0 > > c8t5d0 ONLINE 0 0 0 > > logs > > c8t3d0 UNAVAIL 0 0 0 cannot open > > > > (---------------------------) > > I know the unplayed log device data is lost forever. So I clear the error, > remove the faulted log device, and acknowledge that I have lost the last few > seconds of written data, up to the system crash: > > > > # zpool clear junkpool > > # zpool status > > > > pool: junkpool > > state: DEGRADED > > status: One or more devices could not be opened. Sufficient replicas exist > for > > the pool to continue functioning in a degraded state. > > action: Attach the missing device and online it using ''zpool online''. > > see: http://www.sun.com/msg/ZFS-8000-2Q > > scrub: none requested > > config: > > > > NAME STATE READ WRITE CKSUM > > junkpool DEGRADED 0 0 0 > > c8t4d0 ONLINE 0 0 0 > > c8t5d0 ONLINE 0 0 0 > > logs > > c8t3d0 UNAVAIL 0 0 0 cannot open > > > > # zpool remove junkpool c8t3d0 > > # zpool status junkpool > > > > pool: junkpool > > state: ONLINE > > scrub: none requested > > config: > > > > NAME STATE READ WRITE CKSUM > > junkpool ONLINE 0 0 0 > > c8t4d0 ONLINE 0 0 0 > > c8t5d0 ONLINE 0 0 0 > > >Awesome! Thanks for letting us know the results of your tests Ed, that''s extremely helpful. I was actually interested in grabbing some of the cheaper intel SSD''s for home use, but didn''t want to waste my money if it wasn''t going to handle the various failure modes gracefully. --Tim -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100410/968bfc2f/attachment.html>
matthew patton
2010-Apr-10 16:56 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
Thanks for the testing. so FINALLY with version > 19 does ZFS demonstrate production-ready status in my book. How long is it going to take Solaris to catch up?
Edward Ned Harvey
2010-Apr-11 12:32 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
> From: Tim Cook [mailto:tim at cook.ms] > > Awesome! ?Thanks for letting us know the results of your tests Ed, > that''s extremely helpful. ?I was actually interested in grabbing some > of the cheaper intel SSD''s for home use, but didn''t want to waste my > money if it wasn''t going to handle the various failure modes > gracefully.I can''t emphasize this enough: zpool >= 19 for unmirrored log device. The only way you will see any benefit is (a) use it as a ''cache'' for reads, or (b) use it as a ''log'' for synchronous writes, bringing the sync write performance closer to the async write performance.
Edward Ned Harvey
2010-Apr-11 12:36 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > > Thanks for the testing. so FINALLY with version > 19 does ZFS > demonstrate production-ready status in my book. How long is it going to > take Solaris to catch up?Oh, it''s been production worthy for some time - Just don''t use unmirrored log devices for zpool < 19. Even in zpool >= 19, don''t use unmirrored primary storage devices. ;-) This log device mirroring thing is just a corner case, where there was ground to be gained <19. One thing I do wish, however: In the event a pool is faulted, I wish you didn''t have to power cycle the machine. Let all the zfs filesystems that are in that pool simply disappear, and when somebody does "zpool status" you can see why.
Richard Elling
2010-Apr-11 16:41 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
On Apr 11, 2010, at 5:36 AM, Edward Ned Harvey wrote:> > In the event a pool is faulted, I wish you didn''t have to power cycle the > machine. Let all the zfs filesystems that are in that pool simply > disappear, and when somebody does "zpool status" you can see why.In general, I agree. How would you propose handling nested mounts? -- richard ZFS storage and performance consulting at http://www.RichardElling.com ZFS training on deduplication, NexentaStor, and NAS performance Las Vegas, April 29-30, 2010 http://nexenta-vegas.eventbrite.com
Edward Ned Harvey
2010-Apr-11 23:03 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
> From: Richard Elling [mailto:richard.elling at gmail.com] > > On Apr 11, 2010, at 5:36 AM, Edward Ned Harvey wrote: > > > > In the event a pool is faulted, I wish you didn''t have to power cycle > the > > machine. Let all the zfs filesystems that are in that pool simply > > disappear, and when somebody does "zpool status" you can see why. > > In general, I agree. How would you propose handling nested mounts?Not sure. What''s the present behavior? I bet at present, if you have good zfs filesystems mounted as subdirs of a pool which fails ... You''re still forced to power cycle, and afterward, I bet they''re not mounted. The behavior of a system after a pool is faulted is ok to suck, but hopefully it could suck less than forcing the power cycle. So if the OS forcibly unmounts all those filesystems after pool fault, without forcing the power cycle, that''s an improvement. Also, if you have processes running that are binaries from within those unmounted filesystems, or even if they just have open file handles in the faulted areas... Even if you kill all those processes -KILL so they die ungracefully, that''s *still* an improvement, because it''s better than power cycling. Heck, even if the faulted pool spontaneously sent the server into an ungraceful reboot, even *that* would be an improvement. At least root will be able to (a) login and (b) do a "zpool status." These are two useful things you can''t presently do. Heck, even "reboot" or "init 6" would be nice additions that are not presently possible. All of the above are not amazingly important as long as you have redundant reliable pools and so forth. Because then faulted pools are so rare. But if you put ZFS onto an external removable disk ... then it''s really easy to accidentally have that disk disappear. Bumped the power cord, or bumped the USB cable, etc. At present, one of my backup strategies is to "zfs send | zfs receive" onto external disk. And it''s annoyingly common that this flakes out for some various reason, and forces me to power cycle the machine. Which I can''t do remotely because I can''t ssh into the machine (can''t even get the login prompt) ... although it responds to ping, and if I happen to have an ssh or vnc session already open, I can type in some commands, as long as I don''t do "zpool" or "zfs" or "df" or anything which attempts to access the faulted pool ... but "reboot" and "init" both fail.
Daniel Carosone
2010-Apr-11 23:48 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
On Sun, Apr 11, 2010 at 07:03:29PM -0400, Edward Ned Harvey wrote:> Heck, even if the faulted pool spontaneously sent the server into an > ungraceful reboot, even *that* would be an improvement.Please look at the pool property "failmode". Both of the preferences you have expressed are available, as well as the default you seem so unhappy with. The other part of the issue, when failmode is set to the default "wait", relates to lower-level drivers and subsystems recovering reliably to things like removable disks reappearing after removal. There''s surely room for improvement in some cases there, and perhaps your specific chipsets and configuration is one of those, making the problem worse for you. -- Dan. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 194 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100412/745d0c97/attachment.bin>
Edward Ned Harvey
2010-Apr-12 01:42 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
> From: Daniel Carosone [mailto:dan at geek.com.au] > > Please look at the pool property "failmode". Both of the preferences > you have expressed are available, as well as the default you seem so > unhappy with.I ... did not know that. :-) Thank you.
Miles Nordin
2010-Apr-12 19:29 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
>>>>> "re" == Richard Elling <richard.elling at gmail.com> writes: >>>>> "dc" == Daniel Carosone <dan at geek.com.au> writes:re> In general, I agree. How would you propose handling nested re> mounts? force-unmount them. (so that they can be manually mounted elsewhere, if desired, or even in the same place with the middle filesystem missing and empty directories in between. In the latter case the nfs fsid should stay the same so that hard-mounted clients can continue once a sysadmin forces the remount. Remember, hard-mounted NFS clients will do this even hours or days later, and this behavior can be extremely useful to a batch cluster thqat''s hard to start, or even just someone who doesn''t want to lose his last edit.) And make force-mounting actually work like it does on Mac OS. dc> Please look at the pool property "failmode". It doesn''t work, though. We''ve been over this. Failmode applies after it''s decided that the drive is failed, but it can take an arbitrary time---minutes, hours, or forever---for an underlying driver to report that a drive is failed up to ZFS, and until then (a) you get ``wait'''' no matter what you picked, and (b) commands like ''zpool status'' hang for all pools, where in a resiliently-designed system they would hang for no pools especially not the pool affected by the unresponsive device. One might reasonably want a device state like HUNG or SLOW or >30SEC in ''zpool status'', along with the ability to ''zpool offline'' any device at any time and, when doing so, cancel all outstanding commands to that device to zfs''s view as if they''d gotten failures from the driver even though they''re still waiting for responses from the driver. That device state doesn''t exist partly because ''zpool status'' isn''t meant to work well enough to ever return such a state. ''failmode'' is not a real or complete answer so long as we agree it''s reasonable to expect maintenance commands to work all the time and not freeze up for intervals of 180sec - <several hours> - <forever>. I understand most Unixes do act this way, not just Solaris, but it''s really not good enough. dc> The other part of the issue, when failmode is set to the dc> default "wait", relates to lower-level drivers and subsystems dc> recovering reliably to things like removable disks reappearing dc> after removal. There''s surely room for improvement in some dc> cases there, and perhaps your specific chipsets How do you handle the case when a hotplug SATA drive is powered off unexpectedly with data in its write cache? Do you replay the writes, or do they go down the ZFS hotplug write hole? I don''t think this side of the issue is dependent on ``specific chipsets''''. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20100412/72c69abc/attachment.bin>
Carson Gaspar
2010-Apr-12 19:48 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
Miles Nordin wrote:>>>>>> "re" == Richard Elling <richard.elling at gmail.com> writes:> How do you handle the case when a hotplug SATA drive is powered off > unexpectedly with data in its write cache? Do you replay the writes, > or do they go down the ZFS hotplug write hole?If zfs never got a positive response to a cache flush, that data is still in memory and will be re-written. Unless I greatly misunderstand how ZFS works... If the drive _lies_ about a cache flush, you''re screwed (well, you can probably roll back a few TXGs...). Don''t buy broken drives / bridge chipsets. -- Carson
Carson Gaspar
2010-Apr-12 20:32 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
Carson Gaspar wrote:> Miles Nordin wrote: >>>>>>> "re" == Richard Elling <richard.elling at gmail.com> writes: > >> How do you handle the case when a hotplug SATA drive is powered off >> unexpectedly with data in its write cache? Do you replay the writes, >> or do they go down the ZFS hotplug write hole? > > If zfs never got a positive response to a cache flush, that data is > still in memory and will be re-written. Unless I greatly misunderstand > how ZFS works... > > If the drive _lies_ about a cache flush, you''re screwed (well, you can > probably roll back a few TXGs...). Don''t buy broken drives / bridge > chipsets.Hrm... thinking about this some more, I''m not sure what happens if the drive comes _back_ after a power loss, quickly enough that ZFS is never told about the disappearance (assuming that can happen without a human cfgadm''ing it back online - I don''t know). Does anyone who understands the internals better than care to take a stab at what happens if: - ZFS writes data to /dev/foo - /dev/foo looses power and the data from the above write, not yet flushed to rust (say a field tech pulls the wrong drive...) - /dev/foo powers back on (field tech quickly goes whoops and plugs it back in) In the case of a redundant zpool config, when will ZFS notice the uberblocks are out of sync and repair? If this is a non-redundant zpool, how does the response differ? -- Carson
Edward Ned Harvey
2010-Apr-12 23:10 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
> Carson Gaspar wrote: > > Does anyone who understands the internals better than care to take a > stab at what happens if: > > - ZFS writes data to /dev/foo > - /dev/foo looses power and the data from the above write, not yet > flushed to rust (say a field tech pulls the wrong drive...) > - /dev/foo powers back on (field tech quickly goes whoops and plugs it > back in)I can''t answer as an "internals" guy, but I can say this: I accidentally knocked the power off my external drive, which contains a pool. I quickly reconnected it. A few days later I noticed the machine was essentially nonresponsive, and had to power cycle it. It is possible that something else happened in the meantime, to put the machine into a bad state, but at least it''s highly suspect that this happened after I kicked the power. I never tested this scientifically.
R. Eulenberg
2010-Jun-14 08:10 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
Hello I even have this problem on my system. I lost my backup server crashing the system-hd and the ZIL-device. After setting up a new system (osol 2009.06 and updating to the latest osol/dev version with zpool-dedup) I tried to import my backup pool, but I can''t. The system tells me there isn''t any zpool tank1 trying to replace / detache / attach / add any kind of device or answers this: zpool import -f tank1 cannot import ''tank1'': one or more devices is currently unavailable Destroy and re-create the pool from a backup source. While using options -F, -X, -V, -C, -D and any combination of them the same reaction comes from the system. There are some solutions for problems in which the old cachefile is available or the ZIL-device isn''t destroyed, but for my case there isn''t anyone. I need a way importing the zpool by ignoring the ZIL-device. I spend a week searching in the net, but didn''t found something. For some help I would be very glad. regards Ronny P.S. hoping you excuse my lousy English.
Arne Jansen
2010-Jun-29 08:58 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
Edward Ned Harvey wrote:> Due to recent experiences, and discussion on this list, my colleague and > I performed some tests: > > Using solaris 10, fully upgraded. (zpool 15 is latest, which does not > have log device removal that was introduced in zpool 19) In any way > possible, you lose an unmirrored log device, and the OS will crash, and > the whole zpool is permanently gone, even after reboots. >I''m a bit confused. I tried hard, but haven''t been able to reproduce this using Sol10U8. I have a mirrored slog device. While putting it under load doing synchronous file creations, we pulled the power cords and unplugged the slog devices. After powering on zfs imported the pool, but prompted to acknowledge the missing slog devices with zpool clear. After that the pool was accessible again. That''s exactly how it should be. What am I doing wrong here? The system is on a different pool using different disks. One peculiarity I noted though: when pulling both slog devices from the running machine, zpool status reports 1 file error. In my understanding this should not happen as the file data is written from memory and not from the contents of the zil. It seems the reported write error from the slog device somehow lead to a corrupted file. Thanks, Arne
Edward Ned Harvey
2010-Jun-30 16:47 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
> From: Arne Jansen [mailto:sensille at gmx.net] > > Edward Ned Harvey wrote: > > Due to recent experiences, and discussion on this list, my colleague > and > > I performed some tests: > > > > Using solaris 10, fully upgraded. (zpool 15 is latest, which does > not > > have log device removal that was introduced in zpool 19) In any way > > possible, you lose an unmirrored log device, and the OS will crash, > and > > the whole zpool is permanently gone, even after reboots. > > > > I''m a bit confused. I tried hard, but haven''t been able to reproduce > this > using Sol10U8. I have a mirrored slog device. While putting it > under load doing synchronous file creations, we pulled the power cords > and unplugged the slog devices. After powering on zfs imported the > pool, > but prompted to acknowledge the missing slog devices with zpool clear. > After that the pool was accessible again. That''s exactly how it should > be.Very interesting. I did this test some months ago, so I may not recall the relevant details, but here are the details I do remember: I don''t recall if I did this test on osol2009.06, or sol10. In Sol10u6 (and I think Sol10u8) the default zpool version is 10, but if you apply all your patches, then 15 becomes available. I am sure that I''ve never upgraded any of my sol10 zpools higher than 10. So it could be that an older zpool version might exhibit the problem, and you might be using a newer version. In osol2009.06, IIRC, the default is zpool 14, and if you upgrade fully, you''ll get to something around 24. So again, it''s possible the bad behavior went away in zpool 15, or any other number from 11 to 15. I''ll leave it there for now. If that doesn''t shed any light, I''ll try to dust out some more of my mental cobwebs.
Ray Van Dolson
2010-Jun-30 16:54 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
On Wed, Jun 30, 2010 at 09:47:15AM -0700, Edward Ned Harvey wrote:> > From: Arne Jansen [mailto:sensille at gmx.net] > > > > Edward Ned Harvey wrote: > > > Due to recent experiences, and discussion on this list, my colleague > > and > > > I performed some tests: > > > > > > Using solaris 10, fully upgraded. (zpool 15 is latest, which does > > not > > > have log device removal that was introduced in zpool 19) In any way > > > possible, you lose an unmirrored log device, and the OS will crash, > > and > > > the whole zpool is permanently gone, even after reboots. > > > > > > > I''m a bit confused. I tried hard, but haven''t been able to reproduce > > this > > using Sol10U8. I have a mirrored slog device. While putting it > > under load doing synchronous file creations, we pulled the power cords > > and unplugged the slog devices. After powering on zfs imported the > > pool, > > but prompted to acknowledge the missing slog devices with zpool clear. > > After that the pool was accessible again. That''s exactly how it should > > be. > > Very interesting. I did this test some months ago, so I may not recall the > relevant details, but here are the details I do remember: > > I don''t recall if I did this test on osol2009.06, or sol10. > > In Sol10u6 (and I think Sol10u8) the default zpool version is 10, but if you > apply all your patches, then 15 becomes available. I am sure that I''ve > never upgraded any of my sol10 zpools higher than 10. So it could be that > an older zpool version might exhibit the problem, and you might be using a > newer version. > > In osol2009.06, IIRC, the default is zpool 14, and if you upgrade fully, > you''ll get to something around 24. So again, it''s possible the bad behavior > went away in zpool 15, or any other number from 11 to 15. > > I''ll leave it there for now. If that doesn''t shed any light, I''ll try to > dust out some more of my mental cobwebs.Anyone else done any testing with zpool version 15 (on Solaris 10 U8)? Have a new system coming in shortly and will test myself, but knowing this is a recoverable scenario would help me rest easier as I have an unmirrored slog setup hanging around still. Ray
Ragnar Sundblad
2010-Jun-30 20:28 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
On 12 apr 2010, at 22.32, Carson Gaspar wrote:> Carson Gaspar wrote: >> Miles Nordin wrote: >>>>>>>> "re" == Richard Elling <richard.elling at gmail.com> writes: >>> How do you handle the case when a hotplug SATA drive is powered off >>> unexpectedly with data in its write cache? Do you replay the writes, or do they go down the ZFS hotplug write hole? >> If zfs never got a positive response to a cache flush, that data is still in memory and will be re-written. Unless I greatly misunderstand how ZFS works... >> If the drive _lies_ about a cache flush, you''re screwed (well, you can probably roll back a few TXGs...). Don''t buy broken drives / bridge chipsets. > > Hrm... thinking about this some more, I''m not sure what happens if the drive comes _back_ after a power loss, quickly enough that ZFS is never told about the disappearance (assuming that can happen without a human cfgadm''ing it back online - I don''t know). > > Does anyone who understands the internals better than care to take a stab at what happens if: > > - ZFS writes data to /dev/foo > - /dev/foo looses power and the data from the above write, not yet flushed to rust (say a field tech pulls the wrong drive...) > - /dev/foo powers back on (field tech quickly goes whoops and plugs it back in) > > In the case of a redundant zpool config, when will ZFS notice the uberblocks are out of sync and repair? If this is a non-redundant zpool, how does the response differ?To be safe, the protocol needs to be able to discover that the devices (host or disk) has been disconnected and reconnected or has been reset and that either parts assumptions about the state of the other has to be invalidated. I don''t know enough about either SAS or SATA to say if they guarantee that you will be noticed. But if they don''t, they aren''t safe for cached writes. /ragge
Garrett D''Amore
2010-Jun-30 20:46 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
On Wed, 2010-06-30 at 22:28 +0200, Ragnar Sundblad wrote:> To be safe, the protocol needs to be able to discover that the devices > (host or disk) has been disconnected and reconnected or has been reset > and that either parts assumptions about the state of the other has to > be invalidated. > > I don''t know enough about either SAS or SATA to say if they guarantee that > you will be noticed. But if they don''t, they aren''t safe for cached writes.Generally, ZFS will only notice a removed disk when it is trying to write to it -- or when it probes. ZFS does not necessarily get notified on hot device removal -- certainly not immediately. (I''ve written some code so that *will* notice, even if no write ever goes there... that''s the topic of another message.) The other thing is that disk writes are generally idempotent. So, if a drive was removed between the time an IO was finished but before the time the response was returned to the host, it isn''t a problem. When the disk is returned, ZFS should automatically retry the I/O. (In fact, ZFS automatically retries failed I/O operations several times before finally "failing".) The nasty race that occurs is if your system crashes or is powered off *after* the log has acknowledged the write, but before the bits get shoved to main pool storage. This is a data loss situation. But assuming you don''t take a system crash or some other fault, I would guess that removal of a log device and reinsertion would not cause any problems. (Except for possibly delaying synchronous writes.) That said, I''ve not actually *tested* it. - Garrett> > /ragge > > _______________________________________________ > zfs-discuss mailing list > zfs-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >
Ragnar Sundblad
2010-Jun-30 22:14 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
On 30 jun 2010, at 22.46, Garrett D''Amore wrote:> On Wed, 2010-06-30 at 22:28 +0200, Ragnar Sundblad wrote: > >> To be safe, the protocol needs to be able to discover that the devices >> (host or disk) has been disconnected and reconnected or has been reset >> and that either parts assumptions about the state of the other has to >> be invalidated. >> >> I don''t know enough about either SAS or SATA to say if they guarantee that >> you will be noticed. But if they don''t, they aren''t safe for cached writes. > > Generally, ZFS will only notice a removed disk when it is trying to > write to it -- or when it probes. ZFS does not necessarily get notified > on hot device removal -- certainly not immediately.That should be fine, as soon as it is informed on the next access.> (I''ve written some > code so that *will* notice, even if no write ever goes there... that''s > the topic of another message.) > > The other thing is that disk writes are generally idempotent. So, if a > drive was removed between the time an IO was finished but before the > time the response was returned to the host, it isn''t a problem. When > the disk is returned, ZFS should automatically retry the I/O. (In fact, > ZFS automatically retries failed I/O operations several times before > finally "failing".)I was referring to the case where zfs has written data to the drive but still hasen''t issued a cache flush, and before the cache flush the drive is reset. If zfs finally issues a cache flush and then isn''t informed that the drive has been reset, data is lost. I hope this is not the case, on any SCSI-based protocol or SATA.> The nasty race that occurs is if your system crashes or is powered off > *after* the log has acknowledged the write, but before the bits get > shoved to main pool storage. This is a data loss situation.With "log", do you mean the ZIL (with or without a slog device)? If so, that should not be an issue and is exactly with the ZIL is for - it will be replayed at the next filesystem attach and the data will be pushed to the main pool storage. Do I misunderstand you? /ragge
Carson Gaspar
2010-Jul-01 00:08 UTC
[zfs-discuss] What happens when unmirrored ZIL log device is removed ungracefully
Ragnar Sundblad wrote:> I was referring to the case where zfs has written data to the drive but > still hasen''t issued a cache flush, and before the cache flush the drive > is reset. If zfs finally issues a cache flush and then isn''t informed > that the drive has been reset, data is lost. > > I hope this is not the case, on any SCSI-based protocol or SATA. > >> The nasty race that occurs is if your system crashes or is powered off >> *after* the log has acknowledged the write, but before the bits get >> shoved to main pool storage. This is a data loss situation. > > With "log", do you mean the ZIL (with or without a slog device)? > If so, that should not be an issue and is exactly with the ZIL > is for - it will be replayed at the next filesystem attach and the > data will be pushed to the main pool storage. Do I misunderstand you?See your case above - written, ack''d, but not cache flushed. We''re talking about the same thing. -- Carson