I have a filer running Opensolaris (snv_111b) and I am presenting a iSCSI share from a RAIDZ pool. I want to run ZFS on the share at the client. Is it necessary to create a mirror or use ditto blocks at the client to ensure ZFS can recover if it detects a failure at the client? Thanks, Bruin
On Jan 5, 2011, at 7:49 AM, Bruins wrote:> I have a filer running Opensolaris (snv_111b) and I am presenting a iSCSI share from a RAIDZ pool. I want to run ZFS on the share at the client. Is it necessary to create a mirror or use ditto blocks at the client to ensure ZFS can recover if it detects a failure at the client?The rule is: protect your data first, then worry about the other ways to protect your data. Also, be aware that b111 has what I call, "the iSCSI performance pit of hell." You will not get good iSCSI operation from that release. I''d advise that you move forward or backward a few releases or look at one of the more modern releases (b111 is almost 2 years old). I am biased towards NexentaStor, but there are several other distributions that can work for you. -- richard
> From: zfs-discuss-bounces at opensolaris.org [mailto:zfs-discuss- > bounces at opensolaris.org] On Behalf Of Bruins > > I have a filer running Opensolaris (snv_111b) and I am presenting a > iSCSI share from a RAIDZ pool. I want to run ZFS on the share at the > client. Is it necessary to create a mirror or use ditto blocks at the > client to ensure ZFS can recover if it detects a failure at the client?You might consider using NFS at the client instead of iscsi with another layer of ZFS. Performance is generally about the same, but pool management is generally easier. But if you''re sure... As long as you have failmode=continue (which is default) then if the server disappears for some reason, the client will basically just pause IO to the pool until the pool reappears, and then continue as if nothing went wrong. You are asking a very intelligent question though. At first blush, it would appear to be possible for the client to detect a checksum error, and then due to lack of redundancy, be unable to correct it. Fortunately that''s not possible (see below) but if it were, you would have to scrub at the server and then scrub or clear at the client. But that''s precisely why it''s an impossible situation. In order for the client to see a checksum error, it must have read some corrupt data from the pool storage, but the server will never allow that to happen. So the short answer is No. You don''t need to add the redundancy at the client, unless you want the client to continue working (without pause) in the event the server is unavailable. The only possible way for the client to see data corruption which the server didn''t see would be if there was a temporary transport error. Such an error would be transient, and disappear simply by re-reading, scrubbing, or clearing the pool clientside.
On Wed, 5 Jan 2011, Edward Ned Harvey wrote:> > You are asking a very intelligent question though. At first blush, it would > appear to be possible for the client to detect a checksum error, and then > due to lack of redundancy, be unable to correct it. Fortunately that''s not > possible (see below) but if it were, you would have to scrub at the server > and then scrub or clear at the client. > > But that''s precisely why it''s an impossible situation. In order for the > client to see a checksum error, it must have read some corrupt data from the > pool storage, but the server will never allow that to happen. So the short > answer is No. You don''t need to add the redundancy at the client, unless > you want the client to continue working (without pause) in the event the > server is unavailable.I don''t agree with the above. It is quite possible for the server or network to cause an error. Computers are not error free. Network checksum algorithms are not perfect. ECC memory is not perfect. OS kernel''s and CPUs are not perfect. The probability of data error goes down quite a lot due to zfs on the server, but the probability is not zero. If zfs on the server detects an error in its data, then it won''t allow the client to read that known bad data. This does not help the client produce good data, except for via backups, or a LUN on a redundant pool. However, it should also be said that the client is also not error free and at some point the return diminishes enough that improving server reliability does not help much client reliability. Bob -- Bob Friesenhahn bfriesen at simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/
> From: Bob Friesenhahn [mailto:bfriesen at simple.dallas.tx.us] > > > > But that''s precisely why it''s an impossible situation. In order for the > > client to see a checksum error, it must have read some corrupt data from > the > > pool storage, but the server will never allow that to happen. So theshort> > answer is No. You don''t need to add the redundancy at the client,unless> > you want the client to continue working (without pause) in the event the > > server is unavailable. > > I don''t agree with the above. It is quite possible for the server or > network to cause an error. Computers are not error free. NetworkI agree with Bob. When I said "impossible," of course that''s unrealistic. But the conclusion remains the same: Redundancy is not needed at the client, because any data corruption the client could possibly see from the server would be transient and self-correcting. Out of curiosity ... Let''s suppose ZFS reads some corrupt data from a device (in this case an iscsi target). Does ZFS immediately mark it as a checksum error without retrying? Or does ZFS attempt to re-read the data first? As long as a re-read is attempted, the probability of the client experiencing any checksum error at all would be very very low.
On Thu, Jan 6, 2011 at 5:33 AM, Edward Ned Harvey <opensolarisisdeadlongliveopensolaris at nedharvey.com> wrote:> But the conclusion remains the same: ?Redundancy is not needed at the > client, because any data corruption the client could possibly see from the > server would be transient and self-correcting.Weren''t you just chastising someone else for not using redundancy over iSCSI? The rules don''t really change for zfs-backed iSCSI disks vs. SAN iSCSI. If you don''t let the client (initiator) manage redundancy, then there is no way for it to recover if there is a network, memory, or other error. -B -- Brandon High : bhigh at freaks.com
> From: Brandon High [mailto:bhigh at freaks.com] > > On Thu, Jan 6, 2011 at 5:33 AM, Edward Ned Harvey > <opensolarisisdeadlongliveopensolaris at nedharvey.com> wrote: > > But the conclusion remains the same: ?Redundancy is not needed at the > > client, because any data corruption the client could possibly see fromthe> > server would be transient and self-correcting. > > Weren''t you just chastising someone else for not using redundancy over > iSCSI?I wouldn''t say chastising... But yes. But that was different. The difference is whether or not the iscsi target is using ZFS. If the iscsi target is a typical SAN made of typical hardware raid, then there is no checksumming happening at the per-disk per-block level, and the raid redundancy only protects against hardware-detected complete disk failure. Any data corruption undetected by hardware is uncorrectable by software in that case. The situation is much better when your iscsi target is in fact a ZFS server. Because if there''s a checksum error on a disk, it''s detected and correctable by ZFS. So the iscsi initiator will not see any corrupt data. The point that I keep emphasizing is: Let ZFS manage your raid. No hardware raid. As mentioned, sure there''s always the possibility of an error being introduced in the network between initiator & target, but ultimately the nonvolatile storage is disk, which has good data. So the possibility of transient network errors, at least for me, is much less risky than the possibility of undetected error on disk.