I currently use a Linux server w/ 8 disks (approx. 3Tb) as an NFS/SAMBA/netatalk fileserver. I have a matching Linux server as a backup, using nightly rsync jobs to keep the backup current. I use LVM, so replacing a disk is doable, but kind of a pain. Also, I really wanted to use encrypted disks, but found that even NFS speeds were reduced on my hardware. So, the idea I have now is to use my Linux machines to encrypt the disks and export them as iSCSI LUNs, and use a third server running opensolaris to create a ZFS pool from the iSCSI LUNs. Assuming the iSCSI traffic is on a separate network, this seems like it should accomplish my goals of making adding disk easier and ensuring everything on disk is always encrypted. Does this plan make sense? Any recommendations on how best to use the disk I have to ensure I have a safe backup strategy? -- This message posted from opensolaris.org
>>>>> "mp" == Matthew Plumb <solaris at reality-based.com> writes:mp> how best to use the disk I have to ensure I have a safe backup mp> strategy? continue using rsync between a ZFS pool and an LVM2 pool. At the very least, have two ZFS pools. For ZFS over iSCSI, have some zpool-layer redundancy because ZFS seems to be far more vulnerable to corruption if the redundancy is below the iSCSI layer, especially when the iSCSI targets reboot and ZFS does not. I get some strange livelock-ish behavior with heavily-loaded Linux IET targets, so set up something that you can test, but something you can back out of if, after a month or two, you find it''s not stable. let me know how it goes. I want to try dm_crypt under iSCSI, too, as soon as my VIA board finally arrives. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080907/1f937808/attachment.bin>
I was looking into something like that last year, I was mirroring two iSCSI drives using ZFS. The only real problem I found was that ZFS hangs the pool for 3 minutes if an iSCSI device gets disconnected, unfortunately ZFS waits for the iSCSI timeout before it realises something has happened. After the 3 minutes it did offline the device and carry on working with the remaining one. So if you don''t mind a 3 minute wait if something goes wrong I think it will work fine. Also, since you can add mirrors at any stage with zpool attach, you can create the ZFS pool on your backup server, transfer the data over from your live machine, and once it''s working, reformat your live machine as an iscsi volume and attach it to the pool. I think the idea of doing this as separate disks is a good one if you want to add disks later. Just bear in mind that you won''t be able to have any kind of raid on the individual servers, your only protection will be the mirroring between the devices. Exporting them as one huge iSCSI volume is good if you''re paranoid about data loss. You can use raid5 or 6 on the Linux servers, and then mirror those large volumes with ZFS. The downside is that it''s much harder to add storage. I don''t know if iSCSI volumes can be expanded, so you might have to break the mirror, create a larger iSCSI volume and resync all your data with that approach. -- This message posted from opensolaris.org
> Exporting them as one huge iSCSI volume is good if you''re paranoid about data loss. You can use raid5 or 6 on the Linux servers, and then mirror those large volumes with ZFS. The downside is that it''s much harder to add storage. I don''t know if iSCSI volumes can be expanded, so you might have to break the mirror, create a larger iSCSI volume and resync all your data with that approach.Just be careful with respect to writer barriers. The software raid in Linux does not support them with raid5/raid6, so you loose the correctness aspect of ZFS you otherwise get even without hw raid controllers. (Speaking of this, can someone speak to the general state of affairs with iSCSI with respect to write barriers? I assume Solaris does it correctly; what about the bsd/linux stuff? Can one trust that the iSCSI targets correctly implement cache flushing/writer barriers?) -- / Peter Schuller PGP userID: 0xE9758B7D or ''Peter Schuller <peter.schuller at infidyne.com>'' Key retrieval: Send an E-Mail to getpgpkey at scode.org E-Mail: peter.schuller at infidyne.com Web: http://www.scode.org -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080907/ac336253/attachment.bin>
>>>>> "ps" == Peter Schuller <peter.schuller at infidyne.com> writes:ps> The software raid in Linux does not support [write barriers] ps> with raid5/raid6, yeah i read this warning also and think it''s a good argument for not using it. http://lwn.net/Articles/283161/ With RAID5 or RAID6 there is of course the write hole. But the way I read it, just making soft partitions or mirrors with LVM2 breaks write barriers too. From the comments: -----8<----- Q. is there any work going on towards making the barriers work on lvm volumes? A. Yes, but only single disk DM targets (e.g. linear), see: http://lkml.org/lkml/2008/2/15/125 Unfortunately, this patch hasn''t been pushed upstream and the DM maintainer (agk) hasn''t really commented on when it might. -----8<----- The downside is that if you do raidz2 above iscsi, I think iSCSI makes one TCP circuit for each target, so the congestion avoidance will work less well. maybe RED on the switch can help, and probably needs lots [more than i have done] performance testing / comparison. ps> iSCSI with respect to write barriers? +1. Does anyone even know of a good way to actually test it? So far it seems the only way to know if your OS is breaking write barriers is to trade gossip and guess. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 304 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/zfs-discuss/attachments/20080908/3c336bb8/attachment.bin>
On Mon, Sep 8, 2008 at 8:35 PM, Miles Nordin <carton at ivy.net> wrote:> ps> iSCSI with respect to write barriers? > > +1. > > Does anyone even know of a good way to actually test it? So far it > seems the only way to know if your OS is breaking write barriers is to > trade gossip and guess. >Write a program that writes backwards (every other block to avoid write merges) with and without O_DSYNC, measure speed. I think you can also deduce driver and drive cache flush correctness by calculating the best theoretical correct speed (which should be really slow, one write per disc spin) this has been on my TODO list for ages.. :(
Tuomas Leikola wrote:> On Mon, Sep 8, 2008 at 8:35 PM, Miles Nordin <carton at ivy.net> wrote: >> ps> iSCSI with respect to write barriers? >> >> +1. >> >> Does anyone even know of a good way to actually test it? So far it >> seems the only way to know if your OS is breaking write barriers is to >> trade gossip and guess. > > Write a program that writes backwards (every other block to avoid > write merges) with and without O_DSYNC, measure speed. > > I think you can also deduce driver and drive cache flush correctness > by calculating the best theoretical correct speed (which should be > really slow, one write per disc spin) > > this has been on my TODO list for ages.. :(Does the perl script at http://brad.livejournal.com/2116715.html do what you want? -- James Andrewartha