With the introduction of zfs to FreeBSD 7.0, a door has opened for more mirroring options so I would like to get some opinions on what direction I should take for the following scenario. Basically I have 2 machines that are "clones" of each other (master and slave) wherein one will be serving up samba shares. Each server has one disk to hold the OS (not mirrored) and then 3 disks, each of which will be its own mountpoint and samba share. The idea is to create a mirror of each of these disks on the slave machine so that in the event the master goes down, the slave can pick up serving the samba shares (I am using CARP as the samba server IP address). My initial thought was to have the slave set up as an iscsi target and then have the master connect to each drive, then create a gmirror or zpool mirror using local_data1:iscsi_data1, local_data2:iscsi_data2, and local_data3:iscsi_data3. After some feedback (P.French for example) it would appear as though iscsi may not be the way to go for this as it locks up when the target goes down and even though I may be able to remove the target from the mirror, that process may fail as the "disk" remains in "D" state. So that leaves me with the following options: 1) ggated/ggatec + gmirror 2) ggated/ggatec + zfs (zpool mirror) 3) zfs send/recv incremental snapshots (ssh) 1) I have been using ggated/ggatec on a set of 6.2-REL boxes and find that ggated tends to fail after some time leaving me rebuilding the mirror periodically (and gmirror resilvering takes quite some time). Has ggated/ggatec performance and stability improved in 7.0? This combination does work, but it is high maintenance and automating it is a bit painful (in terms of re-establishing the gmirror and rebuilding and making sure the master machine is the one being read from). 2) Noting the issues with ggated/ggatec in (1), would a zpool be better at rebuilding the mirror? I understand that it can better determine which drive of the mirror is out of sync than can gmirror so a lot of the "insert" "rebuild" manipulations used with gmirror would not be needed here. 3) The send/recv feature of zfs was something I had not even considered until very recently. My understanding is that this would work by a) taking a snapshot of master_data1 b) zfs sending that snapshot to slave_data1 c) via ssh on pipe, receiving that snapshot on slave_data1 and then d) doing incremental snapshots, sending, receiving as in (a)(b)(c). How time/cpu intensive is the snapshot generation and just how granular could this be done? I would imagine for systems with litle traffic/changes this could be practical but what about systems that may see a lot of files added, modified, deleted to the filesystem(s)? I would be interested to hear anyone's experience with any (or all) of these methods and caveats of each. I am leaning towards ggate(dc) + zpool at the moment assuming that zfs can "smartly" rebuild the mirror after the slave's ggated processes bug out. Sven -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: This is a digitally signed message part Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20080715/a82cdc0a/attachment.pgp
On Tue, Jul 15, 2008 at 10:07:14AM -0400, Sven Willenberger wrote:> 3) The send/recv feature of zfs was something I had not even considered > until very recently. My understanding is that this would work by a) > taking a snapshot of master_data1 b) zfs sending that snapshot to > slave_data1 c) via ssh on pipe, receiving that snapshot on slave_data1 > and then d) doing incremental snapshots, sending, receiving as in > (a)(b)(c). How time/cpu intensive is the snapshot generation and just > how granular could this be done? I would imagine for systems with litle > traffic/changes this could be practical but what about systems that may > see a lot of files added, modified, deleted to the filesystem(s)?I can speak a bit on ZFS snapshots, because I've used them in the past with good results. Compared to UFS2 snapshots (e.g. dump -L or mksnap_ffs), ZFS snapshots are fantastic. The two main positives for me were: 1) ZFS snapshots take significantly less time to create; I'm talking seconds or minutes vs. 30-45 minutes. I also remember receiving mail from someone (on -hackers? I can't remember -- let me know and I can dig through my mail archives for the specific mail/details) stating something along the lines of "over time, yes, UFS2 snapshots take longer and longer, it's a known design problem". 2) ZFS snapshots, when created, do not cause the system to more or less deadlock until the snapshot is generated; you can continue to use the system during the time the snapshot is being generated. While with UFS2, dump -L and mksnap_ffs will surely disappoint you. We moved all of our production systems off of using dump/restore solely because of these aspects. We didn't move to ZFS though; we went with rsync, which is great, except for the fact that it modifies file atimes (hope you use Maildir and not classic mbox/mail spools...). ZFS's send/recv capability (over a network) is something I didn't have time to experiment with, but it looked *very* promising. The method is documented in the manpage as "Example 12", and is very simple -- as it should be. You don't have to use SSH either, by the way[1]. One of the "annoyances" to ZFS snapshots, however, was that I had to write my own script to do snapshot rotations (think incremental dump(8) but using ZFS snapshots).> I would be interested to hear anyone's experience with any (or all) of > these methods and caveats of each. I am leaning towards ggate(dc) + > zpool at the moment assuming that zfs can "smartly" rebuild the mirror > after the slave's ggated processes bug out.I don't have any experience with GEOM gate, so I can't comment on it. But I would highly recommend you discuss the shortcomings with pjd@, because he definitely listens. However, I must ask you this: why are you doing things the way you are? Why are you using the equivalent of RAID 1 but for entire computers? Is there some reason you aren't using a filer (e.g. NetApp) for your data, thus keeping it centralised? There has been recent discussion of using FreeBSD with ZFS as such, over on freebsd-fs. If you want a link to the thread, I can point you to it. I'd like to know why you're doing things the way you are. By knowing why, possibly myself or others could recommend solving the problem in a different way -- one that doesn't involve realtime duplication of filesystems via network. [1]: If you're transferring huge sums of data over a secure link (read: dedicated gigE LAN or a separate VLAN), you'll be disappointed to find that there is no Cipher=none with stock SSH; the closest you'll get is blowfish-cbc. You might be saddened by the fact that the only way you'll get Cipher=none is via the HPN patches, which means you'll be forced to install ports/security/openssh-portable. (I am not a fan of the "overwrite the base system" concept; it's a hack, and I'd rather get rid of the whole "base system" concept in general -- but that's for another discussion). My point is, your overall network I/O will be limited by SSH, so if you're pushing lots of data across a LAN, consider something without encryption. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Sven Willenberger wrote: > [...] > 1) I have been using ggated/ggatec on a set of 6.2-REL boxes and find > that ggated tends to fail after some time leaving me rebuilding the > mirror periodically (and gmirror resilvering takes quite some time). Has > ggated/ggatec performance and stability improved in 7.0? This > combination does work, but it is high maintenance and automating it is a > bit painful (in terms of re-establishing the gmirror and rebuilding and > making sure the master machine is the one being read from). First, some problems in ggated/ggatec have been fixed between 6.2 and 6.3. Second, you should tune it a little to improve performance and stability. The following reply in an earlier thread is interesting: http://lists.freebsd.org/pipermail/freebsd-stable/2008-January/039722.html > 2) Noting the issues with ggated/ggatec in (1), would a zpool be better > at rebuilding the mirror? I understand that it can better determine > which drive of the mirror is out of sync than can gmirror so a lot of > the "insert" "rebuild" manipulations used with gmirror would not be > needed here. I don't think there's much of a difference between gmirror and a ZFS mirror if used with ggated/ggatec. Of course, ZFS has more advantages, like checksumming, snapshots etc., but also the disadvantages that it requires considerably more memory. Yet another way would be to use DragoFly's "Hammer" file system which is part of DragonFly BSD 2.0 which will be released in a few days. It supports remote mirroring, i.e. mirror source and mirror target can run on different machines. Of course it is still very new and experimental (however, ZFS is marked experimental, too), so you probably don't want to use it on critical production machines. (YMMV, of course.) Best regards Oliver -- Oliver Fromme, secnetix GmbH & Co. KG, Marktplatz 29, 85567 Grafing b. M. Handelsregister: Registergericht Muenchen, HRA 74606, Gesch?ftsfuehrung: secnetix Verwaltungsgesellsch. mbH, Handelsregister: Registergericht M?n- chen, HRB 125758, Gesch?ftsf?hrer: Maik Bachmann, Olaf Erb, Ralf Gebhart FreeBSD-Dienstleistungen, -Produkte und mehr: http://www.secnetix.de/bsd PI: int f[9814],b,c=9814,g,i;long a=1e4,d,e,h; main(){for(;b=c,c-=14;i=printf("%04d",e+d/a),e=d%a) while(g=--b*2)d=h*b+a*(i?f[b]:a/5),h=d/--g,f[b]=d%g;}
On Tue, Jul 15, 2008 at 07:54:26AM -0700, Jeremy Chadwick wrote:> One of the "annoyances" to ZFS snapshots, however, was that I had to > write my own script to do snapshot rotations (think incremental dump(8) > but using ZFS snapshots).There is a PR[1] to get something like this in the ports tree. I have no idea how good it is but I hope to get it in the tree soon. http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/125340 -- WXS
Wesley Shields wrote:> On Tue, Jul 15, 2008 at 07:54:26AM -0700, Jeremy Chadwick wrote: >> One of the "annoyances" to ZFS snapshots, however, was that I had to >> write my own script to do snapshot rotations (think incremental dump(8) >> but using ZFS snapshots). > > There is a PR[1] to get something like this in the ports tree. I have > no idea how good it is but I hope to get it in the tree soon. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/125340There is also sysutils/freebsd-snapshot (pkg-descr is out of date, it supports ZFS too). I found it more convenient to just write my own tiny script. Kris
On Tue, Jul 15, 2008 at 07:10:05PM +0200, Kris Kennaway wrote:> Wesley Shields wrote: >> On Tue, Jul 15, 2008 at 07:54:26AM -0700, Jeremy Chadwick wrote: >>> One of the "annoyances" to ZFS snapshots, however, was that I had to >>> write my own script to do snapshot rotations (think incremental dump(8) >>> but using ZFS snapshots). >> >> There is a PR[1] to get something like this in the ports tree. I have >> no idea how good it is but I hope to get it in the tree soon. >> >> http://www.freebsd.org/cgi/query-pr.cgi?pr=ports/125340 > > There is also sysutils/freebsd-snapshot (pkg-descr is out of date, it > supports ZFS too). > > I found it more convenient to just write my own tiny script.Thanks for pointing this out -- I had no idea such a port existed! -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |