Hi all, I have built out an 8TB SAN at home using OpenSolaris + ZFS. I have yet to put it into ''production'' as a lot of the issues raised on this mailing list are putting me off trusting my data onto the platform right now. Throughout time, I have stored my personal data on NetWare and now NT and this solution has been 100% reliable for the last 12 years. Never a single problem (nor have I had any issues with NTFS with the tens of thousands of spindles i''ve worked with over the years). I appreciate 99% of the time people only comment if they have a problem, which is why I think it''d be nice for some people who have successfully implemented ZFS, including making various use of the features (recovery, replacing disks, etc), could just reply to this post with a sentence or paragraph detailing how great it is for them. Not necessarily interested in very small implementations of one/two disks that haven''t changed config since the first day it was installed, but more aimed towards setups that are ''organic'' and have changed/been_administered over time (to show functionality of the tools, resilience of the platform, etc.).. .. Of course though, I guess a lot of people who may have never had a problem wouldn''t even be signed up on this list! :-) Thanks!
On Mon, Oct 20, 2008 at 03:10, gm_sjo <saqmaster at gmail.com> wrote:> I appreciate 99% of the time people only comment if they have a > problem, which is why I think it''d be nice for some people who have > successfully implemented ZFS, including making various use of the > features (recovery, replacing disks, etc), could just reply to this > post with a sentence or paragraph detailing how great it is for them.My initial test of zfs was with a few IDE disks which I had found flaky on other platforms (md5 mismatches, that kind of thing). I put them all in a non-redundant pool, and loaded some data on the pool. Then I let it sit in the corner serving NFS for a couple weeks, scrubbed the pool every once in a while, and watched the error counters. It confirmed what I''d seen: these disks gave off errors spontaneously. This was a good start: the first time I''d seen a storage stack that had the audacity to complain about problems with its hardware. So I upgraded, and put in "known-good" disks. I started with a mirrored pair of 750s, then added another pair, then added a pair of log disks. At each step, things moved smoothly, and speed increased. I''ve also helped my brother set up a Solaris/ZFS setup, on a bit larger scale but with a more static configuration. He started with Linux, md raid, and XFS, using raid 5 on 8 320GB disks and a Supermicro AOC-SAT2-MV8 Marvell controller. Unfortunately, he lost basically the entire array due to corruption in some layer of the stack. So I suggested ZFS as an alternative. This was around Build 67 of Nevada. He put his 8 disks in a raidz pool. About a year ago, he bought six 500gb disks and another Marvell controller, made a new raidz vdev (in a new pool) out of them, and added six of the 320gb disks in another vdev. A month or so ago, he bought six 1TB disks, made a new pool out of them, and moved all his data over to it. At each step of the way, he upgraded to solve a problem. Moving from Linux to Solaris was because it had better drivers for the Marvell-based card. Adding the 500GB disks was because he was out of space, and the reason we didn''t just add another vdev to the existing pool is because his case only has room for 13 disks. Finally, the 320 gig disks have started returning checksum errors, so he wanted to get them out of the pool. The system as a whole has been very reliable, but due to some ZFS limitations (no vdev removal, no stripe width changing) a new pool has been needed at each stage. My experiences with ZFS at home have been very positive, but I also use it at work. I''m concerned about the speed of "zfs send" and with being able to remove vdevs before I will recommend it unilaterally for work purposes, but despite these issues I have a couple pools in production: one serving mail, one serving user home directories, and one serving data for research groups. We have had no problems with these pools, but I keep an eye on the backup logs for them. I hope that eventually such careful watching will not be necessary. Will
We have 135 TB capacity with about 75 TB in use on zfs based storage. zfs use started about 2 years ago, and has grown from there. This spans 9 SAN appliances, with 5 "head nodes", and 2 more recent servers running zfs on JBOD with vdevs made up of raidz2. So far, the experience has been very positive. Never lost a bit of data. We scrub weekly, and I''ve started sleeping better at night. I have also read the horror stories, but we aren''t seeing them here. We did have some performance issues, especially involving the SAN storage on more heavily used systems, but enabling the cache on the SAN devices without pushing fsync through to disk basically fixed that. Your zfs layout can profoundly effect performance, which is a down side. It''s best to test your setup under an approximate realistic work load to balance capacity with performance before deploying. BTW, most of our zfs deployment is on Solaris 10{u4,u5}, but two large servers are on OpenSolaris svn86. The OpenSolaris servers seem to be considerably faster, and more feature rich, without any reliability issues, so far. Jon gm_sjo wrote:> Hi all, > > I have built out an 8TB SAN at home using OpenSolaris + ZFS. I have > yet to put it into ''production'' as a lot of the issues raised on this > mailing list are putting me off trusting my data onto the platform > right now. > > Throughout time, I have stored my personal data on NetWare and now NT > and this solution has been 100% reliable for the last 12 years. Never > a single problem (nor have I had any issues with NTFS with the tens of > thousands of spindles i''ve worked with over the years). > > I appreciate 99% of the time people only comment if they have a > problem, which is why I think it''d be nice for some people who have > successfully implemented ZFS, including making various use of the > features (recovery, replacing disks, etc), could just reply to this > post with a sentence or paragraph detailing how great it is for them. > Not necessarily interested in very small implementations of one/two > disks that haven''t changed config since the first day it was > installed, but more aimed towards setups that are ''organic'' and have > changed/been_administered over time (to show functionality of the > tools, resilience of the platform, etc.).. > > .. Of course though, I guess a lot of people who may have never had a > problem wouldn''t even be signed up on this list! :-) > > > Thanks! > _______________________________________________ > storage-discuss mailing list > storage-discuss at opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/storage-discuss >-- - _____/ _____/ / - Jonathan Loran - - - / / / IT Manager - - _____ / _____ / / Space Sciences Laboratory, UC Berkeley - / / / (510) 643-5146 jloran at ssl.berkeley.edu - ______/ ______/ ______/ AST:7731^29u18e3
About 2 years ago I used to run snv_55b with a raidz on top of 5 500GB SATA drives. After 10 months I ran out of space and added a mirror of 2 250GB drives to my pool with "zpool add". No pb. I scrubbed it weekly. I only saw 1 CKSUM error one day (ZFS self-healed itself automatically of course). Never had any pb with that server. After running again out of space I replaced it with a new system running snv_82, configured with a raidz on top of 7 750GB drives. To burn in the machine, I wrote a python script that read random sectors from the drives. I let it run for 48 hours to subject each disk to 10+ million I/O operations. After it passed this test, I created the pool and run some more scripts to create/delete files off it continously. To test disk failures (and SATA hotplug), I disconnected and reconnected a drive at random while the scripts were running. The system was always able to redetect the drive immediately after being plugged in (you need "set sata:sata_auto_online=1" for this to work). Depending on how long the drive had been disconnected, I either needed to do a "zpool replace" or nothing at all, for the system to re-add the disk to the pool and initiate a resilver. After these tests, I trusted the system enough to move all my data to it, so I rsync''d everything and double-checked it with MD5 sums. I have another ZFS server, at work, on which 1 disk someday started acting weirdly (timeouts). I physically replaced it, and ran "zpool replace". The resilver completed successfully. On this server, we have seen 2 CKSUM errors over the last 18 months or so. We read about 3 TB of data every day from it (daily rsync), that amounts to about 1.5 PB over 18 months. I guess 2 silent data corruptions while reading that quantity of data is about the expected error rate of modern SATA drives. (Again ZFS self-healed itself, so this was completely transparent to us.) -marc
Hello Marc, Tuesday, October 21, 2008, 8:14:17 AM, you wrote: MB> About 2 years ago I used to run snv_55b with a raidz on top of 5 500GB SATA MB> drives. After 10 months I ran out of space and added a mirror of 2 250GB MB> drives to my pool with "zpool add". No pb. I scrubbed it weekly. I only saw 1 MB> CKSUM error one day (ZFS self-healed itself automatically of course). Never MB> had any pb with that server. MB> After running again out of space I replaced it with a new system running MB> snv_82, configured with a raidz on top of 7 750GB drives. To burn in the MB> machine, I wrote a python script that read random sectors from the drives. I MB> let it run for 48 hours to subject each disk to 10+ million I/O operations. MB> After it passed this test, I created the pool and run some more scripts to MB> create/delete files off it continously. To test disk failures (and SATA MB> hotplug), I disconnected and reconnected a drive at random while the scripts MB> were running. The system was always able to redetect the drive immediately MB> after being plugged in (you need "set sata:sata_auto_online=1" for this to MB> work). Depending on how long the drive had been disconnected, I either needed MB> to do a "zpool replace" or nothing at all, for the system to re-add the disk MB> to the pool and initiate a resilver. After these tests, I trusted the system MB> enough to move all my data to it, so I rsync''d everything and double-checked MB> it with MD5 sums. MB> I have another ZFS server, at work, on which 1 disk someday started acting MB> weirdly (timeouts). I physically replaced it, and ran "zpool replace". The MB> resilver completed successfully. On this server, we have seen 2 CKSUM errors MB> over the last 18 months or so. We read about 3 TB of data every day from it MB> (daily rsync), that amounts to about 1.5 PB over 18 months. I guess 2 silent MB> data corruptions while reading that quantity of data is about the expected MB> error rate of modern SATA drives. (Again ZFS self-healed itself, so this was MB> completely transparent to us.) Which means you haven''t experienced silent data corruption thanks to ZFS. :) -- Best regards, Robert mailto:milek at task.gda.pl http://milek.blogspot.com