Hi all, I have built out an 8TB SAN at home using OpenSolaris + ZFS. I have yet to put it into ''production'' as a lot of the issues raised on this mailing list are putting me off trusting my data onto the platform right now. Throughout time, I have stored my personal data on NetWare and now NT and this solution has been 100% reliable for the last 12 years. Never a single problem (nor have I had any issues with NTFS with the tens of thousands of spindles i''ve worked with over the years). I appreciate 99% of the time people only comment if they have a problem, which is why I think it''d be nice for some people who have successfully implemented ZFS, including making various use of the features (recovery, replacing disks, etc), could just reply to this post with a sentence or paragraph detailing how great it is for them. Not necessarily interested in very small implementations of one/two disks that haven''t changed config since the first day it was installed, but more aimed towards setups that are ''organic'' and have changed/been_administered over time (to show functionality of the tools, resilience of the platform, etc.).. .. Of course though, I guess a lot of people who may have never had a problem wouldn''t even be signed up on this list! :-) Thanks!
On Mon, Oct 20, 2008 at 03:10, gm_sjo <saqmaster at gmail.com> wrote:> I appreciate 99% of the time people only comment if they have a > problem, which is why I think it''d be nice for some people who have > successfully implemented ZFS, including making various use of the > features (recovery, replacing disks, etc), could just reply to this > post with a sentence or paragraph detailing how great it is for them.My initial test of zfs was with a few IDE disks which I had found flaky on other platforms (md5 mismatches, that kind of thing). I put them all in a non-redundant pool, and loaded some data on the pool. Then I let it sit in the corner serving NFS for a couple weeks, scrubbed the pool every once in a while, and watched the error counters. It confirmed what I''d seen: these disks gave off errors spontaneously. This was a good start: the first time I''d seen a storage stack that had the audacity to complain about problems with its hardware. So I upgraded, and put in "known-good" disks. I started with a mirrored pair of 750s, then added another pair, then added a pair of log disks. At each step, things moved smoothly, and speed increased. I''ve also helped my brother set up a Solaris/ZFS setup, on a bit larger scale but with a more static configuration. He started with Linux, md raid, and XFS, using raid 5 on 8 320GB disks and a Supermicro AOC-SAT2-MV8 Marvell controller. Unfortunately, he lost basically the entire array due to corruption in some layer of the stack. So I suggested ZFS as an alternative. This was around Build 67 of Nevada. He put his 8 disks in a raidz pool. About a year ago, he bought six 500gb disks and another Marvell controller, made a new raidz vdev (in a new pool) out of them, and added six of the 320gb disks in another vdev. A month or so ago, he bought six 1TB disks, made a new pool out of them, and moved all his data over to it. At each step of the way, he upgraded to solve a problem. Moving from Linux to Solaris was because it had better drivers for the Marvell-based card. Adding the 500GB disks was because he was out of space, and the reason we didn''t just add another vdev to the existing pool is because his case only has room for 13 disks. Finally, the 320 gig disks have started returning checksum errors, so he wanted to get them out of the pool. The system as a whole has been very reliable, but due to some ZFS limitations (no vdev removal, no stripe width changing) a new pool has been needed at each stage. My experiences with ZFS at home have been very positive, but I also use it at work. I''m concerned about the speed of "zfs send" and with being able to remove vdevs before I will recommend it unilaterally for work purposes, but despite these issues I have a couple pools in production: one serving mail, one serving user home directories, and one serving data for research groups. We have had no problems with these pools, but I keep an eye on the backup logs for them. I hope that eventually such careful watching will not be necessary. Will
We have 135 TB capacity with about 75 TB in use on zfs based storage.
zfs use started about 2 years ago, and has grown from there. This spans
9 SAN appliances, with 5 "head nodes", and 2 more recent servers
running
zfs on JBOD with vdevs made up of raidz2.
So far, the experience has been very positive. Never lost a bit of
data. We scrub weekly, and I''ve started sleeping better at night. I
have also read the horror stories, but we aren''t seeing them here.
We did have some performance issues, especially involving the SAN
storage on more heavily used systems, but enabling the cache on the SAN
devices without pushing fsync through to disk basically fixed that.
Your zfs layout can profoundly effect performance, which is a down
side. It''s best to test your setup under an approximate realistic work
load to balance capacity with performance before deploying.
BTW, most of our zfs deployment is on Solaris 10{u4,u5}, but two large
servers are on OpenSolaris svn86. The OpenSolaris servers seem to be
considerably faster, and more feature rich, without any reliability
issues, so far.
Jon
gm_sjo wrote:> Hi all,
>
> I have built out an 8TB SAN at home using OpenSolaris + ZFS. I have
> yet to put it into ''production'' as a lot of the issues
raised on this
> mailing list are putting me off trusting my data onto the platform
> right now.
>
> Throughout time, I have stored my personal data on NetWare and now NT
> and this solution has been 100% reliable for the last 12 years. Never
> a single problem (nor have I had any issues with NTFS with the tens of
> thousands of spindles i''ve worked with over the years).
>
> I appreciate 99% of the time people only comment if they have a
> problem, which is why I think it''d be nice for some people who
have
> successfully implemented ZFS, including making various use of the
> features (recovery, replacing disks, etc), could just reply to this
> post with a sentence or paragraph detailing how great it is for them.
> Not necessarily interested in very small implementations of one/two
> disks that haven''t changed config since the first day it was
> installed, but more aimed towards setups that are
''organic'' and have
> changed/been_administered over time (to show functionality of the
> tools, resilience of the platform, etc.)..
>
> .. Of course though, I guess a lot of people who may have never had a
> problem wouldn''t even be signed up on this list! :-)
>
>
> Thanks!
> _______________________________________________
> storage-discuss mailing list
> storage-discuss at opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/storage-discuss
>
--
- _____/ _____/ / - Jonathan Loran - -
- / / / IT Manager -
- _____ / _____ / / Space Sciences Laboratory, UC Berkeley
- / / / (510) 643-5146 jloran at ssl.berkeley.edu
- ______/ ______/ ______/ AST:7731^29u18e3
About 2 years ago I used to run snv_55b with a raidz on top of 5 500GB SATA drives. After 10 months I ran out of space and added a mirror of 2 250GB drives to my pool with "zpool add". No pb. I scrubbed it weekly. I only saw 1 CKSUM error one day (ZFS self-healed itself automatically of course). Never had any pb with that server. After running again out of space I replaced it with a new system running snv_82, configured with a raidz on top of 7 750GB drives. To burn in the machine, I wrote a python script that read random sectors from the drives. I let it run for 48 hours to subject each disk to 10+ million I/O operations. After it passed this test, I created the pool and run some more scripts to create/delete files off it continously. To test disk failures (and SATA hotplug), I disconnected and reconnected a drive at random while the scripts were running. The system was always able to redetect the drive immediately after being plugged in (you need "set sata:sata_auto_online=1" for this to work). Depending on how long the drive had been disconnected, I either needed to do a "zpool replace" or nothing at all, for the system to re-add the disk to the pool and initiate a resilver. After these tests, I trusted the system enough to move all my data to it, so I rsync''d everything and double-checked it with MD5 sums. I have another ZFS server, at work, on which 1 disk someday started acting weirdly (timeouts). I physically replaced it, and ran "zpool replace". The resilver completed successfully. On this server, we have seen 2 CKSUM errors over the last 18 months or so. We read about 3 TB of data every day from it (daily rsync), that amounts to about 1.5 PB over 18 months. I guess 2 silent data corruptions while reading that quantity of data is about the expected error rate of modern SATA drives. (Again ZFS self-healed itself, so this was completely transparent to us.) -marc
Hello Marc,
Tuesday, October 21, 2008, 8:14:17 AM, you wrote:
MB> About 2 years ago I used to run snv_55b with a raidz on top of 5 500GB
SATA
MB> drives. After 10 months I ran out of space and added a mirror of 2 250GB
MB> drives to my pool with "zpool add". No pb. I scrubbed it
weekly. I only saw 1
MB> CKSUM error one day (ZFS self-healed itself automatically of course).
Never
MB> had any pb with that server.
MB> After running again out of space I replaced it with a new system running
MB> snv_82, configured with a raidz on top of 7 750GB drives. To burn in the
MB> machine, I wrote a python script that read random sectors from the
drives. I
MB> let it run for 48 hours to subject each disk to 10+ million I/O
operations.
MB> After it passed this test, I created the pool and run some more scripts
to
MB> create/delete files off it continously. To test disk failures (and SATA
MB> hotplug), I disconnected and reconnected a drive at random while the
scripts
MB> were running. The system was always able to redetect the drive
immediately
MB> after being plugged in (you need "set sata:sata_auto_online=1"
for this to
MB> work). Depending on how long the drive had been disconnected, I either
needed
MB> to do a "zpool replace" or nothing at all, for the system to
re-add the disk
MB> to the pool and initiate a resilver. After these tests, I trusted the
system
MB> enough to move all my data to it, so I rsync''d everything and
double-checked
MB> it with MD5 sums.
MB> I have another ZFS server, at work, on which 1 disk someday started
acting
MB> weirdly (timeouts). I physically replaced it, and ran "zpool
replace". The
MB> resilver completed successfully. On this server, we have seen 2 CKSUM
errors
MB> over the last 18 months or so. We read about 3 TB of data every day from
it
MB> (daily rsync), that amounts to about 1.5 PB over 18 months. I guess 2
silent
MB> data corruptions while reading that quantity of data is about the
expected
MB> error rate of modern SATA drives. (Again ZFS self-healed itself, so this
was
MB> completely transparent to us.)
Which means you haven''t experienced silent data corruption thanks to
ZFS. :)
--
Best regards,
Robert mailto:milek at task.gda.pl
http://milek.blogspot.com