Hi, I've been pinged by two people already that it's possible that the problems listed at: http://wiki.freebsd.org/ZFSKnownProblems have been fixed in 7-STABLE, and that the page needs to be significantly revised to sound less scary. Since I unfortunately don't have any ZFS systems in production any more, I'd like to ask for experiences with ZFS from other people. Specifically: * Are the issues on the list still there? * Are there any new issues? * Is somebody running ZFS in production (non-trivial loads) with success? What architecture / RAM / load / applications used? * How is your memory load? (does it leave enough memory for other services) Please also note are you using the "new" ZFS port (in 8-CURRENT) or the "old" one (in 7-STABLE). If the progress has been great as has been suggested, then the page's text could be replaced with a note saying so and maybe even the file system promoted from "experimental"? Suggestion to the following page: http://wiki.freebsd.org/ZFSTuningGuide are also interesting. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20090408/01e92a53/signature.pgp
Ivan Voras wrote:> * Are the issues on the list still there? > * Are there any new issues? > * Is somebody running ZFS in production (non-trivial loads) with > success? What architecture / RAM / load / applications used? > * How is your memory load? (does it leave enough memory for other services)also: what configuration (RAIDZ, mirror, etc.?) -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 252 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20090408/b63f7c34/signature.pgp
My account: amd64 stable/7 system 4GB RAM zero tuning 3-way mirrored zpool with individual dev size about 400G moderate load sufficient remaining RAM (still plenty) zero troubles (system age is 2 months) -- Andriy Gapon
Ivan Voras wrote:> Hi, > > I've been pinged by two people already that it's possible that the > problems listed at: > > http://wiki.freebsd.org/ZFSKnownProblems > > have been fixed in 7-STABLE, and that the page needs to be significantly > revised to sound less scary. Since I unfortunately don't have any ZFS > systems in production any more, I'd like to ask for experiences with ZFS > from other people. Specifically: > > * Are the issues on the list still there?1. After tuning my i386 system was stable, I know of at least one other person who still has issues with kernel panics on i386 though. 2-5. I've never run into any of the other issues on my systems.> * Are there any new issues?Not that I've seen.> * Is somebody running ZFS in production (non-trivial loads) with > success? What architecture / RAM / load / applications used? > * How is your memory load? (does it leave enough memory for other services)Memory load is pretty heavy. IIUC memory used by ZFS is reported as Wired in top and a relatively lightly loaded system with 2GB of RAM has 1225MB wired here.> Please also note are you using the "new" ZFS port (in 8-CURRENT) or the > "old" one (in 7-STABLE).I'm using 7-Stable> If the progress has been great as has been suggested, then the page's > text could be replaced with a note saying so and maybe even the file > system promoted from "experimental"?I think a good chunk of the "experimental" tag is due to the lack of maintainers for ZFS. As far as I know PJD is still the only one that has thorough knowledge of ZFS on FreeBSD. At one point he stated he doesn't want to remove the experimental tag until there is at least one other person that knows the system well. I've not seen anything on this for a while though so my information could be out of date. Jonathan
Hi, I used geli encrypted ZFS including Root on my IBM Thinkpad T31 with 1GB RAM on a 160GB HDD. (i386 7-STABLE) Swap on a dedicated slice. Some Z Filesystems used compression (/usr/ports, /usr/src, for example). I encountered several crashes, especially during heavy loads, such as compiling big ports. Tried some of the "workarounds" listed on the page, which decreased the performance to a level where watching HD videos wasn't possible anymore. On heavy load the system was unable to perform as expected, leading to side effects like a stuttering mouse. All this disappeared since I switched to classic UFS. My home server is a amd64 7-STABLE, 4GB RAM, 4x400GB HDD with geli encryption (AES 256). This works like a charm. So my take on ZFS is that it is a no go on i386, but a stable solution on amd64. Regards Christian Walther
On April 8, 2009 5:30 am Ivan Voras wrote: <snip>> Specifically: > * Are the issues on the list still there? > * Are there any new issues? > * Is somebody running ZFS in production (non-trivial loads) with > success? What architecture / RAM / load / applications used? > * How is your memory load? (does it leave enough memory for other > services) > > Please also note are you using the "new" ZFS port (in 8-CURRENT) or the > "old" one (in 7-STABLE).I'm running the following three setups with ZFS: Home file server generic P4 3.0 GHz system with 2 GB RAM 2 GB USB stick for / and /usr 3x 120 GB SATA HDs onboard Marvel gigabit NIC 32-bit FreeBSD 7.1-RELEASE pool has a single 3-way raidz1 vdev Work file server 1 & 2 5U chenbro case w/1350 Watt 4-way redundant PSU Tyan h2000M motherboard 2x dual-core Opteron 2200-series CPUs at 2.8 GHz 8 GB ECC DDR2-SDRAM 2x 2 GB CompactFlash using gmirror for / and /usr (server 1) 2x 2 GB USB sticks using gmirror for / and /usr (server 2) 3Ware 9550SXU PCI-X RAID controller 3Ware 9650SE PCIe RAID controller 24x 500 GB Western Digital SATA HDs 4-port Intel PRO/1000 gigabit NIC configured using lagg(4) 64-bit FreeBSD 7.1-RELEASE pool on each server has 3 8-way raidz2 vdevs On my home box, it took a little bit of tuning to get it stable. The hardest part was finding the right setting for vm.kmem_size_max and vfs.zfs.arc_max. After about of month of tweaking, twiddling, crashing, and rebooting, I hit upon 1G for kmem and 256M for zfs arc. Since then, it's been rock-solid. This box runs KDE 4.2.2, is used for watching movies, downloading, office work, and sharing files out via Samba and NFS to the rest of the house. On the work servers, it took about 6 weeks to get the right settings for loader.conf to make it stable. After much trial and error, we are using 1596M for kmem_size_max, and 512M for zfs_arc_max. These boxes do remote backups for ~90 Linux and FreeBSD boxes using rsync. The backup script runs parallel rsync processes for each remote site, doing sequential backups of each server at the site. We wait 250s before starting the next site backup. Takes just under 5 hours to do incremental backups for all 90 sites. We get (according to MRTG) a sustained 80 MBytes/sec read/write during the backups. It may be more, as we can't get the 64-bit disk counters to work, and have to poll the 32-bit counters every 60 secs. During the trial-and-error period, we did have a lot of livelocks, deadlocks, and kernel panics. Things have been very stable on both boxes for the past two months. We don't run into any out-of-memory issues. We use swap on ZVol for all the systems listed above. So far, that hasn't been an issue (knock wood). :) iSCSI support works nicely as well, using the net/iscsi-target port. Only done minor desktop-style testing using a Debian Linux initiator. Haven't had any issues sharing the ZFS filesystems via NFS either. We use a couple NFS shares for really old SCO boxes that refuse to install rsync. Even when the full backup run is going, and these boxes are copying files via NFS, we haven't hit any lockups. We run with vfs.zfs.prefetch_disable=1 and vfs.zfs.zil_disable=0 on all systems. We're really looking forward to FreeBSD 8 with the ZFS improvements. Especially the auto-tuning and much higher kmem_max. We'd like to be able to give ZFS 3-4 GB for the ARC. We've also heavily modified /etc/sysctl.conf and upped a bunch of the network-related sysctls. Doing so increased our SSH throughput from ~30 Mbits/sec across all connections to over 90 Mbits/sec per SSH connection. So far, we've been very impressed with ZFS support on FreeBSD. Makes it really hard to use LVM on our Linux systems. :) -- Freddie fjwcash@gmail.com
Dear colleagues, ? ??, 08/04/2009 ? 14:30 +0200, Ivan Voras ?????: I have "old port" (RELENG_7) ZFS in production on two servers. Both of them have Serial ATA HDDs with the following storage pool configuration: 1) mpt(4) SATA/SAS Controller with three drives as RAIDZ; 2) aac(4) SATA/SAS RAID Controller with six drives as two RAIDZs. Both servers in question are Intel Xeons with FreeBSD/amd64 installed on them, as such, tuning of kernel memory-related sysctls is unneccessary with recent RELENG_7. Servers have relatively complex workload: they contain two-to-six jails with PostFix/DrWeb/Amavis; MySQL Database (with very intensive workload as it is used as a backend for RADIUS server with quite intensive authorisation/accounting); Web Server (up to three on one of servers in question); Asterisk SoftPBX. Servers have NFS-shared home directories between jails via localhost. Conclusion: no problems detected past three months. ;) -- Cheers, Andrew.
On Wed, 8 Apr 2009, Ivan Voras wrote: IV> > * Are the issues on the list still there? IV> > * Are there any new issues? IV> > * Is somebody running ZFS in production (non-trivial loads) with IV> > success? What architecture / RAM / load / applications used? IV> > * How is your memory load? (does it leave enough memory for other services) IV> IV> also: what configuration (RAIDZ, mirror, etc.?) Well, besides very strange data corruption problem I reported recently, I have no problems with ZFS (various combinations: my home workstation has mirror, notebook has ZFS on single disk, some machines with raidz, some with just disk on hardware RAID controllers (twa and arcmsr), both amd64 (mostly untuned, modulo kern.maxvnodes increase) and i386 (desktops; kmem_size/arc_max tuned, usually to 640m/192m) All of them are rather fresh RELENG_7. -- Sincerely, D.Marck [DM5020, MCK-RIPE, DM3-RIPN] [ FreeBSD committer: marck@FreeBSD.org ] ------------------------------------------------------------------------ *** Dmitry Morozovsky --- D.Marck --- Wild Woozle --- marck@rinet.ru *** ------------------------------------------------------------------------
My home fileserver: Started using zfs around 7.0-BETA2, oct-nov 07. The system is running 7-STABLE@amd64, updated once a month. 4GB RAM No kmem-tuning. 4x320gb, switched to 5x750 during the summer. No panics after switching from i386 to amd64.
Ivan Voras wrote:> Ivan Voras wrote: > > >>* Are the issues on the list still there? >>* Are there any new issues? >>* Is somebody running ZFS in production (non-trivial loads) with >>success? What architecture / RAM / load / applications used? >>* How is your memory load? (does it leave enough memory for other services) > > > also: what configuration (RAIDZ, mirror, etc.?)I have two production servers with ZFS. First is HP ProLiant ML110 G5 with 4x 1TB SATA Samsung drives in RAIDZ + 5GB RAM running FreeBSD 7.1-RELEASE amd64 (GENERIC) Root is on USB flash drive with UFS, main storage is on ZFS. # zfs list NAME USED AVAIL REFER MOUNTPOINT tank 1.34T 1.33T 29.9K /tank tank/system 1.04G 1.33T 28.4K /tank/system tank/system/tmp 52.4K 1.33T 52.4K /tmp tank/system/usr 611M 1.33T 95.0K /tank/system/usr tank/system/usr/obj 26.9K 1.33T 26.9K /usr/obj tank/system/usr/ports 443M 1.33T 214M /usr/ports tank/system/usr/ports/distfiles 105M 1.33T 105M /usr/ports/distfiles tank/system/usr/ports/packages 123M 1.33T 123M /usr/ports/packages tank/system/usr/src 168M 1.33T 168M /usr/src tank/system/var 459M 1.33T 213K /var tank/system/var/db 421M 1.33T 420M /var/db tank/system/var/db/pkg 387K 1.33T 387K /var/db/pkg tank/system/var/log 37.8M 1.33T 37.8M /var/log tank/system/var/run 60.6K 1.33T 60.6K /var/run tank/vol0 1.34T 1.33T 1.34T /vol0 This server is storage for backups made every night by rsync from 10 FreeBSD machines. It takes about 1 hour every night to do rsync backup of all machines. Rsync is used for "snapshots" with --link-dest= (each day has own directory and all unchenged files are hardlinked to previous day and I have history of two month back). Backups are stored on /vol0 with compression enabled. (compression is enabled on /usr/ports, /usr/src, /var/db/pkg, /vol0) # df -hi /vol0/ Filesystem Size Used Avail Capacity iused ifree %iused Mounted on tank/vol0 2.7T 1.3T 1.3T 50% 17939375 11172232 62% /vol0 This backup server is in service from october 2008 with just one panic (kmem related). After proper loader.conf tuning it is working well. # cat /boot/loader.conf ## ZFS tuning vm.kmem_size="1280M" vm.kmem_size_max="1280M" vfs.zfs.prefetch_disable="1" vfs.zfs.arc_min="16M" vfs.zfs.arc_max="128M" up 80+04:52:10 23:33:40 28 processes: 1 running, 27 sleeping CPU: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 17M Active, 26M Inact, 1541M Wired, 328K Cache, 204M Buf, 3234M Free The second server with ZFS is used as Jails hosting. It is buil on Sun Fire X2100 M2 with 2x 500GB SATA drives + 4GB RAM running FreeBSD 7.1-STABLE amd64 from Wed Feb 11 09:56:08 CET 2009 (GENERIC kernel) The hard drives are splitted to two slices, where first slice is used for gmirrored system (about 20GB) and the rest is used for ZFS mirror. There are 5 Jails running. One with postfix as backup MX and BIND as slave DNS for little webhosting company. Few jails for webdevelopers (not heavily loaded), but last jail has 240GB of audio files for streaming throught Lighttpd with speed about 30Mbps. There are some issues with Lighttpd in conjunction with ZFS - after about 30-60 minutes, Lighttpd decrease the speed to less than 7Mbps until it is restarted, then everything works well. The server has uptime 56 days without any panic or other stability issues. Another box with Lighttpd + UFS (not in jail) is serving same files by 65Mbps without problem. (Lighttpd problem may be related to jail instead of ZFS - I did not test it yet) All jails are on compressed filesystems and there are some snapshots of each jail. # cat /boot/loader.conf ## gmirror RAID1 geom_mirror_load="YES" ## ZFS tuning vm.kmem_size="1280M" vm.kmem_size_max="1280M" kern.maxvnodes="400000" vfs.zfs.prefetch_disable="1" vfs.zfs.arc_min="16M" vfs.zfs.arc_max="128M" Miroslav Lachman
Hi, in one production case (1), haven't seen panics or deadlocks for a long time, yet on another much more powerful machine (2), I could not get rid of "vm_thread_new: kstack allocation failed", ultimately rendering the machine useless pretty fast. This was at least till RELENG_7/november (7.1-PRERELEASE), where I decided to stop the zfs experiment for now and went back to ufs. trying to understand now if 7.2 is worth a new try, or if, for that matter, the only reasonable wait is until 8.0. perhaps worth of note, the kstack errors still occurred (albeit after more time) with all zpools exported (and system rebooted) but the zfs.ko still loaded. only after rebooting without zfs_load="YES" the server began to work seemlessly for months. I'm asking myself if/how important the underlying driver/provider (mfi, mpt, ad, ciss, etc..) can be in regard to the remaining/ recurring problems with zfs.. (since I've seen so different behaviors with different machines...)? (1) Homebrewn Opteron / 2GB RAM / SATA ad / 7.1-PRERELEASE w. usual tuning, one zpool on a SATA mirror for backups via rsync of several servers (2) DELL PE 1950 1 Quad-Xeon / 8GB RAM / LSI mpt / 7.1-PRERELEASE w. many tunings tried, one zpool on a partition on top of HW RAID 1, moderately loaded mailserver box running courier and mysql Regards, Lorenzo
On Thu, 9 Apr 2009 22:35:06 +0200, Lorenzo Perone <lopez.on.the.lists@yellowspace.net> wrote:> trying to understand now if > 7.2 is worth a new try, or if, for that matter, the only reasonable > wait is until 8.0.The deadlock issues should be fixed with ZFS v.13 which is only available in CURRENT. AFAIK, RELENG_7 will most probably stick with v.6, so the problems occurring with 7.1 will be most likely the same with 7.2. HTH Jesco
On Apr 8, 2009, at 5:59 PM, Miroslav Lachman wrote:> Rsync is used for "snapshots" with --link-dest= (each day has own > directory and all unchenged files are hardlinked to previous day and > I have history of two month back). > Backups are stored on /vol0 with compression enabled. (compression > is enabled on /usr/ports, /usr/src, /var/db/pkg, /vol0) > > # df -hi /vol0/ > Filesystem Size Used Avail Capacity iused ifree %iused > Mounted on > tank/vol0 2.7T 1.3T 1.3T 50% 17939375 11172232 62% / > vol0 > > This backup server is in service from october 2008 with just one > panic (kmem related). After proper loader.conf tuning it is working > well.You might want to look at this commit if you are using rsync --link- dest and getting kmem panics: http://svn.freebsd.org/viewvc/base?view=revision&revision=187460 and the MFC: http://svn.freebsd.org/viewvc/base?view=revision&revision=190837 It seems quite possible to me that kmem panics could be mistakenly attributed to zfs when the name cache is eating up memory. Hope that helps. - Ben