Hello A box of mine running RELENG_7_0 and ZFS over a couple of disks (6 disks, 3 mirrors) seems to have gotten stuck. From Ctrl-T: load: 0.50 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock] 0.02u 0.04s 0% 3404k load: 0.43 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock] 0.02u 0.04s 0% 3404k load: 0.10 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock] 0.02u 0.04s 0% 3404k load: 0.10 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock] 0.02u 0.04s 0% 3404k load: 0.11 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock] 0.02u 0.04s 0% 3404k Worked for a while then that stopped working too (was over ssh). When trying a local login i only got load: 0.09 cmd: login 1611 [zfs] 0.00u 0.00s 0% 208k I found one post like this earlier (by Xin LI), but nobody seemed to have replied... in my current conf, I think my kmem/kmem_max is at 512Mb (not sure though, since I've edited my file yesterday for next reboot), with 2G of system RAM.. Normally I'd run kmem(max) 1G (with arcsize of 512M. currently it is at default), but since I just got back to 2G total mem after some hardware problems I've been runnig at those lows (1G total is kindof tight with zfs..) Well, just wanted to report... The box is not totally dead yet, ie I can still do Ctrl-T on console, but thats it.. I don't really know what more I can do so.. I don't have KDB/DDB. I'll wait another hour or so before I hard reboot it, unless it "unlocks" or if anyone have any suggestions. Thanks -- Johan Str?m Stromnet johan@stromnet.se http://www.stromnet.se/
On Tue, Apr 08, 2008 at 08:17:38AM +0200, Johan Str?m wrote:> Hello > > A box of mine running RELENG_7_0 and ZFS over a couple of disks (6 disks, 3 > mirrors) seems to have gotten stuck. From Ctrl-T: > > load: 0.50 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock] 0.02u > 0.04s 0% 3404k > load: 0.43 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock] 0.02u > 0.04s 0% 3404k > load: 0.10 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock] 0.02u > 0.04s 0% 3404k > load: 0.10 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock] 0.02u > 0.04s 0% 3404k > load: 0.11 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock] 0.02u > 0.04s 0% 3404k > > Worked for a while then that stopped working too (was over ssh). When > trying a local login i only got > > load: 0.09 cmd: login 1611 [zfs] 0.00u 0.00s 0% 208k > > I found one post like this earlier (by Xin LI), but nobody seemed to have > replied... > in my current conf, I think my kmem/kmem_max is at 512Mb (not sure though, > since I've edited my file yesterday for next reboot), with 2G of system > RAM.. Normally I'd run kmem(max) 1G (with arcsize of 512M. currently it is > at default), but since I just got back to 2G total mem after some hardware > problems I've been runnig at those lows (1G total is kindof tight with > zfs..) > > Well, just wanted to report... The box is not totally dead yet, ie I can > still do Ctrl-T on console, but thats it.. I don't really know what more I > can do so.. I don't have KDB/DDB. > I'll wait another hour or so before I hard reboot it, unless it "unlocks" > or if anyone have any suggestions.I don't think there are any suggestions left to give. Many people, including myself, have experienced this kind of problem. It's well- documented both on my Common Issues page, and the official FreeBSD ZFS Wiki. ZFS is still considered highly experimental, so if your data is at all important to you, perform backups or switch to another filesystem provider. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Johan Str?m wrote:> Hello > > A box of mine running RELENG_7_0 and ZFS over a couple of disks (6 > disks, 3 mirrors) seems to have gotten stuck. From Ctrl-T: > > load: 0.50 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock] > 0.02u 0.04s 0% 3404k > load: 0.43 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock] > 0.02u 0.04s 0% 3404k > load: 0.10 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock] > 0.02u 0.04s 0% 3404k > load: 0.10 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock] > 0.02u 0.04s 0% 3404k > load: 0.11 cmd: zsh 40188 [zfs:&buf_hash_table.ht_locks[i].ht_lock] > 0.02u 0.04s 0% 3404k > > Worked for a while then that stopped working too (was over ssh). When > trying a local login i only got > > load: 0.09 cmd: login 1611 [zfs] 0.00u 0.00s 0% 208k > > I found one post like this earlier (by Xin LI), but nobody seemed to > have replied... > in my current conf, I think my kmem/kmem_max is at 512Mb (not sure > though, since I've edited my file yesterday for next reboot), with 2G of > system RAM.. Normally I'd run kmem(max) 1G (with arcsize of 512M. > currently it is at default), but since I just got back to 2G total mem > after some hardware problems I've been runnig at those lows (1G total is > kindof tight with zfs..) > > Well, just wanted to report... The box is not totally dead yet, ie I can > still do Ctrl-T on console, but thats it.. I don't really know what more > I can do so.. I don't have KDB/DDB. > I'll wait another hour or so before I hard reboot it, unless it > "unlocks" or if anyone have any suggestions.The key is to increase your kmem and prevent it from being exhausted. I think more recent OpenSolaris's ZFS code has some improvements but I do not have spare devices at hand to test and debug :( Maybe pjd@ would get a new import at some point? I have cc'ed him. Cheers, -- Xin LI <delphij@delphij.net> http://www.delphij.net/ FreeBSD - The Power to Serve! -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 249 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20080408/d9a58dae/signature.pgp
For your question: just reboot would be fine, you may want to tune your arc size (to be smaller) and kmem space (to be larger), which would reduce the chance that this would happen, or eliminate it, depending on your workload. This situation is not recoverable and you can trust ZFS that you will not lose data if they are already sync'ed. -- Xin LI <delphij@delphij.net> http://www.delphij.net/ FreeBSD - The Power to Serve! -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 249 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20080408/264a1fff/signature.pgp
?So no chances of ZFS stable on FBSD7? I was actually considering debian over freebsd on a dual AMD64, but if there are no settings that will make it stable... Nevertheless I'd be willing to help debugging ZFS on that machine (Dell T105) as soon as I receive it in a couple of weeks, as I'm in no rush to getting it into production (just tell me what to do ;) ). ----- Mensaje original ---- De: Spike Ilacqua <spike@indra.com> Para: Ender <ender@enderzone.com> CC: freebsd-fs@freebsd.org; freebsd-stable@freebsd.org; Johan Str?m <johan@headweb.com> Enviado: martes, 8 de abril, 2008 18:13:32 Asunto: Re: ZFS deadlock> Depending on your work load you are just buying more time, so > "reasonable" is a matter of perspective. :( I didn't see if you said > you are on 32bit or 64bit? Keep in mind the kmem max is 1.5-2G on amd64 > regardless of how much memory you have. If 512M arcsize crashes too soon > for your tastes you can always lower it down to 256M, or 128M, etc.I tried for several weeks to get ZFS stable on a 64bit system with a 1.5G kernel. The best uptime I ever got was 72 hours, the worst was 2, the average about 24. Interestingly, most of the hangs were at off hours, when the system was lightly loaded, had lots of free memory, etc. That suggests to me a slow leak of some sort. Anyway, ZFS is not ready for production. Some people may get lucky, but you can't count on it. Spike _______________________________________________ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" ______________________________________________ ?Con Mascota por primera vez? S? un mejor Amigo. Entra en Yahoo! Respuestas http://es.answers.yahoo.com/info/welcome