Niki Denev
2006-Jan-20 10:02 UTC
diskio / filesystem related deadlock on SMP 6.0-STABLE machine.
Hello, I'm experiencing some problems with a 6.0-STABLE machine, cvsupped and rebuilt yesterday. I'm not sure that the problem is related to this last update, because this machine is not very loaded currently. The machine is with dual opteron mb from Supermicro, with two Opteron 244s and 4GB of DDR 400 ECC Registered memory, integrated Adaptec U320 SCSI controller (forced to U160 mode), and four Seagate 36G 10K rpm scsi drives. The kernel config is generic SMP with enabled QUOTA support, accounting_enable=YES in rc.conf and the root fs is a software Raid-10 running two striped mirrors using geom_mirror and geom_stripe with the help of a little /boot partition for loading the kernel and the required modules. Kernel conf, dmesg and loader.conf are available here : http://www.totalterror.net/freebsd/srv/ Yesterday i was able to deadlock the machine two times, doing exactly the same thing : I was doing rsync from another machine to this one. I was syncing one rather big imap(Maildir) folder, about 270K msgs(files), and at the same time i was syncing this folder contents via the bincimap imap server on a remote machine running Kmail. Then i run a "du -sh" on the folder in question.....and all my shells to it stopped working... I was able to ping the machine and connect to listening ports, but without getting banners from the daemons. There was also zero HDD activity at this time. Unfortunately i forgot to enter the debuger and get a trace...(but, will it show something meaningfull or just the keyboard interrupt handler?) After reseting the machine booted and rebuilt it's secondary components on the both mirrors ( maybe this is normal? it seems it's happeing everytime the machine is uncleanly restarted) This is a big problem for me because this machine will soon enter in production, and should be able to serve imap to a dozen of clients. I know that 270K msg in single imap folder is stupid, but our old imap server running FreeBSD 5.4-STABLE(quite old STABLE) on AMD 1800+ with 2G of ram and 80G IDE disk has no problems with it, except being very slow of course I hope this info is enough, if not i will gladly provide more. Any suggestions are welcome, Thanks. --niki
Niki Denev
2006-Jan-25 07:54 UTC
diskio / filesystem related deadlock on SMP 6.0-STABLE machine.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Niki Denev wrote:>[...] I received off the list suggestion to try a patch from -CURRENT for vfs_lookup.c. I think that this patch is already commited to -STABLE, and so i rebuilt my kernel and world again today, but the problem is still here. The machine deadlocked very hard this time, it even stopped responding to ping requests. This time i was rsyncing this monstrous imap mailbox (about 8G), and ran several times "du -sh Maildir" on the target machine, everything looked ok, and then i decided to run "iostat" and when i pressed enter the shell freezed. I will try to setup debugger on the serial console tomorrow to see if i can get some useful info. I'm open to suggestions on how to debug this :) Thanks, Niki -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFD158JHNAJ/fLbfrkRAspmAJ9VJdqMiVljckJUtgipZx1i/UTh1QCcCV+I K2qgeScOMZGz4fxMFnvnE9Q=cOPx -----END PGP SIGNATURE-----
Kris Kennaway
2006-Jan-26 08:18 UTC
diskio / filesystem related deadlock on SMP 6.0-STABLE machine.
On Thu, Jan 26, 2006 at 05:07:56PM +0200, Niki Denev wrote:> On Thursday 26 January 2006 10:40, Niki Denev wrote: > > > [...] > > After i disabled option QUOTA in both my default kernel config > and the one i compiled with the debugging options i was unable > to reproduce the deadlock again. (i hope it stays that way :) ) > This, together with the report in my previous post probably point > that the problem is in the QUOTA support.Actually, I think this is known. Kris -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060126/9ea94a25/attachment.bin
Niki Denev
2006-Jan-26 13:10 UTC
diskio / filesystem related deadlock on SMP 6.0-STABLE machine.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Kris Kennaway wrote:> On Thu, Jan 26, 2006 at 05:07:56PM +0200, Niki Denev wrote: >> On Thursday 26 January 2006 10:40, Niki Denev wrote: >> [...] >> >> After i disabled option QUOTA in both my default kernel config >> and the one i compiled with the debugging options i was unable >> to reproduce the deadlock again. (i hope it stays that way :) ) >> This, together with the report in my previous post probably point >> that the problem is in the QUOTA support. > > Actually, I think this is known. > > KrisYes, sorry for this. I found a thread on -hackers from November 2005 which seems related: http://lists.freebsd.org/pipermail/freebsd-hackers/2005-November/014339.html Anyway, the machine in question is still not very actively used and i can use it as a guinea pig if someone has the time to look at this, if not, maybe it should be noted somewhere that at this point "options QUOTA" may cause deadlocks to avoid similar questions/postings? :) Niki -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.2 (MingW32) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFD2TbsHNAJ/fLbfrkRAv9BAJ4oCP1rMGnztUFR/3AT8KNnVxuVdQCdFD9w ctMFkroDrgtO8Au46TBzfSY=Seff -----END PGP SIGNATURE-----
Kris Kennaway
2006-Jan-26 14:28 UTC
diskio / filesystem related deadlock on SMP 6.0-STABLE machine.
On Thu, Jan 26, 2006 at 10:54:04PM +0200, Niki Denev wrote:> Kris Kennaway wrote: > > On Thu, Jan 26, 2006 at 05:07:56PM +0200, Niki Denev wrote: > >> On Thursday 26 January 2006 10:40, Niki Denev wrote: > >> [...] > >> > >> After i disabled option QUOTA in both my default kernel config > >> and the one i compiled with the debugging options i was unable > >> to reproduce the deadlock again. (i hope it stays that way :) ) > >> This, together with the report in my previous post probably point > >> that the problem is in the QUOTA support. > > > > Actually, I think this is known. > > > > Kris > > Yes, sorry for this.No problem. At least your cause is identified, so you know what to look out for.> I found a thread on -hackers from November 2005 which seems related: > http://lists.freebsd.org/pipermail/freebsd-hackers/2005-November/014339.html > > Anyway, the machine in question is still not very actively used and > i can use it as a guinea pig if someone has the time to look at this, > if not, maybe it should be noted somewhere that at this point "options QUOTA" > may cause deadlocks to avoid similar questions/postings? :)I have recommended this be tracked on the 6.1 TODO list - hopefully there will be a solution before the release. Kris -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20060126/a9f72c1d/attachment.bin
Mike Jakubik
2006-Jan-26 15:46 UTC
diskio / filesystem related deadlock on SMP 6.0-STABLE machine.
Kris Kennaway wrote:> On Thu, Jan 26, 2006 at 05:07:56PM +0200, Niki Denev wrote: > >> On Thursday 26 January 2006 10:40, Niki Denev wrote: >> >> [...] >> >> After i disabled option QUOTA in both my default kernel config >> and the one i compiled with the debugging options i was unable >> to reproduce the deadlock again. (i hope it stays that way :) ) >> This, together with the report in my previous post probably point >> that the problem is in the QUOTA support. >> > > Actually, I think this is known. > > Kris >Well thats good to know, i was planning on upgrading a production box from 5 to 6, its SMP and uses QUOTA. How did 6 get released when QUOTA was known to cause deadlocks?