We have a reproducible problem with FreeBSD-4.7 which is apparently a
deadlock.
The system is undergoing a filesystem stress test.
The machine is pingable, but console and most other features are
unresponsive.
The console debugger can be accessed.
The following information is available with db's "ps".
I suspect the wchan of "inode" to be what everything is waiting on.
I'm not sure who is supposed to perform the waking.
db> ps
  pid   proc     addr    uid  ppid  pgrp  flag stat wmesg   wchan   cmd
  467 e75df000 e76d6000    0   141   141 000104  3   inode c34ab600 sshd
  466 e75df1a0 e76c9000   25   147   147 000104  3   inode c34ab600 sendmail
  465 e75df340 e76c4000    0   144   144 000104  3   inode c34ab600 sendmail
  464 e75df4e0 e76be000   25   147   147 000104  3   inode c34ab600 sendmail
  463 e75df680 e76ba000    0   144   144 000104  3   inode c34ab600 sendmail
  462 e75df820 e76b5000   25   147   147 000104  3   inode c34ab600 sendmail
  461 e75df9c0 e76b0000    0   144   144 000104  3   inode c34ab600 sendmail
  460 e75dfb60 e76ac000   25   147   147 000104  3   inode c34ab600 sendmail
  459 e75dfd00 e76a7000    0   144   144 000104  3   inode c34ab600 sendmail
  458 e75dfea0 e76a3000   25   147   147 000104  3   inode c34ab600 sendmail
  457 e75e0040 e769e000    0   144   144 000104  3   inode c34ab600 sendmail
  456 e75e01e0 e7698000   25   147   147 000104  3   inode c34ab600 sendmail
  455 e75e0380 e7693000    0   144   144 000104  3   inode c34ab600 sendmail
  454 e75e0520 e768f000   25   147   147 000104  3   inode c34ab600 sendmail
  453 e75e06c0 e768b000    0   144   144 000104  3   inode c34ab600 sendmail
  452 e75e0860 e7685000   25   147   147 000104  3   inode c34ab600 sendmail
  451 e75e0a00 e7681000    0   144   144 000104  3   inode c34ab600 sendmail
  450 e75e0ba0 e767d000   25   147   147 000104  3   inode c34ab600 sendmail
  449 e75e0d40 e7678000    0   144   144 000104  3   inode c34ab600 sendmail
  448 e75e0ee0 e7671000   25   147   147 000104  3   inode c34ab600 sendmail
  447 e75e1080 e766d000    0   144   144 000104  3   inode c34ab600 sendmail
  446 e75e1220 e7669000   25   147   147 000104  3   inode c34ab600 sendmail
  445 e75e13c0 e7664000    0   144   144 000104  3   inode c34ab600 sendmail
  444 e75e1560 e7660000   25   147   147 000104  3   inode c34ab600 sendmail
  443 e75e1700 e765b000    0   144   144 000104  3   inode c34ab600 sendmail
  442 e75e18a0 e7656000   25   147   147 000104  3   inode c34ab600 sendmail
  441 e75e1a40 e7652000    0   144   144 000104  3   inode c34ab600 sendmail
  440 e75e1be0 e764c000   25   147   147 000104  3   inode c34ab600 sendmail
  439 e75e1d80 e7647000    0   144   144 000104  3   inode c34ab600 sendmail
  438 e75e1f20 e7642000   25   147   147 000104  3   inode c34ab600 sendmail
  437 e75e20c0 e763e000    0   144   144 000104  3   inode c34ab600 sendmail
  436 e75e2260 e763a000   25   147   147 000104  3   inode c34ab600 sendmail
  435 e75e2400 e7635000    0   144   144 000104  3   inode c34ab600 sendmail
  434 e75e25a0 e7630000   25   147   147 000104  3   inode c34ab600 sendmail
  433 e75e2740 e762c000    0   144   144 000104  3   inode c34ab600 sendmail
  432 e75e28e0 e7626000   25   147   147 000104  3   inode c34ab600 sendmail
  431 e75e2a80 e7621000    0   144   144 000104  3   inode c34ab600 sendmail
  430 e75e2c20 e761c000   25   147   147 000104  3   inode c34ab600 sendmail
  429 e75e2dc0 e7618000    0   144   144 000104  3   inode c34ab600 sendmail
  428 e75e2f60 e7613000   25   147   147 000104  3   inode c34ab600 sendmail
  427 e75e3100 e760c000    0   144   144 000104  3   inode c34ab600 sendmail
  426 e75e32a0 e7608000   25   147   147 000104  3   inode c34ab600 sendmail
  425 e75e3440 e7602000    0   144   144 000104  3   inode c34ab600 sendmail
  424 e75e35e0 e75fc000   25   147   147 000104  3   inode c34ab600 sendmail
  423 e75e3780 e75f8000    0   144   144 000104  3   inode c34ab600 sendmail
  422 e75e3920 e75f4000   25   147   147 000104  3   inode c34ab600 sendmail
  421 e75e3ac0 e75ee000    0   144   144 000104  3   inode c34ab600 sendmail
  420 e75e3c60 e75ea000   25   147   147 000104  3   inode c34ab600 sendmail
  419 e75e3e00 e75e6000    0   144   144 000104  3   inode c34ab600 sendmail
  418 dc358ea0 e75dc000   25   147   147 000104  3   inode c34ab600 sendmail
  417 dc359040 e75d7000    0   144   144 000104  3   inode c34ab600 sendmail
  416 dc3591e0 e75d1000   25   147   147 000104  3   inode c34ab600 sendmail
  415 dc359380 e75cd000    0   144   144 000104  3   inode c34ab600 sendmail
  414 dc359520 e75c8000   25   147   147 000104  3   inode c34ab600 sendmail
  413 dc3596c0 e75c4000    0   144   144 000104  3   inode c34ab600 sendmail
  412 dc359860 e75bf000   25   147   147 000104  3   inode c34ab600 sendmail
  411 dc359a00 e75ba000    0   144   144 000104  3   inode c34ab600 sendmail
  410 dc359ba0 e75b6000   25   147   147 000104  3   inode c34ab600 sendmail
  409 dc359d40 e75b2000    0   144   144 000104  3   inode c34ab600 sendmail
  408 dc359ee0 e75aa000   25   147   147 000104  3   inode c34ab600 sendmail
  407 dc35a080 e75a6000    0   144   144 000104  3   inode c34ab600 sendmail
  406 dc35a220 e75a2000   25   147   147 000104  3   inode c34ab600 sendmail
  405 dc35a3c0 e759d000    0   144   144 000104  3   inode c34ab600 sendmail
  404 dc35a560 e7598000   25   147   147 000104  3   inode c34ab600 sendmail
  403 dc35af20 e03f3000    0   144   144 000104  3   inode c34ab600 sendmail
  402 dc35a700 e2877000    0    99    99 000004  3   inode c34ab600 dhclient
  401 dc35b260 e03f0000    0   203   401 8000006  3   inode c34ab600 bash
  399 dc35aa40 e1366000    0   398   399 000014  3  FFS node c0350140 cron
  398 dc35a8a0 e135b000    0   139   139 000004  3  ppwait dc35a8a0 cron
  302 dc35abe0 e0402000    0   137   302 4004004  3  ffsvgt c03695e8 tclsh83
  277 dc35ad80 e03fe000    0   137   277 4004084  3    poll c037c1a0 tclsh83
  203 dc35b8e0 e03d6000    0   202   203 004086  3    wait dc35b8e0 bash
  202 dc35c440 e036e000    0     1   202 004186  3    wait dc35c440 login
  191 dc35c2a0 e0376000    0     1     7 000086  3  select c037c1a0 snmpd
  173 dc35b0c0 e03e8000    0     1   173 000084  3  nanslp c03646b0
siocontrol
  167 dc35b400 e03e4000    0     1   167 000084  3  nanslp c03646b0 wddt
  147 dc35b5a0 e03df000   25     1   147 2000184  3   pause e03df260
sendmail
  144 dc35b740 e03da000    0     1   144 000184  3  select c037c1a0 sendmail
  141 dc35ba80 e03d2000    0     1   141 000104  3   inode c34ab600 sshd
  139 dc35bc20 e0397000    0     1   139 000004  3   inode c35f4300 cron
  137 dc35bdc0 e0392000    0     1   137 000084  3  select c037c1a0 inetd
  122 dc35bf60 e0382000    0     1   122 000004  3   inode c34ab600 syslogd
   99 dc35c100 e037e000    0     1    99 000084  3    wait dc35c100 dhclient
    6 dc35c5e0 defd1000    0     0     0 000204  3   vlrup dc35c5e0 vnlru
    5 dc35c780 defce000    0     0     0 000204  3  syncer c037c0c8 syncer
    4 dc35c920 defcb000    0     0     0 000204  3  psleep c036487c
bufdaemon
    3 dc35cac0 defc8000    0     0     0 000204  3  psleep c0372fc0 vmdaemon
    2 dc35cc60 defc5000    0     0     0 000204  3  psleep c0351e58
pagedaemon
    1 dc35ce00 dc361000    0     0     1 004284  3    wait dc35ce00 init
    0 c037b4a0 c040d000    0     0     0 000204  3   sched c037b4a0 swapper
The hung tasks look like this:
db> t 446
mi_switch(c34ab600,1000040,0,0,ffffffff) at mi_switch+0x1c8
tsleep(c34ab600,8,c031a54a,0,c34ab600) at tsleep+0x1d1
acquire(c34ab600,1000040,600,c34ab600,20002) at acquire+0xbc
lockmgr(c34ab600,1030002,defc4e6c,e75e1220,defc4e00) at lockmgr+0x2cc
vop_stdlock(e766bd28,e766bd38,c01fa02c,e766bd28,defc4e00) at
vop_stdlock+0x42
ufs_vnoperate(e766bd28) at ufs_vnoperate+0x15
vn_lock(defc4e00,20002,e75e1220) at vn_lock+0x9c
lookup(e766bed0,0,e766bed0,e766bed0,e75e1220) at lookup+0x81
namei(e766bed0,0,cb9c0a40,e766bed0,e766be18) at namei+0x19d
vn_open(e766bed0,1,1a4,3,e75e1220) at vn_open+0x1ed
open(e75e1220,e766bf80,0,80e3500,0) at open+0xc4
syscall2(2f,2f,2f,0,80e3500) at syscall2+0x20d
Xint0x80_syscall() at Xint0x80_syscall+0x2b
It might be here?  Cron is waiting on memory:
db> t 399
mi_switch(c0350140,c0363440,c0350140,c02d535c,ffffffff) at mi_switch+0x1c8
tsleep(c0350140,2,c031a400,0,c368f228) at tsleep+0x1d1
malloc(100,c0350140,0,c368f228,c35f4300) at malloc+0x1cd
ffs_vget(c35dca00,ec17,e1368cbc,0,defc3900) at ffs_vget+0xa0
ufs_lookup(e1368d20,e1368d34,c01ec562,e1368d20,e036c00a) at ufs_lookup+0xb47
ufs_vnoperate(e1368d20,e036c00a,defc3900,e1368ef8,e1368d20) at
ufs_vnoperate+0x15
vfs_cache_lookup(e1368d78,e1368d88,c01efb71,e1368d78,defc4e00) at
vfs_cache_lookup+0x2c2
ufs_vnoperate(e1368d78,defc4e00,cb9ded00,e1368ef8,dc35aa40) at
ufs_vnoperate+0x15
lookup(e1368ed0,0,e1368ed0,e1368ed0,dc35aa40) at lookup+0x2e1
namei(e1368ed0,0,cb9cc7c0,e1368ed0,c02d298b) at namei+0x19d
vn_open(e1368ed0,1,1a4,3,dc35aa40) at vn_open+0x1ed
open(dc35aa40,e1368f80,68108dec,6811b380,4) at open+0xc4
syscall2(2f,2f,2f,4,6811b380) at syscall2+0x20d
Xint0x80_syscall() at Xint0x80_syscall+0x2b
Can anyone suggest what the bug might be or how to proceed with debugging?
Thanks in advance,
David Dolson (ddolson@sandvine.com, www.sandvine.com)
To follow up, I've discovered that the system has exhausted its "FFS node" malloc type.>From vmstat on the core file, the "FFS node" MALLOC type is full:Memory statistics by type Type Kern Type InUse MemUse HighUse Limit Requests Limit Limit Size(s) ... FFS node409600102400K 102400K102400K 1138048 3 0 256 ... The stress test is recursively creating a directory then cd'ing to it, trying to create 1,000,000 deep. You might say "don't do that", but this doesn't require any special priviledge, so is a potential DoS attack by any user. I'm wondering why the MALLOC is done with M_WAITOK; it seems like something which could reasonably fail. Or, why aren't the cached inodes being purged? David Dolson (ddolson@sandvine.com, www.sandvine.com)> -----Original Message----- > From: Dave Dolson > Sent: Tuesday, July 29, 2003 3:06 PM > To: 'freebsd-stable@freebsd.org' > Subject: kernel deadlock > > > We have a reproducible problem with FreeBSD-4.7 which is > apparently a deadlock. > The system is undergoing a filesystem stress test. > > The machine is pingable, but console and most other features > are unresponsive. > The console debugger can be accessed. > The following information is available with db's "ps". > I suspect the wchan of "inode" to be what everything is waiting on. > I'm not sure who is supposed to perform the waking. > > db> ps > pid proc addr uid ppid pgrp flag stat wmesg > wchan cmd > 467 e75df000 e76d6000 0 141 141 000104 3 inode > c34ab600 sshd > 466 e75df1a0 e76c9000 25 147 147 000104 3 inode > c34ab600 sendmail > 465 e75df340 e76c4000 0 144 144 000104 3 inode > c34ab600 sendmail > 464 e75df4e0 e76be000 25 147 147 000104 3 inode > c34ab600 sendmail > 463 e75df680 e76ba000 0 144 144 000104 3 inode > c34ab600 sendmail > 462 e75df820 e76b5000 25 147 147 000104 3 inode > c34ab600 sendmail > 461 e75df9c0 e76b0000 0 144 144 000104 3 inode > c34ab600 sendmail > 460 e75dfb60 e76ac000 25 147 147 000104 3 inode > c34ab600 sendmail > 459 e75dfd00 e76a7000 0 144 144 000104 3 inode > c34ab600 sendmail > 458 e75dfea0 e76a3000 25 147 147 000104 3 inode > c34ab600 sendmail > 457 e75e0040 e769e000 0 144 144 000104 3 inode > c34ab600 sendmail > 456 e75e01e0 e7698000 25 147 147 000104 3 inode > c34ab600 sendmail > 455 e75e0380 e7693000 0 144 144 000104 3 inode > c34ab600 sendmail > 454 e75e0520 e768f000 25 147 147 000104 3 inode > c34ab600 sendmail > 453 e75e06c0 e768b000 0 144 144 000104 3 inode > c34ab600 sendmail > 452 e75e0860 e7685000 25 147 147 000104 3 inode > c34ab600 sendmail > 451 e75e0a00 e7681000 0 144 144 000104 3 inode > c34ab600 sendmail > 450 e75e0ba0 e767d000 25 147 147 000104 3 inode > c34ab600 sendmail > 449 e75e0d40 e7678000 0 144 144 000104 3 inode > c34ab600 sendmail > 448 e75e0ee0 e7671000 25 147 147 000104 3 inode > c34ab600 sendmail > 447 e75e1080 e766d000 0 144 144 000104 3 inode > c34ab600 sendmail > 446 e75e1220 e7669000 25 147 147 000104 3 inode > c34ab600 sendmail > 445 e75e13c0 e7664000 0 144 144 000104 3 inode > c34ab600 sendmail > 444 e75e1560 e7660000 25 147 147 000104 3 inode > c34ab600 sendmail > 443 e75e1700 e765b000 0 144 144 000104 3 inode > c34ab600 sendmail > 442 e75e18a0 e7656000 25 147 147 000104 3 inode > c34ab600 sendmail > 441 e75e1a40 e7652000 0 144 144 000104 3 inode > c34ab600 sendmail > 440 e75e1be0 e764c000 25 147 147 000104 3 inode > c34ab600 sendmail > 439 e75e1d80 e7647000 0 144 144 000104 3 inode > c34ab600 sendmail > 438 e75e1f20 e7642000 25 147 147 000104 3 inode > c34ab600 sendmail > 437 e75e20c0 e763e000 0 144 144 000104 3 inode > c34ab600 sendmail > 436 e75e2260 e763a000 25 147 147 000104 3 inode > c34ab600 sendmail > 435 e75e2400 e7635000 0 144 144 000104 3 inode > c34ab600 sendmail > 434 e75e25a0 e7630000 25 147 147 000104 3 inode > c34ab600 sendmail > 433 e75e2740 e762c000 0 144 144 000104 3 inode > c34ab600 sendmail > 432 e75e28e0 e7626000 25 147 147 000104 3 inode > c34ab600 sendmail > 431 e75e2a80 e7621000 0 144 144 000104 3 inode > c34ab600 sendmail > 430 e75e2c20 e761c000 25 147 147 000104 3 inode > c34ab600 sendmail > 429 e75e2dc0 e7618000 0 144 144 000104 3 inode > c34ab600 sendmail > 428 e75e2f60 e7613000 25 147 147 000104 3 inode > c34ab600 sendmail > 427 e75e3100 e760c000 0 144 144 000104 3 inode > c34ab600 sendmail > 426 e75e32a0 e7608000 25 147 147 000104 3 inode > c34ab600 sendmail > 425 e75e3440 e7602000 0 144 144 000104 3 inode > c34ab600 sendmail > 424 e75e35e0 e75fc000 25 147 147 000104 3 inode > c34ab600 sendmail > 423 e75e3780 e75f8000 0 144 144 000104 3 inode > c34ab600 sendmail > 422 e75e3920 e75f4000 25 147 147 000104 3 inode > c34ab600 sendmail > 421 e75e3ac0 e75ee000 0 144 144 000104 3 inode > c34ab600 sendmail > 420 e75e3c60 e75ea000 25 147 147 000104 3 inode > c34ab600 sendmail > 419 e75e3e00 e75e6000 0 144 144 000104 3 inode > c34ab600 sendmail > 418 dc358ea0 e75dc000 25 147 147 000104 3 inode > c34ab600 sendmail > 417 dc359040 e75d7000 0 144 144 000104 3 inode > c34ab600 sendmail > 416 dc3591e0 e75d1000 25 147 147 000104 3 inode > c34ab600 sendmail > 415 dc359380 e75cd000 0 144 144 000104 3 inode > c34ab600 sendmail > 414 dc359520 e75c8000 25 147 147 000104 3 inode > c34ab600 sendmail > 413 dc3596c0 e75c4000 0 144 144 000104 3 inode > c34ab600 sendmail > 412 dc359860 e75bf000 25 147 147 000104 3 inode > c34ab600 sendmail > 411 dc359a00 e75ba000 0 144 144 000104 3 inode > c34ab600 sendmail > 410 dc359ba0 e75b6000 25 147 147 000104 3 inode > c34ab600 sendmail > 409 dc359d40 e75b2000 0 144 144 000104 3 inode > c34ab600 sendmail > 408 dc359ee0 e75aa000 25 147 147 000104 3 inode > c34ab600 sendmail > 407 dc35a080 e75a6000 0 144 144 000104 3 inode > c34ab600 sendmail > 406 dc35a220 e75a2000 25 147 147 000104 3 inode > c34ab600 sendmail > 405 dc35a3c0 e759d000 0 144 144 000104 3 inode > c34ab600 sendmail > 404 dc35a560 e7598000 25 147 147 000104 3 inode > c34ab600 sendmail > 403 dc35af20 e03f3000 0 144 144 000104 3 inode > c34ab600 sendmail > 402 dc35a700 e2877000 0 99 99 000004 3 inode > c34ab600 dhclient > 401 dc35b260 e03f0000 0 203 401 8000006 3 inode > c34ab600 bash > 399 dc35aa40 e1366000 0 398 399 000014 3 FFS node > c0350140 cron > 398 dc35a8a0 e135b000 0 139 139 000004 3 ppwait > dc35a8a0 cron > 302 dc35abe0 e0402000 0 137 302 4004004 3 ffsvgt > c03695e8 tclsh83 > 277 dc35ad80 e03fe000 0 137 277 4004084 3 poll > c037c1a0 tclsh83 > 203 dc35b8e0 e03d6000 0 202 203 004086 3 wait > dc35b8e0 bash > 202 dc35c440 e036e000 0 1 202 004186 3 wait > dc35c440 login > 191 dc35c2a0 e0376000 0 1 7 000086 3 select > c037c1a0 snmpd > 173 dc35b0c0 e03e8000 0 1 173 000084 3 nanslp > c03646b0 siocontrol > 167 dc35b400 e03e4000 0 1 167 000084 3 nanslp > c03646b0 wddt > 147 dc35b5a0 e03df000 25 1 147 2000184 3 pause > e03df260 sendmail > 144 dc35b740 e03da000 0 1 144 000184 3 select > c037c1a0 sendmail > 141 dc35ba80 e03d2000 0 1 141 000104 3 inode > c34ab600 sshd > 139 dc35bc20 e0397000 0 1 139 000004 3 inode > c35f4300 cron > 137 dc35bdc0 e0392000 0 1 137 000084 3 select > c037c1a0 inetd > 122 dc35bf60 e0382000 0 1 122 000004 3 inode > c34ab600 syslogd > 99 dc35c100 e037e000 0 1 99 000084 3 wait > dc35c100 dhclient > 6 dc35c5e0 defd1000 0 0 0 000204 3 vlrup > dc35c5e0 vnlru > 5 dc35c780 defce000 0 0 0 000204 3 syncer > c037c0c8 syncer > 4 dc35c920 defcb000 0 0 0 000204 3 psleep > c036487c bufdaemon > 3 dc35cac0 defc8000 0 0 0 000204 3 psleep > c0372fc0 vmdaemon > 2 dc35cc60 defc5000 0 0 0 000204 3 psleep > c0351e58 pagedaemon > 1 dc35ce00 dc361000 0 0 1 004284 3 wait > dc35ce00 init > 0 c037b4a0 c040d000 0 0 0 000204 3 sched > c037b4a0 swapper > > The hung tasks look like this: > > db> t 446 > mi_switch(c34ab600,1000040,0,0,ffffffff) at mi_switch+0x1c8 > tsleep(c34ab600,8,c031a54a,0,c34ab600) at tsleep+0x1d1 > acquire(c34ab600,1000040,600,c34ab600,20002) at acquire+0xbc > lockmgr(c34ab600,1030002,defc4e6c,e75e1220,defc4e00) at lockmgr+0x2cc > vop_stdlock(e766bd28,e766bd38,c01fa02c,e766bd28,defc4e00) at > vop_stdlock+0x42 > ufs_vnoperate(e766bd28) at ufs_vnoperate+0x15 > vn_lock(defc4e00,20002,e75e1220) at vn_lock+0x9c > lookup(e766bed0,0,e766bed0,e766bed0,e75e1220) at lookup+0x81 > namei(e766bed0,0,cb9c0a40,e766bed0,e766be18) at namei+0x19d > vn_open(e766bed0,1,1a4,3,e75e1220) at vn_open+0x1ed > open(e75e1220,e766bf80,0,80e3500,0) at open+0xc4 > syscall2(2f,2f,2f,0,80e3500) at syscall2+0x20d > Xint0x80_syscall() at Xint0x80_syscall+0x2b > > It might be here? Cron is waiting on memory: > > db> t 399 > mi_switch(c0350140,c0363440,c0350140,c02d535c,ffffffff) at > mi_switch+0x1c8 > tsleep(c0350140,2,c031a400,0,c368f228) at tsleep+0x1d1 > malloc(100,c0350140,0,c368f228,c35f4300) at malloc+0x1cd > ffs_vget(c35dca00,ec17,e1368cbc,0,defc3900) at ffs_vget+0xa0 > ufs_lookup(e1368d20,e1368d34,c01ec562,e1368d20,e036c00a) at > ufs_lookup+0xb47 > ufs_vnoperate(e1368d20,e036c00a,defc3900,e1368ef8,e1368d20) > at ufs_vnoperate+0x15 > vfs_cache_lookup(e1368d78,e1368d88,c01efb71,e1368d78,defc4e00) > at vfs_cache_lookup+0x2c2 > ufs_vnoperate(e1368d78,defc4e00,cb9ded00,e1368ef8,dc35aa40) > at ufs_vnoperate+0x15 > lookup(e1368ed0,0,e1368ed0,e1368ed0,dc35aa40) at lookup+0x2e1 > namei(e1368ed0,0,cb9cc7c0,e1368ed0,c02d298b) at namei+0x19d > vn_open(e1368ed0,1,1a4,3,dc35aa40) at vn_open+0x1ed > open(dc35aa40,e1368f80,68108dec,6811b380,4) at open+0xc4 > syscall2(2f,2f,2f,4,6811b380) at syscall2+0x20d > Xint0x80_syscall() at Xint0x80_syscall+0x2b > > > Can anyone suggest what the bug might be or how to proceed > with debugging? > > Thanks in advance, > David Dolson (ddolson@sandvine.com, www.sandvine.com) > >
From: Robert Watson [mailto:rwatson@freebsd.org]> On Tue, 29 Jul 2003, Dave Dolson wrote: > > > To follow up, I've discovered that the system has exhausted its "FFS > > node" malloc type....> > Some problems with this have turned up in -CURRENT on large-memory > machines where some of the scaling factors have been off. InWe currently have kern.maxvnodes=70354 set (automatically scaled). This is a 1GB box. I will try re-running the test with less. when it hits kern.maxvnodes, what will it do? --don
> From: Robert Watson [mailto:rwatson@freebsd.org] > > On Tue, 29 Jul 2003, Dave Dolson wrote: > > > > > To follow up, I've discovered that the system has > exhausted its "FFS > > > node" malloc type. > ... > > > > Some problems with this have turned up in -CURRENT on large-memory > > machines where some of the scaling factors have been off. In > > We currently have kern.maxvnodes=70354 set (automatically > scaled). This > is a 1GB box. > > I will try re-running the test with less. > > when it hits kern.maxvnodes, what will it do?So I dropped kern.maxvnodes in half (to 35000). This has a 1GB of physical memory in a 2x xeon (w/ HTT enabled, so 4 procs). when it hit the limit, the system stopped switching amongst processes. my vmstat blocked in 'vlruwk'. I merged kern/52425 (kern/vfs_subr.c 1.249.2.30, kern/vfs_syscalls.c 1.151.2.18, sys/mount.h 1.89.2.7) which is supposed to address this, but it didn't. [we're running 4.7]. after a long time, my ^C to the vmstat came through, and my shell prompt came back, but then bash stopped in 'inode'. In this case i'm not short of memory. the test is doing this: for (i = 0; i < 100000; i++) mkdir dir.$i cd dir.$i and while it was running i had: while true do vmstat -m | grep FFS sleep 1 done running to watch it. So it seems the problem may not be running out of memory in the malloc pool, but in the vnode reclamation? --don
From: Don Bowman [mailto:don@sandvine.com]> > From: Robert Watson [mailto:rwatson@freebsd.org] > > On Tue, 29 Jul 2003, Dave Dolson wrote: > > > > > To follow up, I've discovered that the system has > exhausted its "FFS > > > node" malloc type. > ... > > > > Some problems with this have turned up in -CURRENT on large-memory > > machines where some of the scaling factors have been off. In > > We currently have kern.maxvnodes=70354 set (automatically > scaled). This > is a 1GB box. > > I will try re-running the test with less. > > when it hits kern.maxvnodes, what will it do?After applying the fixes from RELENG_4 for kern/52425, I can still easily reproduce this hang without low memory. Further debugging shows that vnlru process is waiting on vlrup. This line is shown below. ie vnlru_nowhere is being incremented ever 3 seconds. static void vnlru_proc(void) { ... s = splbio(); for (;;) { ... if (done == 0) { vnlru_nowhere++; tsleep(vnlruproc, PPAUSE, "vlrup", hz * 3); } } splx(s); syncher is in vlruwk wait from getnewvnode(). lots of other processes waiting on ffsvgt. this implies that vlrureclaim() was unable to free anything. i have maxvnode = 35k. as soon as i hit this value, my system locked up [bash on serial shell non-responsive, serial driver echos chars, can drop into ddb]. Processes which don't use filesystem seem to continue to run ok. A couple of procs are waiting on inode: env, cron. These never come out of waiting for it. suggestions? db> ps pid proc addr uid ppid pgrp flag stat wmesg wchan cmd 649 dc35a8a0 e0a32000 0 641 641 004104 3 ffsvgt c03698a8 atrun 648 dc35a3c0 e0e36000 0 647 648 000014 3 vlruwk c0364c90 cron 647 dc35b740 e03d4000 0 135 135 000004 3 ppwait dc35b740 cron 646 dc35b0c0 e03ee000 0 635 101 004004 3 inode c368ee00 env 645 dc35ad80 e03f1000 0 212 644 004006 3 ffsvgt c03698a8 grep 644 dc35aa40 e0400000 0 212 644 004006 3 ffsvgt c03698a8 sysctl 641 dc35a080 e0e4c000 0 640 641 004084 3 wait dc35a080 sh 640 dc35a220 e0e39000 0 135 135 000084 3 piperd e037c5c0 cron 635 dc35a560 e0e32000 0 101 101 004084 3 piperd e037cd40 sh 456 dc35abe0 e03fc000 0 133 456 4004004 3 ffsvgt c03698a8 tclsh83 212 dc35bdc0 e0392000 0 199 212 004086 3 wait dc35bdc0 bash 199 dc35c440 e036e000 0 1 199 004186 3 wait dc35c440 login 187 dc35c2a0 e0376000 0 1 7 000086 3 select c037c460 snmpd 169 dc35af20 e03e7000 0 1 169 000084 3 nanslp c0364970 siocontrol 163 dc35b260 e03e2000 0 1 163 000084 3 nanslp c0364970 wddt 143 dc35b400 e03dd000 25 1 143 2000184 3 pause e03dd260 sendmail 140 dc35b5a0 e03d9000 0 1 140 000184 3 select c037c460 sendmail 137 dc35b8e0 e03d0000 0 1 137 000184 3 select c037c460 sshd 135 dc35ba80 e03c2000 0 1 135 000004 3 inode c35f4400 cron 133 dc35bc20 e0397000 0 1 133 000084 3 select c037c460 inetd 124 dc35bf60 e0382000 0 1 124 000084 3 select c037c460 syslogd 101 dc35c100 e037e000 0 1 101 000084 3 wait dc35c100 dhclient 6 dc35c5e0 defd1000 0 0 0 000204 3 vlrup dc35c5e0 vnlru 5 dc35c780 defce000 0 0 0 000204 3 syncer c037c388 syncer 4 dc35c920 defcb000 0 0 0 000204 3 psleep c0364b3c bufdaemon 3 dc35cac0 defc8000 0 0 0 000204 3 psleep c0373280 vmdaemon 2 dc35cc60 defc5000 0 0 0 000204 3 psleep c0352118 pagedaemon 1 dc35ce00 dc361000 0 0 1 004284 3 wait dc35ce00 init 0 c037b760 c040e000 0 0 0 000204 3 sched c037b760 swapper
> On Tue, 29 Jul 2003, Don Bowman wrote: > > > From: Don Bowman [mailto:don@sandvine.com] > > > > > > From: Robert Watson [mailto:rwatson@freebsd.org] > > > > On Tue, 29 Jul 2003, Dave Dolson wrote: > > > > > > > > > To follow up, I've discovered that the system has > > > exhausted its "FFS > > > > > node" malloc type. > > > ... > > > > > > > > Some problems with this have turned up in -CURRENT on > large-memory > > > > machines where some of the scaling factors have been off. In > > > > > > We currently have kern.maxvnodes=70354 set (automatically > > > scaled). This > > > is a 1GB box. > > > > > > I will try re-running the test with less. > > > > > > when it hits kern.maxvnodes, what will it do? > > > > After applying the fixes from RELENG_4 for kern/52425, > > I can still easily reproduce this hang without low memory. > > Further debugging shows that vnlru process is waiting on > > vlrup. This line is shown below. ie vnlru_nowhere is being > > incremented ever 3 seconds.So what is happening here is that vnlru wakes up, runs through, and there is nothing to free, so it goes back to sleep having freed nothing. The caller doesn't wake up. There's no vnodes to free, and everything in the system locks up. One possible solution is to make vnlru more aggressive, so that before giving up, it tries to free pages that have many references etc (which it currently skips). Another option is to have it simply bump the kern.maxvnodes number and wake up the process which called it. Suggestions? --don
From: The Hermit Hacker [mailto:scrappy@hub.org]> > > > One possible solution is to make vnlru more aggressive, so > > that before giving up, it tries to free pages that have > > many references etc (which it currently skips). > > Another option is to have it simply bump the kern.maxvnodes > > number and wake up the process which called it. > > > > Suggestions? > > > > check out 4.8-STABLE, which Tor.Egge(sp?) made modifications > to the vnlru > process that sound exactly what you are proposing ...Actually that makes the problem worse in an other area, and doesn't fix this one. The 'fix' there is to do 10% of the noes on a free operation, rather than 10 at a time. Now the system will hang up for longer when its freeing them. However the root cause is still that it decides there are no freeable nodes in this case, so vnlru goes back to sleep having freed none, the caller stays asleep, and anyone else wanting a vnode goes to sleep too. --don