Robin Lee Powell
2004-Apr-16 20:39 UTC
[Samba] smbfs linux 2.4.24; connection loss causes total machine hang
I'm running smbfs on a Debian box with kernel 2.4.24. I would give you more information (such as my samba tools versions) if I could get into the box, but it's hung right now due to this bug. I'm mounting a filesystem from my desktop Windows box (which is right next to it; there's a D-Link in the way, though) and passing large amounts of data to it (backups). If the connection to the windows box drops while this large cp is running, i.e. due to the Windows box rebooting or being shut down, the cp hangs unkillably. From that point on, many other processes end up in the same state (mutt, man and vi/vim, for example, but not cat) regardless of whether they are going anywhere near the smbfs mount *or* the partition the data was coming from. umount will work on neither the smbfs mount or the partition the data was comming; it says they are busy. fuser -k hangs unkillably. apache, mysql, and postgres all hang unkillably, as does spamd. No mail delivery occurs, but the exim processes are killable. In short, the system becomes basically unusable. This seems pretty intensely bad. Is this a known bug? If so, is there a workaround? A patch? -Robin -- http://www.csclub.uwaterloo.ca/~rlpowell/ BTW, I'm male, honest. le datni cu djica le nu zifre .iku'i .oi le so'e datni cu to'e te pilno je xlali -- RLP http://www.lojban.org/
Robin Lee Powell
2004-Apr-17 01:59 UTC
[Samba] Re: smbfs linux 2.4.24; connection loss causes total machine hang
On Fri, Apr 16, 2004 at 04:38:47PM -0400, wrote:> If the connection to the windows box drops while this large cp is > running, i.e. due to the Windows box rebooting or being shut down, the > cp hangs unkillably.I figured out what was happening more specifically. My anti-virus software, PC-Cillin 2002, apparently either shut down the SMB connection or made it so slow as to be unusable, presumably to give it time to inspect the files I was trying to upload. syslog at that time said: Apr 16 04:00:26 chain kernel: smb_open: chain/home_noras.zip access denied, access=0, wish=1 Apr 16 04:00:26 chain kernel: smb_open: chain/home_nwn_saves.zip access denied, access=0, wish=1 Apr 16 04:00:26 chain kernel: smb_open: chain/home_phma.zip access denied, access=0, wish=1 Apr 16 04:00:27 chain kernel: smb_open: chain/home_raladue.zip access denied, access=0, wish=1 Apr 16 04:00:28 chain kernel: smb_open: chain/home_rj.zip access denied, access=0, wish=1 Apr 16 04:00:58 chain kernel: SMB server not responding Apr 16 04:00:58 chain kernel: smb_get_length: recv error = 5 Apr 16 04:00:58 chain kernel: smb_request: result -5, setting invalid Things got progressively worse, judging from the logs, after that. Thinking the backups *must* have completed, I turned the machine off the next morning and things went to hell in a handbasket. At some point, either when the machine went off or at the first time I tried to unmount, I also got this: Apr 16 10:12:38 chain kernel: smb_request: result -104, setting invalid Apr 16 10:12:48 chain kernel: smb_retry: caught signal -Robin -- http://www.csclub.uwaterloo.ca/~rlpowell/ BTW, I'm male, honest. le datni cu djica le nu zifre .iku'i .oi le so'e datni cu to'e te pilno je xlali -- RLP http://www.lojban.org/