similar to: How To change server recovery timeout

Displaying 11 results from an estimated 11 matches similar to: "How To change server recovery timeout"

2008 Mar 04
16
Cannot send after transport endpoint shutdown (-108)
This morning I''ve had both my infiniband and tcp lustre clients hiccup. They are evicted from the server presumably as a result of their high load and consequent timeouts. My question is- why don''t the clients re-connect. The infiniband and tcp clients both give the following message when I type "df" - Cannot send after transport endpoint shutdown (-108). I''ve
2008 Feb 04
32
Luster clients getting evicted
on our cluster that has been running lustre for about 1 month. I have 1 MDT/MGS and 1 OSS with 2 OST''s. Our cluster uses all Gige and has about 608 nodes 1854 cores. We have allot of jobs that die, and/or go into high IO wait, strace shows processes stuck in fstat(). The big problem is (i think) I would like some feedback on it that of these 608 nodes 209 of them have in dmesg
2008 Feb 12
1
LDISKFS-fs warnings on MDS lustre 1.6.4.2
Hi Folks, We can see these massages on our MDS Feb 12 12:46:08 mds01 kernel: LDISKFS-fs warning (device dm-0): empty_dir: bad directory (dir #31452569) - no `.'' or `..'' Feb 12 12:46:08 mds01 kernel: LDISKFS-fs warning (device dm-0): ldiskfs_rmdir: empty directory has too many links (3) It seem to indicate that we have bad(corrupted) directory. Do you have any idea how to
2007 Mar 20
15
How to bypass failed OST without blocking?
Hi I want my lustre do such things during OST failed: if some file has stripe data on th failed OST, any operation on the file will return IO error without blocking, and also at this moment I can create and read/write new file or read/write files which have no stripe data on the failed OST without blocking. What should I do ? How to configure? thanks! swin -------------- next part
2013 Apr 29
2
Samba 3 dynamically enable or disable share
Hello, ? ?I wonder if it is possible to dynamically enable/disable samba 3 shares.? Here is my problem.? On a remote server I have 4 removable hard drives, large capacity. I am not using any RAID/JBOD, so each drive is mounted individually (like /mnt/DISK1, /mnt/DISK2 etc) and each drive is individually shared, something like: [STORAGE01] path = /mnt/DISK1 Guest OK = false ... [STORAGE02]
2007 Nov 06
4
Checksum Algorithm
Hi, We have seen a huge performance drop in 1.6.3, due to the checksum being enabled by default. I looked at the algorithm being used, and it is actually a CRC32, which is a very strong algorithm for detecting all sorts of problems, such as single bit errors, swapped bytes, and missing bytes. I''ve been experimenting with using a simple XOR algorithm. I''ve been able to recover
2010 Jul 13
4
Enable async journals
Hi all, we use SLES 11 and Lustre 1.8.1.1 + patches and like convert a lustre FS using external journals to one with async journals enabled. Question is whether the procedure: umount <filesystem> on all clients umount <osts> on all OSSes e2fsck <ost-device> on all OSSes for all all OSTs tune2fs -O ^has_journal <ost-device> on all
2013 Apr 16
0
Fine locally but not on LAN
Hi There, I've been using Samba for years and years but I've hit a problem that I just don't understand. We have a storage server that's been running nicely for 18 months. It does NFS for virtual machines and SMB for Windows storage. The machine is domain joined and seems in good order. Recently (and coincidently) we made some changes to our domain controllers and since then
2008 Feb 12
0
Lustre-discuss Digest, Vol 25, Issue 17
Hi, i just want to know whether there are any alternative file systems for HP SFS. I heard that there is Cluster Gateway from Polyserve. Can anybody plz help me in finding more abt this Cluster Gateway. Thanks and Regards, Ashok Bharat -----Original Message----- From: lustre-discuss-bounces at lists.lustre.org on behalf of lustre-discuss-request at lists.lustre.org Sent: Tue 2/12/2008 3:18 AM
2011 Oct 19
5
doveadm segfaults on TCP connect - version 2.0.15
Hi list, I just recently installed Dovecot 2.0.15. Unfortunately, doveadm segfaults when I attempt to connect to the local dovecot instance. When this occurs, my logs show: 2011-10-19T12:31:23-07:00 mail02 dovecot: doveadm: Error: doveadm client not compatible with this server (mixed old and new binaries?) I am using the settings listed on the wiki page http://wiki2.dovecot.org/Director [root
2008 Jun 30
4
Rebuild of kernel 2.6.9-67.0.20.EL failure
Hello list. I'm trying to rebuild the 2.6.9.67.0.20.EL kernel, but it fails even without modifications. How did I try it? Created a (non-root) build environment (not a mock ) Installed the kernel.scr.rpm and did a rpmbuild -ba --target=`uname -m` kernel-2.6.spec 2> prep-err.log | tee prep-out.log The build failed at the end: Processing files: kernel-xenU-devel-2.6.9-67.0.20.EL Checking