similar to: strange lustre errors

Displaying 20 results from an estimated 200 matches similar to: "strange lustre errors"

2013 Apr 29
1
OSTs inactive on one client (only)
Hi everyone, I have seen this question here before, but without a very satisfactory answer. One of our half a dozen clients has lost access to a set of OSTs: > lfs osts OBDS:: 0: lustre-OST0000_UUID ACTIVE 1: lustre-OST0001_UUID ACTIVE 2: lustre-OST0002_UUID INACTIVE 3: lustre-OST0003_UUID INACTIVE 4: lustre-OST0004_UUID INACTIVE 5: lustre-OST0005_UUID ACTIVE 6: lustre-OST0006_UUID ACTIVE
2008 Jan 10
4
1.6.4.1 - active client evicted
Hi! We''ve started to poke and prod at Lustre 1.6.4.1, and it seems to mostly work (we haven''t had it OOPS on us yet like the earlier 1.6-versions did). However, we had this weird incident where an active client (it was copying 4GB files and running ls at the time) got evicted by the MDS and all OST''s. After a while logs indicate that it did recover the connection
2008 Feb 04
32
Luster clients getting evicted
on our cluster that has been running lustre for about 1 month. I have 1 MDT/MGS and 1 OSS with 2 OST''s. Our cluster uses all Gige and has about 608 nodes 1854 cores. We have allot of jobs that die, and/or go into high IO wait, strace shows processes stuck in fstat(). The big problem is (i think) I would like some feedback on it that of these 608 nodes 209 of them have in dmesg
2013 Mar 18
1
lustre showing inactive devices
I installed 1 MDS , 2 OSS/OST and 2 Lustre Client. My MDS shows: [code] [root at MDS ~]# lctl list_nids 10.94.214.185 at tcp [root at MDS ~]# [/code] On Lustre Client1: [code] [root at lustreclient1 lustre]# lfs df -h UUID bytes Used Available Use% Mounted on lustre-MDT0000_UUID 4.5G 274.3M 3.9G 6% /mnt/lustre[MDT:0]
2013 Mar 18
1
OST0006 : inactive device
I installed 1 MDS , 2 OSS/OST and 2 Lustre Client. My MDS shows: [code] [root at MDS ~]# lctl list_nids 10.94.214.185 at tcp [root at MDS ~]# [/code] On Lustre Client1: [code] [root at lustreclient1 lustre]# lfs df -h UUID bytes Used Available Use% Mounted on lustre-MDT0000_UUID 4.5G 274.3M 3.9G 6% /mnt/lustre[MDT:0] lustre-OST0000_UUID
2003 May 11
2
printing issues - lprng
Hi, I am running samba and lprng for 4 hp laserjets. Sometimes a samba print job goes to the wrong printer. Normally everything works just fine. Shown below is the lprng acct log file enries for the concerned print job. Please notice that the print job was supposed to go to hp4050-mail, but ended up in hp8100. Any suggestions are much appreciated. Thanks Balagopal jobstart
2008 Mar 07
2
Multihomed question: want Lustre over IB andEthernet
Chris, Perhaps you need to perform some write_conf like command. I''m not sure if this is needed in 1.6 or not. Shane ----- Original Message ----- From: lustre-discuss-bounces at lists.lustre.org <lustre-discuss-bounces at lists.lustre.org> To: lustre-discuss <lustre-discuss at lists.lustre.org> Sent: Fri Mar 07 12:03:17 2008 Subject: Re: [Lustre-discuss] Multihomed
2010 Jul 08
5
No space left on device on not full filesystem
Hello, We have running lustre 1.8.1 and have met "No space lest on device" error when uploading 500 Gb small files (less then 100 Kb each). The problem seems to depends on the number of files. If we remove one file, we can create one new file, even with Gb size; but if we haven''t remove something we can''t create even very little file, as an example using touch
2007 Nov 07
1
ll_cfg_requeue process timeouts
Hi, Our environment is: 2.6.9-55.0.9.EL_lustre.1.6.3smp I am getting following errors from two OSS''s ... Nov 7 10:39:51 storage09.beowulf.cluster kernel: LustreError: 23045:0:(client.c:519:ptlrpc_import_delay_req()) @@@ IMP_INVALID req at 00000100b410be00 x4190687/t0 o101->MGS at MGC10.143.245.201@tcp_0:26 lens 232/240 ref 1 fl Rpc:/0/0 rc 0/0 Nov 7 10:39:51
2007 Oct 25
1
Error message
I''m seeing this error message on one of my OSS''s but not the other three. Any idea what is causing it? Oct 25 13:58:56 oss2 kernel: LustreError: 3228:0:(client.c:519:ptlrpc_import_delay_req()) @@@ IMP_INVALID req at f6b13200 x18040/t0 o101->MGS at MGC192.168.0.200@tcp_0:26 lens 176/184 ref 1 fl Rpc:/0/0 rc 0/0 Oct 25 13:58:56 oss2 kernel: LustreError:
2007 Nov 07
9
How To change server recovery timeout
Hi, Our lustre environment is: 2.6.9-55.0.9.EL_lustre.1.6.3smp I would like to change recovery timeout from default value 250s to something longer I tried example from manual: set_timeout <secs> Sets the timeout (obd_timeout) for a server to wait before failing recovery. We performed that experiment on our test lustre installation with one OST. storage02 is our OSS [root at
2010 Sep 16
2
Lustre module not getting loaded in MDS
Hello All, I have installed and configured Lustre 1.8.4 on SuSe 11.0 and everything works fine if i run modprobe lustre and when the lustre module is getting loaded. But when the server reboots it is not getting loaded. Kindly help. Lnet is configured in /etc/modprobe.conf.local as below. options lnet networks=tcp0(eth0) accept=all For loading lustre module i tried including lustre module in
2008 Jan 31
2
lustre+samba
Dear All, I try to use our cluster though samba share. Everything work fine, but I think, we should have -o flock at lustre mount time. Great, it''s work. But when I want to save a file on the share, I get this on the logs: Jan 31 10:45:24 opteron-ren-11 kernel: LustreError: 24836:0:(file.c:2309:ll_file_flock()) unknown fcntl lock type: 32 Jan 31 10:45:24 opteron-ren-11 kernel:
2012 Sep 27
4
Bad reporting inodes free
Hello, When I run a "df -i" in my clients I get 95% indes used or 5% inodes free: Filesystem Inodes IUsed IFree IUse% Mounted on lustre-mds-01:lustre-mds-02:/cetafs 22200087 20949839 1250248 95% /mnt/data But if I run lfs df -i i get: UUID Inodes IUsed IFree I
2013 Oct 17
3
Speeding up configuration log regeneration?
Hi, We run four-node Lustre 2.3, and I needed to both change hardware under MGS/MDS and reassign an OSS ip. Just the same, I added a brand new 10GE network to the system, which was the reason for MDS hardware change. I ran tunefs.lustre --writeconf as per chapter 14.4 in Lustre Manual, and everything mounts fine. Log regeneration apparently works, since it seems to do something, but
2008 Apr 15
5
o2ib module prevents shutdown
Hello, Not sure if this is the right forum: I''m encountering difficulties with o2ib which prevents an LNET shutdown from proceeding: Unloading OpenIB kernel modules:NET: Unregistered protocal family 27 Failed to unload rdma_cm Failed to unload rdma_cm Failed to unload ib_cm Failed to unload ib_sa LustreError: 131-3: Received notification of device removal Please shutdown LNET
2008 Feb 12
0
Lustre-discuss Digest, Vol 25, Issue 17
Hi, i just want to know whether there are any alternative file systems for HP SFS. I heard that there is Cluster Gateway from Polyserve. Can anybody plz help me in finding more abt this Cluster Gateway. Thanks and Regards, Ashok Bharat -----Original Message----- From: lustre-discuss-bounces at lists.lustre.org on behalf of lustre-discuss-request at lists.lustre.org Sent: Tue 2/12/2008 3:18 AM
2007 Dec 11
2
lustre + nfs + alphas
This is the strangest problem I have seen. I have a lustre filesystem mounted on a linux server and its being exported to various alpha systems. The alphas mount it just fine however under heavy load the NFS server stops responding, as does the lustre mount on the export server. The weird thing is that if i mount the nfs export on another nfs server and run the same benchmark (bonnie) everything
2007 Nov 23
2
How to remove OST permanently?
All, I''ve added a new 2.2 TB OST to my cluster easily enough, but this new disk array is meant to replace several smaller OSTs that I used to have of which were only 120 GB, 500 GB, and 700 GB. Adding an OST is easy, but how do I REMOVE the small OSTs that I no longer want to be part of my cluster? Is there a command to tell luster to move all the file stripes off one of the nodes?
2008 Mar 03
1
Quota setup fails because of OST ordering
Hi all, after installing a Lustre test file system consisting of 34 OSTs, I encountered a strange error when trying to set up quotas: lfs quotacheck gave me an "Input/Output error", while in /var/log/kern.log I found a Lustre error LustreError: 20807:0:(quota_check.c:227:lov_quota_check()) lov idx 32 inactive Indeed, in /proc/fs/lustre/lov/.../target_obd all 34 OSTs were listed