thr3ads.net - similar to: "Lost OSTs, remounted, now /proc/fs/lustre/obdfilter/$UUID/ is empty"

Displaying 20 results from an estimated 100 matches similar to: "Lost OSTs, remounted, now /proc/fs/lustre/obdfilter/$UUID/ is empty"

1.6.4.1 - active client evicted

2008 Jan 10

1.6.4.1 - active client evicted

Hi! We''ve started to poke and prod at Lustre 1.6.4.1, and it seems to mostly work (we haven''t had it OOPS on us yet like the earlier 1.6-versions did). However, we had this weird incident where an active client (it was copying 4GB files and running ls at the time) got evicted by the MDS and all OST''s. After a while logs indicate that it did recover the connection

obdfilter/datafs-OST0000/recovery_status

2008 Feb 05

obdfilter/datafs-OST0000/recovery_status

I''m evaluating lustre. I''m trying what I think is a basic/simple ethernet config. with MDT and OST on the same node. Can someone tell me if the following (~150 second recovery occurring when small 190 GB OST is re-mounted) is expected behavior or if I''m missing something? I thought I would send this and continue with the eval while awaiting a response. I''m using

Error message

2007 Oct 25

Error message

I''m seeing this error message on one of my OSS''s but not the other three. Any idea what is causing it? Oct 25 13:58:56 oss2 kernel: LustreError: 3228:0:(client.c:519:ptlrpc_import_delay_req()) @@@ IMP_INVALID req at f6b13200 x18040/t0 o101->MGS at MGC192.168.0.200@tcp_0:26 lens 176/184 ref 1 fl Rpc:/0/0 rc 0/0 Oct 25 13:58:56 oss2 kernel: LustreError:

lustre error

2008 Feb 22

lustre error

Dear All, Yesterday evening or cluster has stopped. Two of our nodes tried to take the resource from each other, they haven''t seen the other side, if I saw well. I stopped heartbeat, resources, start it again, and back to online, worked fine. This morning I saw this in logs: Feb 22 03:25:07 node4 kernel: Lustre: 7:0:(linux-debug.c:98:libcfs_run_upcall()) Invoked LNET upcall

Lost folders after changing MDS

2013 Feb 12

Lost folders after changing MDS

OK, so our old MDS had hardware issues so I configured a new MGS / MDS on a VM (this is a backup lustre filesystem and I wanted to separate the MGS / MDS from OSS of the previous), and then did this: For example: mount -t ldiskfs /dev/old /mnt/ost_old mount -t ldiskfs /dev/new /mnt/ost_new rsync -aSv /mnt/ost_old/ /mnt/ost_new # note trailing slash on ost_old/ If you are unable to connect both

fsck ldiskfs-backed OSTs?

2007 Oct 01

fsck ldiskfs-backed OSTs?

There are references to running fsck on the lustre OSTs after a crash or power failure. However, after downloading the ClusterFS e2fsprogs and building them, e2fsck does not recognize our ldiskfs- based OSTs. Is there a way to fsck the ldiskfs-based OSTs? Thanks, Charlie Taylor UF HPC Center

Multihomed question: want Lustre over IB andEthernet

2008 Mar 07

Multihomed question: want Lustre over IB andEthernet

Chris, Perhaps you need to perform some write_conf like command. I''m not sure if this is needed in 1.6 or not. Shane ----- Original Message ----- From: lustre-discuss-bounces at lists.lustre.org <lustre-discuss-bounces at lists.lustre.org> To: lustre-discuss <lustre-discuss at lists.lustre.org> Sent: Fri Mar 07 12:03:17 2008 Subject: Re: [Lustre-discuss] Multihomed

Depreciated client still shown on OST exports

2010 Aug 06

Depreciated client still shown on OST exports

Some clients have been removed several weeks ago but are still listed in: ls -l /proc/fs/lustre/obdfilter/*/exports/ This was found after tracing back mystery tcp packets to the OSS. Although this is causing no damage, it raises the question of when former clients will be cleared from the OSS. Is there a way to manually remove these exports from the OSS? -- Regards, David

lustre showing inactive devices

2013 Mar 18

lustre showing inactive devices

I installed 1 MDS , 2 OSS/OST and 2 Lustre Client. My MDS shows: [code] [root at MDS ~]# lctl list_nids 10.94.214.185 at tcp [root at MDS ~]# [/code] On Lustre Client1: [code] [root at lustreclient1 lustre]# lfs df -h UUID bytes Used Available Use% Mounted on lustre-MDT0000_UUID 4.5G 274.3M 3.9G 6% /mnt/lustre[MDT:0]

OST0006 : inactive device

2013 Mar 18

OST0006 : inactive device

Files written to an OST are corrupted

2013 Sep 19

Files written to an OST are corrupted

Hi, everyone, I need some help in figuring out what may have happened here, as newly created files on an OST are being corrupted. I don''t know if this applies to all files written to this OST, or just to files of order 2GB size, but files are definitely being corrupted, with no errors reported by the OSS machine. Let me describe the situation. We had been running Lustre 1.8.4 for

OST acting up

2014 Nov 13

OST acting up

whoops, sent from wrong email address, form right address now: Hello, I am using Lustre 2.4.2 and have an OST that doesn't seem to be written to. When I check the MDS with 'lctl dl' I do not see that OST in the list. However when I check the OSS that OST belongs to I can see it is mounted and up; 0 UP osd-zfs l2-OST0003-osd l2-OST0003-osd_UUID 5 3 UP obdfilter l2-OST0003

How to evict a dead client?

2010 Jul 07

How to evict a dead client?

Dear, everyone We have stuck with the problem that the OSS connect one dead client or one with changed IP address all the time until we reboot the dead client. From the OSS log message, we can get the information as follows: Jul 7 14:45:07 com01 kernel: Lustre: 12180:0:(socklnd_cb.cLustre: 12180:0:(socklnd_cb.c:915:ksocknal_launch_packet()) No usable routes to 12345-202.Lustre:

Unusual Block Allocations on OSTs

2008 Mar 11

Unusual Block Allocations on OSTs

Hi, I see some unusual block allocations on my OSTs, and I was wondering if someone could explain to my why, and help me to fix a performance problem. In order to track down a performance issue, I ran the following test: - I reformatted my OSTS (I have 4 OSTs). - I created a 10G file on each OST. - I ran dumpe2fs to see if I had some unusual fragmentation going on. dumpe2fs shows

1.8.4 and write-through cache

2010 Sep 13

1.8.4 and write-through cache

Afternoon I upgraded our oss''s from 1.8.3 to 1.8.4 on Saturday (due to https://bugzilla.lustre.org/show_bug.cgi?id=22755) and suffered a great deal of pain. We have 30 oss''s of multiple vintages. The basic difference between them is * md on first 20 nodes * 3ware 9650SE ML12 on last 10 nodes After the upgrade to 1.8.4 we were seeing terrible throughput on the nodes with

OSTs inactive on one client (only)

2013 Apr 29

OSTs inactive on one client (only)

Hi everyone, I have seen this question here before, but without a very satisfactory answer. One of our half a dozen clients has lost access to a set of OSTs: > lfs osts OBDS:: 0: lustre-OST0000_UUID ACTIVE 1: lustre-OST0001_UUID ACTIVE 2: lustre-OST0002_UUID INACTIVE 3: lustre-OST0003_UUID INACTIVE 4: lustre-OST0004_UUID INACTIVE 5: lustre-OST0005_UUID ACTIVE 6: lustre-OST0006_UUID ACTIVE

Enable async journals

2010 Jul 13

Enable async journals

Hi all, we use SLES 11 and Lustre 1.8.1.1 + patches and like convert a lustre FS using external journals to one with async journals enabled. Question is whether the procedure: umount <filesystem> on all clients umount <osts> on all OSSes e2fsck <ost-device> on all OSSes for all all OSTs tune2fs -O ^has_journal <ost-device> on all

ldiskfs-ext4 interoperability question

2010 Sep 30

ldiskfs-ext4 interoperability question

Our current Lustre servers run the version 1.8.1.1 with the regular ldiskfs. We are looking to expand our Lustre file system with new servers/storage and upgrade to all the lustre servers to 1.8.4 as well at the same time. We would like to make use of the ldiskfs-ext4 on the new servers to use larger OSTs. I just want to confirm the following facts: 1. Is is possible to run different versions

How To change server recovery timeout

2007 Nov 07

How To change server recovery timeout

Hi, Our lustre environment is: 2.6.9-55.0.9.EL_lustre.1.6.3smp I would like to change recovery timeout from default value 250s to something longer I tried example from manual: set_timeout <secs> Sets the timeout (obd_timeout) for a server to wait before failing recovery. We performed that experiment on our test lustre installation with one OST. storage02 is our OSS [root at

Lustre-discuss Digest, Vol 25, Issue 17

2008 Feb 12

Lustre-discuss Digest, Vol 25, Issue 17

Hi, i just want to know whether there are any alternative file systems for HP SFS. I heard that there is Cluster Gateway from Polyserve. Can anybody plz help me in finding more abt this Cluster Gateway. Thanks and Regards, Ashok Bharat -----Original Message----- From: lustre-discuss-bounces at lists.lustre.org on behalf of lustre-discuss-request at lists.lustre.org Sent: Tue 2/12/2008 3:18 AM

similar to: Lost OSTs, remounted, now /proc/fs/lustre/obdfilter/$UUID/ is empty