thr3ads.net - similar to: "How to track down a latency/timing problem"

Displaying 20 results from an estimated 3000 matches similar to: "How to track down a latency/timing problem"

2007 Nov 29

Balancing I/O Load

We are seeing some disturbing (probably due to our ignorance) behavior from lustre 1.6.3 right now. We have 8 OSSs with 3 OSTs per OSS (24 physical LUNs). We just created a brand new lustre file system across this configuration using the default mkfs.lustre formatting options. We have this file system mounted across 400 clients. At the moment, we have 63 IOzone threads running

NFS Performance

2008 Apr 15

NFS Performance

Hi, With help from Oleg we got the right patches applied and NFS working well. Maximum performance was about 60 MB/sec. Last week that dropped to about 12.5 MB/sec and I cannot find a reason. Lustre clients all obtain 100+ MB/sec on GigE. Each OST is good for 270 MB/sec. When mounting the client on one of the OSSs I get 230 MB/sec. Seems the speed is there. How can NFS and Lustre be tuned

bad 1.6.3 striped write performance

2007 Nov 26

bad 1.6.3 striped write performance

Hi, I''m seeing what can only be described as dismal striped write performance from lustre 1.6.3 clients :-/ 1.6.2 and 1.6.1 clients are fine. 1.6.4rc3 clients (from cvs a couple of days ago) are also terrible. the below shows that the OS (centos4.5/5) or fabric (gigE/IB) or lustre version on the servers doesn''t matter - the problem is with the 1.6.3 and 1.6.4rc3 client kernels

Large Corosync/Pacemaker clusters

2012 Oct 19

Large Corosync/Pacemaker clusters

Hi, We''re setting up fairly large Lustre 2.1.2 filesystems, each with 18 nodes and 159 resources all in one Corosync/Pacemaker cluster as suggested by our vendor. We''re getting mixed messages on how large of a Corosync/Pacemaker cluster will work well between our vendor an others. 1. Are there Lustre Corosync/Pacemaker clusters out there of this size or larger? 2.

OSTs inactive on one client (only)

2013 Apr 29

OSTs inactive on one client (only)

Hi everyone, I have seen this question here before, but without a very satisfactory answer. One of our half a dozen clients has lost access to a set of OSTs: > lfs osts OBDS:: 0: lustre-OST0000_UUID ACTIVE 1: lustre-OST0001_UUID ACTIVE 2: lustre-OST0002_UUID INACTIVE 3: lustre-OST0003_UUID INACTIVE 4: lustre-OST0004_UUID INACTIVE 5: lustre-OST0005_UUID ACTIVE 6: lustre-OST0006_UUID ACTIVE

lustre-1.8 OSD

2007 Nov 30

lustre-1.8 OSD

lustre-1.8 has OSD structures in place, what do I need to add in to make it work with OSD T10 standard? could anybody point me to some docs mentioning lustre internals - OSTs, OSSs, OBDs, and control flow when a read/write call is invoked by a client. thanks. -------------- next part -------------- An HTML attachment was scrubbed... URL:

lnet infiniband config

2010 Jun 22

lnet infiniband config

Hi all, I''m getting my feet wet in the infiniband lake and of course I run into some problems. It would seem I got the compilation part of sles11 kernel 2.6.27 + Lustre 1.8.3 + ofed 1.4.2 right, because it allows me to see and use the infiniband fabric, and because ko2iblnd loads without any complaints. In /etc/modprobe.d/lustre (this is a Debian system, hence this subdir of

write RPC & congestion

2010 Aug 17

write RPC & congestion

Hi, thanks for previous help. I have some question about Lustre RPC and the sequence of events that occur during large concurrent write() involving many processes and large data size per process. I understand there is a mechanism of flow control by credits, but I''m a little unclear on how it works in general after reading the "networking & io protocol" white paper. Is

ldiskfs-ext4 interoperability question

2010 Sep 30

ldiskfs-ext4 interoperability question

Our current Lustre servers run the version 1.8.1.1 with the regular ldiskfs. We are looking to expand our Lustre file system with new servers/storage and upgrade to all the lustre servers to 1.8.4 as well at the same time. We would like to make use of the ldiskfs-ext4 on the new servers to use larger OSTs. I just want to confirm the following facts: 1. Is is possible to run different versions

Multihomed question: want Lustre over IB andEthernet

2008 Mar 07

Multihomed question: want Lustre over IB andEthernet

Chris, Perhaps you need to perform some write_conf like command. I''m not sure if this is needed in 1.6 or not. Shane ----- Original Message ----- From: lustre-discuss-bounces at lists.lustre.org <lustre-discuss-bounces at lists.lustre.org> To: lustre-discuss <lustre-discuss at lists.lustre.org> Sent: Fri Mar 07 12:03:17 2008 Subject: Re: [Lustre-discuss] Multihomed

Cannot send after transport endpoint shutdown (-108)

2008 Mar 04

Cannot send after transport endpoint shutdown (-108)

This morning I''ve had both my infiniband and tcp lustre clients hiccup. They are evicted from the server presumably as a result of their high load and consequent timeouts. My question is- why don''t the clients re-connect. The infiniband and tcp clients both give the following message when I type "df" - Cannot send after transport endpoint shutdown (-108). I''ve

Enable async journals

2010 Jul 13

Enable async journals

Hi all, we use SLES 11 and Lustre 1.8.1.1 + patches and like convert a lustre FS using external journals to one with async journals enabled. Question is whether the procedure: umount <filesystem> on all clients umount <osts> on all OSSes e2fsck <ost-device> on all OSSes for all all OSTs tune2fs -O ^has_journal <ost-device> on all

Adding OST to online Lustre with quota

2010 Jul 05

Adding OST to online Lustre with quota

Hello, we wounder whether is it possible to add OSTs to the Lustre with quota support without making it offline? We tried to do this but all quota information was lost. Despite the fact that OST was formatted with quota support we are receiving this error message: Lustre: 3743:0:(lproc_quota.c:447:lprocfs_quota_wr_type()) lustrefs-OST0016: quotaon failed because quota files

No space left on device on not full filesystem

2010 Jul 08

No space left on device on not full filesystem

Hello, We have running lustre 1.8.1 and have met "No space lest on device" error when uploading 500 Gb small files (less then 100 Kb each). The problem seems to depends on the number of files. If we remove one file, we can create one new file, even with Gb size; but if we haven''t remove something we can''t create even very little file, as an example using touch

Lustre drawback

2007 Dec 13

Lustre drawback

Hello everybody, at the following pages: http://www.rit.edu/~rc/docs/Survey_of_Clustered_Parallel_File_Systems_004_LANL.ppt http://www.intel.com/cd/ids/developer/asmo-na/eng/dc/tools/threading/238284.htm?page=2 I read: "[...] Currently, one additional drawback to Lustre is that a Lustre client cannot be on a server that is providing OSTs. This solution is being worked on and may be

Quota setup fails because of OST ordering

2008 Mar 03

Quota setup fails because of OST ordering

Hi all, after installing a Lustre test file system consisting of 34 OSTs, I encountered a strange error when trying to set up quotas: lfs quotacheck gave me an "Input/Output error", while in /var/log/kern.log I found a Lustre error LustreError: 20807:0:(quota_check.c:227:lov_quota_check()) lov idx 32 inactive Indeed, in /proc/fs/lustre/lov/.../target_obd all 34 OSTs were listed

Understanding lustre setup ..

2013 Mar 11

Understanding lustre setup ..

Hello, I have been reading http://wiki.lustre.org/images/1/1b/Hadoop_wp_v0.4.2.pdf for setting up Hadoop over lustre. Generally in hadoop setup, we have 1 Namenode and various number of datanodes. If I want to setup the same keeping Lustre as backend, in the document it is mentioned that: ".............Our experiments run on cluster with 8 nodes in total, one is mds/namenode, the rest are

Bad reporting inodes free

2012 Sep 27

Bad reporting inodes free

Hello, When I run a "df -i" in my clients I get 95% indes used or 5% inodes free: Filesystem Inodes IUsed IFree IUse% Mounted on lustre-mds-01:lustre-mds-02:/cetafs 22200087 20949839 1250248 95% /mnt/data But if I run lfs df -i i get: UUID Inodes IUsed IFree I

Lustre 1.0.2 packages available

2004 Jan 11

Lustre 1.0.2 packages available

Greetings-- Packages for Lustre 1.0.2 are now available in the usual place http://www.clusterfs.com/download.html This bug-fix release resolves a number of issues, of which a few are user-visible: - the default debug level is now a more reasonable production value - zero-copy TCP is now enabled by default, if your hardware supports it - you should encounter fewer allocation failures

Lustre 1.0.2 packages available

2004 Jan 11

Lustre 1.0.2 packages available

similar to: How to track down a latency/timing problem