Displaying 20 results from an estimated 3000 matches similar to: "How to track down a latency/timing problem"
2007 Nov 29
2
Balancing I/O Load
We are seeing some disturbing (probably due to our ignorance)
behavior from lustre 1.6.3 right now. We have 8 OSSs with 3 OSTs
per OSS (24 physical LUNs). We just created a brand new lustre file
system across this configuration using the default mkfs.lustre
formatting options. We have this file system mounted across 400
clients.
At the moment, we have 63 IOzone threads running
2008 Apr 15
4
NFS Performance
Hi,
With help from Oleg we got the right patches applied and NFS working
well. Maximum performance was about 60 MB/sec. Last week that dropped
to about 12.5 MB/sec and I cannot find a reason. Lustre clients all
obtain 100+ MB/sec on GigE. Each OST is good for 270 MB/sec. When
mounting the client on one of the OSSs I get 230 MB/sec. Seems the
speed is there. How can NFS and Lustre be tuned
2007 Nov 26
15
bad 1.6.3 striped write performance
Hi,
I''m seeing what can only be described as dismal striped write
performance from lustre 1.6.3 clients :-/
1.6.2 and 1.6.1 clients are fine. 1.6.4rc3 clients (from cvs a couple
of days ago) are also terrible.
the below shows that the OS (centos4.5/5) or fabric (gigE/IB) or lustre
version on the servers doesn''t matter - the problem is with the 1.6.3
and 1.6.4rc3 client kernels
2012 Oct 19
6
Large Corosync/Pacemaker clusters
Hi,
We''re setting up fairly large Lustre 2.1.2 filesystems, each with 18
nodes and 159 resources all in one Corosync/Pacemaker cluster as
suggested by our vendor. We''re getting mixed messages on how large of a
Corosync/Pacemaker cluster will work well between our vendor an others.
1. Are there Lustre Corosync/Pacemaker clusters out there of this
size or larger?
2.
2013 Apr 29
1
OSTs inactive on one client (only)
Hi everyone,
I have seen this question here before, but without a very
satisfactory answer. One of our half a dozen clients has
lost access to a set of OSTs:
> lfs osts
OBDS::
0: lustre-OST0000_UUID ACTIVE
1: lustre-OST0001_UUID ACTIVE
2: lustre-OST0002_UUID INACTIVE
3: lustre-OST0003_UUID INACTIVE
4: lustre-OST0004_UUID INACTIVE
5: lustre-OST0005_UUID ACTIVE
6: lustre-OST0006_UUID ACTIVE
2007 Nov 30
1
lustre-1.8 OSD
lustre-1.8 has OSD structures in place, what do I need to add in to make it
work with OSD T10 standard? could anybody point me to some docs mentioning
lustre internals - OSTs, OSSs, OBDs, and control flow when a read/write call
is invoked by a client. thanks.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
2010 Jun 22
7
lnet infiniband config
Hi all,
I''m getting my feet wet in the infiniband lake and of course I run into
some problems.
It would seem I got the compilation part of sles11 kernel 2.6.27 +
Lustre 1.8.3 + ofed 1.4.2 right, because it allows me to see and use the
infiniband fabric, and because ko2iblnd loads without any complaints.
In /etc/modprobe.d/lustre (this is a Debian system, hence this subdir of
2010 Aug 17
18
write RPC & congestion
Hi, thanks for previous help.
I have some question about Lustre RPC and the sequence of events that
occur during large concurrent write() involving many processes and large
data size per process. I understand there is a mechanism of flow
control by credits, but I''m a little unclear on how it works in general
after reading the "networking & io protocol" white paper.
Is
2010 Sep 30
1
ldiskfs-ext4 interoperability question
Our current Lustre servers run the version 1.8.1.1 with the regular ldiskfs.
We are looking to expand our Lustre file system with new servers/storage and upgrade to all the lustre servers to 1.8.4 as well at the same time. We
would like to make use of the ldiskfs-ext4 on the new servers to use larger OSTs.
I just want to confirm the following facts:
1. Is is possible to run different versions
2008 Mar 07
2
Multihomed question: want Lustre over IB andEthernet
Chris,
Perhaps you need to perform some write_conf like command. I''m not sure if this is needed in 1.6 or not.
Shane
----- Original Message -----
From: lustre-discuss-bounces at lists.lustre.org <lustre-discuss-bounces at lists.lustre.org>
To: lustre-discuss <lustre-discuss at lists.lustre.org>
Sent: Fri Mar 07 12:03:17 2008
Subject: Re: [Lustre-discuss] Multihomed
2008 Mar 04
16
Cannot send after transport endpoint shutdown (-108)
This morning I''ve had both my infiniband and tcp lustre clients hiccup. They are evicted from the server presumably as a result of their high load and consequent timeouts. My question is- why don''t the clients re-connect. The infiniband and tcp clients both give the following message when I type "df" - Cannot send after transport endpoint shutdown (-108). I''ve
2010 Jul 13
4
Enable async journals
Hi all,
we use SLES 11 and Lustre 1.8.1.1 + patches and like convert a lustre FS
using external journals to one with async journals enabled.
Question is whether the procedure:
umount <filesystem> on all clients
umount <osts> on all OSSes
e2fsck <ost-device> on all OSSes for all all OSTs
tune2fs -O ^has_journal <ost-device> on all
2010 Jul 05
4
Adding OST to online Lustre with quota
Hello,
we wounder whether is it possible to add OSTs to the Lustre with
quota support without making it offline?
We tried to do this but all quota information was lost. Despite the fact
that OST was formatted with quota support
we are receiving this error message:
Lustre: 3743:0:(lproc_quota.c:447:lprocfs_quota_wr_type())
lustrefs-OST0016: quotaon failed because quota files
2010 Jul 08
5
No space left on device on not full filesystem
Hello,
We have running lustre 1.8.1 and have met "No space lest on device"
error when uploading 500 Gb small files (less then 100 Kb each).
The problem seems to depends on the number of files. If we remove one
file, we can create one new file, even with Gb size; but if we haven''t
remove something we can''t create even very little file, as an example
using touch
2007 Dec 13
4
Lustre drawback
Hello everybody,
at the following pages:
http://www.rit.edu/~rc/docs/Survey_of_Clustered_Parallel_File_Systems_004_LANL.ppt
http://www.intel.com/cd/ids/developer/asmo-na/eng/dc/tools/threading/238284.htm?page=2
I read:
"[...] Currently, one additional drawback to Lustre is that a Lustre
client cannot be on a server that is providing OSTs. This solution is
being worked on and may be
2008 Mar 03
1
Quota setup fails because of OST ordering
Hi all,
after installing a Lustre test file system consisting of 34 OSTs, I
encountered a strange error when trying to set up quotas:
lfs quotacheck gave me an "Input/Output error", while in
/var/log/kern.log I found a Lustre error
LustreError: 20807:0:(quota_check.c:227:lov_quota_check()) lov idx 32
inactive
Indeed, in /proc/fs/lustre/lov/.../target_obd all 34 OSTs were listed
2013 Mar 11
4
Understanding lustre setup ..
Hello,
I have been reading
http://wiki.lustre.org/images/1/1b/Hadoop_wp_v0.4.2.pdf for setting up
Hadoop over lustre.
Generally in hadoop setup, we have 1 Namenode and various number of datanodes.
If I want to setup the same keeping Lustre as backend, in the document
it is mentioned that:
".............Our experiments run on cluster with 8 nodes in total,
one is mds/namenode, the rest are
2012 Sep 27
4
Bad reporting inodes free
Hello,
When I run a "df -i" in my clients I get 95% indes used or 5% inodes free:
Filesystem Inodes
IUsed IFree IUse% Mounted on
lustre-mds-01:lustre-mds-02:/cetafs 22200087 20949839 1250248 95%
/mnt/data
But if I run lfs df -i i get:
UUID Inodes IUsed
IFree I
2004 Jan 11
3
Lustre 1.0.2 packages available
Greetings--
Packages for Lustre 1.0.2 are now available in the usual place
http://www.clusterfs.com/download.html
This bug-fix release resolves a number of issues, of which a few are
user-visible:
- the default debug level is now a more reasonable production value
- zero-copy TCP is now enabled by default, if your hardware supports it
- you should encounter fewer allocation failures
2004 Jan 11
3
Lustre 1.0.2 packages available
Greetings--
Packages for Lustre 1.0.2 are now available in the usual place
http://www.clusterfs.com/download.html
This bug-fix release resolves a number of issues, of which a few are
user-visible:
- the default debug level is now a more reasonable production value
- zero-copy TCP is now enabled by default, if your hardware supports it
- you should encounter fewer allocation failures