thr3ads.net - Lustre discuss - [Lustre-discuss] Unbalanced OST--for discussion purposes [Mar 2010]

If this information is useful, please help other people find it:
Share via:

Ms. Megan Larko

2010-Mar-02 20:45 UTC

[Lustre-discuss] Unbalanced OST--for discussion purposes

Hi,

I have a Lustre array (version  2.6.18-53.1.13.el5_lustre.1.6.4.3smp)
which will soon be decommissioned in favor of newer hardware.
Therefore this question is mostly for my personal intellectual
curiosity.

I logged directly into the OSS (OSS4) and just ran a df (along with a
periodic check of the log files).  I last looked about two weeks ago
(I know it was after 17 Feb).   Anyway, the OST0007 is more full than
any of the other OSTs.  The default lustre stripe (I believe that is
set to 1) is used.    Can just one file shift the size used of one OST
that significantly?  What other reasonable explanation for a
difference on one OST in comparison with the others?  Could this cause
a lustre performance hit at this point?

   [root at oss4 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on

/dev/sdb1             6.3T  3.6T  2.5T  60% /srv/lustre/OST/crew8-OST0000
/dev/sdb2             6.3T  4.1T  1.9T  69% /srv/lustre/OST/crew8-OST0001
/dev/sdc1             6.3T  3.3T  2.8T  55% /srv/lustre/OST/crew8-OST0002
/dev/sdc2             6.3T  3.3T  2.7T  56% /srv/lustre/OST/crew8-OST0003
/dev/sdd1             6.3T  3.5T  2.6T  58% /srv/lustre/OST/crew8-OST0004
/dev/sdd2             6.3T  4.1T  1.9T  69% /srv/lustre/OST/crew8-OST0005
/dev/sdi1             6.3T  3.9T  2.2T  65% /srv/lustre/OST/crew8-OST0006
/dev/sdi2             6.3T  5.0T 1015G  84%
/srv/lustre/OST/crew8-OST0007     <----
/dev/sdj1             6.3T  3.4T  2.7T  56% /srv/lustre/OST/crew8-OST0008
/dev/sdj2             6.3T  3.3T  2.7T  56% /srv/lustre/OST/crew8-OST0009
/dev/sdk1             6.3T  3.4T  2.7T  56% /srv/lustre/OST/crew8-OST0010
/dev/sdk2             6.3T  3.8T  2.2T  64% /srv/lustre/OST/crew8-OST0011

Still learning....
megan

Brian J. Murrell

2010-Mar-02 21:00 UTC

head link

[Lustre-discuss] Unbalanced OST--for discussion purposes

On Tue, 2010-03-02 at 15:45 -0500, Ms. Megan Larko wrote:
> Hi,
Hi,
> I logged directly into the OSS (OSS4) and just ran a df (along with a
> periodic check of the log files).  I last looked about two weeks ago
> (I know it was after 17 Feb).
Is the implication that at this point the OSTs were more or less well
balanced?
> Anyway, the OST0007 is more full than
> any of the other OSTs.  The default lustre stripe (I believe that is
> set to 1) is used.    Can just one file shift the size used of one OST
> that significantly?
Sure.  As an example, if one had a 1KiB file on that OST, called, let''s
say, "1K_file.dat" and one did:

$ dd if=/dev/zero of=1K_file.dat bs=1G count=1024

that would overwrite the 1KiB file on that OST with a 1TiB file.
Recognizing of course that that would be 1TiB in a single object on an
OST.
> What other reasonable explanation for a
> difference on one OST in comparison with the others?
Any kind of variation on the above.
> Could this cause
> a lustre performance hit at this point?
Not really.

b.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url :
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100302/b4f86b6c/attachment.bin

Andreas Dilger

2010-Mar-03 04:06 UTC

head link

[Lustre-discuss] Unbalanced OST--for discussion purposes

On 2010-03-02, at 13:45, Ms. Megan Larko wrote:> I logged directly into the OSS (OSS4) and just ran a df (along with a
> periodic check of the log files).  I last looked about two weeks ago
> (I know it was after 17 Feb).   Anyway, the OST0007 is more full than
> any of the other OSTs.  The default lustre stripe (I believe that is
> set to 1) is used.    Can just one file shift the size used of one OST
> that significantly?
Sure, this is easy if the size of a single file is be a large fraction  
of the OST size.  This is one reason why we recommend people use  
larger OSTs (up to 16TB in 1.8.2 with RHEL5.4) instead of e.g. 1TB or  
less that is sometimes reported here.
> What other reasonable explanation for a difference on one OST in  
> comparison with the others?  Could this cause a lustre performance  
> hit at this point?
It is possible, if the filesystem is getting very full and it causes  
more seeking to do IO.  At the 84% you report below it is starting to  
get into that range - I wouldn''t recommend running the filesystem  
beyond 90% full unless you are more concerned with space usage than  
performance.

You can find the file(s) that are abnormally large on that particular  
OST by running (preferably on a client mountpoint on the MDS):

   lfs find --obd crew8-OST0006_UUID -size +10G /mnt/lustre
>   [root at oss4 ~]# df -h
> Filesystem            Size  Used Avail Use% Mounted on
>
> /dev/sdb1             6.3T  3.6T  2.5T  60% /srv/lustre/OST/crew8- 
> OST0000
> /dev/sdb2             6.3T  4.1T  1.9T  69% /srv/lustre/OST/crew8- 
> OST0001
> /dev/sdc1             6.3T  3.3T  2.8T  55% /srv/lustre/OST/crew8- 
> OST0002
> /dev/sdc2             6.3T  3.3T  2.7T  56% /srv/lustre/OST/crew8- 
> OST0003
> /dev/sdd1             6.3T  3.5T  2.6T  58% /srv/lustre/OST/crew8- 
> OST0004
> /dev/sdd2             6.3T  4.1T  1.9T  69% /srv/lustre/OST/crew8- 
> OST0005
> /dev/sdi1             6.3T  3.9T  2.2T  65% /srv/lustre/OST/crew8- 
> OST0006
> /dev/sdi2             6.3T  5.0T 1015G  84%
> /srv/lustre/OST/crew8-OST0007     <----
> /dev/sdj1             6.3T  3.4T  2.7T  56% /srv/lustre/OST/crew8- 
> OST0008
> /dev/sdj2             6.3T  3.3T  2.7T  56% /srv/lustre/OST/crew8- 
> OST0009
> /dev/sdk1             6.3T  3.4T  2.7T  56% /srv/lustre/OST/crew8- 
> OST0010
> /dev/sdk2             6.3T  3.8T  2.2T  64% /srv/lustre/OST/crew8- 
> OST0011
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Ms. Megan Larko

2010-Mar-03 19:25 UTC

head link

[Lustre-discuss] Unbalanced OST--for discussion purposes

Thanks to both Brian and Andreas for the timely responses.
Brian posed the question as to whether or not the OSTs were more or
less balanced a week ago.  The answer is that I believe that they
were.   Usually all OSTs report a similar percentage of usage (between
1%  and 3% of one another).   I believe that is why this new report
piqued my curiosity.

Regarding Andreas remark about individual OST size, yes I understand
that having larger individual OSTs can preempt any one OST from
becoming so full that the others degrade in performance (per A.
Dilger, not B. Murrel).   For that reason I personally like the option
available in newer Lustre releases (I think 1.8.x and higher) to allow
up to 16Tb in a single OST slice.  I know the previous limit was 8Tb
per OST slice for precaution against data corruption.   (I was able to
build a larger OST slice with 1.6.7 but I was cautioned that some data
may become unreachable and/or corrupted as the Lustre system had not
at that time been modified to accept the larger partition sizes which
the underlying files systems--ext4, xfs---would accept.)    The OST
formatted size of 6.3Tb fit nicely into the JBOD scheme of
evenly-sized partitions.

Thanks,
megan

On Tue, 2010-03-02 at 15:45 -0500, Ms. Megan Larko
wrote:> Hi,
Hi,
> I logged directly into the OSS (OSS4) and just ran a df (along with a
> periodic check of the log files).  I last looked about two weeks ago
> (I know it was after 17 Feb).
Is the implication that at this point the OSTs were more or less well
balanced?
> Anyway, the OST0007 is more full than
> any of the other OSTs.  The default lustre stripe (I believe that is
> set to 1) is used.    Can just one file shift the size used of one OST
> that significantly?
Sure.  As an example, if one had a 1KiB file on that OST, called, let''s
say, "1K_file.dat" and one did:

$ dd if=/dev/zero of=1K_file.dat bs=1G count=1024

that would overwrite the 1KiB file on that OST with a 1TiB file.
Recognizing of course that that would be 1TiB in a single object on an
OST.
> What other reasonable explanation for a
> difference on one OST in comparison with the others?
Any kind of variation on the above.
> Could this cause
> a lustre performance hit at this point?
Not really.

b.

Lustre discuss - Mar 2010 - Unbalanced OST--for discussion purposes

[Lustre-discuss] Unbalanced OST--for discussion purposes

[Lustre-discuss] Unbalanced OST--for discussion purposes

[Lustre-discuss] Unbalanced OST--for discussion purposes

[Lustre-discuss] Unbalanced OST--for discussion purposes