thr3ads.net - Lustre discuss - [Lustre-discuss] Lustre reported "No space left on device" [Mar 2010]

If this information is useful, please help other people find it:
Share via:
xiaobao
2010-Mar-04 07:36 UTC
[Lustre-discuss] Lustre reported "No space left on device"

Hi all, we are running lustre 1.8.1 on about 800 clients. Some days ago, we
found a weird problem several times. When writing data, some clients
reported "LustreError: 11-0: an error occurred while communicating with
xx.xx.xx.xx at o2ib. The ost_write operation failed with -28", where
xx.xx.xx.xx is one of our OSS node. But both the MDS and OSS is not anywhere
near full. As we know, the errno 28 is "no space left on device".
After some
time, everything appears to be ok again.

the space used:

 # lfs df -h
UUID                     bytes      Used Available  Use% Mounted on
lustre-MDT0000_UUID      350.0G      1.1G    328.9G    0% /home[MDT:0]
lustre-OST0000_UUID        6.2T    263.5G      5.6T    4% /home[OST:0]
lustre-OST0001_UUID        6.2T    264.3G      5.6T    4% /home[OST:1]
lustre-OST0002_UUID        5.7T    261.7G      5.2T    4% /home[OST:2]
lustre-OST0003_UUID        5.4T    206.6G      4.9T    3% /home[OST:3]
lustre-OST0004_UUID        4.6T    197.1G      4.2T    4% /home[OST:4]
lustre-OST0005_UUID        4.6T    160.6G      4.2T    3% /home[OST:5]
lustre-OST0006_UUID        4.6T    300.7G      4.1T    6% /home[OST:6]
lustre-OST0007_UUID        4.6T    174.1G      4.2T    3% /home[OST:7]
lustre-OST0008_UUID        6.9T    232.7G      6.4T    3% /home[OST:8]
lustre-OST0009_UUID        6.9T    237.7G      6.4T    3% /home[OST:9]
lustre-OST000a_UUID        6.2T    219.9G      5.6T    3% /home[OST:10]
lustre-OST000b_UUID        6.2T    257.8G      5.6T    4% /home[OST:11]
lustre-OST000c_UUID        6.2T    784.6G      5.1T   12% /home[OST:12]
lustre-OST000d_UUID        6.2T    227.2G      5.6T    3% /home[OST:13]
lustre-OST000e_UUID        5.7T    199.2G      5.2T    3% /home[OST:14]
lustre-OST000f_UUID        5.4T    221.9G      4.9T    4% /home[OST:15]
lustre-OST0010_UUID        4.6T    176.4G      4.2T    3% /home[OST:16]
lustre-OST0011_UUID        4.6T    160.9G      4.2T    3% /home[OST:17]
lustre-OST0012_UUID        3.1T    118.3G      2.8T    3% /home[OST:18]
lustre-OST0013_UUID        3.1T     99.0G      2.8T    3% /home[OST:19]
lustre-OST0014_UUID        6.9T    243.7G      6.4T    3% /home[OST:20]
lustre-OST0015_UUID        6.9T    273.6G      6.3T    3% /home[OST:21]
lustre-OST0016_UUID        6.2T    335.6G      5.5T    5% /home[OST:22]
lustre-OST0017_UUID        6.2T    219.1G      5.6T    3% /home[OST:23]

the inode used:

 # lfs df -ih
UUID                    Inodes     IUsed     IFree IUse% Mounted on
lustre-MDT0000_UUID       89.3M      2.1M     87.2M    2% /home[MDT:0]
lustre-OST0000_UUID        6.2M     91.3K      6.1M    1% /home[OST:0]
lustre-OST0001_UUID        6.2M     90.7K      6.1M    1% /home[OST:1]
lustre-OST0002_UUID        5.7M     83.1K      5.6M    1% /home[OST:2]
lustre-OST0003_UUID        5.4M     80.1K      5.3M    1% /home[OST:3]
lustre-OST0004_UUID        4.6M     68.8K      4.6M    1% /home[OST:4]
lustre-OST0005_UUID        4.6M     69.8K      4.6M    1% /home[OST:5]
lustre-OST0006_UUID        4.6M     69.4K      4.6M    1% /home[OST:6]
lustre-OST0007_UUID        4.6M     69.2K      4.6M    1% /home[OST:7]
lustre-OST0008_UUID        6.9M    101.7K      6.8M    1% /home[OST:8]
lustre-OST0009_UUID        6.9M    101.3K      6.8M    1% /home[OST:9]
lustre-OST000a_UUID        6.2M     91.0K      6.1M    1% /home[OST:10]
lustre-OST000b_UUID        6.2M     90.9K      6.1M    1% /home[OST:11]
lustre-OST000c_UUID        6.2M     86.1K      6.1M    1% /home[OST:12]
lustre-OST000d_UUID        6.2M     90.2K      6.1M    1% /home[OST:13]
lustre-OST000e_UUID        5.7M     83.4K      5.6M    1% /home[OST:14]
lustre-OST000f_UUID        5.4M     80.4K      5.3M    1% /home[OST:15]
lustre-OST0010_UUID        4.6M     69.0K      4.6M    1% /home[OST:16]
lustre-OST0011_UUID        4.6M     69.3K      4.6M    1% /home[OST:17]
lustre-OST0012_UUID        3.1M     46.7K      3.0M    1% /home[OST:18]
lustre-OST0013_UUID        3.1M     46.6K      3.0M    1% /home[OST:19]
lustre-OST0014_UUID        6.9M    101.5K      6.8M    1% /home[OST:20]
lustre-OST0015_UUID        6.9M    101.3K      6.8M    1% /home[OST:21]
lustre-OST0016_UUID        6.2M     90.1K      6.1M    1% /home[OST:22]
lustre-OST0017_UUID        6.2M     90.7K      6.1M    1% /home[OST:23]

So why it happened? BTW, OSS and MDS didn''t say anything about "no
space
left" in their logs, the os is SLES 10 sp2.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://lists.lustre.org/pipermail/lustre-discuss/attachments/20100304/76299640/attachment.html
Lustre discuss - Mar 2010 - Lustre reported "No space left on device"

[Lustre-discuss] Lustre reported "No space left on device"