/* Chris */
2008-Aug-21 04:13 UTC
[Lustre-discuss] It gives error "no space left" while lustre still have spaces left.
Hi, all, I got a problem when I testing lustre-1.6.5.1 on CentOS-5.2. I have four machines(PCs), they are MGS co-located with MDT, OSS-1, OSS-2 and CLT. OSS-1 have two disks which are formatted as ost01(40GB) and ost02(15GB). OSS-2 have two disks which are formatted as ost03(23GB) and ost04(5GB). At first, I reformat the MGS/MDT and mount it to /mnt/ as follow: [root at MGS ~]# mkfs.lustre --reformat --fsname=testfs --mgs --mdt /dev/hdb [root at MGS ~]# mount -t lustre /dev/hdb /mnt/mgs Second, I reformat OSTs and mount it to /mnt as follow: [root at OSS-1 ~]# mkfs.lustre --reformat --fsname=testfs --ost --mgsnode=MGS at tcp /dev/hdc [root at OSS-1 ~]# mkfs.lustre --reformat --fsname=testfs --ost --mgsnode=MGS at tcp /dev/hdd [root at OSS-1 ~]# mount -t lustre /dev/hdc /mnt/ost01 [root at OSS-1 ~]# mount -t lustre /dev/hdd /mnt/ost02 [root at OSS-2 ~]# mkfs.lustre --reformat --fsname=testfs --ost --mgsnode=MGS at tcp /dev/hdc [root at OSS-2 ~]# mkfs.lustre --reformat --fsname=testfs --ost --mgsnode=MGS at tcp /dev/hdd [root at OSS-2 ~]# mount -t lustre /dev/hdc /mnt/ost03 [root at OSS-2 ~]# mount -t lustre /dev/hdd /mnt/ost04 Third, I mounted lustre file system at CLT like this: [root at CLT ~] # mount -t lustre MGS at tcp:/testfs /mnt/lfs [root at CLT mnt]# df -h Filesystem Capacity Used Available Use% Mounted on: /dev/mapper/VolGroup00-LogVol00 4.3G 1.9G 2.2G 46% / /dev/hda1 99M 67M 28M 72% /boot tmpfs 252M 0 252M 0% /dev/shm MGS at tcp:/testfs 82G 1.6G 77G 2% /mnt/lfs Fourth, I try to use "lfs" command to set stripe parameters at CLT. [root at CLT mnt]# lfs setstripe lfs -s 8m -c -1 Fifth, I use "dd" command to test lustre file system. Then, it gives error "no space left" since just ost04(5GB) get full. [root at CLT lfs]# dd if=/dev/zero of=testfile001 bs=128M count=24 24+0 records in 24+0 records out 3221225472 bytes (3.2 GB) copied?164.585 seconds?19.6 MB/s [root at CLT lfs]# dd if=/dev/zero of=testfile002 bs=128M count=24 24+0 records in 24+0 records out 3221225472 bytes (3.2 GB) copied?164.836 seconds?19.5 MB/s [root at CLT lfs]# dd if=/dev/zero of=testfile003 bs=128M count=48 48+0 records in 48+0 records out 6442450944 bytes (6.4 GB) copied?383.2 seconds?16.8 MB/s [root at CLT lfs]# dd if=/dev/zero of=testfile004 bs=128M count=48 dd: write error: ?testfile004?: No space left on this device. 47+0 records in 46+0 records out 6301048832 bytes (6.3 GB) copied?418.321 ??15.1 MB/s [root at CLT lfs]# df -h Filesystem Capacity Used Available Use% Mounted on /dev/mapper/VolGroup00-LogVol00 4.3G 1.9G 2.2G 46% / /dev/hda1 99M 67M 28M 72% /boot tmpfs 252M 0 252M 0% /dev/shm MGS at tcp:/testfs 82G 20G 59G 25% /mnt/lfs [root at CLT lfs]# lfs df UUID 1K-blocks Used Available Use% Mounted on testfs-MDT0000_UUID 2752272 127844 2467144 4% /mnt/lfs[MDT:0] testfs-OST0000_UUID 41284928 5145080 34042632 12% /mnt/lfs[OST:0] testfs-OST0001_UUID 15481840 5134432 9560912 33% /mnt/lfs[OST:1] testfs-OST0002_UUID 23738812 5141040 17391848 21% /mnt/lfs[OST:2] testfs-OST0003_UUID 5160576 4898364 4 94% /mnt/lfs[OST:3] filesystem summary: 85666156 20318916 60995396 23% /mnt/lfs I have no idea about this error. Is there anyone could tell me about that how to config lustre to avoid this error? didn''t Lustre put file into the OSTs which still have free spaces instead of those full ones ? Regards, Chris -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.lustre.org/pipermail/lustre-discuss/attachments/20080821/89b1017b/attachment-0001.html
Brock Palen
2008-Aug-21 04:19 UTC
[Lustre-discuss] It gives error "no space left" while lustre still have spaces left.
If I understand right when you use ''setstripe -c -1'' lustre will try to evenly spread the data of a file over all OST''s. Because one of yours gets full, the file can nolonger be added to. Lustre does not fall back to using fewer stripes as most users say ''use more stripes'' for a reason. Lustre should not ignore this (and doesn''t). I don''t know how you would work around a this, A "use every stripe you can till its out of space" I don''t think exists. Brock Palen www.umich.edu/~brockp Center for Advanced Computing brockp at umich.edu (734)936-1985 On Aug 21, 2008, at 12:13 AM, /* Chris */ wrote:> Hi, all, > > I got a problem when I testing lustre-1.6.5.1 on CentOS-5.2. > I have four machines(PCs), they are MGS co-located with MDT, OSS-1, > OSS-2 and CLT. > OSS-1 have two disks which are formatted as ost01(40GB) and ost02 > (15GB). > OSS-2 have two disks which are formatted as ost03(23GB) and ost04 > (5GB). > > At first, I reformat the MGS/MDT and mount it to /mnt/ as follow: > [root at MGS ~]# mkfs.lustre --reformat --fsname=testfs --mgs --mdt / > dev/hdb > [root at MGS ~]# mount -t lustre /dev/hdb /mnt/mgs > > Second, I reformat OSTs and mount it to /mnt as follow: > [root at OSS-1 ~]# mkfs.lustre --reformat --fsname=testfs --ost -- > mgsnode=MGS at tcp /dev/hdc > [root at OSS-1 ~]# mkfs.lustre --reformat --fsname=testfs --ost -- > mgsnode=MGS at tcp /dev/hdd > [root at OSS-1 ~]# mount -t lustre /dev/hdc /mnt/ost01 > [root at OSS-1 ~]# mount -t lustre /dev/hdd /mnt/ost02 > > [root at OSS-2 ~]# mkfs.lustre --reformat --fsname=testfs --ost -- > mgsnode=MGS at tcp /dev/hdc > [root at OSS-2 ~]# mkfs.lustre --reformat --fsname=testfs --ost -- > mgsnode=MGS at tcp /dev/hdd > [root at OSS-2 ~]# mount -t lustre /dev/hdc /mnt/ost03 > [root at OSS-2 ~]# mount -t lustre /dev/hdd /mnt/ost04 > > Third, I mounted lustre file system at CLT like this: > [root at CLT ~] # mount -t lustre MGS at tcp:/testfs /mnt/lfs > [root at CLT mnt]# df -h > Filesystem Capacity Used Available Use% Mounted on: > /dev/mapper/VolGroup00-LogVol00 4.3G 1.9G 2.2G 46% / > /dev/hda1 99M 67M 28M 72% /boot > tmpfs 252M 0 252M 0% /dev/shm > MGS at tcp:/testfs 82G 1.6G 77G 2% /mnt/lfs > Fourth, I try to use "lfs" command to set stripe parameters at CLT. > [root at CLT mnt]# lfs setstripe lfs -s 8m -c -1 > > Fifth, I use "dd" command to test lustre file system. > Then, it gives error "no space left" since just ost04(5GB) get full. > > [root at CLT lfs]# dd if=/dev/zero of=testfile001 bs=128M count=24 > 24+0 records in > 24+0 records out > 3221225472 bytes (3.2 GB) copied?164.585 seconds?19.6 MB/s > [root at CLT lfs]# dd if=/dev/zero of=testfile002 bs=128M count=24 > 24+0 records in > 24+0 records out > 3221225472 bytes (3.2 GB) copied?164.836 seconds?19.5 MB/s > [root at CLT lfs]# dd if=/dev/zero of=testfile003 bs=128M count=48 > 48+0 records in > 48+0 records out > 6442450944 bytes (6.4 GB) copied?383.2 seconds?16.8 MB/s > [root at CLT lfs]# dd if=/dev/zero of=testfile004 bs=128M count=48 > dd: write error: ?testfile004?: No space left on this device. > 47+0 records in > 46+0 records out > 6301048832 bytes (6.3 GB) copied?418.321 ??15.1 MB/s > > [root at CLT lfs]# df -h > Filesystem Capacity Used Available Use% Mounted on > /dev/mapper/VolGroup00-LogVol00 4.3G 1.9G 2.2G 46% / > /dev/hda1 99M 67M 28M 72% /boot > tmpfs 252M 0 252M 0% /dev/shm > MGS at tcp:/testfs 82G 20G 59G 25% /mnt/lfs > [root at CLT lfs]# lfs df > UUID 1K-blocks Used Available > Use% Mounted on > testfs-MDT0000_UUID 2752272 127844 2467144 4% /mnt/lfs > [MDT:0] > testfs-OST0000_UUID 41284928 5145080 34042632 12% /mnt/lfs > [OST:0] > testfs-OST0001_UUID 15481840 5134432 9560912 33% /mnt/lfs > [OST:1] > testfs-OST0002_UUID 23738812 5141040 17391848 21% /mnt/lfs > [OST:2] > testfs-OST0003_UUID 5160576 4898364 4 94% /mnt/lfs > [OST:3] > > filesystem summary: 85666156 20318916 60995396 23% /mnt/lfs > I have no idea about this error. > Is there anyone could tell me about that how to config lustre to > avoid this error? > didn''t Lustre put file into the OSTs which still have free spaces > instead of those full ones ? > > > Regards, > > Chris > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss