Nick Jennings
2009-Feb-28 01:34 UTC
[Lustre-discuss] OST node filling up and aborting write
Hi Everyone, I have a small lustre test machine setup to bring myself back up to speed as it''s been a few years. This is probably a very basic issue but I''m not able to find documentation on it (maybe I''m looking for the wrong thing). I''ve got 4 OSTs (each 2gigs in size) on one lustre file system. I dd a 4 gig file to the filesystem and after the first OST fills up, the write fails (not enough space on device): # dd of=/mnt/testfs/datafile3 if=/dev/zero bs=1048576 count=4024 dd: writing `/mnt/testfs/testfile3'': No space left on device 1710+0 records in 1709+0 records out 1792020480 bytes (1.8 GB) copied, 55.1519 seconds, 32.5 MB/s # df -h Filesystem Size Used Avail Use% Mounted on /dev/hda1 15G 7.7G 5.9G 57% / tmpfs 252M 0 252M 0% /dev/shm /dev/hda5 4.1G 198M 3.7G 6% /mnt/test/mdt /dev/hda6 1.9G 1.1G 686M 62% /mnt/test/ost0 192.168.0.149 at tcp:/testfs 7.4G 4.7G 2.4G 67% /mnt/testfs /dev/hda7 1.9G 1.8G 68K 100% /mnt/test/ost1 /dev/hda8 1.9G 80M 1.7G 5% /mnt/test/ost2 /dev/hda9 1.9G 1.8G 68K 100% /mnt/test/ost3 I did this 2 times, which is why both ost1 and ost3 are full. As you can see, ost2 and ost0 still have space. I initially thought this could be solved by enabling striping, but from HowTo (which doesn''t say much on the subject admittedly) I gathered striping was already enabled? (4MB chunks). So, shouldn''t these OSTs be filling up at a relatively uniform ratio? # cat /proc/fs/lustre/lov/testfs-clilov-ca5e0000/stripe* 1 0 4194304 1 [root at andy ~]# cat /proc/fs/lustre/lov/testfs-mdtlov/stripe* 1 0 4194304 1 Thanks for any help, -Nick
Hi Nick, Have you tried setting stripecount=-1 ? Thanks -Minh Nick Jennings wrote:> Hi Everyone, > > I have a small lustre test machine setup to bring myself back up to > speed as it''s been a few years. This is probably a very basic issue but > I''m not able to find documentation on it (maybe I''m looking for the > wrong thing). > > I''ve got 4 OSTs (each 2gigs in size) on one lustre file system. I dd a > 4 gig file to the filesystem and after the first OST fills up, the write > fails (not enough space on device): > > > # dd of=/mnt/testfs/datafile3 if=/dev/zero bs=1048576 count=4024 > dd: writing `/mnt/testfs/testfile3'': No space left on device > 1710+0 records in > 1709+0 records out > 1792020480 bytes (1.8 GB) copied, 55.1519 seconds, 32.5 MB/s > > # df -h > Filesystem Size Used Avail Use% Mounted on > /dev/hda1 15G 7.7G 5.9G 57% / > tmpfs 252M 0 252M 0% /dev/shm > /dev/hda5 4.1G 198M 3.7G 6% /mnt/test/mdt > /dev/hda6 1.9G 1.1G 686M 62% /mnt/test/ost0 > 192.168.0.149 at tcp:/testfs > 7.4G 4.7G 2.4G 67% /mnt/testfs > /dev/hda7 1.9G 1.8G 68K 100% /mnt/test/ost1 > /dev/hda8 1.9G 80M 1.7G 5% /mnt/test/ost2 > /dev/hda9 1.9G 1.8G 68K 100% /mnt/test/ost3 > > > I did this 2 times, which is why both ost1 and ost3 are full. As you can > see, ost2 and ost0 still have space. > > I initially thought this could be solved by enabling striping, but from > HowTo (which doesn''t say much on the subject admittedly) I gathered > striping was already enabled? (4MB chunks). So, shouldn''t these OSTs be > filling up at a relatively uniform ratio? > > # cat /proc/fs/lustre/lov/testfs-clilov-ca5e0000/stripe* > 1 > 0 > 4194304 > 1 > [root at andy ~]# cat /proc/fs/lustre/lov/testfs-mdtlov/stripe* > 1 > 0 > 4194304 > 1 > > > Thanks for any help, > -Nick > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Nick Jennings
2009-Feb-28 13:06 UTC
[Lustre-discuss] OST node filling up and aborting write
Hi Minh, Yes, stripecount is set to one: # cat /proc/fs/lustre/lov/*/stripecount 1 1 -Nick Minh Diep wrote:> Hi Nick, > > Have you tried setting stripecount=-1 ? > > Thanks > -Minh > > Nick Jennings wrote: >> Hi Everyone, >> >> I have a small lustre test machine setup to bring myself back up to >> speed as it''s been a few years. This is probably a very basic issue >> but I''m not able to find documentation on it (maybe I''m looking for >> the wrong thing). >> >> I''ve got 4 OSTs (each 2gigs in size) on one lustre file system. I dd >> a 4 gig file to the filesystem and after the first OST fills up, the >> write fails (not enough space on device): >> >> >> # dd of=/mnt/testfs/datafile3 if=/dev/zero bs=1048576 count=4024 >> dd: writing `/mnt/testfs/testfile3'': No space left on device >> 1710+0 records in >> 1709+0 records out >> 1792020480 bytes (1.8 GB) copied, 55.1519 seconds, 32.5 MB/s >> >> # df -h >> Filesystem Size Used Avail Use% Mounted on >> /dev/hda1 15G 7.7G 5.9G 57% / >> tmpfs 252M 0 252M 0% /dev/shm >> /dev/hda5 4.1G 198M 3.7G 6% /mnt/test/mdt >> /dev/hda6 1.9G 1.1G 686M 62% /mnt/test/ost0 >> 192.168.0.149 at tcp:/testfs >> 7.4G 4.7G 2.4G 67% /mnt/testfs >> /dev/hda7 1.9G 1.8G 68K 100% /mnt/test/ost1 >> /dev/hda8 1.9G 80M 1.7G 5% /mnt/test/ost2 >> /dev/hda9 1.9G 1.8G 68K 100% /mnt/test/ost3 >> >> >> I did this 2 times, which is why both ost1 and ost3 are full. As you >> can see, ost2 and ost0 still have space. >> >> I initially thought this could be solved by enabling striping, but >> from HowTo (which doesn''t say much on the subject admittedly) I >> gathered striping was already enabled? (4MB chunks). So, shouldn''t >> these OSTs be filling up at a relatively uniform ratio? >> >> # cat /proc/fs/lustre/lov/testfs-clilov-ca5e0000/stripe* >> 1 >> 0 >> 4194304 >> 1 >> [root at andy ~]# cat /proc/fs/lustre/lov/testfs-mdtlov/stripe* >> 1 >> 0 >> 4194304 >> 1 >> >> >> Thanks for any help, >> -Nick >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss
Nick Jennings
2009-Feb-28 13:36 UTC
[Lustre-discuss] OST node filling up and aborting write
I just re-formated everything and started from scratch, here is a start to finish account of the process. I''m following the Lustre Mount Conf doc found here: http://wiki.lustre.org/index.php?title=Mount_Conf -- Create MDT / MGS -- # mkfs.lustre --fsname=testfs --mdt --mgs --reformat /dev/hda5 # mount -t lustre /dev/hda5 /mnt/lustre/mdt/ # cat /proc/fs/lustre/devices 0 UP mgs MGS MGS 5 1 UP mgc MGC192.168.0.149 at tcp c047ce37-72dd-346d-b348-19d50416e195 5 2 UP mdt MDS MDS_uuid 3 3 UP lov testfs-mdtlov testfs-mdtlov_UUID 4 4 UP mds testfs-MDT0000 testfs-MDT0000_UUID 3 -- Format OSTs -- # mkfs.lustre --fsname=testfs --ost --mgsnode=192.168.0.149 at tcp0 --reformat /dev/hda6 # mkfs.lustre --fsname=testfs --ost --mgsnode=192.168.0.149 at tcp0 --reformat /dev/hda7 # mkfs.lustre --fsname=testfs --ost --mgsnode=192.168.0.149 at tcp0 --reformat /dev/hda8 # mkfs.lustre --fsname=testfs --ost --mgsnode=192.168.0.149 at tcp0 --reformat /dev/hda9 -- Mount OSTs -- # mount -t lustre /dev/hda6 /mnt/lustre/ost0/ # mount -t lustre /dev/hda7 /mnt/lustre/ost1/ # mount -t lustre /dev/hda8 /mnt/lustre/ost2/ # mount -t lustre /dev/hda9 /mnt/lustre/ost3/ # df -h Filesystem Size Used Avail Use% Mounted on /dev/hda1 15G 7.7G 5.9G 57% / tmpfs 252M 0 252M 0% /dev/shm /dev/hda5 4.1G 198M 3.7G 6% /mnt/lustre/mdt /dev/hda6 1.9G 80M 1.7G 5% /mnt/lustre/ost0 /dev/hda7 1.9G 80M 1.7G 5% /mnt/lustre/ost1 /dev/hda8 1.9G 80M 1.7G 5% /mnt/lustre/ost2 /dev/hda9 1.9G 80M 1.7G 5% /mnt/lustre/ost3 -- Mount Lustre Filesystem -- # mount -t lustre 192.168.0.149 at tcp0:/testfs /mnt/testfs/ # df -h Filesystem Size Used Avail Use% Mounted on /dev/hda1 15G 7.7G 5.9G 57% / tmpfs 252M 0 252M 0% /dev/shm /dev/hda5 4.1G 198M 3.7G 6% /mnt/lustre/mdt /dev/hda6 1.9G 80M 1.7G 5% /mnt/lustre/ost0 /dev/hda7 1.9G 80M 1.7G 5% /mnt/lustre/ost1 /dev/hda8 1.9G 80M 1.7G 5% /mnt/lustre/ost2 /dev/hda9 1.9G 80M 1.7G 5% /mnt/lustre/ost3 192.168.0.149 at tcp0:/testfs 7.4G 317M 6.7G 5% /mnt/testfs -- Write Test #1 -- # dd if=/dev/zero of=/mnt/testfs/testfile1 bs=4096 count=614400 dd: writing `/mnt/testfs/testfile1'': No space left on device 437506+0 records in 437505+0 records out 1792020480 bytes (1.8 GB) copied, 49.9896 seconds, 35.8 MB/s # df -h Filesystem Size Used Avail Use% Mounted on /dev/hda1 15G 7.7G 5.9G 57% / tmpfs 252M 0 252M 0% /dev/shm /dev/hda5 4.1G 198M 3.7G 6% /mnt/lustre/mdt /dev/hda6 1.9G 1.8G 68K 100% /mnt/lustre/ost0 /dev/hda7 1.9G 80M 1.7G 5% /mnt/lustre/ost1 /dev/hda8 1.9G 80M 1.7G 5% /mnt/lustre/ost2 /dev/hda9 1.9G 80M 1.7G 5% /mnt/lustre/ost3 192.168.0.149 at tcp0:/testfs 7.4G 2.0G 5.1G 29% /mnt/testfs # cat /proc/fs/lustre/lov/testfs-*/stripe* 1 0 1048576 1 1 0 1048576 1 Nick Jennings wrote:> Hi Minh, > > Yes, stripecount is set to one: > > # cat /proc/fs/lustre/lov/*/stripecount > 1 > 1 > > -Nick > > Minh Diep wrote: >> Hi Nick, >> >> Have you tried setting stripecount=-1 ? >> >> Thanks >> -Minh >> >> Nick Jennings wrote: >>> Hi Everyone, >>> >>> I have a small lustre test machine setup to bring myself back up to >>> speed as it''s been a few years. This is probably a very basic issue >>> but I''m not able to find documentation on it (maybe I''m looking for >>> the wrong thing). >>> >>> I''ve got 4 OSTs (each 2gigs in size) on one lustre file system. I dd >>> a 4 gig file to the filesystem and after the first OST fills up, the >>> write fails (not enough space on device): >>> >>> >>> # dd of=/mnt/testfs/datafile3 if=/dev/zero bs=1048576 count=4024 >>> dd: writing `/mnt/testfs/testfile3'': No space left on device >>> 1710+0 records in >>> 1709+0 records out >>> 1792020480 bytes (1.8 GB) copied, 55.1519 seconds, 32.5 MB/s >>> >>> # df -h >>> Filesystem Size Used Avail Use% Mounted on >>> /dev/hda1 15G 7.7G 5.9G 57% / >>> tmpfs 252M 0 252M 0% /dev/shm >>> /dev/hda5 4.1G 198M 3.7G 6% /mnt/test/mdt >>> /dev/hda6 1.9G 1.1G 686M 62% /mnt/test/ost0 >>> 192.168.0.149 at tcp:/testfs >>> 7.4G 4.7G 2.4G 67% /mnt/testfs >>> /dev/hda7 1.9G 1.8G 68K 100% /mnt/test/ost1 >>> /dev/hda8 1.9G 80M 1.7G 5% /mnt/test/ost2 >>> /dev/hda9 1.9G 1.8G 68K 100% /mnt/test/ost3 >>> >>> >>> I did this 2 times, which is why both ost1 and ost3 are full. As you >>> can see, ost2 and ost0 still have space. >>> >>> I initially thought this could be solved by enabling striping, but >>> from HowTo (which doesn''t say much on the subject admittedly) I >>> gathered striping was already enabled? (4MB chunks). So, shouldn''t >>> these OSTs be filling up at a relatively uniform ratio? >>> >>> # cat /proc/fs/lustre/lov/testfs-clilov-ca5e0000/stripe* >>> 1 >>> 0 >>> 4194304 >>> 1 >>> [root at andy ~]# cat /proc/fs/lustre/lov/testfs-mdtlov/stripe* >>> 1 >>> 0 >>> 4194304 >>> 1 >>> >>> >>> Thanks for any help, >>> -Nick >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Brian J. Murrell
2009-Feb-28 13:36 UTC
[Lustre-discuss] OST node filling up and aborting write
On Sat, 2009-02-28 at 02:34 +0100, Nick Jennings wrote:> Hi Everyone,Hi Nick,> I''ve got 4 OSTs (each 2gigs in size) on one lustre file system. I dd a > 4 gig file to the filesystem and after the first OST fills up, the write > fails (not enough space on device):Writes to do not "cascade" over to another OST when one fills up.> I initially thought this could be solved by enabling striping, but from > HowTo (which doesn''t say much on the subject admittedly) I gathered > striping was already enabled?No. By default, stripesize == 1. In order to get a single file onto multiple OSTs you will need to explicitly set a striping policy either on the file you are going to write into or the directory the file is in. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090228/39e00879/attachment-0001.bin
Brian J. Murrell
2009-Feb-28 13:37 UTC
[Lustre-discuss] OST node filling up and aborting write
On Sat, 2009-02-28 at 14:06 +0100, Nick Jennings wrote:> Hi Minh, > > Yes, stripecount is set to one: > > # cat /proc/fs/lustre/lov/*/stripecount > 1 > 1Notice that Minh asked you to try "-1" (negative one), not one. That will stripe across all OSTs. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090228/d0682e44/attachment.bin
Brian J. Murrell
2009-Feb-28 13:52 UTC
[Lustre-discuss] OST node filling up and aborting write
On Sat, 2009-02-28 at 14:36 +0100, Nick Jennings wrote:> I just re-formated everything and started from scratch, here is a start > to finish account of the process. I''m following the Lustre Mount Conf > doc found here: http://wiki.lustre.org/index.php?title=Mount_Conf> > # cat /proc/fs/lustre/lov/testfs-*/stripe* > 1 > 0 > 1048576 > 1 > 1 > 0 > 1048576 > 1It''s obviously not clear from the above cat which each of those values is for. Substituting "grep .*" for cat usually solves that. In any case, I don''t see anything in those numbers that indicates anything other than a "stripe_count" of 1 policy. Have you read Chapter 25 of the Operations Manual? It covers striping pretty well. b. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 197 bytes Desc: This is a digitally signed message part Url : http://lists.lustre.org/pipermail/lustre-discuss/attachments/20090228/eea804ad/attachment.bin
Nick Jennings
2009-Feb-28 14:22 UTC
[Lustre-discuss] OST node filling up and aborting write
Hi Brian, (Thanks for pointing out the -1 as opposed to 1, I missed that) Brian J. Murrell wrote:> On Sat, 2009-02-28 at 02:34 +0100, Nick Jennings wrote: >> Hi Everyone, > > Hi Nick, > >> I''ve got 4 OSTs (each 2gigs in size) on one lustre file system. I dd a >> 4 gig file to the filesystem and after the first OST fills up, the write >> fails (not enough space on device): > > Writes to do not "cascade" over to another OST when one fills up.I see. I guess I have a misunderstanding of the way striping works. If you set the stripesize=1MB, and stripecount=-1 - Then I would assume this means: Split each write process into 1MB chunks, stripe across all OSTs. By write process I mean 1 single file being written to disk. I''ve read over Chapter 25 as well but it doesn''t seem to clarify this for me (I''m probably letting something fly over my head).>> I initially thought this could be solved by enabling striping, but from >> HowTo (which doesn''t say much on the subject admittedly) I gathered >> striping was already enabled? > > No. By default, stripesize == 1. In order to get a single file onto > multiple OSTs you will need to explicitly set a striping policy either > on the file you are going to write into or the directory the file is in.Then what is stripesize=-1 used for? (when specified for the filesystem, and not a file or a directory). Can you give me an example? -- Write Test #2 -- # lctl conf_param testfs-MDT0000.lov.stripecount=-1 /proc/fs/lustre/lov/testfs-clilov-c464c000/stripecount:-1 /proc/fs/lustre/lov/testfs-clilov-c464c000/stripeoffset:0 /proc/fs/lustre/lov/testfs-clilov-c464c000/stripesize:1048576 /proc/fs/lustre/lov/testfs-clilov-c464c000/stripetype:1 /proc/fs/lustre/lov/testfs-mdtlov/stripecount:-1 /proc/fs/lustre/lov/testfs-mdtlov/stripeoffset:0 /proc/fs/lustre/lov/testfs-mdtlov/stripesize:1048576 /proc/fs/lustre/lov/testfs-mdtlov/stripetype:1 # dd if=/dev/zero of=/mnt/testfs/testfile1 bs=4096 count=614400 dd: writing `/mnt/testfs/testfile1'': No space left on device 437506+0 records in 437505+0 records out 1792020480 bytes (1.8 GB) copied, 52.5727 seconds, 34.1 MB/s # df -h Filesystem Size Used Avail Use% Mounted on /dev/hda1 15G 7.7G 5.9G 57% / tmpfs 252M 0 252M 0% /dev/shm /dev/hda5 4.1G 198M 3.7G 6% /mnt/lustre/mdt /dev/hda6 1.9G 1.8G 68K 100% /mnt/lustre/ost0 /dev/hda7 1.9G 80M 1.7G 5% /mnt/lustre/ost1 /dev/hda8 1.9G 80M 1.7G 5% /mnt/lustre/ost2 /dev/hda9 1.9G 80M 1.7G 5% /mnt/lustre/ost3 192.168.0.149 at tcp0:/testfs 7.4G 2.0G 5.1G 29% /mnt/testfs Thanks for your help, -Nick
On 02/28/09 06:22, Nick Jennings wrote:> Hi Brian, > > (Thanks for pointing out the -1 as opposed to 1, I missed that) > > Brian J. Murrell wrote: >> On Sat, 2009-02-28 at 02:34 +0100, Nick Jennings wrote: >>> Hi Everyone, >> Hi Nick, >> >>> I''ve got 4 OSTs (each 2gigs in size) on one lustre file system. I dd a >>> 4 gig file to the filesystem and after the first OST fills up, the write >>> fails (not enough space on device): >> Writes to do not "cascade" over to another OST when one fills up. > > I see. I guess I have a misunderstanding of the way striping works. > > If you set the stripesize=1MB, and stripecount=-1 - Then I would assume > this means: Split each write process into 1MB chunks, stripe across all > OSTs. By write process I mean 1 single file being written to disk. I''ve > read over Chapter 25 as well but it doesn''t seem to clarify this for me > (I''m probably letting something fly over my head). > > >>> I initially thought this could be solved by enabling striping, but from >>> HowTo (which doesn''t say much on the subject admittedly) I gathered >>> striping was already enabled? >> No. By default, stripesize == 1. In order to get a single file onto >> multiple OSTs you will need to explicitly set a striping policy either >> on the file you are going to write into or the directory the file is in. > > Then what is stripesize=-1 used for? (when specified for the filesystem, > and not a file or a directory). Can you give me an example?You can only setstripe on a directory, not a file. You could try this 1. rm -f /mnt/testfs/testfile1 2. lfs setstripe -c -1 /mnt/testfs 3. dd if=/dev/zero of=/mnt/testfs/testfile1 bs=4096 count=614400 4. df -h Thanks -Minh> > > > -- > Write Test #2 > -- > > # lctl conf_param testfs-MDT0000.lov.stripecount=-1 > > /proc/fs/lustre/lov/testfs-clilov-c464c000/stripecount:-1 > /proc/fs/lustre/lov/testfs-clilov-c464c000/stripeoffset:0 > /proc/fs/lustre/lov/testfs-clilov-c464c000/stripesize:1048576 > /proc/fs/lustre/lov/testfs-clilov-c464c000/stripetype:1 > /proc/fs/lustre/lov/testfs-mdtlov/stripecount:-1 > /proc/fs/lustre/lov/testfs-mdtlov/stripeoffset:0 > /proc/fs/lustre/lov/testfs-mdtlov/stripesize:1048576 > /proc/fs/lustre/lov/testfs-mdtlov/stripetype:1 > > # dd if=/dev/zero of=/mnt/testfs/testfile1 bs=4096 count=614400 > dd: writing `/mnt/testfs/testfile1'': No space left on device > 437506+0 records in > 437505+0 records out > 1792020480 bytes (1.8 GB) copied, 52.5727 seconds, 34.1 MB/s > > # df -h > Filesystem Size Used Avail Use% Mounted on > /dev/hda1 15G 7.7G 5.9G 57% / > tmpfs 252M 0 252M 0% /dev/shm > /dev/hda5 4.1G 198M 3.7G 6% /mnt/lustre/mdt > /dev/hda6 1.9G 1.8G 68K 100% /mnt/lustre/ost0 > /dev/hda7 1.9G 80M 1.7G 5% /mnt/lustre/ost1 > /dev/hda8 1.9G 80M 1.7G 5% /mnt/lustre/ost2 > /dev/hda9 1.9G 80M 1.7G 5% /mnt/lustre/ost3 > 192.168.0.149 at tcp0:/testfs > 7.4G 2.0G 5.1G 29% /mnt/testfs > > > > Thanks for your help, > -Nick > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss
Kevin Van Maren
2009-Feb-28 16:44 UTC
[Lustre-discuss] OST node filling up and aborting write
minh diep wrote:>>> No. By default, stripesize == 1. In order to get a single file onto >>> multiple OSTs you will need to explicitly set a striping policy either >>> on the file you are going to write into or the directory the file is in. >>> >> Then what is stripesize=-1 used for? (when specified for the filesystem, >> and not a file or a directory). Can you give me an example? >>"-1" means "all available". So with a 1MB stripe size, it will write 1MB to an OST, then write 1MB to the next, etc, until it runs out of OSTs. Then it will put the next chunk back on the first OST and repeat.> You can only setstripe on a directory, not a file. >Not entirely correct: you cannot change the stripe on an _existing_ file, but "lfs setstripe" will create a 0-byte file with the specified striping (think "touch"). But "lfs setstripe" is normally used on directories. Any file or directory being created inherits the stripe info from the directory it is in. If that value is "0", it uses the system default (normally 1). For large files, you should set the stripe to "-1" or some value > 1 (4 is good with 4 OSTs). I think we''ve now beaten this thread to death... Kevin
Nick Jennings
2009-Feb-28 16:46 UTC
[Lustre-discuss] OST node filling up and aborting write
minh diep wrote:> On 02/28/09 06:22, Nick Jennings wrote: >> Brian J. Murrell wrote: >>> On Sat, 2009-02-28 at 02:34 +0100, Nick Jennings wrote: >>>> I''ve got 4 OSTs (each 2gigs in size) on one lustre file system. I >>>> dd a 4 gig file to the filesystem and after the first OST fills up, >>>> the write fails (not enough space on device): >>> Writes to do not "cascade" over to another OST when one fills up. >> >> I see. I guess I have a misunderstanding of the way striping works. >> >> If you set the stripesize=1MB, and stripecount=-1 - Then I would >> assume this means: Split each write process into 1MB chunks, stripe >> across all OSTs. By write process I mean 1 single file being written >> to disk. I''ve read over Chapter 25 as well but it doesn''t seem to >> clarify this for me (I''m probably letting something fly over my head). >> >> >>>> I initially thought this could be solved by enabling striping, but >>>> from HowTo (which doesn''t say much on the subject admittedly) I >>>> gathered striping was already enabled? >>> No. By default, stripesize == 1. In order to get a single file onto >>> multiple OSTs you will need to explicitly set a striping policy either >>> on the file you are going to write into or the directory the file is in. >> >> Then what is stripesize=-1 used for? (when specified for the >> filesystem, and not a file or a directory). Can you give me an example? > You can only setstripe on a directory, not a file. > > You could try this > 1. rm -f /mnt/testfs/testfile1 > 2. lfs setstripe -c -1 /mnt/testfs > 3. dd if=/dev/zero of=/mnt/testfs/testfile1 bs=4096 count=614400 > 4. df -hOk, that makes sense in itself, but I''m still confused as to what effect the stripe settings in /proc/fs/lustre/lov/ have on new files that don''t have special lfs settings. Thanks, -Nick