Mag Gam
2008-Oct-12 19:05 UTC
[Lustre-discuss] Getting random "No space left on device (28)"
We have recently added another 1TB to a filesystem. We added a new OST and mounted the OST. On the clients we do a lfs df -h and we see the new space has been acquired. Also, lfs df -i shows enough inodes. However, we randomly see ''No Space left on device (28)" when we run our jobs. But if we resubmit the jobs it works again. Is there anything special we need to do, after we mount up a new OST? TIA
Papp Tamas
2008-Oct-12 19:15 UTC
[Lustre-discuss] Getting random "No space left on device (28)"
Mag Gam wrote:> We have recently added another 1TB to a filesystem. We added a new OST > and mounted the OST. On the clients we do a lfs df -h and we see the > new space has been acquired. Also, lfs df -i shows enough inodes. > However, we randomly see ''No Space left on device (28)" when we run > our jobs. But if we resubmit the jobs it works again. > > Is there anything special we need to do, after we mount up a new OST? > >I saw this one time, there was no enough space on one of our OST, I disabled the writing on it, and everything was OK. tamas
Kevin Van Maren
2008-Oct-12 19:24 UTC
[Lustre-discuss] Getting random "No space left on device (28)"
Sounds like one (or more) of your existing OSTs are out of space. The OSTs are assigned at file creation time, and Lustre will return an error if you cannot allocate space on the OST for a file you are writing. Do a "df" on your OSS nodes. Lustre does not re-stripe files; you may have to manually move (cp/rm) some files to the new OST to rebalance the file system. It is a manual process, but you can use "lfs setstripe" for force a specific OST, and use "lfs getstripe" to see where a file''s storage is allocated. Kevin Mag Gam wrote:> We have recently added another 1TB to a filesystem. We added a new OST > and mounted the OST. On the clients we do a lfs df -h and we see the > new space has been acquired. Also, lfs df -i shows enough inodes. > However, we randomly see ''No Space left on device (28)" when we run > our jobs. But if we resubmit the jobs it works again. > > Is there anything special we need to do, after we mount up a new OST? > > TIA > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss >
Brock Palen
2008-Oct-12 19:36 UTC
[Lustre-discuss] Getting random "No space left on device (28)"
On any client lfs df -h Show you all your OST usage for all your OST in one command. Brock Palen www.umich.edu/~brockp Center for Advanced Computing brockp at umich.edu (734)936-1985 On Oct 12, 2008, at 3:24 PM, Kevin Van Maren wrote:> Sounds like one (or more) of your existing OSTs are out of space. The > OSTs are assigned at file creation > time, and Lustre will return an error if you cannot allocate space on > the OST for a file you are writing. > Do a "df" on your OSS nodes. > > Lustre does not re-stripe files; you may have to manually move (cp/rm) > some files to the new OST > to rebalance the file system. It is a manual process, but you can use > "lfs setstripe" for force a specific OST, > and use "lfs getstripe" to see where a file''s storage is allocated. > > Kevin > > > Mag Gam wrote: >> We have recently added another 1TB to a filesystem. We added a new >> OST >> and mounted the OST. On the clients we do a lfs df -h and we see the >> new space has been acquired. Also, lfs df -i shows enough inodes. >> However, we randomly see ''No Space left on device (28)" when we run >> our jobs. But if we resubmit the jobs it works again. >> >> Is there anything special we need to do, after we mount up a new OST? >> >> TIA >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss at lists.lustre.org > http://lists.lustre.org/mailman/listinfo/lustre-discuss > >
Mag Gam
2008-Oct-12 20:19 UTC
[Lustre-discuss] Getting random "No space left on device (28)"
Guys: Thanks. I think thats the problem! Once I started to move away from the old OST and start rebalancing everything stated to clear up! Also, does anyone have a script or algorithm to re balance much quicker? I prefer an algorithm so I can use rsync to re balance. Thanks again. On Sun, Oct 12, 2008 at 3:36 PM, Brock Palen <brockp at umich.edu> wrote:> On any client > > lfs df -h > > Show you all your OST usage for all your OST in one command. > > Brock Palen > www.umich.edu/~brockp > Center for Advanced Computing > brockp at umich.edu > (734)936-1985 > > > > On Oct 12, 2008, at 3:24 PM, Kevin Van Maren wrote: > >> Sounds like one (or more) of your existing OSTs are out of space. The >> OSTs are assigned at file creation >> time, and Lustre will return an error if you cannot allocate space on >> the OST for a file you are writing. >> Do a "df" on your OSS nodes. >> >> Lustre does not re-stripe files; you may have to manually move (cp/rm) >> some files to the new OST >> to rebalance the file system. It is a manual process, but you can use >> "lfs setstripe" for force a specific OST, >> and use "lfs getstripe" to see where a file''s storage is allocated. >> >> Kevin >> >> >> Mag Gam wrote: >>> >>> We have recently added another 1TB to a filesystem. We added a new OST >>> and mounted the OST. On the clients we do a lfs df -h and we see the >>> new space has been acquired. Also, lfs df -i shows enough inodes. >>> However, we randomly see ''No Space left on device (28)" when we run >>> our jobs. But if we resubmit the jobs it works again. >>> >>> Is there anything special we need to do, after we mount up a new OST? >>> >>> TIA >>> _______________________________________________ >>> Lustre-discuss mailing list >>> Lustre-discuss at lists.lustre.org >>> http://lists.lustre.org/mailman/listinfo/lustre-discuss >>> >> >> _______________________________________________ >> Lustre-discuss mailing list >> Lustre-discuss at lists.lustre.org >> http://lists.lustre.org/mailman/listinfo/lustre-discuss >> >> > >
Hi, I don''t have a script but if you run command given below on the client it will produce a list of files that are striped to a particular OST. lfs find --recursive --obd lustrefs-OST0000_UUID /mnt/lustre Substitute lustrefs-OST0000_UUID with your full OST and the /mnt/lustre with your client lustre mount point. Since you have a list of the files you can move some of them to another OSTs Cheers, Wojciech Mag Gam wrote: Guys: Thanks. I think thats the problem! Once I started to move away from the old OST and start rebalancing everything stated to clear up! Also, does anyone have a script or algorithm to re balance much quicker? I prefer an algorithm so I can use rsync to re balance. Thanks again. On Sun, Oct 12, 2008 at 3:36 PM, Brock Palen wrote: On any client lfs df -h Show you all your OST usage for all your OST in one command. Brock Palen www.umich.edu/~brockp Center for Advanced Computing brockp-63aXycvo3TyHXe+LvDLADg@public.gmane.org (734)936-1985 On Oct 12, 2008, at 3:24 PM, Kevin Van Maren wrote: Sounds like one (or more) of your existing OSTs are out of space. The OSTs are assigned at file creation time, and Lustre will return an error if you cannot allocate space on the OST for a file you are writing. Do a "df" on your OSS nodes. Lustre does not re-stripe files; you may have to manually move (cp/rm) some files to the new OST to rebalance the file system. It is a manual process, but you can use "lfs setstripe" for force a specific OST, and use "lfs getstripe" to see where a file''s storage is allocated. Kevin Mag Gam wrote: We have recently added another 1TB to a filesystem. We added a new OST and mounted the OST. On the clients we do a lfs df -h and we see the new space has been acquired. Also, lfs df -i shows enough inodes. However, we randomly see ''No Space left on device (28)" when we run our jobs. But if we resubmit the jobs it works again. Is there anything special we need to do, after we mount up a new OST? TIA _______________________________________________ Lustre-discuss mailing list Lustre-discuss-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list Lustre-discuss-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org http://lists.lustre.org/mailman/listinfo/lustre-discuss Lustre-discuss mailing list Lustre-discuss-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org http://lists.lustre.org/mailman/listinfo/lustre-discuss _______________________________________________ Lustre-discuss mailing list Lustre-discuss-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org http://lists.lustre.org/mailman/listinfo/lustre-discuss