I have a cluster set up with 1 MDS/MDT and 4 OSS/OST''s under 1.6beta5 (lustre-1.5.95). The OSS''s have different sizes and I have set the striping to start with the largest OSS first and to use all OSS''s. I remember reading that there was a ''free space balancing'' in the striping code for 1.6. Do I need to turn something on to activate this? As you can see from below, OST2 has 1.3TB available yet I get "out of space on device" errors when the cluster drops to 1.4TB of total space available. The balancing (if it is indeed working) seems to be working for OSTs 0 and 1 but not for 2 and 3. Any suggestions are appreciated. Adam D''Auria Utelisys Communications, BV lfs > df -h UUID bytes Used Available Use% Mounted on spool-MDT0000_UUID 65.2G 4.1G 61.1G 6 /spool[MDT:0] spool-OST0000_UUID 2.2T 2.1T 161.3G 92 /spool[OST:0] spool-OST0001_UUID 2.7T 2.5T 160.9G 94 /spool[OST:1] spool-OST0002_UUID 3.6T 2.3T 1.3T 64 /spool[OST:2] spool-OST0003_UUID 2.7T 2.3T 380.7G 86 /spool[OST:3] filesystem summary: 11.2T 9.2T 1.9T 82 /spool -------------------- lfs> getstripe /spool OBDS: 0: spool-OST0000_UUID ACTIVE 1: spool-OST0001_UUID ACTIVE 2: spool-OST0002_UUID ACTIVE 3: spool-OST0003_UUID ACTIVE /spool default stripe_count: -1 stripe_size: 1048576 stripe_offset: 2
Nathaniel Rutman
2006-Nov-20 10:42 UTC
[Lustre-discuss] 1.6beta5 balancing of space troubles
here''s my post from 10/24: ---- Oops. The free-space stripe weighting (what we call "stripe QOS"), although present in all the 1.6 betas, was inadvertently set to give a priority of "0" to the free space (versus trying to place the stripes "widely" -- nicely distributed across OSSs and OSTs to maximize network balancing). This priority can be adjusted via the proc file /proc/fs/lustre/lov/lustre-mdtlov/qos_prio_free The default in the future will be 90%. You can set this permanently on existing betas with this command on the MGS: lctl conf_param <fsname>-MDT0000.lov.qos_prio_free=90 Note that setting the priority to 100% just means that OSS distribution doesn''t count in the weighting, but the stripe assignment is still done via a weighting -- if OST2 has twice as much free space as OST1, it will be twice as likely to be used, but is _NOT_ guaranteed to be used. Also note that stripe QOS doesn''t kick in until two OSTs are imbalanced by more than 20%. Until then, a faster round-robin stripe allocater is used. (The new round-robin order also maximizes network balancing.) ----- One other thing to note is that if your files have more than 1 stripe, they _will not_ be placed on the same OST, so you might still ENOSPC. There are other conditions that generate an ENOSPC too -- if you have lots of clients with active grants, you might run our of grant space without actually being out of disk space. Adam D''Auria wrote:> I have a cluster set up with 1 MDS/MDT and 4 OSS/OST''s under 1.6beta5 > (lustre-1.5.95). The OSS''s have different sizes and I have set the > striping to start with the largest OSS first and to use all OSS''s. > > I remember reading that there was a ''free space balancing'' in the > striping code for 1.6. Do I need to turn something on to activate this? > > As you can see from below, OST2 has 1.3TB available yet I get "out of > space on device" errors when the cluster drops to 1.4TB of total space > available. The balancing (if it is indeed working) seems to be > working for OSTs 0 and 1 but not for 2 and 3. > > Any suggestions are appreciated. > > Adam D''Auria > Utelisys Communications, BV > > > lfs > df -h > UUID bytes Used Available Use% Mounted on > spool-MDT0000_UUID 65.2G 4.1G 61.1G 6 /spool[MDT:0] > spool-OST0000_UUID 2.2T 2.1T 161.3G 92 /spool[OST:0] > spool-OST0001_UUID 2.7T 2.5T 160.9G 94 /spool[OST:1] > spool-OST0002_UUID 3.6T 2.3T 1.3T 64 /spool[OST:2] > spool-OST0003_UUID 2.7T 2.3T 380.7G 86 /spool[OST:3] > > filesystem summary: 11.2T 9.2T 1.9T 82 /spool > > -------------------- > lfs> getstripe /spool > OBDS: > 0: spool-OST0000_UUID ACTIVE > 1: spool-OST0001_UUID ACTIVE > 2: spool-OST0002_UUID ACTIVE > 3: spool-OST0003_UUID ACTIVE > /spool > default stripe_count: -1 stripe_size: 1048576 stripe_offset: 2 > > _______________________________________________ > Lustre-discuss mailing list > Lustre-discuss@clusterfs.com > https://mail.clusterfs.com/mailman/listinfo/lustre-discuss >