Daniel Kobras
2009-Oct-15 16:46 UTC
[Lustre-discuss] Expected number of clients during MDS reconnect.
Hi! When initiating an MDS failover on one of our systems, we see the new active MDS expecting more clients to recover than were actually connected before. # cat /proc/fs/lustre/mds/lustrefs-MDT0000/recovery_status status: COMPLETE recovery_start: 1255622509 recovery_duration: 300 delayed_clients: 0/651 completed_clients: 260/651 replayed_requests: 4 last_transno: 4112137365 Where 260 is indeed the correct number of active clients. # ls -d1 /proc/fs/lustre/mds/lustrefs-MDT0000/exports/*@* | wc -l 260 # cat /proc/fs/lustre/mds/lustrefs-MDT0000/num_exports 261 Not sure what caues the off-by-one between num_exports and the number of entries in the exports subdirectory, but the difference doesn''t look severe. I do wonder about the expected number of 651 clients, though. When recovery has finished on the MDS, Lustre correctly evicts those surplus clients, it seems, as the syslog reports Lustre: lustrefs-MDT0000: Recovery period over after 5:00, of 651 clients 260 recovered and 391 were evicted. but still the MDT apparently keeps note of them and expects them back during the next recovery cycle. Which means that currently we always have to wait the full recovery timespan even though all active clients have reconnected already. We''ve seen this behaviour with MDSes running 1.6.7.2 and 1.8.1, clients run a mixture of versions between 1.6.6 and 1.8.1. During the lifetime of the system, we''ve only decommissioned a small number of systems running Lustre clients, so the difference between current and expected client numbers must have developped by some other means. Does anyone know how the MDT calculates the number of expected clients? Is there a way to make Lustre dump a list of nids of the surplus clients it evicts after the recovery phase? And above all, is there a way to convince the MDT about the true number of clients (preferrably one that doesn''t involve the writeconf dance ;-)? Regards, Daniel.