braam@clusterfs.com
2007-May-05 12:07 UTC
[Lustre-devel] [Bug 12418] Evictions taking too long
Please don''t reply to lustre-devel. Instead, comment in Bugzilla by using the following link: https://bugzilla.lustre.org/show_bug.cgi?id=12418 I think that this is to be expected - with one IOR continuing the eviction processing time is suffering. Adaptive timeouts will not help at all. There is a case to be made for a policy that prioritizes the eviction requests over IO requests. This will cause a small delay in IO requests from the surviving IOR, and it will avoid the server processing useless IO that is still in progress from the dying IOR. There is a solution to this that is perhaps not difficult to implement. When eviction RPC''s are beginning processing they raise a flag and they lower the flag when they are done. Multiple eviction threads can all raise and lower the flag. All IO processing threads check for the same flag both before network bulk transfer and before disk IO (both can cause delays). If the flag is (>0) the IO threads wait until the flag is lowered to 0 and then it simply proceeds (without holding the flag or something like that). Our future plans call for server driven quality of service to give requests from different sources different priorities. This is a very simple example of this.