I''ve got an odd situation that I can''t seem to fix. My setup is Lustre 1.8.8-wc1 clients on RHEL 6 talking to 1.8.6 servers on RHEL 5. My compute nodes have 64 GB of memory and I have a use case where an application has very low memory usage and needs to access a few thousand files in Lustre that range from 10 to 50 MB. The files are subject to some reuse and it would be advantageous to cache as much of the data as possible. The default cache for this configuration would be 48GB on the client as that is 75% of memory. However the client never caches more than about 40GB of data according to /proc/meminfo Even if I tune the cached memory to 64GB the amount of cache in use never goes past 40GB. My current setting is as follows # lctl get_param llite.*.max_cached_mb llite.olympus-ffff8804069da800.max_cached_mb=64000 I''ve also played with some of the VM tunable settings. Like running vfs_cache_pressure down to 10 # vm.vfs_cache_pressure = 10 In no case do I see more than about 35GB of cache being used. To do some more testing on this I created a bunch (40) 2G files in Lustre and then copied them to /dev/null on the client. While doing this I ran the fincore tool from http://code.google.com/p/linux-ftools/ to see if the file was still in cache. Once about 40GB of cache was used, the kernel started to drop files from the cache even though there was no memory pressure on the system. If I do the same test with files local to the system, I can fill all the cache to about 61GB before files start getting dropped. Is there some other Lustre tunable on the client that I can twiddle with to make more use of the local memory cache? Thanks Tim Carlson Director, PNNL Institutional Computing timothy.carlson-MIjBx5DB8Ok@public.gmane.org
On 2013-09-24, at 14:15, "Carlson, Timothy S" <Timothy.Carlson-MIjBx5DB8Ok@public.gmane.org> wrote:> I''ve got an odd situation that I can''t seem to fix. > > My setup is Lustre 1.8.8-wc1 clients on RHEL 6 talking to 1.8.6 servers on RHEL 5. > > My compute nodes have 64 GB of memory and I have a use case where an application has very low memory usage and needs to access a few thousand files in Lustre that range from 10 to 50 MB. The files are subject to some reuse and it would be advantageous to cache as much of the data as possible. The default cache for this configuration would be 48GB on the client as that is 75% of memory. However the client never caches more than about 40GB of data according to /proc/meminfo > > Even if I tune the cached memory to 64GB the amount of cache in use never goes past 40GB. My current setting is as follows > > # lctl get_param llite.*.max_cached_mb > llite.olympus-ffff8804069da800.max_cached_mb=64000 > > I''ve also played with some of the VM tunable settings. Like running vfs_cache_pressure down to 10 > > # vm.vfs_cache_pressure = 10 > > In no case do I see more than about 35GB of cache being used. To do some more testing on this I created a bunch (40) 2G files in Lustre and then copied them to /dev/null on the client. While doing this I ran the fincore tool from http://code.google.com/p/linux-ftools/ to see if the file was still in cache. Once about 40GB of cache was used, the kernel started to drop files from the cache even though there was no memory pressure on the system. > > If I do the same test with files local to the system, I can fill all the cache to about 61GB before files start getting dropped. > > Is there some other Lustre tunable on the client that I can twiddle with to make more use of the local memory cache?This might relate to the number of DLM locks cached on the client. Of the locks get cancelled for some reason (e.g. memory pressure on the server, old age) then the pages covered by the locks will also be dropped. You could try disabling the lock LRU and specify some large static number of locks (for testing, I wouldn''t leave this set for production systems with large numbers of clients): lctl set_param ldlm.namespaces.*.lru_size=10000 To reset it to dynamic DLM LRU size management set a value of "0". Cheers, Andread
> -----Original Message----- > From: Dilger, Andreas [mailto:andreas.dilger-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org] > Sent: Wednesday, September 25, 2013 10:03 AM > To: Carlson, Timothy S > Cc: lustre-discuss-aLEFhgZF4x6X6Mz3xDxJMA@public.gmane.org; hpdd-discuss-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org > Subject: Re: [Lustre-discuss] Can''t increase effective client read cache > > On 2013-09-24, at 14:15, "Carlson, Timothy S" <Timothy.Carlson-MIjBx5DB8Ok@public.gmane.org> > wrote: > > > I''ve got an odd situation that I can''t seem to fix. > > > > My setup is Lustre 1.8.8-wc1 clients on RHEL 6 talking to 1.8.6 servers on RHEL > 5. > > > > My compute nodes have 64 GB of memory and I have a use case where an > application has very low memory usage and needs to access a few thousand > files in Lustre that range from 10 to 50 MB. The files are subject to some reuse > and it would be advantageous to cache as much of the data as possible. The > default cache for this configuration would be 48GB on the client as that is 75% > of memory. However the client never caches more than about 40GB of data > according to /proc/meminfo > > > > Even if I tune the cached memory to 64GB the amount of cache in use never > goes past 40GB. My current setting is as follows > > > > # lctl get_param llite.*.max_cached_mb > > llite.olympus-ffff8804069da800.max_cached_mb=64000 > > > > I''ve also played with some of the VM tunable settings. Like running > vfs_cache_pressure down to 10 > > > > # vm.vfs_cache_pressure = 10 > > > > In no case do I see more than about 35GB of cache being used. To do some > more testing on this I created a bunch (40) 2G files in Lustre and then copied > them to /dev/null on the client. While doing this I ran the fincore tool from > http://code.google.com/p/linux-ftools/ to see if the file was still in cache. Once > about 40GB of cache was used, the kernel started to drop files from the cache > even though there was no memory pressure on the system. > > > > If I do the same test with files local to the system, I can fill all the cache to > about 61GB before files start getting dropped. > > > > Is there some other Lustre tunable on the client that I can twiddle with to > make more use of the local memory cache? > > This might relate to the number of DLM locks cached on the client. Of the locks > get cancelled for some reason (e.g. memory pressure on the server, old age) > then the pages covered by the locks will also be dropped. > > You could try disabling the lock LRU and specify some large static number of > locks (for testing, I wouldn''t leave this set for production systems with large > numbers of clients): > > lctl set_param ldlm.namespaces.*.lru_size=10000 > > To reset it to dynamic DLM LRU size management set a value of "0". > > Cheers, AndreadI gave that a try but it didn''t seem to help. On my generic example of copying 2G files to /dev/null, the lru_size of all the OSTs is under 10 except for 3 that I have permanently marked as inactive (and they are at 3200) and the MDS which is at 143. Here I dd''ed 20 2G files into /dev/null but only about 30GB is still in cache. # lctl get_param ldlm.namespaces.*.lru_size | awk -F\= ''{print $2}'' | sort -n | uniq -c 29 0 129 1 73 2 32 3 4 4 1 5 2 9 1 423 3 3200 Any other thoughts on parameters to twiddle? Thanks! Tim
On 09/24/2013 02:14 PM, Carlson, Timothy S wrote:> I''ve got an odd situation that I can''t seem to fix. > > My setup is Lustre 1.8.8-wc1 clients on RHEL 6 talking to 1.8.6 servers on RHEL 5. > > My compute nodes have 64 GB of memory and I have a use case where an application has very low memory usage and needs to access a few thousand files in Lustre that range from 10 to 50 MB. The files are subject to some reuse and it would be advantageous to cache as much of the data as possible. The default cache for this configuration would be 48GB on the client as that is 75% of memory. However the client never caches more than about 40GB of data according to /proc/meminfo > > Even if I tune the cached memory to 64GB the amount of cache in use never goes past 40GB. My current setting is as follows > > # lctl get_param llite.*.max_cached_mb > llite.olympus-ffff8804069da800.max_cached_mb=64000 > > I''ve also played with some of the VM tunable settings. Like running vfs_cache_pressure down to 10 > > # vm.vfs_cache_pressure = 10 > > In no case do I see more than about 35GB of cache being used. To do some more testing on this I created a bunch (40) 2G files in Lustre and then copied them to /dev/null on the client. While doing this I ran the fincore tool from http://code.google.com/p/linux-ftools/ to see if the file was still in cache. Once about 40GB of cache was used, the kernel started to drop files from the cache even though there was no memory pressure on the system. > > If I do the same test with files local to the system, I can fill all the cache to about 61GB before files start getting dropped. > > Is there some other Lustre tunable on the client that I can twiddle with to make more use of the local memory cache?Tim, Another kernel sysctl that might be in play here. Have you looked at these? vm.dirty_background_ratio vm.dirty_ratio vm.dirty_background_bytes vm.dirty_bytes Those will control at what number of bytes or percentage of memory the kernel flushes buffer cache. Hope this helps, Nathan> Thanks > > Tim Carlson > Director, PNNL Institutional Computing > timothy.carlson-MIjBx5DB8Ok@public.gmane.org > >
Sitting in the pdsw workshop here at SC and watched my colleague Evan Felix try to repeat my problem on a lustre 2.1 setup. Looks like the problem does not exist in that configuration and lustre caches exactly as much on the client side as you configure. So the problem is likely limited to 1.8 clients. Oh well. Time to plan an upgrade.>From my phone so short. Tim> On Oct 4, 2013, at 9:53 AM, "Nathan Dauchy" <nathan.dauchy-32lpuo7BZBA@public.gmane.org> wrote: > >> On 09/24/2013 02:14 PM, Carlson, Timothy S wrote: >> I''ve got an odd situation that I can''t seem to fix. >> >> My setup is Lustre 1.8.8-wc1 clients on RHEL 6 talking to 1.8.6 servers on RHEL 5. >> >> My compute nodes have 64 GB of memory and I have a use case where an application has very low memory usage and needs to access a few thousand files in Lustre that range from 10 to 50 MB. The files are subject to some reuse and it would be advantageous to cache as much of the data as possible. The default cache for this configuration would be 48GB on the client as that is 75% of memory. However the client never caches more than about 40GB of data according to /proc/meminfo >> >> Even if I tune the cached memory to 64GB the amount of cache in use never goes past 40GB. My current setting is as follows >> >> # lctl get_param llite.*.max_cached_mb >> llite.olympus-ffff8804069da800.max_cached_mb=64000 >> >> I''ve also played with some of the VM tunable settings. Like running vfs_cache_pressure down to 10 >> >> # vm.vfs_cache_pressure = 10 >> >> In no case do I see more than about 35GB of cache being used. To do some more testing on this I created a bunch (40) 2G files in Lustre and then copied them to /dev/null on the client. While doing this I ran the fincore tool from http://code.google.com/p/linux-ftools/ to see if the file was still in cache. Once about 40GB of cache was used, the kernel started to drop files from the cache even though there was no memory pressure on the system. >> >> If I do the same test with files local to the system, I can fill all the cache to about 61GB before files start getting dropped. >> >> Is there some other Lustre tunable on the client that I can twiddle with to make more use of the local memory cache? > > Tim, > > Another kernel sysctl that might be in play here. Have you looked at these? > > vm.dirty_background_ratio > vm.dirty_ratio > vm.dirty_background_bytes > vm.dirty_bytes > > Those will control at what number of bytes or percentage of memory the > kernel flushes buffer cache. > > Hope this helps, > Nathan > > >> Thanks >> >> Tim Carlson >> Director, PNNL Institutional Computing >> timothy.carlson-MIjBx5DB8Ok@public.gmane.org >> >>