thr3ads.net - Lustre discuss - [Lustre-discuss] lustre ram usage (contd) [Dec 2007]

If this information is useful, please help other people find it:
Share via:

Balagopal Pillai

2007-Dec-23 22:01 UTC

[Lustre-discuss] lustre ram usage (contd)

Hi,

           The cluster is made idle on the weekend to look at the Lustre 
ram consumpton issue. The ram used during yesterday''s rsync is still
not
freed up. Here is the output from free 

             total       used       free     shared    buffers     cached
Mem:       4041880    3958744      83136          0     876132     144276
-/+ buffers/cache:    2938336    1103544
Swap:      4096564        240    4096324


          Looking at vmstat -m, there is something odd. Seems like 
ext3_inode_cache and dentry_cache seems to be the biggest occupants of 
ram. ldiskfs_inode_cache comparatively smaller.   
-

Cache                       Num  Total   Size  Pages
ll_fmd_cache                  0      0     56     69
osc_quota_info                0      0     32    119
lustre_dquot_cache            0      0    144     27
fsfilt_ldiskfs_fcb            0      0     56     69
ldiskfs_inode_cache      430199 440044    920      4
ldiskfs_xattr                 0      0     88     45
ldiskfs_prealloc_space       14     38    104     38
ll_file_data                  0      0    128     31
lustre_inode_cache            0      0    896      4
lov_oinfo                     0      0    256     15
ll_qunit_cache                0      0     72     54
ldlm_locks                10509  12005    512      7
ldlm_resources            10291  11325    256     15
ll_import_cache               0      0    440      9
ll_obdo_cache                 0      0    208     19
ll_obd_dev_cache             40     40   5328      1
fib6_nodes                   11     61     64     61
ip6_dst_cache                16     24    320     12
ndisc_cache                   1     15    256     15
rawv6_sock                   10     12   1024      4
udpv6_sock                    1      4   1024      4
tcpv6_sock                    3      4   1728      4
rpc_buffers                   8      8   2048      2
rpc_tasks                     8     12    320     12
rpc_inode_cache               6      8    832      4
msi_cache                     4      4   5760      1
ip_fib_alias                 10    119     32    119
ip_fib_hash                  10     61     64     61
dm_tio                        0      0     24    156
dm_io                         0      0     40     96
dm-bvec-(256)                 0      0   4096      1
dm-bvec-128                   0      0   2048      2
dm-bvec-64                    0      0   1024      4
dm-bvec-16                    0      0    256     15
dm-bvec-4                     0      0     64     61
Cache                       Num  Total   Size  Pages
dm-bvec-1                     0      0     16    225
dm-bio                        0      0    128     31
uhci_urb_priv                 2     45     88     45
ext3_inode_cache         1636505 1636556    856      4
ext3_xattr                    0      0     88     45
journal_handle                8     81     48     81
journal_head                460    855     88     45
revoke_table                 38    225     16    225
revoke_record                 0      0     32    119
scsi_cmd_cache                2     14    512      7
unix_sock                   105    155    768      5
ip_mrt_cache                  0      0    128     31
tcp_tw_bucket                 0      0    192     20
tcp_bind_bucket              14    238     32    119
tcp_open_request              0      0    128     31
inet_peer_cache               0      0    128     31
secpath_cache                 0      0    192     20
xfrm_dst_cache                0      0    384     10
ip_dst_cache                 40     80    384     10
arp_cache                    16     30    256     15
raw_sock                      9      9    832      9
udp_sock                     14     45    832      9
tcp_sock                     56     60   1536      5
flow_cache                    0      0    128     31
mqueue_inode_cache            1      4    896      4
relayfs_inode_cache           0      0    592     13
isofs_inode_cache             0      0    632      6
hugetlbfs_inode_cache         1      6    624      6
ext2_inode_cache              0      0    752      5
ext2_xattr                    0      0     88     45
dquot                         0      0    224     17
eventpoll_pwq                 3     54     72     54
eventpoll_epi                 3     20    192     20
kioctx                        0      0    384     10
kiocb                         0      0    256     15
Cache                       Num  Total   Size  Pages
dnotify_cache                 2     96     40     96
fasync_cache                  1    156     24    156
shmem_inode_cache           376    405    816      5
posix_timers_cache            0      0    184     21
uid_cache                     5     62    128     31
sgpool-256                   32     32   8192      1
sgpool-128                   32     32   4096      1
sgpool-64                    32     32   2048      2
sgpool-32                    32     32   1024      4
sgpool-16                    32     32    512      8
sgpool-8                     32     45    256     15
cfq_pool                     66    207     56     69
crq_pool                     64    324     72     54
deadline_drq                  0      0     96     41
as_arq                        0      0    112     35
blkdev_ioc                  364    476     32    119
blkdev_queue                 33     81    856      9
blkdev_requests              64    120    264     15
biovec-(256)                256    256   4096      1
biovec-128                  256    256   2048      2
biovec-64                   256    256   1024      4
biovec-16                   256    270    256     15
biovec-4                    256    305     64     61
biovec-1                    256    450     16    225
bio                         256    279    128     31
file_lock_cache               3     75    160     25
sock_inode_cache            209    220    704      5
skbuff_head_cache         16443  22008    320     12
sock                          6     12    640      6
proc_inode_cache           2670   2670    616      6
sigqueue                     40    230    168     23
radix_tree_node           68531  68880    536      7
bdev_cache                   45     60    832      4
mnt_cache                    60     80    192     20
inode_cache                 927   1176    584      7
Cache                       Num  Total   Size  Pages
dentry_cache             1349923 1361216    240     16
filp                        717    924    320     12
names_cache                   3      3   4096      1
avc_node                     12    648     72     54
key_jar                      10     60    192     20
idr_layer_cache             110    133    528      7
buffer_head              230970 393300     88     45
mm_struct                    47    105   1152      7
vm_area_struct             1573   2904    176     22
fs_cache                    422    549     64     61
files_cache                  58    171    832      9
signal_cache                529    585    256     15
sighand_cache               522    528   2112      3
task_struct                 550    554   2000      2
anon_vma                    601   1404     24    156
shared_policy_node            0      0     56     69
numa_policy                  82    675     16    225
size-131072(DMA)              0      0 131072      1
size-131072                  12     12 131072      1
size-65536(DMA)               0      0  65536      1
size-65536                  205    205  65536      1
size-32768(DMA)               0      0  32768      1
size-32768                    0      0  32768      1
size-16384(DMA)               0      0  16384      1
size-16384                  936    936  16384      1
size-8192(DMA)                0      0   8192      1
size-8192                  4911   4911   8192      1
size-4096(DMA)                0      0   4096      1
size-4096                   676    676   4096      1
size-2048(DMA)                0      0   2048      2
size-2048                  8753   8782   2048      2
size-1620(DMA)                0      0   1664      4
size-1620                    86    104   1664      4
size-1024(DMA)                0      0   1024      4
size-1024                 15228  15900   1024      4
Cache                       Num  Total   Size  Pages
size-512(DMA)                 0      0    512      8
size-512                   1189   2752    512      8
size-256(DMA)                 0      0    256     15
size-256                  10235  10560    256     15
size-128(DMA)                 0      0    128     31
size-128                 200934 211916    128     31
size-64(DMA)                  0      0     64     61
size-64                  712970 735416     64     61
size-32(DMA)                  0      0     32    119
size-32                    2338  94486     32    119
kmem_cache                  210    210    256     15


             On the second OSS, here is the vmstat output -
Again dentry_cache and ldiskfs_inode_cache and ext3_inode_cache
seems to be the biggest users of ram.

ll_fmd_cache                  0      0     56     69
ldiskfs_inode_cache      987664 987668    920      4
lustre_inode_cache            0      0    896      4
ll_qunit_cache                0      0     72     54
ll_import_cache               0      0    440      9
ll_obdo_cache                 0      0    208     19
ll_obd_dev_cache             10     10   5328      1
ip6_dst_cache                16     24    320     12
ndisc_cache                   1     15    256     15
rpc_inode_cache               6      8    832      4
msi_cache                     4      4   5760      1
ext3_inode_cache         392316 392328    856      4
scsi_cmd_cache               41     42    512      7
ip_mrt_cache                  0      0    128     31
inet_peer_cache               0      0    128     31
secpath_cache                 0      0    192     20
xfrm_dst_cache                0      0    384     10
ip_dst_cache                 39     80    384     10
arp_cache                    16     30    256     15
flow_cache                    0      0    128     31
mqueue_inode_cache            1      4    896      4
relayfs_inode_cache           0      0    592     13
isofs_inode_cache             0      0    632      6
hugetlbfs_inode_cache         1      6    624      6
ext2_inode_cache              0      0    752      5
dnotify_cache                 2     96     40     96
fasync_cache                  1    156     24    156
shmem_inode_cache           370    400    816      5
posix_timers_cache            0      0    184     21
uid_cache                     7     31    128     31
file_lock_cache               7     75    160     25
sock_inode_cache            216    235    704      5
skbuff_head_cache         16500  21768    320     12
proc_inode_cache           2260   2262    616      6
bdev_cache                   56     56    832      4
mnt_cache                    46     60    192     20
inode_cache                 944   1218    584      7
dentry_cache             1387440 1387440    240     16
names_cache                  10     10   4096      1
idr_layer_cache              91     98    528      7
fs_cache                    366    549     64     61
files_cache                  69    153    832      9
signal_cache                462    585    256     15
sighand_cache               453    453   2112      3
kmem_cache                  180    180    256     15

              Is there a way to flush out the cache so that the ram is 
freed up? The same issue is reported here - 
http://lkml.org/lkml/2006/8/3/376      But both OSS run CentOS 4 and 2.6.9 
kernel, so drop_caches doesn''t seem to be available in /proc.

             Is there anything in proc as explained in 
http://www.redhat.com/docs/manuals/enterprise/RHEL-4-Manual/ref-guide/s1-proc-directories.html
that can force the kernel to flush out the dentry_cache and 
ext3_inode_cache when the rsync is over and cache is not needed anymore? 
Thanks very much.

Regards
Balagopal

Andreas Dilger

2007-Dec-24 15:35 UTC

head link

[Lustre-discuss] lustre ram usage (contd)

On Dec 23, 2007  18:01 -0400, Balagopal Pillai wrote:>            The cluster is made idle on the weekend to look at the Lustre 
> ram consumpton issue. The ram used during yesterday''s rsync is
still not
> freed up. Here is the output from free 
> 
>              total       used       free     shared    buffers     cached
> Mem:       4041880    3958744      83136          0     876132     144276
> -/+ buffers/cache:    2938336    1103544
> Swap:      4096564        240    4096324
Note that this is normal behaviour for Linux.  Ram that is unused provides
no value, so all available RAM is used for cache until something else is
needing to use this memory.
>           Looking at vmstat -m, there is something odd. Seems like 
> ext3_inode_cache and dentry_cache seems to be the biggest occupants of 
> ram. ldiskfs_inode_cache comparatively smaller.   
> -
> 
> Cache                       Num  Total   Size  Pages
> ldiskfs_inode_cache      430199 440044    920      4
> ldlm_locks                10509  12005    512      7
> ldlm_resources            10291  11325    256     15
> buffer_head              230970 393300     88     45
> ext3_inode_cache         1636505 1636556    856      4
> dentry_cache             1349923 1361216    240     16
This is odd, because Lustre doesn''t use ext3 at all.  It uses ldiskfs
(which is ext3 renamed + patches), so it is some non-Lustre filesystem
usage which is consuming most of your memory.
> 
>              Is there anything in proc as explained in 
>
http://www.redhat.com/docs/manuals/enterprise/RHEL-4-Manual/ref-guide/s1-proc-directories.html
> that can force the kernel to flush out the dentry_cache and 
> ext3_inode_cache when the rsync is over and cache is not needed anymore? 
> Thanks very much.
Only to unmount and remount the filesystem, on the server.  On Lustre
clients there is a mechanism to flush Lustre cache, but that doesn''t
help you here.

Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.

Mark Seger

2007-Dec-24 16:03 UTC

head link

[Lustre-discuss] lustre ram usage (contd)

If you''re really interesting in tracking memory utilization, collectl -
see http://collectl.sourceforge.net/ - when run as a daemon will 
collect/log all slab data once a minute and you can change the frequency 
to anything you like.  You can then later play it back and see exactly 
what is happening over time.  As another approach you can run 
interactively and if you specify the -oS switch, you''ll only see
changes
as they occur.  Including the ''T'' will time stamp them as in
the example
below:

[root at cag-dl380-01 root]# collectl -sY -oST -i:1
# SLAB DETAIL
#                               
<-----------Objects----------><---------Slab Allocation------>
#         Name                  InUse   Bytes   Alloc   Bytes   InUse   
Bytes   Total   Bytes
11:02:02 size-512                 146   74752     208  106496      21   
86016      26  106496
11:02:07 sigqueue                 319   42108     319   42108      11   
45056      11   45056
11:02:07 size-512                 208  106496     208  106496      26  
106496      26  106496

Since this isn''t a lustre system there isn''t a whole lot of
activity...

-mark

Andreas Dilger wrote:> On Dec 23, 2007  18:01 -0400, Balagopal Pillai wrote:
>   
>>            The cluster is made idle on the weekend to look at the
Lustre
>> ram consumpton issue. The ram used during yesterday''s rsync is
still not
>> freed up. Here is the output from free 
>>
>>              total       used       free     shared    buffers    
cached
>> Mem:       4041880    3958744      83136          0     876132    
144276
>> -/+ buffers/cache:    2938336    1103544
>> Swap:      4096564        240    4096324
>>     
>
> Note that this is normal behaviour for Linux.  Ram that is unused provides
> no value, so all available RAM is used for cache until something else is
> needing to use this memory.
>
>   
>>           Looking at vmstat -m, there is something odd. Seems like 
>> ext3_inode_cache and dentry_cache seems to be the biggest occupants of 
>> ram. ldiskfs_inode_cache comparatively smaller.   
>> -
>>
>> Cache                       Num  Total   Size  Pages
>> ldiskfs_inode_cache      430199 440044    920      4
>> ldlm_locks                10509  12005    512      7
>> ldlm_resources            10291  11325    256     15
>> buffer_head              230970 393300     88     45
>>     
>
>   
>> ext3_inode_cache         1636505 1636556    856      4
>> dentry_cache             1349923 1361216    240     16
>>     
>
> This is odd, because Lustre doesn''t use ext3 at all.  It uses
ldiskfs
> (which is ext3 renamed + patches), so it is some non-Lustre filesystem
> usage which is consuming most of your memory.
>
>   
>>              Is there anything in proc as explained in 
>>
http://www.redhat.com/docs/manuals/enterprise/RHEL-4-Manual/ref-guide/s1-proc-directories.html
>> that can force the kernel to flush out the dentry_cache and 
>> ext3_inode_cache when the rsync is over and cache is not needed
anymore?
>> Thanks very much.
>>     
>
> Only to unmount and remount the filesystem, on the server.  On Lustre
> clients there is a mechanism to flush Lustre cache, but that
doesn''t
> help you here.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>

Balagopal Pillai

2007-Dec-24 16:09 UTC

head link

[Lustre-discuss] lustre ram usage (contd)

Hi Andreas,

          Thanks. The two OSS  also export two ext3 volumes each via NFS 
that are used
to backup the 4 smaller Lustre volumes. One possibility as you 
mentioned, is that the memory consumption is not Lustre related, but
ext3 related, as the destination ext3 volumes are also coming from the 
same OSS servers, but mounted over NFS on the Lustre client that does 
the rsync.

         I upgraded the ram today morning on both OSS from 4 G to 8 G 
and hope the ram on both OSS is enough for both Lustre operations and 
the rsync backup.

Regards
Balagopal

Andreas Dilger wrote:> On Dec 23, 2007  18:01 -0400, Balagopal Pillai wrote:
>   
>>            The cluster is made idle on the weekend to look at the
Lustre
>> ram consumpton issue. The ram used during yesterday''s rsync is
still not
>> freed up. Here is the output from free 
>>
>>              total       used       free     shared    buffers    
cached
>> Mem:       4041880    3958744      83136          0     876132    
144276
>> -/+ buffers/cache:    2938336    1103544
>> Swap:      4096564        240    4096324
>>     
>
> Note that this is normal behaviour for Linux.  Ram that is unused provides
> no value, so all available RAM is used for cache until something else is
> needing to use this memory.
>
>   
>>           Looking at vmstat -m, there is something odd. Seems like 
>> ext3_inode_cache and dentry_cache seems to be the biggest occupants of 
>> ram. ldiskfs_inode_cache comparatively smaller.   
>> -
>>
>> Cache                       Num  Total   Size  Pages
>> ldiskfs_inode_cache      430199 440044    920      4
>> ldlm_locks                10509  12005    512      7
>> ldlm_resources            10291  11325    256     15
>> buffer_head              230970 393300     88     45
>>     
>
>   
>> ext3_inode_cache         1636505 1636556    856      4
>> dentry_cache             1349923 1361216    240     16
>>     
>
> This is odd, because Lustre doesn''t use ext3 at all.  It uses
ldiskfs
> (which is ext3 renamed + patches), so it is some non-Lustre filesystem
> usage which is consuming most of your memory.
>
>   
>>              Is there anything in proc as explained in 
>>
http://www.redhat.com/docs/manuals/enterprise/RHEL-4-Manual/ref-guide/s1-proc-directories.html
>> that can force the kernel to flush out the dentry_cache and 
>> ext3_inode_cache when the rsync is over and cache is not needed
anymore?
>> Thanks very much.
>>     
>
> Only to unmount and remount the filesystem, on the server.  On Lustre
> clients there is a mechanism to flush Lustre cache, but that
doesn''t
> help you here.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>

Balagopal Pillai

2007-Dec-24 16:16 UTC

head link

[Lustre-discuss] lustre ram usage (contd)

Thanks Mark. This looks handy. I was about to put a cron job with vmstat 
to see how the memory utilization progresses with the early morning rsync .
Since i put another 4G on both OSS today morning, hopefully it should be 
enough for its operation.

Regards
Balagopal


Mark Seger wrote:> If you''re really interesting in tracking memory utilization,
collectl
> - see http://collectl.sourceforge.net/ - when run as a daemon will 
> collect/log all slab data once a minute and you can change the 
> frequency to anything you like.  You can then later play it back and 
> see exactly what is happening over time.  As another approach you can 
> run interactively and if you specify the -oS switch, you''ll only
see
> changes as they occur.  Including the ''T'' will time stamp
them as in
> the example below:
>
> [root at cag-dl380-01 root]# collectl -sY -oST -i:1
> # SLAB DETAIL
> #                               
> <-----------Objects----------><---------Slab Allocation------>
> #         Name                  InUse   Bytes   Alloc   Bytes   
> InUse   Bytes   Total   Bytes
> 11:02:02 size-512                 146   74752     208  106496      
> 21   86016      26  106496
> 11:02:07 sigqueue                 319   42108     319   42108      
> 11   45056      11   45056
> 11:02:07 size-512                 208  106496     208  106496      26  
> 106496      26  106496
>
> Since this isn''t a lustre system there isn''t a whole lot
of activity...
>
> -mark
>
> Andreas Dilger wrote:
>> On Dec 23, 2007  18:01 -0400, Balagopal Pillai wrote:
>>  
>>>            The cluster is made idle on the weekend to look at the 
>>> Lustre ram consumpton issue. The ram used during
yesterday''s rsync
>>> is still not freed up. Here is the output from free
>>>              total       used       free     shared    buffers     
>>> cached
>>> Mem:       4041880    3958744      83136          0     876132     
>>> 144276
>>> -/+ buffers/cache:    2938336    1103544
>>> Swap:      4096564        240    4096324
>>>     
>>
>> Note that this is normal behaviour for Linux.  Ram that is unused 
>> provides
>> no value, so all available RAM is used for cache until something else
is
>> needing to use this memory.
>>
>>  
>>>           Looking at vmstat -m, there is something odd. Seems like 
>>> ext3_inode_cache and dentry_cache seems to be the biggest occupants
>>> of ram. ldiskfs_inode_cache comparatively smaller.   -
>>>
>>> Cache                       Num  Total   Size  Pages
>>> ldiskfs_inode_cache      430199 440044    920      4
>>> ldlm_locks                10509  12005    512      7
>>> ldlm_resources            10291  11325    256     15
>>> buffer_head              230970 393300     88     45
>>>     
>>
>>  
>>> ext3_inode_cache         1636505 1636556    856      4
>>> dentry_cache             1349923 1361216    240     16
>>>     
>>
>> This is odd, because Lustre doesn''t use ext3 at all.  It uses
ldiskfs
>> (which is ext3 renamed + patches), so it is some non-Lustre filesystem
>> usage which is consuming most of your memory.
>>
>>  
>>>              Is there anything in proc as explained in 
>>>
http://www.redhat.com/docs/manuals/enterprise/RHEL-4-Manual/ref-guide/s1-proc-directories.html
>>>
>>> that can force the kernel to flush out the dentry_cache and 
>>> ext3_inode_cache when the rsync is over and cache is not needed 
>>> anymore? Thanks very much.
>>>     
>>
>> Only to unmount and remount the filesystem, on the server.  On Lustre
>> clients there is a mechanism to flush Lustre cache, but that
doesn''t
>> help you here.
>>
>> Cheers, Andreas
>> -- 
>> Andreas Dilger
>> Sr. Staff Engineer, Lustre Group
>> Sun Microsystems of Canada, Inc.
>>
>> _______________________________________________
>> Lustre-discuss mailing list
>> Lustre-discuss at clusterfs.com
>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>

Mark Seger

2007-Dec-24 16:38 UTC

head link

[Lustre-discuss] lustre ram usage (contd)

In my opinion there are a couple of problems with cron jobs that do 
monitoring.  On the positive note they''re quick and easy, but on the 
downside you have extra work to do it you want timestamps and then, 
there''s the issue about all the other potential system metrics
you''re
missing out on.  The neat thing about collectl is it essentially does it 
all!  In the case of lustre that means if you run it with the defaults 
you''ll get cpu, memory, network, and more in addition to the slab data.
However, if you really want to get crazy, you can get the performance by 
ost and even the rpc stats.  The one negative with collectl is while it 
can do a lot, that translates into a lot of options which can be 
confusing at first.
-mark

Balagopal Pillai wrote:> Thanks Mark. This looks handy. I was about to put a cron job with vmstat 
> to see how the memory utilization progresses with the early morning rsync .
> Since i put another 4G on both OSS today morning, hopefully it should be 
> enough for its operation.
>
> Regards
> Balagopal
>
>
> Mark Seger wrote:
>   
>> If you''re really interesting in tracking memory utilization,
collectl
>> - see http://collectl.sourceforge.net/ - when run as a daemon will 
>> collect/log all slab data once a minute and you can change the 
>> frequency to anything you like.  You can then later play it back and 
>> see exactly what is happening over time.  As another approach you can 
>> run interactively and if you specify the -oS switch, you''ll
only see
>> changes as they occur.  Including the ''T'' will time
stamp them as in
>> the example below:
>>
>> [root at cag-dl380-01 root]# collectl -sY -oST -i:1
>> # SLAB DETAIL
>> #                               
>> <-----------Objects----------><---------Slab
Allocation------>
>> #         Name                  InUse   Bytes   Alloc   Bytes   
>> InUse   Bytes   Total   Bytes
>> 11:02:02 size-512                 146   74752     208  106496      
>> 21   86016      26  106496
>> 11:02:07 sigqueue                 319   42108     319   42108      
>> 11   45056      11   45056
>> 11:02:07 size-512                 208  106496     208  106496      26  
>> 106496      26  106496
>>
>> Since this isn''t a lustre system there isn''t a whole
lot of activity...
>>
>> -mark
>>
>> Andreas Dilger wrote:
>>     
>>> On Dec 23, 2007  18:01 -0400, Balagopal Pillai wrote:
>>>  
>>>       
>>>>            The cluster is made idle on the weekend to look at
the
>>>> Lustre ram consumpton issue. The ram used during
yesterday''s rsync
>>>> is still not freed up. Here is the output from free
>>>>              total       used       free     shared    buffers
>>>> cached
>>>> Mem:       4041880    3958744      83136          0     876132
>>>> 144276
>>>> -/+ buffers/cache:    2938336    1103544
>>>> Swap:      4096564        240    4096324
>>>>     
>>>>         
>>> Note that this is normal behaviour for Linux.  Ram that is unused 
>>> provides
>>> no value, so all available RAM is used for cache until something
else is
>>> needing to use this memory.
>>>
>>>  
>>>       
>>>>           Looking at vmstat -m, there is something odd. Seems
like
>>>> ext3_inode_cache and dentry_cache seems to be the biggest
occupants
>>>> of ram. ldiskfs_inode_cache comparatively smaller.   -
>>>>
>>>> Cache                       Num  Total   Size  Pages
>>>> ldiskfs_inode_cache      430199 440044    920      4
>>>> ldlm_locks                10509  12005    512      7
>>>> ldlm_resources            10291  11325    256     15
>>>> buffer_head              230970 393300     88     45
>>>>     
>>>>         
>>>  
>>>       
>>>> ext3_inode_cache         1636505 1636556    856      4
>>>> dentry_cache             1349923 1361216    240     16
>>>>     
>>>>         
>>> This is odd, because Lustre doesn''t use ext3 at all.  It
uses ldiskfs
>>> (which is ext3 renamed + patches), so it is some non-Lustre
filesystem
>>> usage which is consuming most of your memory.
>>>
>>>  
>>>       
>>>>              Is there anything in proc as explained in 
>>>>
http://www.redhat.com/docs/manuals/enterprise/RHEL-4-Manual/ref-guide/s1-proc-directories.html
>>>>
>>>> that can force the kernel to flush out the dentry_cache and 
>>>> ext3_inode_cache when the rsync is over and cache is not needed
>>>> anymore? Thanks very much.
>>>>     
>>>>         
>>> Only to unmount and remount the filesystem, on the server.  On
Lustre
>>> clients there is a mechanism to flush Lustre cache, but that
doesn''t
>>> help you here.
>>>
>>> Cheers, Andreas
>>> -- 
>>> Andreas Dilger
>>> Sr. Staff Engineer, Lustre Group
>>> Sun Microsystems of Canada, Inc.
>>>
>>> _______________________________________________
>>> Lustre-discuss mailing list
>>> Lustre-discuss at clusterfs.com
>>> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>>>   
>>>       
>
> _______________________________________________
> Lustre-discuss mailing list
> Lustre-discuss at clusterfs.com
> https://mail.clusterfs.com/mailman/listinfo/lustre-discuss
>

Balagopal Pillai

2007-Dec-25 12:35 UTC

head link

[Lustre-discuss] lustre ram usage (contd)

On Mon, 24 Dec 2007, Andreas Dilger wrote:
Hi Andreas,

             Here the output of vmstat -m after doubling the ram yesterday 
on both OSS. The rsync completed successfully yesterday night. But almost 
5.4G of ram is used up!. 

             total       used       free     shared    buffers     cached
Mem:       8166408    8094468      71940          0    2597688      48124
-/+ buffers/cache:    5448656    2717752
Swap:      4096564        224    4096340


             Here is the vmstat -m. This time, ldiskfs_inode_cache is 
biggest occupant. ex3 inode cache is smaller and dentry cache is quite 
big. I can get around the problem if this is ext3 related by exporting the 
backup volume by iscsi from the OSS and mounting on the lustre client
via an iscsi client. The nodes have 16GB and that should be enough for all 
the caches. But ldiskfs_inode_cache is also becoming quite big. The only 
difference between last time and this time is that, i have re-enabled all 
the needed rsyncs with one copy of data to an nfs mounted ext3 volume and 
another copy to another big Lustre volume. That could explain the beefing 
up of ldiskfs_inode_cache this time. The current vmstat -m of both OSS are 
pasted below -

1st OSS + MDS -

Cache                       Num  Total   Size  Pages
ll_fmd_cache                  0      0     56     69
osc_quota_info                0      0     32    119
lustre_dquot_cache            0      0    144     27
fsfilt_ldiskfs_fcb            0      0     56     69
ldiskfs_inode_cache      3969899 3969960    920      4
ldiskfs_xattr                 0      0     88     45
ldiskfs_prealloc_space     5536   5662    104     38
ll_file_data                  0      0    128     31
lustre_inode_cache            0      0    896      4
lov_oinfo                     0      0    256     15
ll_qunit_cache                0      0     72     54
ldlm_locks                86258 110698    512      7
ldlm_resources            85847 103725    256     15
ll_import_cache               0      0    440      9
ll_obdo_cache                 0      0    208     19
ll_obd_dev_cache             40     40   5328      1
fib6_nodes                   11     61     64     61
ip6_dst_cache                16     24    320     12
ndisc_cache                   1     15    256     15
rawv6_sock                   10     12   1024      4
udpv6_sock                    1      4   1024      4
tcpv6_sock                    3      4   1728      4
rpc_buffers                   8      8   2048      2
rpc_tasks                     8     12    320     12
rpc_inode_cache               6      8    832      4
msi_cache                     4      4   5760      1
ip_fib_alias                 10    119     32    119
ip_fib_hash                  10     61     64     61
dm_tio                        0      0     24    156
dm_io                         0      0     40     96
dm-bvec-(256)                 0      0   4096      1
dm-bvec-128                   0      0   2048      2
dm-bvec-64                    0      0   1024      4
dm-bvec-16                    0      0    256     15
dm-bvec-4                     0      0     64     61
Cache                       Num  Total   Size  Pages
dm-bvec-1                     0      0     16    225
dm-bio                        0      0    128     31
uhci_urb_priv                 2     45     88     45
ext3_inode_cache           6104  20520    856      4
ext3_xattr                    0      0     88     45
journal_handle               20     81     48     81
journal_head                482   2610     88     45
revoke_table                 38    225     16    225
revoke_record                 0      0     32    119
scsi_cmd_cache                7      7    512      7
unix_sock                   103    150    768      5
ip_mrt_cache                  0      0    128     31
tcp_tw_bucket                 0      0    192     20
tcp_bind_bucket              14    119     32    119
tcp_open_request              0      0    128     31
inet_peer_cache               0      0    128     31
secpath_cache                 0      0    192     20
xfrm_dst_cache                0      0    384     10
ip_dst_cache                 38     90    384     10
arp_cache                    16     30    256     15
raw_sock                      9      9    832      9
udp_sock                     14     54    832      9
tcp_sock                     56     60   1536      5
flow_cache                    0      0    128     31
mqueue_inode_cache            1      4    896      4
relayfs_inode_cache           0      0    592     13
isofs_inode_cache             0      0    632      6
hugetlbfs_inode_cache         1      6    624      6
ext2_inode_cache              0      0    752      5
ext2_xattr                    0      0     88     45
dquot                         0      0    224     17
eventpoll_pwq                 3     54     72     54
eventpoll_epi                 3     20    192     20
kioctx                        0      0    384     10
kiocb                         0      0    256     15
Cache                       Num  Total   Size  Pages
dnotify_cache                 2     96     40     96
fasync_cache                  1    156     24    156
shmem_inode_cache           379    415    816      5
posix_timers_cache            0      0    184     21
uid_cache                     5     31    128     31
sgpool-256                   32     32   8192      1
sgpool-128                   32     32   4096      1
sgpool-64                    32     32   2048      2
sgpool-32                    36     36   1024      4
sgpool-16                    32     32    512      8
sgpool-8                     45     45    256     15
cfq_pool                     98    207     56     69
crq_pool                     80    324     72     54
deadline_drq                  0      0     96     41
as_arq                        0      0    112     35
blkdev_ioc                  360    476     32    119
blkdev_queue                 33     63    856      9
blkdev_requests              80    120    264     15
biovec-(256)                256    256   4096      1
biovec-128                  256    256   2048      2
biovec-64                   256    256   1024      4
biovec-16                   256    270    256     15
biovec-4                    256    305     64     61
biovec-1                    332    450     16    225
bio                         310    310    128     31
file_lock_cache               3     75    160     25
sock_inode_cache            207    210    704      5
skbuff_head_cache         16465  21900    320     12
sock                          6     12    640      6
proc_inode_cache           2637   2658    616      6
sigqueue                     45     46    168     23
radix_tree_node          182213 186375    536      7
bdev_cache                   52     52    832      4
mnt_cache                    60    100    192     20
inode_cache                 917   1239    584      7
Cache                       Num  Total   Size  Pages
dentry_cache             2880362 2882112    240     16
filp                        731    816    320     12
names_cache                   4      5   4096      1
avc_node                     12    432     72     54
key_jar                      10     40    192     20
idr_layer_cache             111    119    528      7
buffer_head              650238 742680     88     45
mm_struct                    45    112   1152      7
vm_area_struct             1626   2904    176     22
fs_cache                    427    549     64     61
files_cache                  48    126    832      9
signal_cache                534    615    256     15
sighand_cache               530    543   2112      3
task_struct                 555    560   2000      2
anon_vma                    679   1248     24    156
shared_policy_node            0      0     56     69
numa_policy                  82    675     16    225
size-131072(DMA)              0      0 131072      1
size-131072                  12     12 131072      1
size-65536(DMA)               0      0  65536      1
size-65536                  229    229  65536      1
size-32768(DMA)               0      0  32768      1
size-32768                    0      0  32768      1
size-16384(DMA)               0      0  16384      1
size-16384                 1286   1286  16384      1
size-8192(DMA)                0      0   8192      1
size-8192                  4884   4884   8192      1
size-4096(DMA)                0      0   4096      1
size-4096                   744    786   4096      1
size-2048(DMA)                0      0   2048      2
size-2048                  9114   9120   2048      2
size-1620(DMA)                0      0   1664      4
size-1620                    86    100   1664      4
size-1024(DMA)                0      0   1024      4
size-1024                 15217  16132   1024      4
Cache                       Num  Total   Size  Pages
size-512(DMA)                 0      0    512      8
size-512                   1213   2752    512      8
size-256(DMA)                 0      0    256     15
size-256                  10441  11310    256     15
size-128(DMA)                 0      0    128     31
size-128                 205487 218488    128     31
size-64(DMA)                  0      0     64     61
size-64                  777658 891088     64     61
size-32(DMA)                  0      0     32    119
size-32                   43033  86632     32    119
kmem_cache                  225    225    256     15


             total       used       free     shared    buffers     cached
Mem:       8166340    5462540    2703800          0    1515664     448516
-/+ buffers/cache:    3498360    4667980
Swap:      4096440          0    4096440
[root at lustre2 ~]# vmstat -m
Cache                       Num  Total   Size  Pages
ll_fmd_cache                  0      0     56     69
fsfilt_ldiskfs_fcb            4     69     56     69
ldiskfs_inode_cache      1971539 1971548    920      4
ldiskfs_xattr                 0      0     88     45
ldiskfs_prealloc_space     9090   9120    104     38
ll_file_data                  0      0    128     31
lustre_inode_cache            0      0    896      4
lov_oinfo                     0      0    256     15
ll_qunit_cache                0      0     72     54
ldlm_locks                  228   1253    512      7
ldlm_resources              226   2235    256     15
ll_import_cache               0      0    440      9
ll_obdo_cache                 0      0    208     19
ll_obd_dev_cache             10     10   5328      1
fib6_nodes                   11     61     64     61
ip6_dst_cache                16     24    320     12
ndisc_cache                   1     15    256     15
rawv6_sock                   10     12   1024      4
udpv6_sock                    1      4   1024      4
tcpv6_sock                    3      4   1728      4
rpc_buffers                   8      8   2048      2
rpc_tasks                     8     12    320     12
rpc_inode_cache               6      8    832      4
msi_cache                     4      4   5760      1
ip_fib_alias                 10    119     32    119
ip_fib_hash                  10     61     64     61
dm_tio                        0      0     24    156
dm_io                         0      0     40     96
dm-bvec-(256)                 0      0   4096      1
dm-bvec-128                   0      0   2048      2
dm-bvec-64                    0      0   1024      4
dm-bvec-16                    0      0    256     15
dm-bvec-4                     0      0     64     61
dm-bvec-1                     0      0     16    225
dm-bio                        0      0    128     31
Cache                       Num  Total   Size  Pages
uhci_urb_priv                 2     90     88     45
ext3_inode_cache         393257 393260    856      4
ext3_xattr                    0      0     88     45
journal_handle                8     81     48     81
journal_head                653   2295     88     45
revoke_table                 24    225     16    225
revoke_record                 0      0     32    119
scsi_cmd_cache               10     49    512      7
unix_sock                   106    150    768      5
ip_mrt_cache                  0      0    128     31
tcp_tw_bucket                 0      0    192     20
tcp_bind_bucket              17    119     32    119
tcp_open_request              0      0    128     31
inet_peer_cache               0      0    128     31
secpath_cache                 0      0    192     20
xfrm_dst_cache                0      0    384     10
ip_dst_cache                 38     80    384     10
arp_cache                    16     30    256     15
raw_sock                      9      9    832      9
udp_sock                     15     36    832      9
tcp_sock                     56     65   1536      5
flow_cache                    0      0    128     31
mqueue_inode_cache            1      4    896      4
relayfs_inode_cache           0      0    592     13
isofs_inode_cache             0      0    632      6
hugetlbfs_inode_cache         1      6    624      6
ext2_inode_cache              0      0    752      5
ext2_xattr                    0      0     88     45
dquot                         0      0    224     17
eventpoll_pwq                 3     54     72     54
eventpoll_epi                 3     20    192     20
kioctx                        0      0    384     10
kiocb                         0      0    256     15
dnotify_cache                 2     96     40     96
fasync_cache                  1    156     24    156
Cache                       Num  Total   Size  Pages
shmem_inode_cache           369    390    816      5
posix_timers_cache            0      0    184     21
uid_cache                     6     62    128     31
sgpool-256                   32     32   8192      1
sgpool-128                   32     32   4096      1
sgpool-64                    32     32   2048      2
sgpool-32                    32     32   1024      4
sgpool-16                    33     40    512      8
sgpool-8                     45     90    256     15
cfq_pool                     95    207     56     69
crq_pool                     87    216     72     54
deadline_drq                  0      0     96     41
as_arq                        0      0    112     35
blkdev_ioc                  300    357     32    119
blkdev_queue                 35     72    856      9
blkdev_requests              95    135    264     15
biovec-(256)                256    256   4096      1
biovec-128                  256    256   2048      2
biovec-64                   256    256   1024      4
biovec-16                   256    270    256     15
biovec-4                    256    305     64     61
biovec-1                    324    450     16    225
bio                         305    372    128     31
file_lock_cache               3     50    160     25
sock_inode_cache            211    230    704      5
skbuff_head_cache         16556  21324    320     12
sock                          6     12    640      6
proc_inode_cache           2361   2364    616      6
sigqueue                     33     46    168     23
radix_tree_node          212941 212954    536      7
bdev_cache                   45     56    832      4
mnt_cache                    46     60    192     20
inode_cache                2730   2779    584      7
dentry_cache             2373901 2374016    240     16
filp                        718    804    320     12
Cache                       Num  Total   Size  Pages
names_cache                   5     10   4096      1
avc_node                     12    378     72     54
key_jar                      12     40    192     20
idr_layer_cache              88     91    528      7
buffer_head              387120 387180     88     45
mm_struct                    52    119   1152      7
vm_area_struct             1707   2706    176     22
fs_cache                    349    488     64     61
files_cache                  47    153    832      9
signal_cache                452    630    256     15
sighand_cache               448    465   2112      3
task_struct                 476    492   2000      2
anon_vma                    665   1248     24    156
shared_policy_node            0      0     56     69
numa_policy                  82    450     16    225
size-131072(DMA)              0      0 131072      1
size-131072                  12     12 131072      1
size-65536(DMA)               0      0  65536      1
size-65536                  126    126  65536      1
size-32768(DMA)               0      0  32768      1
size-32768                    0      0  32768      1
size-16384(DMA)               0      0  16384      1
size-16384                 1210   1210  16384      1
size-8192(DMA)                0      0   8192      1
size-8192                  2615   2616   8192      1
size-4096(DMA)                0      0   4096      1
size-4096                   488    496   4096      1
size-2048(DMA)                0      0   2048      2
size-2048                  9050   9102   2048      2
size-1620(DMA)                0      0   1664      4
size-1620                    88    108   1664      4
size-1024(DMA)                0      0   1024      4
size-1024                 13138  14816   1024      4
size-512(DMA)                 0      0    512      8
size-512                    854   2752    512      8
Cache                       Num  Total   Size  Pages
size-256(DMA)                 0      0    256     15
size-256                   6495   7770    256     15
size-128(DMA)                 0      0    128     31
size-128                 198380 198586    128     31
size-64(DMA)                  0      0     64     61
size-64                   20477  36478     64     61
size-32(DMA)                  0      0     32    119
size-32                   43283  50932     32    119
kmem_cache                  180    180    256     15

The collectl stats during the rsync is available at 
http://cluster.mathstat.dal.ca/lustre2-20071225-000104.raw.gz 
It shows the cache getting built up after 4 am in the morning.

Thanks very much for any recommendations and help. We still have a bit of 
headroom in the available ram. Hope these caches don''t continue to
build
everyday and crash the OSS again. 

Regards
Balagopal

> On Dec 23, 2007  18:01 -0400, Balagopal Pillai wrote:
> >            The cluster is made idle on the weekend to look at the
Lustre
> > ram consumpton issue. The ram used during yesterday''s rsync
is still not
> > freed up. Here is the output from free 
> > 
> >              total       used       free     shared    buffers    
cached
> > Mem:       4041880    3958744      83136          0     876132    
144276
> > -/+ buffers/cache:    2938336    1103544
> > Swap:      4096564        240    4096324
> 
> Note that this is normal behaviour for Linux.  Ram that is unused provides
> no value, so all available RAM is used for cache until something else is
> needing to use this memory.
> 
> >           Looking at vmstat -m, there is something odd. Seems like 
> > ext3_inode_cache and dentry_cache seems to be the biggest occupants of
> > ram. ldiskfs_inode_cache comparatively smaller.   
> > -
> > 
> > Cache                       Num  Total   Size  Pages
> > ldiskfs_inode_cache      430199 440044    920      4
> > ldlm_locks                10509  12005    512      7
> > ldlm_resources            10291  11325    256     15
> > buffer_head              230970 393300     88     45
> 
> > ext3_inode_cache         1636505 1636556    856      4
> > dentry_cache             1349923 1361216    240     16
> 
> This is odd, because Lustre doesn''t use ext3 at all.  It uses
ldiskfs
> (which is ext3 renamed + patches), so it is some non-Lustre filesystem
> usage which is consuming most of your memory.
> 
> > 
> >              Is there anything in proc as explained in 
> >
http://www.redhat.com/docs/manuals/enterprise/RHEL-4-Manual/ref-guide/s1-proc-directories.html
> > that can force the kernel to flush out the dentry_cache and 
> > ext3_inode_cache when the rsync is over and cache is not needed
anymore?
> > Thanks very much.
> 
> Only to unmount and remount the filesystem, on the server.  On Lustre
> clients there is a mechanism to flush Lustre cache, but that
doesn''t
> help you here.
> 
> Cheers, Andreas
> --
> Andreas Dilger
> Sr. Staff Engineer, Lustre Group
> Sun Microsystems of Canada, Inc.
>

Lustre discuss - Dec 2007 - lustre ram usage (contd)

[Lustre-discuss] lustre ram usage (contd)

[Lustre-discuss] lustre ram usage (contd)

[Lustre-discuss] lustre ram usage (contd)

[Lustre-discuss] lustre ram usage (contd)

[Lustre-discuss] lustre ram usage (contd)

[Lustre-discuss] lustre ram usage (contd)

[Lustre-discuss] lustre ram usage (contd)