Andreas Kurz
2012-Jul-18 20:16 UTC
[Gluster-users] glusterfs process eats memory until OOM kills it
Hi, I'm running GlusterFS 3.2.6 in AWS on CentOS 6.2, running a distributed/replicated setup. Type: Distributed-Replicate Status: Started Number of Bricks: 4 x 2 = 8 Whenever geo-replication is activated, the corresponding glusterfs process on the server starts eating memory (and using lot of cpu) until oom killer strikes back. This happens once user starting to change files via glusterfs mount and gsyncd starts crawling through the directory tree looking for changes. Network traffic between the servers is quite high, typically 10Mbit/s ... the vast majority are lookups and getxattr request from the server running geo-replication. I also created a state dump (5MB bzip2 archive) of this glusterfs process when eating about 9GB if that is needed for debugging I can upload it somewhere (Bugzilla?). Dropping dentries and inodes reclaims about 1GB. Any ideas? A bug? Any recommended tunings, maybe a gsyncd option? I changed these values: performance.stat-prefetch: off performance.quick-read: off performance.cache-refresh-timeout: 1 performance.read-ahead: off geo-replication.indexing: on nfs.disable: on network.ping-timeout: 10 performance.cache-size: 1073741824 Regards, Andreas -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 222 bytes Desc: OpenPGP digital signature URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20120718/3957bada/attachment.sig>
Vijay Bellur
2012-Jul-19 06:57 UTC
[Gluster-users] glusterfs process eats memory until OOM kills it
On 07/19/2012 01:46 AM, Andreas Kurz wrote:> Hi, > > I'm running GlusterFS 3.2.6 in AWS on CentOS 6.2, running a > distributed/replicated setup. > > Type: Distributed-Replicate > Status: Started > Number of Bricks: 4 x 2 = 8 > > Whenever geo-replication is activated, the corresponding glusterfs > process on the server starts eating memory (and using lot of cpu) until > oom killer strikes back. > > This happens once user starting to change files via glusterfs mount and > gsyncd starts crawling through the directory tree looking for changes. > Network traffic between the servers is quite high, typically 10Mbit/s > ... the vast majority are lookups and getxattr request from the server > running geo-replication. > > I also created a state dump (5MB bzip2 archive) of this glusterfs > process when eating about 9GB if that is needed for debugging I can > upload it somewhere (Bugzilla?). Dropping dentries and inodes reclaims > about 1GB. > > Any ideas? A bug? Any recommended tunings, maybe a gsyncd option? I > changed these values: > > performance.stat-prefetch: off > performance.quick-read: off > performance.cache-refresh-timeout: 1 > performance.read-ahead: off > geo-replication.indexing: on > nfs.disable: on > network.ping-timeout: 10 > performance.cache-size: 1073741824 >Can you also try with performance.io-cache being set to off? If that doesn't show any improvement, please raise a bug and attach the statedump to it. Thanks, Vijay