Vladimir Sharun
2005-Oct-23  00:43 UTC
kmem_malloc(4096): kmem_map too small: 536870912 total allocated
We have 2xOpteron/2Gb RAM server with extrensive disk load. Every week or two it suddenly hangs with "kmem_malloc(4096): kmem_map too small 335bla-bla allocated". I look onto handbook and put vm.kmem_size_max="536870912" onto /boot/loader.conf. Today was the same with the new parameters. Is there any other solutions ? # sysctl -a | grep kmem vm.kmem_size: 536870912 vm.kmem_size_max: 536870912 vm.kmem_size_scale: 3 The only vm.kmem_size_max on loader.conf, no vm.kmem_size. We're running FreeBSD 6.0-BETA5 #0: Wed Sep 28 16:54:33 EEST 2005 in i386 mode. The same was with 5.3/5.4 and NetBSD 2.0 on this machine.
Kris Kennaway
2005-Oct-23  01:15 UTC
kmem_malloc(4096): kmem_map too small: 536870912 total allocated
On Sun, Oct 23, 2005 at 10:43:42AM +0300, Vladimir Sharun wrote:> We have 2xOpteron/2Gb RAM server with extrensive disk load. Every week or two > it suddenly hangs with "kmem_malloc(4096): kmem_map too small 335bla-bla allocated". > I look onto handbook and put vm.kmem_size_max="536870912" onto /boot/loader.conf. > Today was the same with the new parameters. Is there any other solutions ?If that's not enough, try making it larger. Kris -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20051023/1f6fa9d6/attachment.bin
Vladimir Sharun
2005-Oct-24  00:57 UTC
kmem_malloc(4096): kmem_map too small: 536870912 total allocated
Leak in lockf, confirmed:
        lockf 80448  2516K       -   709113  32,64
        lockf 80450  2516K       -   709155  32,64
        lockf 80453  2516K       -   709199  32,64
        lockf 80452  2516K       -   709207  32,64
        lockf 80455  2516K       -   709226  32,64
        lockf 80455  2516K       -   709236  32,64
        lockf 80459  2516K       -   709250  32,64
        lockf 80461  2516K       -   709280  32,64
        lockf 80466  2516K       -   709317  32,64
        lockf 80474  2516K       -   709376  32,64
        lockf 80475  2516K       -   709396  32,64
        lockf 80477  2517K       -   709427  32,64
        lockf 80481  2517K       -   709445  32,64
        lockf 80482  2517K       -   709472  32,64
        lockf 80484  2517K       -   709488  32,64
        lockf 80490  2517K       -   709547  32,64
        lockf 80498  2517K       -   709578  32,64
        lockf 80505  2518K       -   709615  32,64
        lockf 80507  2518K       -   709647  32,64
        lockf 80510  2518K       -   709700  32,64
        lockf 80518  2518K       -   709747  32,64
        lockf 80531  2518K       -   709865  32,64
        lockf 80540  2519K       -   709940  32,64
        lockf 80561  2519K       -   710078  32,64
        lockf 80590  2520K       -   710263  32,64
        lockf 80611  2521K       -   710419  32,64
        lockf 80623  2521K       -   710512  32,64
        lockf 80625  2521K       -   710530  32,64
        lockf 80637  2522K       -   710596  32,64
        lockf 80638  2521K       -   710643  32,64
        lockf 80641  2522K       -   710681  32,64
        lockf 80656  2522K       -   710769  32,64
        lockf 80658  2522K       -   710803  32,64
        lockf 80666  2522K       -   710859  32,64
        lockf 80672  2523K       -   710899  32,64
        lockf 80675  2523K       -   710930  32,64
(output from while true; do vmstat -m | grep lockf; sleep 1 ; done)
Vladimir Sharun wrote:
 VS> We have 2xOpteron/2Gb RAM server with extrensive disk load. Every week
or two
 VS> it suddenly hangs with "kmem_malloc(4096): kmem_map too small
335bla-bla allocated".
 VS> I look onto handbook and put vm.kmem_size_max="536870912" onto
/boot/loader.conf.
 VS> Today was the same with the new parameters. Is there any other solutions
?
 VS> # sysctl -a | grep kmem
 VS> vm.kmem_size: 536870912
 VS> vm.kmem_size_max: 536870912
 VS> vm.kmem_size_scale: 3
 VS> The only vm.kmem_size_max on loader.conf, no vm.kmem_size.
 VS> We're running FreeBSD 6.0-BETA5 #0: Wed Sep 28 16:54:33 EEST 2005
 VS> in i386 mode. The same was with 5.3/5.4 and NetBSD 2.0 on this machine.
Vladimir Sharun
2005-Oct-25  04:57 UTC
kmem_malloc(4096): kmem_map too small: 536870912 total allocated
I found the sources of the leak: if exim accessess ANY configuration/text files over NFS, there will be leak. And, how often exim will be called, then quicker your system dies. My main problem now is to build near-realtime mirroring solution nfs-to-local for around 20 files (up to 1Mb everything). Any /ports solution ? The next question to Philip Hazel: any comments why this happens ? Vladimir Sharun wrote: VS> We have 2xOpteron/2Gb RAM server with extensive disk load. Every week or two VS> it suddenly hangs with "kmem_malloc(4096): kmem_map too small 335bla-bla allocated". VS> I look onto handbook and put vm.kmem_size_max="536870912" onto /boot/loader.conf. VS> Today was the same with the new parameters. Is there any other solutions ? VS> # sysctl -a | grep kmem VS> vm.kmem_size: 536870912 VS> vm.kmem_size_max: 536870912 VS> vm.kmem_size_scale: 3 VS> The only vm.kmem_size_max on loader.conf, no vm.kmem_size. VS> We're running FreeBSD 6.0-BETA5 #0: Wed Sep 28 16:54:33 EEST 2005 VS> in i386 mode. The same was with 5.3/5.4 and NetBSD 2.0 on this machine.
Gleb Smirnoff
2005-Oct-25  08:01 UTC
kmem_malloc(4096): kmem_map too small: 536870912 total allocated
Vladimir,
  please confirm that the attached patch fix your problem. The patch is relative
to src/sys tree.
  Kris, Christian, please review it. Thanks.
-- 
Totus tuus, Glebius.
GLEBIUS-RIPN GLEB-RIPE
-------------- next part --------------
Index: nfsclient/nfs_lock.c
==================================================================RCS file:
/home/ncvs/src/sys/nfsclient/nfs_lock.c,v
retrieving revision 1.40
diff -u -r1.40 nfs_lock.c
--- nfsclient/nfs_lock.c	6 Dec 2004 08:31:32 -0000	1.40
+++ nfsclient/nfs_lock.c	25 Oct 2005 14:51:11 -0000
@@ -62,9 +62,13 @@
 #include <nfsclient/nfs_lock.h>
 #include <nfsclient/nlminfo.h>
 
+extern void (*nlminfo_release_p)(struct proc *p);
+
 MALLOC_DEFINE(M_NFSLOCK, "NFS lock", "NFS lock request");
+MALLOC_DEFINE(M_NLMINFO, "nlminfo", "NFS lock process
structure");
 
 static int nfslockdans(struct thread *td, struct lockd_ans *ansp);
+static void nlminfo_release(struct proc *p);
 /*
  * --------------------------------------------------------------------
  * A miniature device driver which the userland uses to talk to us.
@@ -194,6 +198,7 @@
 			printf("nfslock: pseudo-device\n");
 		mtx_init(&nfslock_mtx, "nfslock", NULL, MTX_DEF);
 		TAILQ_INIT(&nfslock_list);
+		nlminfo_release_p = nlminfo_release;
 		nfslock_dev = make_dev(&nfslock_cdevsw, 0,
 		    UID_ROOT, GID_KMEM, 0600, _PATH_NFSLCKDEV);
 		return (0);
@@ -259,7 +264,7 @@
 	 */
 	if (p->p_nlminfo == NULL) {
 		MALLOC(p->p_nlminfo, struct nlminfo *,
-			sizeof(struct nlminfo), M_LOCKF, M_WAITOK | M_ZERO);
+			sizeof(struct nlminfo), M_NLMINFO, M_WAITOK | M_ZERO);
 		p->p_nlminfo->pid_start = p->p_stats->p_start;
 		timevaladd(&p->p_nlminfo->pid_start, &boottime);
 	}
@@ -381,3 +386,12 @@
 	return (0);
 }
 
+/*
+ * Free nlminfo attached to process.
+ */
+void        
+nlminfo_release(struct proc *p)
+{  
+        free(p->p_nlminfo, M_NLMINFO);
+        p->p_nlminfo = NULL;
+}
Index: nfsclient/nlminfo.h
==================================================================RCS file:
/home/ncvs/src/sys/nfsclient/nlminfo.h,v
retrieving revision 1.2
diff -u -r1.2 nlminfo.h
--- nfsclient/nlminfo.h	18 Sep 2001 23:31:53 -0000	1.2
+++ nfsclient/nlminfo.h	25 Oct 2005 14:40:30 -0000
@@ -40,5 +40,3 @@
 	int		getlk_pid;
         struct  timeval pid_start;      /* process starting time */
 };
-
-extern void nlminfo_release(struct proc *p);
Index: kern/kern_exit.c
==================================================================RCS file:
/home/ncvs/src/sys/kern/kern_exit.c,v
retrieving revision 1.268
diff -u -r1.268 kern_exit.c
--- kern/kern_exit.c	23 Oct 2005 12:19:08 -0000	1.268
+++ kern/kern_exit.c	25 Oct 2005 14:45:35 -0000
@@ -82,6 +82,9 @@
 /* Required to be non-static for SysVR4 emulator */
 MALLOC_DEFINE(M_ZOMBIE, "zombie", "zombie proc status");
 
+/* Hook for NFS teardown procedure. */
+void (*nlminfo_release_p)(struct proc *p);
+
 /*
  * exit --
  *	Death of process.
@@ -234,6 +237,12 @@
 	funsetownlst(&p->p_sigiolst);
 
 	/*
+	 * If this process has an nlminfo data area (for lockd), release it
+	 */
+	if (nlminfo_release_p != NULL && p->p_nlminfo != NULL)
+		(*nlminfo_release_p)(p);
+
+	/*
 	 * Close open files and release open-file table.
 	 * This may block!
 	 */
Index: sys/lockf.h
==================================================================RCS file:
/home/ncvs/src/sys/sys/lockf.h,v
retrieving revision 1.18
diff -u -r1.18 lockf.h
--- sys/lockf.h	25 Jan 2005 10:15:25 -0000	1.18
+++ sys/lockf.h	25 Oct 2005 14:51:28 -0000
@@ -40,10 +40,6 @@
 
 struct vop_advlock_args;
 
-#ifdef MALLOC_DECLARE
-MALLOC_DECLARE(M_LOCKF);
-#endif
-
 /*
  * The lockf structure is a kernel structure which contains the information
  * associated with a byte range lock.  The lockf structures are linked into