Gavin Mu
2015-Dec-05 05:09 UTC
application coredump behavior differences between FreeBSD 7.0and FreeBSD 10.1
Hi, kib, Please see my testing on FreeBSD 7.0. freebsd7# sysctl kern.ipc.shmall kern.ipc.shmall: 819200 freebsd7# sysctl kern.ipc.shmmax kern.ipc.shmmax: 3355443200 freebsd7# uname -a FreeBSD freebsd7.localdomain 7.0-RELEASE FreeBSD 7.0-RELEASE #0: Sun Feb 24 10:35:36 UTC 2008 root at driscoll.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 testing code: freebsd7# cat tt.c #include <stdio.h> #include <stdlib.h> #include <machine/param.h> #include <sys/types.h> #include <sys/ipc.h> #include <sys/shm.h> int main(int argc, char **argv) { char **p; int size; int i; char *c = NULL; int shmid; void *shm_handle; size = atoi(argv[1]); printf("will alloc %dGB\n", size); shmid = shmget(100, size * 1024 * 1024 * 1024, 0644 | IPC_CREAT); if (shmid == -1) { printf("shmid = %d\n", shmid); } shm_handle = shmat(shmid, NULL, 0); if (shm_handle == -1) { printf("null shm_handle\n"); } *c = 0; return 0; } freebsd7# ./a.out 1 will alloc 1GB Segmentation fault (core dumped) when a.out is running, the RES keeps being 2024K without increasing: last pid: 735; load averages: 0.00, 0.01, 0.03 up 0+00:15:11 04:43:35 25 processes: 1 running, 24 sleeping CPU states: 0.0% user, 0.0% nice, 22.6% system, 0.8% interrupt, 76.7% idle Mem: 13M Active, 6380K Inact, 52M Wired, 32K Cache, 39M Buf, 910M Free Swap: 2015M Total, 2015M Free PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND 734 root 1 -16 0 1027M 2024K wdrain 0:02 13.27% a.out but when same code is running on FreeBSD 10.1, the RES keeps increasing to 1GB. From my testing, if the memory is allocated by malloc(), then RES will keep increasing in both 7.0 and 10.1. only sysv_shm in 7.0 has different behavior. I have checked coredump() code but did not find any clue why it is different. Regards, Gavin Mu ------------------ Original ------------------ From: "Konstantin Belousov";<kostikbel at gmail.com>; Date: Fri, Dec 4, 2015 05:45 PM To: "Gavin Mu"<gavin.mu at qq.com>; Cc: "freebsd-stable"<freebsd-stable at freebsd.org>; Subject: Re: application coredump behavior differences between FreeBSD 7.0and FreeBSD 10.1 On Fri, Dec 04, 2015 at 09:35:54AM +0800, Gavin Mu wrote:> Hi, > > We have an application running on old FreeBSD 7.0, and we are upgrading the base system to FreeBSD 10.1. The application uses sysv_shm, and will allocate a lot of share memory, though most of time only a part of the allocated memory is used. aka. large SIZE and small RES from /usr/bin/top view. > > When the application core dump, the core dump file will be large, and in FreeBSD 7.0, it uses only a little more memory to do core dump, but in FreeBSD 10.1, it seems all share memory are touched and uses a lot of physical memory (RES in /usr/bin/top output will increase very much) and cause memory drain. > > I have been debugging but can not find any clue yet. Could someone provide some points where the issue happen? Thanks.Both stable/7 and latest HEAD do read the whole mapped segment to write the coredump. This behaviour did not changed, since probably introduction of the ELF support into FreeBSD. And, how otherwise could coredump file contain the content of the mapped segments ? What in the FreeBSD 10 changed in this regard, is a deadlock fix which could occur in some scenarious, including the coredumping. In stable/7, the page instantiation or swap-in for pages accessed by the core write, was done while owning several VFS locks. This sometimes caused deadlock. In stable/10 the deadlock avoidance code is enabled by default, and when kernel detects the possibility of the deadlock, it changes to reading carefully by small chunks. Still, this does not explain the effect that you describe. In fact, I am more suspicious to the claim that stable/7 did not increase RSS of the dumping process or did not accessed the whole mapped shared segment, then the claim that there is a regression in stable/10.
Konstantin Belousov
2015-Dec-05 14:24 UTC
application coredump behavior differences between FreeBSD 7.0and FreeBSD 10.1
On Sat, Dec 05, 2015 at 01:09:31PM +0800, Gavin Mu wrote:> Hi, kib, > > > Please see my testing on FreeBSD 7.0. > freebsd7# sysctl kern.ipc.shmall > kern.ipc.shmall: 819200 > freebsd7# sysctl kern.ipc.shmmax > kern.ipc.shmmax: 3355443200 > freebsd7# uname -a > FreeBSD freebsd7.localdomain 7.0-RELEASE FreeBSD 7.0-RELEASE #0: Sun Feb 24 10:35:36 UTC 2008 root at driscoll.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 > > > > testing code: > freebsd7# cat tt.c > #include <stdio.h> > #include <stdlib.h> > #include <machine/param.h> > #include <sys/types.h> > #include <sys/ipc.h> > #include <sys/shm.h> > > > int > main(int argc, char **argv) > { > char **p; > int size; > int i; > char *c = NULL; > int shmid; > void *shm_handle; > size = atoi(argv[1]); > printf("will alloc %dGB\n", size); > > > shmid = shmget(100, size * 1024 * 1024 * 1024, 0644 | IPC_CREAT); > if (shmid == -1) { > printf("shmid = %d\n", shmid); > } > > > shm_handle = shmat(shmid, NULL, 0);(shm_handle is not a handle).> if (shm_handle == -1) { > printf("null shm_handle\n"); > } >What if you add madvise(shm_handle, size, MADV_SEQUENTIAL); there ? Does 10.x behaviour become similar to that of the 7.x ?> > *c = 0; > return 0; > } > > > > freebsd7# ./a.out 1 > will alloc 1GB > Segmentation fault (core dumped) > > > > when a.out is running, the RES keeps being 2024K without increasing: > > > last pid: 735; load averages: 0.00, 0.01, 0.03 up 0+00:15:11 04:43:35 > 25 processes: 1 running, 24 sleeping > CPU states: 0.0% user, 0.0% nice, 22.6% system, 0.8% interrupt, 76.7% idle > Mem: 13M Active, 6380K Inact, 52M Wired, 32K Cache, 39M Buf, 910M Free > Swap: 2015M Total, 2015M Free > > > PID USERNAME THR PRI NICE SIZE RES STATE TIME WCPU COMMAND > 734 root 1 -16 0 1027M 2024K wdrain 0:02 13.27% a.out > > > > but when same code is running on FreeBSD 10.1, the RES keeps increasing to 1GB. From my testing, if the memory is allocated by malloc(), then RES will keep increasing in both 7.0 and 10.1. only sysv_shm in 7.0 has different behavior. I have checked coredump() code but did not find any clue why it is different. > > > Regards, > Gavin Mu > > > ------------------ Original ------------------ > From: "Konstantin Belousov";<kostikbel at gmail.com>; > Date: Fri, Dec 4, 2015 05:45 PM > To: "Gavin Mu"<gavin.mu at qq.com>; > Cc: "freebsd-stable"<freebsd-stable at freebsd.org>; > Subject: Re: application coredump behavior differences between FreeBSD 7.0and FreeBSD 10.1 > > > > On Fri, Dec 04, 2015 at 09:35:54AM +0800, Gavin Mu wrote: > > Hi, > > > > We have an application running on old FreeBSD 7.0, and we are upgrading the base system to FreeBSD 10.1. The application uses sysv_shm, and will allocate a lot of share memory, though most of time only a part of the allocated memory is used. aka. large SIZE and small RES from /usr/bin/top view. > > > > When the application core dump, the core dump file will be large, and in FreeBSD 7.0, it uses only a little more memory to do core dump, but in FreeBSD 10.1, it seems all share memory are touched and uses a lot of physical memory (RES in /usr/bin/top output will increase very much) and cause memory drain. > > > > I have been debugging but can not find any clue yet. Could someone provide some points where the issue happen? Thanks. > > Both stable/7 and latest HEAD do read the whole mapped segment to write > the coredump. This behaviour did not changed, since probably introduction > of the ELF support into FreeBSD. And, how otherwise could coredump file > contain the content of the mapped segments ? > > What in the FreeBSD 10 changed in this regard, is a deadlock fix which > could occur in some scenarious, including the coredumping. In stable/7, > the page instantiation or swap-in for pages accessed by the core write, > was done while owning several VFS locks. This sometimes caused deadlock. > In stable/10 the deadlock avoidance code is enabled by default, and > when kernel detects the possibility of the deadlock, it changes to reading > carefully by small chunks. > > Still, this does not explain the effect that you describe. In fact, I > am more suspicious to the claim that stable/7 did not increase RSS of > the dumping process or did not accessed the whole mapped shared segment, > then the claim that there is a regression in stable/10.