Gavin Mu
2015-Dec-04 01:35 UTC
application coredump behavior differences between FreeBSD 7.0 and FreeBSD 10.1
Hi, We have an application running on old FreeBSD 7.0, and we are upgrading the base system to FreeBSD 10.1. The application uses sysv_shm, and will allocate a lot of share memory, though most of time only a part of the allocated memory is used. aka. large SIZE and small RES from /usr/bin/top view. When the application core dump, the core dump file will be large, and in FreeBSD 7.0, it uses only a little more memory to do core dump, but in FreeBSD 10.1, it seems all share memory are touched and uses a lot of physical memory (RES in /usr/bin/top output will increase very much) and cause memory drain. I have been debugging but can not find any clue yet. Could someone provide some points where the issue happen? Thanks. Regards, Gavin Mu
Konstantin Belousov
2015-Dec-04 09:45 UTC
application coredump behavior differences between FreeBSD 7.0 and FreeBSD 10.1
On Fri, Dec 04, 2015 at 09:35:54AM +0800, Gavin Mu wrote:> Hi, > > We have an application running on old FreeBSD 7.0, and we are upgrading the base system to FreeBSD 10.1. The application uses sysv_shm, and will allocate a lot of share memory, though most of time only a part of the allocated memory is used. aka. large SIZE and small RES from /usr/bin/top view. > > When the application core dump, the core dump file will be large, and in FreeBSD 7.0, it uses only a little more memory to do core dump, but in FreeBSD 10.1, it seems all share memory are touched and uses a lot of physical memory (RES in /usr/bin/top output will increase very much) and cause memory drain. > > I have been debugging but can not find any clue yet. Could someone provide some points where the issue happen? Thanks.Both stable/7 and latest HEAD do read the whole mapped segment to write the coredump. This behaviour did not changed, since probably introduction of the ELF support into FreeBSD. And, how otherwise could coredump file contain the content of the mapped segments ? What in the FreeBSD 10 changed in this regard, is a deadlock fix which could occur in some scenarious, including the coredumping. In stable/7, the page instantiation or swap-in for pages accessed by the core write, was done while owning several VFS locks. This sometimes caused deadlock. In stable/10 the deadlock avoidance code is enabled by default, and when kernel detects the possibility of the deadlock, it changes to reading carefully by small chunks. Still, this does not explain the effect that you describe. In fact, I am more suspicious to the claim that stable/7 did not increase RSS of the dumping process or did not accessed the whole mapped shared segment, then the claim that there is a regression in stable/10.