I have a file server that keeps panic?ing with a mbuf cluster in the 17 Quadrillion range (2^64 - 2). I am pretty sure it?s a buffer underflow. I have opened a PR, but I haven?t had any movement on it. This happened while I was running 9.1-RELEASE as well. Here is the PR.. http://www.freebsd.org/cgi/query-pr.cgi?pr=183424 And I have uploaded the crash dumps here.. http://www.wallbridge.net/crash/ If anyone has any ideas, I would be grateful as this is a production box and it?s really impacting us. shawn
Hi, How about adding options INVARIANTS .. to the kernel config, compile and reboot? -adrian On 29 October 2013 21:37, Shawn Wallbridge <shawn at wallbridge.net> wrote:> I have a file server that keeps panic?ing with a mbuf cluster in the 17 Quadrillion range (2^64 - 2). I am pretty sure it?s a buffer underflow. > > I have opened a PR, but I haven?t had any movement on it. This happened while I was running 9.1-RELEASE as well. > > Here is the PR.. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=183424 > > And I have uploaded the crash dumps here.. > > http://www.wallbridge.net/crash/ > > If anyone has any ideas, I would be grateful as this is a production box and it?s really impacting us. > > shawn > > > _______________________________________________ > freebsd-stable at freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe at freebsd.org"
Shawn Wallbridge wrote this message on Tue, Oct 29, 2013 at 21:37 -0700:> I have a file server that keeps panic?ing with a mbuf cluster in the 17 Quadrillion range (2^64 - 2). I am pretty sure it?s a buffer underflow. > > I have opened a PR, but I haven?t had any movement on it. This happened while I was running 9.1-RELEASE as well. > > Here is the PR.. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=183424 > > And I have uploaded the crash dumps here.. > > http://www.wallbridge.net/crash/ > > If anyone has any ideas, I would be grateful as this is a production box and it?s really impacting us.If you could push the kernel.symbols file there too that would be great... -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."
Shawn Wallbridge wrote this message on Tue, Oct 29, 2013 at 21:37 -0700:> I have a file server that keeps panic?ing with a mbuf cluster in the 17 Quadrillion range (2^64 - 2). I am pretty sure it?s a buffer underflow.Ok, after some tracking stuff down, I do not think it has anything to do w/ mbufs, as the stats appear to be correct... The problem is that mbuf clusters takes into the fact that some clusters might be still associated w/ packets (from usr.bin/netstat/mbuf.c): printf("%ju/%ju/%ju/%ju mbuf clusters in use " "(current/cache/total/max)\n", cluster_count - packet_free, cluster_free + packet_free, cluster_count + cluster_free, cluster_limit); notice how current is cluster_count - packet_free instead of something like cluster_count - cluster_free... And I just printed your values from vmcore.6, and apparently packet_count is 0, while packet_free is 5215... cluster_count is 2049, cluster_free is 1997.. And because packet is a secondary zone of mbufs, things apparently get confused... So I wouldn't go down this road anymore... This looks like a simple race/accounting error in the status...> I have opened a PR, but I haven?t had any movement on it. This happened while I was running 9.1-RELEASE as well. > > Here is the PR.. > > http://www.freebsd.org/cgi/query-pr.cgi?pr=183424 > > And I have uploaded the crash dumps here.. > > http://www.wallbridge.net/crash/ > > If anyone has any ideas, I would be grateful as this is a production box and it?s really impacting us.Have you done a full fsck on the fs to make sure that there isn't any corruption on the disk that keeps popping up? I do realize that it will take a LONG time to fsck... Sadly, you're last three cores (all on 9.2-R) are for different inodes... Could you tell me the path and filename of inodes: 3226539015, 3224134148 and 3343904256? It could help us track down which app is causing this and being able to reproduce this... To find the inode on the fs use find <fs> -inum <inum>, so: find <fs> -inum 3226539015 -or -inum 3224134148 -or -inum 3343904256 will do it in one pass so it won't take so long... Thanks. -- John-Mark Gurney Voice: +1 415 225 5579 "All that I will do, has been done, All that I have, has not."