Hi all, we are experiencing repeated crash on a Dell PowerEdge 2950 (rev 1 or 2). FBSD release is 6.2-RELEASE-p5 , AMD64. 2xXeon QuadCore and 8G of Ram. MySQL Version is 5.0.41 with following configuration settings: set-variable = key_buffer=768M set-variable = table_cache=800 set-variable = sort_buffer=24M set-variable = myisam_sort_buffer_size=256M set-variable = record_buffer=16M set-variable = max_allowed_packet=10M set-variable = thread_stack=128K set-variable = join_buffer=512M set-variable = max_heap_table_size=256M set-variable = max_connections=300 set-variable = tmp_table_size=384M set-variable = query_cache_size=402653184 set-variable = query_cache_limit=134217728 set-variable = read_rnd_buffer_size=10M set-variable = ft_min_word_len=1 pid-file = /var/db/mysqld.pid tmpdir = /var/tmp ft_stopword_file = '' set-variable = thread_cache_size=80 set-variable = myisam_stats_method=nulls_equal The system is crashing repeatedly and from the graphs we collect on the box i can see that every time before the crash we have an intensive usage of *InnoDB* related resources, i collected several vmcore dump and attached is what i've been able to extract. I'm not sure how much the *InnoDB* usage is related to the crash, btw i'm quite sure that it is triggering the crash. I've looked on the various CVS and releases to see if anything related to my crash has been updated in the last period but i did not find anything specifically related so i'm wondering if anybody else had experience of this kind of problems before proceding to a blind upgrade or any other blind solution.> $ sudo kgdb /usr/obj/usr/src/sys/PE2950/kernel.debug vmcore.2Password: [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol "ps_pglobal_lookup"] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "amd64-marcel-freebsd". Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode cpuid = 5; apic id = 05 fault virtual address = 0x100166887ad fault code = supervisor read, page not present instruction pointer = 0x8:0xffffffff803fa290 stack pointer = 0x10:0xffffffffba0a9980 frame pointer = 0x10:0x2 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 1038 (mysqld) trap number = 12 panic: page fault cpuid = 5 Uptime: 1d4h37m54s Dumping 8191 MB (3 chunks) chunk 0: 1MB (156 pages) ... ok chunk 1: 3327MB (851624 pages) 3311 3295 3279 3263 3247 3231 3215 3199 3183 3167 3151 3135 3119 3103 3087 3071 3055 3039 3023 3007 2991 2975 2959 2943 2927 2911 2895 2879 2863 2847 2831 2815 2799 2783 2767 2751 2735 2719 2703 2687 2671 2655 2639 2623 2607 2591 2575 2559 2543 2527 2511 2495 2479 2463 2447 2431 2415 2399 2383 2367 2351 2335 2319 2303 2287 2271 2255 2239 2223 2207 2191 2175 2159 2143 2127 2111 2095 2079 2063 2047 2031 2015 1999 1983 1967 1951 1935 1919 1903 1887 1871 1855 1839 1823 1807 1791 1775 1759 1743 1727 1711 1695 1679 1663 1647 1631 1615 1599 1583 1567 1551 1535 1519 1503 1487 1471 1455 1439 1423 1407 1391 1375 1359 1343 1327 1311 1295 1279 1263 1247 1231 1215 1199 1183 1167 1151 1135 1119 1103 1087 1071 1055 1039 1023 1007 991 975 959 943 927 911 895 879 863 847 831 815 799 783 767 751 735 719 703 687 671 655 639 623 607 591 575 559 543 527 511 495 479 463 447 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 127 111 95 79 63 47 31 15 ... ok chunk 2: 4864MB (1245184 pages) 4849 4833 4817 4801 4785 4769 4753 4737 4721 4705 4689 4673 4657 4641 4625 4609 4593 4577 4561 4545 4529 4513 4497 4481 4465 4449 4433 4417 4401 4385 4369 4353 4337 4321 4305 4289 4273 4257 4241 4225 4209 4193 4177 4161 4145 4129 4113 4097 4081 4065 4049 4033 4017 4001 3985 3969 3953 3937 3921 3905 3889 3873 3857 3841 3825 3809 3793 3777 3761 3745 3729 3713 3697 3681 3665 3649 3633 3617 3601 3585 3569 3553 3537 3521 3505 3489 3473 3457 3441 3425 3409 3393 3377 3361 3345 3329 3313 3297 3281 3265 3249 3233 3217 3201 3185 3169 3153 3137 3121 3105 3089 3073 3057 3041 3025 3009 2993 2977 2961 2945 2929 2913 2897 2881 2865 2849 2833 2817 2801 2785 2769 2753 2737 2721 2705 2689 2673 2657 2641 2625 2609 2593 2577 2561 2545 2529 2513 2497 2481 2465 2449 2433 2417 2401 2385 2369 2353 2337 2321 2305 2289 2273 2257 2241 2225 2209 2193 2177 2161 2145 2129 2113 2097 2081 2065 2049 2033 2017 2001 1985 1969 1953 1937 1921 1905 1889 1873 1857 1841 1825 1809 1793 1777 1761 1745 1729 1713 1697 1681 1665 1649 1633 1617 1601 1585 1569 1553 1537 1521 1505 1489 1473 1457 1441 1425 1409 1393 1377 1361 1345 1329 1313 1297 1281 1265 1249 1233 1217 1201 1185 1169 1153 1137 1121 1105 1089 1073 1057 1041 1025 1009 993 977 961 945 929 913 897 881 865 849 833 817 801 785 769 753 737 721 705 689 673 657 641 625 609 593 577 561 545 529 513 497 481 465 449 433 417 401 385 369 353 337 321 305 289 273 257 241 225 209 193 177 161 145 129 113 97 81 65 49 33 17 1 #0 doadump () at pcpu.h:172 172 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:172 #1 0x0000000000000004 in ?? () #2 0xffffffff802a7d67 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #3 0xffffffff802a8401 in panic (fmt=0xffffff0036f9c720 "?\206C?\001?????%\\\001???\200i??") at /usr/src/sys/kern/kern_shutdown.c:565 #4 0xffffffff80425f7f in trap_fatal (frame=0xffffff0036f9c720, eva=18446742981617878704) at /usr/src/sys/amd64/amd64/trap.c:660 #5 0xffffffff8042629f in trap_pfault (frame=0xffffffffba0a98d0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:573 #6 0xffffffff80426553 in trap (frame {tf_rdi = 1099887576672, tf_rsi = 0, tf_rdx = 0, tf_rcx = -1173710312, tf_r8 = -1093564261992, tf_r9 = -1173710304, tf_rax = -1173710293, tf_rbx -1093564262000, tf_rbp = 2, tf_r10 = -1098589288672, tf_r11 = 435836558, tf_r12 = 1099887576672, tf_r13 = -1093564262000, tf_r14 = 435835520, tf_r15 = -1173710312, tf_trapno = 12, tf_addr = 1099887577005, tf_flags -2144607018, tf_err = 0, tf_rip = -2143313264, tf_cs = 8, tf_rflags = 66118, tf_rsp = -1173710440, tf_ss = 16}) at /usr/src/sys/amd64/amd64/trap.c:352 #7 0xffffffff8041173b in calltrap () at /usr/src/sys/amd64/amd64/exception.S:168 #8 0xffffffff803fa290 in _vm_map_unlock () at /usr/src/sys/vm/vm_map.c:443 #9 0xffffffff803fdecc in vm_map_lookup (var_map=0xffffffffba0a9a10, vaddr=435835520, fault_typea=2 '\002', out_entry=0xffffffffba0a9a18, object=0xffffff01627d9998, pindex=0xffffffffba0a9a20, out_prot=0xffffffffba0a9a2b "", wired=0xffffffffba0a9a2c) at /usr/src/sys/vm/vm_map.c:3074 #10 0xffffffff802b845e in umtx_key_get (td=0xffffff0036f9c720, umtx=0x19fa5280, key=0xffffff01627d9990) at /usr/src/sys/kern/kern_umtx.c:312 #11 0xffffffff802b8578 in _do_lock (td=0xffffff0036f9c720, umtx=0x19fa5280, id=100582, timo=0) at /usr/src/sys/kern/kern_umtx.c:362 #12 0xffffffff802b99e9 in _umtx_op (td=0xffffff0036f9c720, uap=0x188e6) at /usr/src/sys/kern/kern_umtx.c:545 #13 0xffffffff80426dd1 in syscall (frame {tf_rdi = 435835520, tf_rsi = 0, tf_rdx = 100582, tf_rcx = 0, tf_r8 0, tf_r9 = 140737452053060, tf_rax = 454, tf_rbx = 100582, tf_rbp 435835520, tf_r10 = 1, tf_r11 = 582, tf_r12 = 9982128, tf_r13 = 1024, tf_r14 = 0, tf_r15 = 0, tf_trapno = 12, tf_addr = 1387466752, tf_flags = 0, tf_err = 2, tf_rip = 34378206780, tf_cs = 43, tf_rflags = 582, tf_rsp 140737452052808, tf_ss = 35}) at /usr/src/sys/amd64/amd64/trap.c:792 #14 0xffffffff804118d8 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:270 #15 0x000000080119ce3c in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) (kgdb) up 6 #6 0xffffffff80426553 in trap (frame {tf_rdi = 1099887576672, tf_rsi = 0, tf_rdx = 0, tf_rcx = -1173710312, tf_r8 = -1093564261992, tf_r9 = -1173710304, tf_rax = -1173710293, tf_rbx -1093564262000, tf_rbp = 2, tf_r10 = -1098589288672, tf_r11 = 435836558, tf_r12 = 1099887576672, tf_r13 = -1093564262000, tf_r14 = 435835520, tf_r15 = -1173710312, tf_trapno = 12, tf_addr = 1099887577005, tf_flags -2144607018, tf_err = 0, tf_rip = -2143313264, tf_cs = 8, tf_rflags = 66118, tf_rsp = -1173710440, tf_ss = 16}) at /usr/src/sys/amd64/amd64/trap.c:352 352 (void) trap_pfault(&frame, FALSE); (kgdb) up #7 0xffffffff8041173b in calltrap () at /usr/src/sys/amd64/amd64/exception.S:168 168 call trap Current language: auto; currently asm (kgdb) up #8 0xffffffff803fa290 in _vm_map_unlock () at /usr/src/sys/vm/vm_map.c:443 443 _sx_xunlock(&map->lock, file, line); Current language: auto; currently c (kgdb) up #9 0xffffffff803fdecc in vm_map_lookup (var_map=0xffffffffba0a9a10, vaddr=435835520, fault_typea=2 '\002', out_entry=0xffffffffba0a9a18, object=0xffffff01627d9998, pindex=0xffffffffba0a9a20, out_prot=0xffffffffba0a9a2b "", wired=0xffffffffba0a9a2c) at /usr/src/sys/vm/vm_map.c:3074 3074 vm_map_lock_read(map); (kgdb) list 3069 RetryLookup:; 3070 /* 3071 * Lookup the faulting address. 3072 */ 3073 3074 vm_map_lock_read(map); 3075 #define RETURN(why) \ 3076 { \ 3077 vm_map_unlock_read(map); \ 3078 return (why); \ (kgdb) p map $1 = 0x10016688660 (kgdb) down #8 0xffffffff803fa290 in _vm_map_unlock () at /usr/src/sys/vm/vm_map.c:443 443 _sx_xunlock(&map->lock, file, line); (kgdb) list 438 { 439 440 if (map->system_map) 441 _mtx_unlock_flags(&map->system_mtx, 0, file, line); 442 else 443 _sx_xunlock(&map->lock, file, line); 444 } 445 446 void 447 _vm_map_lock_read(vm_map_t map, const char *file, int line) Thanks, Francesco Ciocchetti
Jeremy Chadwick
2008-Feb-04 05:38 UTC
Crashing repeatedly: 6.2-RELEASE-p5 and MySQL 5.0.41
On Mon, Feb 04, 2008 at 12:50:32PM +0000, Primeroz lists wrote:> we are experiencing repeated crash on a Dell PowerEdge 2950 (rev 1 or 2). > > FBSD release is 6.2-RELEASE-p5 , AMD64. 2xXeon QuadCore and 8G of Ram. > MySQL Version is 5.0.41 with following configuration settings: > {snip}There's additional information needed to help with this: 1) Contents of /boot/loader.conf 2) What scheduler you're using in your kernel configuration 3) Your kernel configuration in its entirity, if possible :-) 4) What options you compiled/built mysql50-server with Chances are your problem is related to process size limits. The most important part is how you've tuned /boot/loader.conf -- I'm betting you haven't. In my experience, mysqld will either segfault or in some cases cause the entire box to panic when kern.maxdsiz, kern.dfldsiz, and kern.maxssiz are not adjusted in loader.conf. This is taken from loader.conf on our production SQL server running RELENG_6, i386, with 2GB RAM: # Increase maximum allocatable memory on a process to 1536MB. # We don't choose 2GB (our amount of RAM) since that would # exhaust all memory, and result in a kernel panic. Maximum # stack size is still set to 128MB. # # dfldsiz = Initial data size limit (bytes) # maxdsiz = Maximum data size limit (bytes) # dflssiz = Initial stack size limit (bytes) # maxssiz = Maximum stack size limit (bytes) # kern.maxdsiz="1536M" kern.dfldsiz="1536M" kern.maxssiz="128M" Some other comments:> set-variable = key_buffer=768M > set-variable = table_cache=800 > set-variable = sort_buffer=24M > set-variable = myisam_sort_buffer_size=256M > set-variable = record_buffer=16M > set-variable = max_allowed_packet=10M > set-variable = thread_stack=128K > set-variable = join_buffer=512M > set-variable = max_heap_table_size=256M > set-variable = max_connections=300 > set-variable = tmp_table_size=384M > set-variable = query_cache_size=402653184 > set-variable = query_cache_limit=134217728 > set-variable = read_rnd_buffer_size=10M > set-variable = ft_min_word_len=1 > pid-file = /var/db/mysqld.pid > tmpdir = /var/tmp > ft_stopword_file = '' > set-variable = thread_cache_size=80 > set-variable = myisam_stats_method=nulls_equalThese tunings seem fairly "random", in that they almost look like someone just picked arbitrary values rather than reading how they all work together and how exactly they impact the system. This is a very common (and bad) habit people have when "tuning" mysql; they just fiddle around. For comparison, using the same box of ours mentioned above: set-variable = tmp_table_size=64M set-variable = max_allowed_packet=32M set-variable = table_cache=256 set-variable = key_buffer_size=64M set-variable = join_buffer_size=8M set-variable = sort_buffer_size=8M set-variable = read_buffer_size=8M set-variable = query_cache_size=64M set-variable = query_cache_limit=32M set-variable = innodb_buffer_pool_size=512M set-variable = innodb_additional_mem_pool_size=20M set-variable = innodb_log_file_size=128M set-variable = innodb_log_buffer_size=8M Also, please remove pid-file from your my.cnf -- it serves zero purpose, and if placed in a location which isn't where rc.d/mysql expects it to be, could lead to problems when shutting down/starting up the server. The rc.d/mysql script takes care of this for you, so don't override it. Finally, I will take a moment to urge you to upgrade that box to RELENG_7. SCHED_ULE was re-written and specifically tested with mysqld in mind, and are quite impressive. The fact you're using a pair of quad core CPUs would be reason enough to upgrade. RELENG_6 will soon be on its way out the door, so it's an inevitable upgrade anyways. -- | Jeremy Chadwick jdc at parodius.com | | Parodius Networking http://www.parodius.com/ | | UNIX Systems Administrator Mountain View, CA, USA | | Making life hard for others since 1977. PGP: 4BD6C0CB |
Kostik Belousov
2008-Feb-04 08:19 UTC
Crashing repeatedly: 6.2-RELEASE-p5 and MySQL 5.0.41
On Mon, Feb 04, 2008 at 12:50:32PM +0000, Primeroz lists wrote:> Hi all, > > we are experiencing repeated crash on a Dell PowerEdge 2950 (rev 1 or 2). > > FBSD release is 6.2-RELEASE-p5 , AMD64. 2xXeon QuadCore and 8G of Ram. > > MySQL Version is 5.0.41 with following configuration settings: > > set-variable = key_buffer=768M > set-variable = table_cache=800 > set-variable = sort_buffer=24M > set-variable = myisam_sort_buffer_size=256M > set-variable = record_buffer=16M > set-variable = max_allowed_packet=10M > set-variable = thread_stack=128K > set-variable = join_buffer=512M > set-variable = max_heap_table_size=256M > set-variable = max_connections=300 > set-variable = tmp_table_size=384M > set-variable = query_cache_size=402653184 > set-variable = query_cache_limit=134217728 > set-variable = read_rnd_buffer_size=10M > set-variable = ft_min_word_len=1 > pid-file = /var/db/mysqld.pid > tmpdir = /var/tmp > ft_stopword_file = '' > set-variable = thread_cache_size=80 > set-variable = myisam_stats_method=nulls_equal > > > The system is crashing repeatedly and from the graphs we collect on the box > i can see that every time before the crash we have an intensive usage of > *InnoDB* related resources, i collected several vmcore dump and attached is > what i've been able to extract. > > I'm not sure how much the *InnoDB* usage is related to the crash, btw i'm > quite sure that it is triggering the crash. > > I've looked on the various CVS and releases to see if anything related to my > crash has been updated in the last period but i did not find anything > specifically related so i'm wondering if anybody else had experience of this > kind of problems before proceding to a blind upgrade or any other blind > solution. > > > > $ sudo kgdb /usr/obj/usr/src/sys/PE2950/kernel.debug vmcore.2 > Password: > [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: > Undefined symbol "ps_pglobal_lookup"] > GNU gdb 6.1.1 [FreeBSD] > Copyright 2004 Free Software Foundation, Inc. > GDB is free software, covered by the GNU General Public License, and you are > welcome to change it and/or distribute copies of it under certain > conditions. > Type "show copying" to see the conditions. > There is absolutely no warranty for GDB. Type "show warranty" for details. > This GDB was configured as "amd64-marcel-freebsd". > > Unread portion of the kernel message buffer: > > > Fatal trap 12: page fault while in kernel mode > cpuid = 5; apic id = 05 > fault virtual address = 0x100166887ad > fault code = supervisor read, page not present > instruction pointer = 0x8:0xffffffff803fa290 > stack pointer = 0x10:0xffffffffba0a9980 > frame pointer = 0x10:0x2 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 1038 (mysqld) > trap number = 12 > panic: page fault > cpuid = 5 > Uptime: 1d4h37m54s > Dumping 8191 MB (3 chunks) > chunk 0: 1MB (156 pages) ... ok > chunk 1: 3327MB (851624 pages) 3311 3295 3279 3263 3247 3231 3215 3199 > 3183 3167 3151 3135 3119 3103 3087 3071 3055 3039 3023 3007 2991 2975 2959 > 2943 2927 2911 2895 2879 2863 2847 2831 2815 2799 2783 2767 2751 2735 2719 > 2703 2687 2671 2655 2639 2623 2607 2591 2575 2559 2543 2527 2511 2495 2479 > 2463 2447 2431 2415 2399 2383 2367 2351 2335 2319 2303 2287 2271 2255 2239 > 2223 2207 2191 2175 2159 2143 2127 2111 2095 2079 2063 2047 2031 2015 1999 > 1983 1967 1951 1935 1919 1903 1887 1871 1855 1839 1823 1807 1791 1775 1759 > 1743 1727 1711 1695 1679 1663 1647 1631 1615 1599 1583 1567 1551 1535 1519 > 1503 1487 1471 1455 1439 1423 1407 1391 1375 1359 1343 1327 1311 1295 1279 > 1263 1247 1231 1215 1199 1183 1167 1151 1135 1119 1103 1087 1071 1055 1039 > 1023 1007 991 975 959 943 927 911 895 879 863 847 831 815 799 783 767 751 > 735 719 703 687 671 655 639 623 607 591 575 559 543 527 511 495 479 463 447 > 431 415 399 383 367 351 335 319 303 287 271 255 239 223 207 191 175 159 143 > 127 111 95 79 63 47 31 15 ... ok > chunk 2: 4864MB (1245184 pages) 4849 4833 4817 4801 4785 4769 4753 4737 > 4721 4705 4689 4673 4657 4641 4625 4609 4593 4577 4561 4545 4529 4513 4497 > 4481 4465 4449 4433 4417 4401 4385 4369 4353 4337 4321 4305 4289 4273 4257 > 4241 4225 4209 4193 4177 4161 4145 4129 4113 4097 4081 4065 4049 4033 4017 > 4001 3985 3969 3953 3937 3921 3905 3889 3873 3857 3841 3825 3809 3793 3777 > 3761 3745 3729 3713 3697 3681 3665 3649 3633 3617 3601 3585 3569 3553 3537 > 3521 3505 3489 3473 3457 3441 3425 3409 3393 3377 3361 3345 3329 3313 3297 > 3281 3265 3249 3233 3217 3201 3185 3169 3153 3137 3121 3105 3089 3073 3057 > 3041 3025 3009 2993 2977 2961 2945 2929 2913 2897 2881 2865 2849 2833 2817 > 2801 2785 2769 2753 2737 2721 2705 2689 2673 2657 2641 2625 2609 2593 2577 > 2561 2545 2529 2513 2497 2481 2465 2449 2433 2417 2401 2385 2369 2353 2337 > 2321 2305 2289 2273 2257 2241 2225 2209 2193 2177 2161 2145 2129 2113 2097 > 2081 2065 2049 2033 2017 2001 1985 1969 1953 1937 1921 1905 1889 1873 1857 > 1841 1825 1809 1793 1777 1761 1745 1729 1713 1697 1681 1665 1649 1633 1617 > 1601 1585 1569 1553 1537 1521 1505 1489 1473 1457 1441 1425 1409 1393 1377 > 1361 1345 1329 1313 1297 1281 1265 1249 1233 1217 1201 1185 1169 1153 1137 > 1121 1105 1089 1073 1057 1041 1025 1009 993 977 961 945 929 913 897 881 865 > 849 833 817 801 785 769 753 737 721 705 689 673 657 641 625 609 593 577 561 > 545 529 513 497 481 465 449 433 417 401 385 369 353 337 321 305 289 273 257 > 241 225 209 193 177 161 145 129 113 97 81 65 49 33 17 1 > > #0 doadump () at pcpu.h:172 > 172 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) bt > #0 doadump () at pcpu.h:172 > #1 0x0000000000000004 in ?? () > #2 0xffffffff802a7d67 in boot (howto=260) at > /usr/src/sys/kern/kern_shutdown.c:409 > #3 0xffffffff802a8401 in panic (fmt=0xffffff0036f9c720 > "???\206C???\001???????????????%\\\001?????????\200i??????") > at /usr/src/sys/kern/kern_shutdown.c:565 > #4 0xffffffff80425f7f in trap_fatal (frame=0xffffff0036f9c720, > eva=18446742981617878704) > at /usr/src/sys/amd64/amd64/trap.c:660 > #5 0xffffffff8042629f in trap_pfault (frame=0xffffffffba0a98d0, usermode=0) > at /usr/src/sys/amd64/amd64/trap.c:573 > #6 0xffffffff80426553 in trap (frame> {tf_rdi = 1099887576672, tf_rsi = 0, tf_rdx = 0, tf_rcx = -1173710312, > tf_r8 = -1093564261992, tf_r9 = -1173710304, tf_rax = -1173710293, tf_rbx > -1093564262000, tf_rbp = 2, tf_r10 = -1098589288672, tf_r11 = 435836558, > tf_r12 = 1099887576672, tf_r13 = -1093564262000, tf_r14 = 435835520, tf_r15 > = -1173710312, tf_trapno = 12, tf_addr = 1099887577005, tf_flags > -2144607018, tf_err = 0, tf_rip = -2143313264, tf_cs = 8, tf_rflags = 66118, > tf_rsp = -1173710440, tf_ss = 16}) > at /usr/src/sys/amd64/amd64/trap.c:352 > #7 0xffffffff8041173b in calltrap () at > /usr/src/sys/amd64/amd64/exception.S:168 > #8 0xffffffff803fa290 in _vm_map_unlock () at /usr/src/sys/vm/vm_map.c:443 > #9 0xffffffff803fdecc in vm_map_lookup (var_map=0xffffffffba0a9a10, > vaddr=435835520, fault_typea=2 '\002', > out_entry=0xffffffffba0a9a18, object=0xffffff01627d9998, > pindex=0xffffffffba0a9a20, out_prot=0xffffffffba0a9a2b "", > wired=0xffffffffba0a9a2c) at /usr/src/sys/vm/vm_map.c:3074The vm_map.c does not contain a call to the vm_map_unlock() at the line 3074. Please, rebuild you kernel from scratch. In case this does not help, I ask you to show the backtrace from the ddb. Also, to speed up the conversation, could you, please, for each <function>+<offset> from the ddb output, do the list *(<function>+<offset>) in the kgdb ?> #10 0xffffffff802b845e in umtx_key_get (td=0xffffff0036f9c720, > umtx=0x19fa5280, key=0xffffff01627d9990) > at /usr/src/sys/kern/kern_umtx.c:312 > #11 0xffffffff802b8578 in _do_lock (td=0xffffff0036f9c720, umtx=0x19fa5280, > id=100582, timo=0) > at /usr/src/sys/kern/kern_umtx.c:362 > #12 0xffffffff802b99e9 in _umtx_op (td=0xffffff0036f9c720, uap=0x188e6) at > /usr/src/sys/kern/kern_umtx.c:545 > #13 0xffffffff80426dd1 in syscall (frame> {tf_rdi = 435835520, tf_rsi = 0, tf_rdx = 100582, tf_rcx = 0, tf_r8 > 0, tf_r9 = 140737452053060, tf_rax = 454, tf_rbx = 100582, tf_rbp > 435835520, tf_r10 = 1, tf_r11 = 582, tf_r12 = 9982128, tf_r13 = 1024, tf_r14 > = 0, tf_r15 = 0, tf_trapno = 12, tf_addr = 1387466752, tf_flags = 0, tf_err > = 2, tf_rip = 34378206780, tf_cs = 43, tf_rflags = 582, tf_rsp > 140737452052808, tf_ss = 35}) at /usr/src/sys/amd64/amd64/trap.c:792 > #14 0xffffffff804118d8 in Xfast_syscall () at > /usr/src/sys/amd64/amd64/exception.S:270 > #15 0x000000080119ce3c in ?? () > Previous frame inner to this frame (corrupt stack?) > (kgdb) > > (kgdb) up 6 > #6 0xffffffff80426553 in trap (frame> {tf_rdi = 1099887576672, tf_rsi = 0, tf_rdx = 0, tf_rcx = -1173710312, > tf_r8 = -1093564261992, tf_r9 = -1173710304, tf_rax = -1173710293, tf_rbx > -1093564262000, tf_rbp = 2, tf_r10 = -1098589288672, tf_r11 = 435836558, > tf_r12 = 1099887576672, tf_r13 = -1093564262000, tf_r14 = 435835520, tf_r15 > = -1173710312, tf_trapno = 12, tf_addr = 1099887577005, tf_flags > -2144607018, tf_err = 0, tf_rip = -2143313264, tf_cs = 8, tf_rflags = 66118, > tf_rsp = -1173710440, tf_ss = 16}) > at /usr/src/sys/amd64/amd64/trap.c:352 > 352 (void) trap_pfault(&frame, FALSE); > > (kgdb) up > #7 0xffffffff8041173b in calltrap () at > /usr/src/sys/amd64/amd64/exception.S:168 > 168 call trap > Current language: auto; currently asm > (kgdb) up > #8 0xffffffff803fa290 in _vm_map_unlock () at /usr/src/sys/vm/vm_map.c:443 > 443 _sx_xunlock(&map->lock, file, line); > Current language: auto; currently c > (kgdb) up > #9 0xffffffff803fdecc in vm_map_lookup (var_map=0xffffffffba0a9a10, > vaddr=435835520, fault_typea=2 '\002', > out_entry=0xffffffffba0a9a18, object=0xffffff01627d9998, > pindex=0xffffffffba0a9a20, out_prot=0xffffffffba0a9a2b "", > wired=0xffffffffba0a9a2c) at /usr/src/sys/vm/vm_map.c:3074 > 3074 vm_map_lock_read(map); > (kgdb) list > 3069 RetryLookup:; > 3070 /* > 3071 * Lookup the faulting address. > 3072 */ > 3073 > 3074 vm_map_lock_read(map); > 3075 #define RETURN(why) \ > 3076 { \ > 3077 vm_map_unlock_read(map); \ > 3078 return (why); \ > (kgdb) p map > $1 = 0x10016688660 > (kgdb) down > #8 0xffffffff803fa290 in _vm_map_unlock () at /usr/src/sys/vm/vm_map.c:443 > 443 _sx_xunlock(&map->lock, file, line); > (kgdb) list > 438 { > 439 > 440 if (map->system_map) > 441 _mtx_unlock_flags(&map->system_mtx, 0, file, line); > 442 else > 443 _sx_xunlock(&map->lock, file, line); > 444 } > 445 > 446 void > 447 _vm_map_lock_read(vm_map_t map, const char *file, int line) > > > Thanks, > Francesco Ciocchetti> _______________________________________________ > freebsd-stable@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-stable > To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org"-------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-stable/attachments/20080204/e7161bbc/attachment.pgp
> > The vm_map.c does not contain a call to the vm_map_unlock() at the > line 3074. >Mine does ... is Revision *1.366.2.3 *on Freebsd CVS for vm_map.c , CVS TAG RELENG_6_2 Please, rebuild you kernel from scratch. In case this does not help,> I ask you to show the backtrace from the ddb. Also, to speed up the > conversation, could you, please, for each <function>+<offset> from the > ddb output, do the list *(<function>+<offset>) in the kgdb ? > >Working on this thanks FC
Primeroz lists wrote:> Hi all, > > we are experiencing repeated crash on a Dell PowerEdge 2950 (rev 1 or 2). > > FBSD release is 6.2-RELEASE-p5 , AMD64. 2xXeon QuadCore and 8G of Ram. ><SNIP>> > >> $ sudo kgdb /usr/obj/usr/src/sys/PE2950/kernel.debug vmcore.2<SNIP> This back trace will be useless, I rebuilt the kernels on the build servers the week before last, not all systems got the updates. You will need to 'make installkernel' and crash the box again to get a usable vmcore... Pointy hat to me sorry, should have told you this before. Tom
----- "Primeroz lists" <primeroz.lists@googlemail.com> wrote:> Hi all, > > we are experiencing repeated crash on a Dell PowerEdge 2950 (rev 1 or > 2). > > FBSD release is 6.2-RELEASE-p5 , AMD64. 2xXeon QuadCore and 8G of Ram. > > MySQL Version is 5.0.41 with following configuration settings: > > set-variable = key_buffer=768M > set-variable = table_cache=800 > set-variable = sort_buffer=24M > set-variable = myisam_sort_buffer_size=256M > set-variable = record_buffer=16M > set-variable = max_allowed_packet=10M > set-variable = thread_stack=128K > set-variable = join_buffer=512M > set-variable = max_heap_table_size=256M > set-variable = max_connections=300 > set-variable = tmp_table_size=384M > set-variable = query_cache_size=402653184 > set-variable = query_cache_limit=134217728 > set-variable = read_rnd_buffer_size=10M > set-variable = ft_min_word_len=1 > pid-file = /var/db/mysqld.pid > tmpdir = /var/tmp > ft_stopword_file = '' > set-variable = thread_cache_size=80 > set-variable = myisam_stats_method=nulls_equalAlso, myslq is not really well tuned. The query cache is a kludge. It is helpful, if you have stupid application that issues the same query over and over again, even though the database has not changed. If you don't have this problem, it just adds overhead. And quite a lot, if it is big. Generally, the query cache should be 20 to 100M at most, if not disabled. If you have a smart web application (anything using memcached), the query cache should just be turned off. It will actually be faster. You should give us much storage as possible to the database engine, for it cache actual data, not query results. It is weird that you are apparently are heavily using Innodb, but you have just set various myiasam values? Here is something useful: http://www.joyeur.com/2007/09/25/quick-wins-with-mysql Tom