Sebastian Gosenheimer
2010-Apr-26 11:03 UTC
[Xen-users] trouble with xenserver and xfs (soft lockup - CPU#0 stuck for 61s!)
Hi everybody, i hope that i'm on the right mailing list. I'm having some trouble with xfs and xen. We are running the newest xenserver version with ha on dell servers and a dell equalogic. I set up a fileserver (debian lenny 2.6.29-xs5.5.0.17) with one ext3 partiton for the os and one xfs partition for the data. Two webserver are using this fileserver with nfs. Now i run the second time into following problem when rsync was running for backup: [212521.428003] BUG: soft lockup - CPU#0 stuck for 61s! [rsync:29921] [212521.428003] Modules linked in: ipv6 nfsd nfs lockd nfs_acl auth_rpcgss sunrpc xenfs xfs exportfs loop evdev xen_netfront pcspkr ext3 jbd mbcache xen_blkfront thermal_sys [212521.428003] [212521.428003] Pid: 29921, comm: rsync Tainted: G D (2.6.29-xs5.5.0.17 #1) [212521.428003] EIP: 0061:[<c02f2221>] EFLAGS: 00000287 CPU: 0 [212521.428003] EIP is at __write_lock_failed+0x9/0x1c [212521.428003] EAX: e1875cbc EBX: c1351e1a ECX: e2407880 EDX: dd004c50 [212521.428003] ESI: 00000000 EDI: c536fd40 EBP: e1875cbc ESP: c1351dac [212521.428003] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 [212521.428003] CR0: 8005003b CR2: b806e348 CR3: 20d99000 CR4: 00002620 [212521.428003] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [212521.428003] DR6: ffff0ff0 DR7: 00000400 [212521.428003] Call Trace: [212521.428003] [<c02f23a3>] ? _write_lock+0xe/0xf [212521.428003] [<e33f7409>] ? xfs_iget+0x328/0x44f [xfs] [212521.428003] [<e340e925>] ? xfs_lookup+0x69/0x97 [xfs] [212521.428003] [<e341638b>] ? xfs_vn_lookup+0x36/0x6e [xfs] [212521.428003] [<c0194ee1>] ? do_lookup+0xa6/0x116 [212521.428003] [<c019584d>] ? __link_path_walk+0x524/0x631 [212521.428003] [<c01057fc>] ? xen_force_evtchn_callback+0xc/0x10 [212521.428003] [<c0195d5c>] ? path_walk+0x4f/0xa3 [212521.428003] [<c0196aa0>] ? do_path_lookup+0x132/0x178 [212521.428003] [<c019732e>] ? getname+0x5e/0xb0 [212521.428003] [<c0197b05>] ? user_path_at+0x37/0x5f [212521.428003] [<c0198e80>] ? filldir64+0x0/0xc5 [212521.428003] [<c019187f>] ? vfs_lstat_fd+0x12/0x3 [212455.928003] [<c0191912>] ? sys_lstat64+0xf/0x23 [212455.928003] [<c0199124>] ? vfs_readdir+0x7c/0x8c [212455.928003] [<c0198e80>] ? filldir64+0x0/0xc5 [212455.928003] [<c01991cf>] ? sys_getdents64+0x9b/0xa5 [212455.928003] [<c0107f23>] ? sysenter_past_esp+0x3c/0x62 [212455.928003] [<c0107f5b>] ? sysenter_do_call+0x12/0x2f I have to force reboot to bring the vm back online. Are there any known problems with xfs and xen? Because i know from another deployment that they are having nearly the same issues with xfs. There are no problems with the ext3 filesystem. Thank you for your help! -- Kind regards, --sg Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail sind nicht gestattet. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Yann Cezard
2010-Apr-27 08:51 UTC
Re: [Xen-users] trouble with xenserver and xfs (soft lockup - CPU#0 stuck for 61s!)
Sebastian Gosenheimer a écrit :> Hi everybody, > > i hope that i''m on the right mailing list. I''m having some trouble with > xfs and xen. We are running the newest xenserver version with ha on dell > servers and a dell equalogic. > > I set up a fileserver (debian lenny 2.6.29-xs5.5.0.17) with one ext3 > partiton for the os and one xfs partition for the data. Two webserver > are using this fileserver with nfs. > > Now i run the second time into following problem when rsync was running > for backup: > > [212521.428003] BUG: soft lockup - CPU#0 stuck for 61s! [rsync:29921] > [212521.428003] Modules linked in: ipv6 nfsd nfs lockd nfs_acl > auth_rpcgss sunrpc xenfs xfs exportfs loop evdev xen_netfront pcspkr > ext3 jbd mbcache xen_blkfront thermal_sys > [212521.428003] > [212521.428003] Pid: 29921, comm: rsync Tainted: G D > (2.6.29-xs5.5.0.17 #1) > [212521.428003] EIP: 0061:[<c02f2221>] EFLAGS: 00000287 CPU: 0 > [212521.428003] EIP is at __write_lock_failed+0x9/0x1c > [212521.428003] EAX: e1875cbc EBX: c1351e1a ECX: e2407880 EDX: dd004c50 > [212521.428003] ESI: 00000000 EDI: c536fd40 EBP: e1875cbc ESP: c1351dac > [212521.428003] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 > [212521.428003] CR0: 8005003b CR2: b806e348 CR3: 20d99000 CR4: 00002620 > [212521.428003] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [212521.428003] DR6: ffff0ff0 DR7: 00000400 > [212521.428003] Call Trace: > [212521.428003] [<c02f23a3>] ? _write_lock+0xe/0xf > [212521.428003] [<e33f7409>] ? xfs_iget+0x328/0x44f [xfs] > [212521.428003] [<e340e925>] ? xfs_lookup+0x69/0x97 [xfs] > [212521.428003] [<e341638b>] ? xfs_vn_lookup+0x36/0x6e [xfs] > [212521.428003] [<c0194ee1>] ? do_lookup+0xa6/0x116 > [212521.428003] [<c019584d>] ? __link_path_walk+0x524/0x631 > [212521.428003] [<c01057fc>] ? xen_force_evtchn_callback+0xc/0x10 > [212521.428003] [<c0195d5c>] ? path_walk+0x4f/0xa3 > [212521.428003] [<c0196aa0>] ? do_path_lookup+0x132/0x178 > [212521.428003] [<c019732e>] ? getname+0x5e/0xb0 > [212521.428003] [<c0197b05>] ? user_path_at+0x37/0x5f > [212521.428003] [<c0198e80>] ? filldir64+0x0/0xc5 > [212521.428003] [<c019187f>] ? vfs_lstat_fd+0x12/0x3 > [212455.928003] [<c0191912>] ? sys_lstat64+0xf/0x23 > [212455.928003] [<c0199124>] ? vfs_readdir+0x7c/0x8c > [212455.928003] [<c0198e80>] ? filldir64+0x0/0xc5 > [212455.928003] [<c01991cf>] ? sys_getdents64+0x9b/0xa5 > [212455.928003] [<c0107f23>] ? sysenter_past_esp+0x3c/0x62 > [212455.928003] [<c0107f5b>] ? sysenter_do_call+0x12/0x2f > > I have to force reboot to bring the vm back online. Are there any known > problems with xfs and xen? Because i know from another deployment that > they are having nearly the same issues with xfs. There are no problems > with the ext3 filesystem. > > Thank you for your help! > >Hi Sebastian, I''m having similar issues with two Lenny domU (amd64 and i386), both with XFS + quite heavy load (I should say normal load for what they do : database server and fileserver). They are running the standard Lenny Xen kernel (2.6.26-2-xen-*) After searching a little, I found this thread : http://developer.amazonwebservices.com/connect/thread.jspa?threadID=28968&start=0&tstart=0 Not sure that it is connected to my (or your) problem, but the domUs I have problems with both have XFS log version 2, i have others domUs with XFS filesystems and quite heavy charge (mail servers basically), but they are still in Etch and XFS log version 1, no problem with them. So I think I will give the log version workaround a try. Perhaps could you give it a try to ? In any case, if you find something about that problem, I would be happy to have feedback from you. Cheers, -- Yann Cézard - Administrateur Systèmes Serveurs _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Sebastian Gosenheimer
2010-Apr-30 12:45 UTC
Re: [Xen-users] trouble with xenserver and xfs (soft lockup - CPU#0 stuck for 61s!)
Am 27.04.2010 10:51, schrieb Yann Cezard:> Sebastian Gosenheimer a écrit : >> Hi everybody, >> >> i hope that i'm on the right mailing list. I'm having some trouble with >> xfs and xen. We are running the newest xenserver version with ha on dell >> servers and a dell equalogic. >> >> I set up a fileserver (debian lenny 2.6.29-xs5.5.0.17) with one ext3 >> partiton for the os and one xfs partition for the data. Two webserver >> are using this fileserver with nfs. >> >> Now i run the second time into following problem when rsync was running >> for backup: >> >> [212521.428003] BUG: soft lockup - CPU#0 stuck for 61s! [rsync:29921] >> [212521.428003] Modules linked in: ipv6 nfsd nfs lockd nfs_acl >> auth_rpcgss sunrpc xenfs xfs exportfs loop evdev xen_netfront pcspkr >> ext3 jbd mbcache xen_blkfront thermal_sys >> [212521.428003] >> [212521.428003] Pid: 29921, comm: rsync Tainted: G D >> (2.6.29-xs5.5.0.17 #1) >> [212521.428003] EIP: 0061:[<c02f2221>] EFLAGS: 00000287 CPU: 0 >> [212521.428003] EIP is at __write_lock_failed+0x9/0x1c >> [212521.428003] EAX: e1875cbc EBX: c1351e1a ECX: e2407880 EDX: dd004c50 >> [212521.428003] ESI: 00000000 EDI: c536fd40 EBP: e1875cbc ESP: c1351dac >> [212521.428003] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 >> [212521.428003] CR0: 8005003b CR2: b806e348 CR3: 20d99000 CR4: 00002620 >> [212521.428003] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 >> [212521.428003] DR6: ffff0ff0 DR7: 00000400 >> [212521.428003] Call Trace: >> [212521.428003] [<c02f23a3>] ? _write_lock+0xe/0xf >> [212521.428003] [<e33f7409>] ? xfs_iget+0x328/0x44f [xfs] >> [212521.428003] [<e340e925>] ? xfs_lookup+0x69/0x97 [xfs] >> [212521.428003] [<e341638b>] ? xfs_vn_lookup+0x36/0x6e [xfs] >> [212521.428003] [<c0194ee1>] ? do_lookup+0xa6/0x116 >> [212521.428003] [<c019584d>] ? __link_path_walk+0x524/0x631 >> [212521.428003] [<c01057fc>] ? xen_force_evtchn_callback+0xc/0x10 >> [212521.428003] [<c0195d5c>] ? path_walk+0x4f/0xa3 >> [212521.428003] [<c0196aa0>] ? do_path_lookup+0x132/0x178 >> [212521.428003] [<c019732e>] ? getname+0x5e/0xb0 >> [212521.428003] [<c0197b05>] ? user_path_at+0x37/0x5f >> [212521.428003] [<c0198e80>] ? filldir64+0x0/0xc5 >> [212521.428003] [<c019187f>] ? vfs_lstat_fd+0x12/0x3 >> [212455.928003] [<c0191912>] ? sys_lstat64+0xf/0x23 >> [212455.928003] [<c0199124>] ? vfs_readdir+0x7c/0x8c >> [212455.928003] [<c0198e80>] ? filldir64+0x0/0xc5 >> [212455.928003] [<c01991cf>] ? sys_getdents64+0x9b/0xa5 >> [212455.928003] [<c0107f23>] ? sysenter_past_esp+0x3c/0x62 >> [212455.928003] [<c0107f5b>] ? sysenter_do_call+0x12/0x2f >> >> I have to force reboot to bring the vm back online. Are there any known >> problems with xfs and xen? Because i know from another deployment that >> they are having nearly the same issues with xfs. There are no problems >> with the ext3 filesystem. >> >> Thank you for your help! >> >> > Hi Sebastian, > > I'm having similar issues with two Lenny domU (amd64 and i386), > both with XFS + quite heavy load (I should say normal load for > what they do : database server and fileserver). > They are running the standard Lenny Xen kernel (2.6.26-2-xen-*) > > After searching a little, I found this thread : > http://developer.amazonwebservices.com/connect/thread.jspa?threadID=28968&start=0&tstart=0 > > Not sure that it is connected to my (or your) problem, but the domUs > I have problems with both have XFS log version 2, i have others domUs > with XFS filesystems and quite heavy charge (mail servers basically), > but they are still in Etch and XFS log version 1, no problem with them. > > So I think I will give the log version workaround a try. > > Perhaps could you give it a try to ? > > In any case, if you find something about that problem, I would be happy > to have feedback from you. > > Cheers, >Hi Yann, sorry for the late answer. I was messing arround the week with some other things. I have now since four days up and running a new kernel on the fileserver (2.6.32.12-xs5.5.0.17). Since this time everything seems to be fine. Right now i don't know if there is a xfs or nfs problem in the kernel 2.6.26-2-xen-*. I will update all our machines to a newer kernel know which are using nfs and/or xfs. From my opionion the thread you sent isn't connected to my problem. Did the workaround work for you or do you still have problems. --sg Diese E-Mail enthält vertrauliche und/oder rechtlich geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den Absender und vernichten Sie diese Mail. Das unerlaubte Kopieren sowie die unbefugte Weitergabe dieser Mail sind nicht gestattet. This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Seemingly Similar Threads
- Bug: Dovecot index loosing sync with FTS despite "fts_autoindex = yes"
- Registry backend changes with CTDB not noticed by other nodes?
- Bug: Dovecot index loosing sync with FTS despite "fts_autoindex = yes"
- Network setup on Redhat based systems
- Wine release 4.7