Rudi Ahlers
2010-Jan-28 11:30 UTC
[Xen-users] NFS vs SMb vs iSCSI for remote backup mounts
Hi, I would like to get some input from people who have used these options for mounting a remote server to a local server. Basically, I need to replicate / backup data from one server to another, but over the internet (i.e. insecure channels) Currently we have been mounting an SMB share over SSH, but it''s got it''s own set of problems. And I don''t know if this is optimal, or if I could setup something better. We don''t have much control over the remote server, so I couldn''t setup a VPN, or iSCSI or anything else. My options was FTP & SMB. But I want to move the backups in-house, to save bandwidth and have more control over what we do. So, with a new CentOS server & 2x1TB HDD''s in RAID1 configuration, I can do pretty much whatever I want. The backup server(s) will serve backups for multiple servers, in different data centers (possible in different counties as well, I still need to think about this), so my biggest concern is security. We mainly use cPanel & DotNetPanel (Windows ServerS) , but also WebMin & VirtualMin, so I need to stick with their native backup procedures and don''t really want to use a too technical backup system. The end users need access to the data 24/7, so having the remote share permanently mounted seems to be the best for this, then our support staff don''t need to SSH into the servers and download the backups. With the mount, I can also use rsync backups, so an end user could restore only a single file if need be. NOW, the question is: Which protocol would be best for this? I can only think of SMB, NFS & iSCSI The SMB mounts have worked well so far, but it''s not as safe, and once the SMB share is mounted, I can''t unmount it until the server reboots. This isn''t necessarily a bad thing, but sometime the backup script will mount the share again (I think this is a bug in cPanel) and we end up with 4 or 5 open connection to the remote server. NFS - last time I looked at it was on V3, which was IMO rather slow & insecure. iSCSI - this doesn''t allow for more than one connect to the same share. Sometimes I user might want to download a backup directly from the backup server via FTP / SSH / a web interface, which I don''t think will work. We also sometimes need to restore a backup on a different server (if for example the HDD on the initial server is too full), so this isn''t possible. The remote shares also need to be mounted inside XEN domU''s, or directly on CentOS / Windows servers. what would be my best option for this? -- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
francisco javier funes nieto
2010-Jan-28 11:47 UTC
Re: [Xen-users] NFS vs SMb vs iSCSI for remote backup mounts
Maybe SSHFS? http://fuse.sourceforge.net/sshfs.html I didn''t used it, but it''s here! ;-) J. 2010/1/28 Rudi Ahlers <Rudi@softdux.com>:> > Hi, > > I would like to get some input from people who have used these options for > mounting a remote server to a local server. Basically, I need to replicate / > backup data from one server to another, but over the internet (i.e. insecure > channels) > > Currently we have been mounting an SMB share over SSH, but it''s got it''s own > set of problems. And I don''t know if this is optimal, or if I could setup > something better. We don''t have much control over the remote server, so I > couldn''t setup a VPN, or iSCSI or anything else. My options was FTP & SMB. > > But I want to move the backups in-house, to save bandwidth and have more > control over what we do. > > So, with a new CentOS server & 2x1TB HDD''s in RAID1 configuration, I can do > pretty much whatever I want. The backup server(s) will serve backups for > multiple servers, in different data centers (possible in different counties > as well, I still need to think about this), so my biggest concern is > security. > > We mainly use cPanel & DotNetPanel (Windows ServerS) , but also WebMin & > VirtualMin, so I need to stick with their native backup procedures and don''t > really want to use a too technical backup system. > > The end users need access to the data 24/7, so having the remote share > permanently mounted seems to be the best for this, then our support staff > don''t need to SSH into the servers and download the backups. With the mount, > I can also use rsync backups, so an end user could restore only a single > file if need be. > > > > NOW, the question is: Which protocol would be best for this? I can only > think of SMB, NFS & iSCSI > The SMB mounts have worked well so far, but it''s not as safe, and once the > SMB share is mounted, I can''t unmount it until the server reboots. This > isn''t necessarily a bad thing, but sometime the backup script will mount the > share again (I think this is a bug in cPanel) and we end up with 4 or 5 open > connection to the remote server. > > NFS - last time I looked at it was on V3, which was IMO rather slow & > insecure. > > iSCSI - this doesn''t allow for more than one connect to the same share. > Sometimes I user might want to download a backup directly from the backup > server via FTP / SSH / a web interface, which I don''t think will work. We > also sometimes need to restore a backup on a different server (if for > example the HDD on the initial server is too full), so this isn''t possible. > > The remote shares also need to be mounted inside XEN domU''s, or directly on > CentOS / Windows servers. > > > what would be my best option for this? > > > > > -- > Kind Regards > Rudi Ahlers > SoftDux > > Website: http://www.SoftDux.com > Technical Blog: http://Blog.SoftDux.com > Office: 087 805 9573 > Cell: 082 554 7532 > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >-- _____________________________________________ Francisco Javier Funes Nieto [esencia@gmail.com] CANONIGOS Servicios Informáticos para PYMES. Cl. Cruz 2, 1º Oficina 7 Tlf: 958.536759 / 661134556 Fax: 958.521354 GRANADA - 18002 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Rudi Ahlers
2010-Jan-28 11:59 UTC
Re: [Xen-users] NFS vs SMb vs iSCSI for remote backup mounts
On Thu, Jan 28, 2010 at 1:47 PM, francisco javier funes nieto < esencia@gmail.com> wrote:> Maybe SSHFS? http://fuse.sourceforge.net/sshfs.html > > I didn''t used it, but it''s here! ;-) > > J. > > > ________ > > Francisco Javier Funes Nieto [esencia@gmail.com] > CANONIGOS > Servicios Informáticos para PYMES. > Cl. Cruz 2, 1º Oficina 7 > Tlf: 958.536759 / 661134556 > Fax: 958.521354 > GRANADA - 18002 > > _______________________________________________ >Thanx, I did try it, and it needs extra kernel modules to be installed, which is sometimes a problem with the client''s domU''s. i.e. when they rebuild the domU''s, then we need to manually add the extra kernel modules - so it creates extra load ofn the support techs. -- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
francisco javier funes nieto
2010-Jan-28 12:06 UTC
Re: [Xen-users] NFS vs SMb vs iSCSI for remote backup mounts
You can use Bacula too (for a complete backup solution) with encrypted communication provided by TLS. It''s a great piece of software! I''ve worked with it since 1.38 version. http://www.bacula.org http://bacula.org/5.0.x-manuals/en/main/main/Bacula_TLS_Communications.html 2010/1/28 Rudi Ahlers <Rudi@softdux.com>:> On Thu, Jan 28, 2010 at 1:47 PM, francisco javier funes nieto > <esencia@gmail.com> wrote: >> >> Maybe SSHFS? http://fuse.sourceforge.net/sshfs.html >> >> I didn''t used it, but it''s here! ;-) >> >> J. >> >> >> ________ >> >> Francisco Javier Funes Nieto [esencia@gmail.com] >> CANONIGOS >> Servicios Informáticos para PYMES. >> Cl. Cruz 2, 1º Oficina 7 >> Tlf: 958.536759 / 661134556 >> Fax: 958.521354 >> GRANADA - 18002 >> >> _______________________________________________ > > > Thanx, I did try it, and it needs extra kernel modules to be installed, > which is sometimes a problem with the client''s domU''s. i.e. when they > rebuild the domU''s, then we need to manually add the extra kernel modules - > so it creates extra load ofn the support techs. > -- > Kind Regards > Rudi Ahlers > SoftDux > > Website: http://www.SoftDux.com > Technical Blog: http://Blog.SoftDux.com > Office: 087 805 9573 > Cell: 082 554 7532 >-- _____________________________________________ Francisco Javier Funes Nieto [esencia@gmail.com] CANONIGOS Servicios Informáticos para PYMES. Cl. Cruz 2, 1º Oficina 7 Tlf: 958.536759 / 661134556 Fax: 958.521354 GRANADA - 18002 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Simon Hobson
2010-Jan-28 12:09 UTC
Re: [Xen-users] NFS vs SMb vs iSCSI for remote backup mounts
For Linux machines you have to look hard to beat Rsync - it''s very efficient and supports encryption and rate limiting. Sadly not too much help for Windows machines. Regarding the options you''ve listed, iSCSI doesn''t support mounting a volume from multiple places unless you use a shared filesystem. That measns you''d need to either use a shared filesystem, use separate volumes for each client, or use some sort of locking mechanism to prevent multiple mounts. It also doesn''t do encryption, so for remote sites you''d have to use a VPN - but I''d suggest doing that anyway as having a unified network has many advantages. If you do put a VPN in place, then for Windows stuff it might be worth you looking at Microsoft''s DPM (Data Protection Manager). I know nothing about it, but the guys at work who deal with the MS stuff have been raving about it like someone''s invented sliced bread. -- Simon Hobson WANTED: "Software CD ROM Kit" for Canon CLBP 360-PS printer (Canon part no RH6-3612, or possibly RH6-3810, or RH6-3610 might do). I''ve a dead HD and need this CD so I can replace the disk and re-install the printer OS on it. If anyone knows where I might get hold of one I''d be grateful - requests to Canon drew a blank, it''s been out of support for years. Alternatively, if anyone has one of these and would let me image their hard disk ... Visit http://www.magpiesnestpublishing.co.uk/ for books by acclaimed author Gladys Hobson. Novels - poetry - short stories - ideal as Christmas stocking fillers. Some available as e-books. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Rudi Ahlers
2010-Jan-28 13:30 UTC
Re: [Xen-users] NFS vs SMb vs iSCSI for remote backup mounts
On Thu, Jan 28, 2010 at 2:06 PM, francisco javier funes nieto < esencia@gmail.com> wrote:> You can use Bacula too (for a complete backup solution) with encrypted > communication provided by TLS. > > It''s a great piece of software! I''ve worked with it since 1.38 version. > > http://www.bacula.org > http://bacula.org/5.0.x-manuals/en/main/main/Bacula_TLS_Communications.html > > >I don''t want to replace the backup system I have on the clients. My question is rather related to the type of remote system I backup to, and in terms of speed / reliability / security. The server has RAID, and will be rsynced to a 2nd backup server, so that''s not my concern either. -- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Javier Guerra
2010-Jan-28 14:51 UTC
Re: [Xen-users] NFS vs SMb vs iSCSI for remote backup mounts
On Thu, Jan 28, 2010 at 6:30 AM, Rudi Ahlers <Rudi@softdux.com> wrote:> Basically, I need to replicate / > backup data from one server to another, but over the internet (i.e. insecure > channels)it would be _really_ hard to find anything better than rsync. Both because of safety (uses ssh by default) and efficiency (copies only what''s needed) if you need point-in-time snapshots while the servers are running, the simplest way is to do an LVM snapshot, mount it (as read-only) and rsync from this to the remote server. afterwards simply destroy the snapshot. -- Javier _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Rudi Ahlers
2010-Jan-28 22:19 UTC
Re: [Xen-users] NFS vs SMb vs iSCSI for remote backup mounts
On Thu, Jan 28, 2010 at 4:51 PM, Javier Guerra <javier@guerrag.com> wrote:> On Thu, Jan 28, 2010 at 6:30 AM, Rudi Ahlers <Rudi@softdux.com> wrote: > > Basically, I need to replicate / > > backup data from one server to another, but over the internet (i.e. > insecure > > channels) > > > it would be _really_ hard to find anything better than rsync. Both > because of safety (uses ssh by default) and efficiency (copies only > what''s needed) > > if you need point-in-time snapshots while the servers are running, the > simplest way is to do an LVM snapshot, mount it (as read-only) and > rsync from this to the remote server. afterwards simply destroy the > snapshot. > >ok, forget about rsync. forget about how I get the data onto the order server. WHICH filesystem would be best for this type of operation? SMB, NFS, or iSCSI? -- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
francisco javier funes nieto
2010-Jan-28 22:35 UTC
Re: [Xen-users] NFS vs SMb vs iSCSI for remote backup mounts
SMB, NFS and iSCSI are protocols, not filesystems. 2010/1/28 Rudi Ahlers <Rudi@softdux.com>:> > > On Thu, Jan 28, 2010 at 4:51 PM, Javier Guerra <javier@guerrag.com> wrote: >> >> On Thu, Jan 28, 2010 at 6:30 AM, Rudi Ahlers <Rudi@softdux.com> wrote: >> > Basically, I need to replicate / >> > backup data from one server to another, but over the internet (i.e. >> > insecure >> > channels) >> >> >> it would be _really_ hard to find anything better than rsync. Both >> because of safety (uses ssh by default) and efficiency (copies only >> what''s needed) >> >> if you need point-in-time snapshots while the servers are running, the >> simplest way is to do an LVM snapshot, mount it (as read-only) and >> rsync from this to the remote server. afterwards simply destroy the >> snapshot. >> > > > ok, forget about rsync. forget about how I get the data onto the order > server. WHICH filesystem would be best for this type of operation? SMB, NFS, > or iSCSI? > -- > Kind Regards > Rudi Ahlers > SoftDux > > Website: http://www.SoftDux.com > Technical Blog: http://Blog.SoftDux.com > Office: 087 805 9573 > Cell: 082 554 7532 > > _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >-- _____________________________________________ Francisco Javier Funes Nieto [esencia@gmail.com] CANONIGOS Servicios Informáticos para PYMES. Cl. Cruz 2, 1º Oficina 7 Tlf: 958.536759 / 661134556 Fax: 958.521354 GRANADA - 18002 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Simon Hobson
2010-Jan-29 09:15 UTC
Re: [Xen-users] NFS vs SMb vs iSCSI for remote backup mounts
Rudi Ahlers wrote:>ok, forget about rsync. forget about how I get the data onto the >order server. WHICH filesystem would be best for this type of >operation? SMB, NFS, or iSCSI?As already stated, iSCSI is **NOT** a filesystem or a share of a filesystem - it is a network block device. You CANNOT share an iSCSI volume between multiple guests without running a cluster filesystem. If you use iSCSI, then you need to do one of three things : 1) Use a cluster file system on all the guests 2) Use a separate volume for each guest 3) Come up with some form of locking mechanism to allow one guest at a time to mount the volume Mounting a volume on two guests without a cluster file system is guaranteed to trash the filesystem on the volume. As to SMB vs NFS, a lot depends on the filesystem semantics your backup process needs. SMB should support WIndows file system sematics/metadata, NFS only supports Unix file system semantics/metadata. If that matters then the decision is made for you - eg if the backup is storing Windows files natively on the backup filesystem then you''ll have to use SMB in order to retain the file metadata. Also, when comparing (or asking about) file system performance, you need to specify the conditions. Performance is likely to be different between a setup storing individual files (ie lots of create,write,close,update directory operations) and a single large archive setup (ie where the backup program creates a big file and streams the backup data into it). I don''t personally have any data on this either way. -- Simon Hobson WANTED: "Software CD ROM Kit" for Canon CLBP 360-PS printer (Canon part no RH6-3612, or possibly RH6-3810, or RH6-3610 might do). I''ve a dead HD and need this CD so I can replace the disk and re-install the printer OS on it. If anyone knows where I might get hold of one I''d be grateful - requests to Canon drew a blank, it''s been out of support for years. Alternatively, if anyone has one of these and would let me image their hard disk ... Visit http://www.magpiesnestpublishing.co.uk/ for books by acclaimed author Gladys Hobson. Novels - poetry - short stories - ideal as Christmas stocking fillers. Some available as e-books. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Rudi Ahlers
2010-Jan-29 09:57 UTC
Re: [Xen-users] NFS vs SMb vs iSCSI for remote backup mounts
On Fri, Jan 29, 2010 at 11:15 AM, Simon Hobson <linux@thehobsons.co.uk>wrote:> Rudi Ahlers wrote: > > ok, forget about rsync. forget about how I get the data onto the order >> server. WHICH filesystem would be best for this type of operation? SMB, NFS, >> or iSCSI? >> > > As already stated, iSCSI is **NOT** a filesystem or a share of a filesystem > - it is a network block device. You CANNOT share an iSCSI volume between > multiple guests without running a cluster filesystem. If you use iSCSI, then > you need to do one of three things : > 1) Use a cluster file system on all the guests > 2) Use a separate volume for each guest > 3) Come up with some form of locking mechanism to allow one guest at a time > to mount the volume > Mounting a volume on two guests without a cluster file system is guaranteed > to trash the filesystem on the volume. > >Fair enough, but iSCSI is commonly used on NAS devices, and then export whatever filesystem is being used to the host. Which is why I am considering it.> As to SMB vs NFS, a lot depends on the filesystem semantics your backup > process needs. SMB should support WIndows file system sematics/metadata, NFS > only supports Unix file system semantics/metadata. If that matters then the > decision is made for you - eg if the backup is storing Windows files > natively on the backup filesystem then you''ll have to use SMB in order to > retain the file metadata. >There is a mixture of Windows & Linux data, but would NFS give me better performance for the Linux hosts?>Also, when comparing (or asking about) file system performance, you need to> specify the conditions. Performance is likely to be different between a > setup storing individual files (ie lots of create,write,close,update > directory operations) and a single large archive setup (ie where the backup > program creates a big file and streams the backup data into it). > I don''t personally have any data on this either way. > > sure, understandable, but this is almost a different subject :) The datathat goes on there will be a mix of smalls files & large files> > -- > Simon Hobson > > >-- Kind Regards Rudi Ahlers SoftDux Website: http://www.SoftDux.com Technical Blog: http://Blog.SoftDux.com Office: 087 805 9573 Cell: 082 554 7532 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Simon Hobson
2010-Jan-29 11:55 UTC
Re: [Xen-users] NFS vs SMb vs iSCSI for remote backup mounts
Rudi Ahlers wrote:>Fair enough, but iSCSI is commonly used on NAS devices, and then >export whatever filesystem is being used to the host. Which is why I >am considering it.In which case it''s not a case of iSCSI vs <something>, it''s NAS+iSCSI vs <something>.>There is a mixture of Windows & Linux data, but would NFS give me >better performance for the Linux hosts?Personally, I would choose NFS over SMB for storing Linux files - for the simple reason that the file system semantics are directly compatible with the files being stored. I''ve no idea how it compares with SMB though - and in any case again it''s not a case of NFS vs SMB, it''s <implementation of NFS> vs <implementation of NFS>. It''s entirely possible that vendor A''s box does SMB better while vendor B''s box does NFS better. I''d still choose Rsync if available for a Linux client. -- Simon Hobson WANTED: "Software CD ROM Kit" for Canon CLBP 360-PS printer (Canon part no RH6-3612, or possibly RH6-3810, or RH6-3610 might do). I''ve a dead HD and need this CD so I can replace the disk and re-install the printer OS on it. If anyone knows where I might get hold of one I''d be grateful - requests to Canon drew a blank, it''s been out of support for years. Alternatively, if anyone has one of these and would let me image their hard disk ... Visit http://www.magpiesnestpublishing.co.uk/ for books by acclaimed author Gladys Hobson. Novels - poetry - short stories - ideal as Christmas stocking fillers. Some available as e-books. _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Hi all, I''ve been experiencing a rash of CPU lockups on a number of domU''s recently. It''s been happening on two different servers. About a year ago I had this problem every once in a while but it was not frequent. I was running Ubuntu with Xen 3.1 and 2.6.24-18 back then. I''m now running Xen 3.3 and 2.6.24-26. What I have noticed is that just prior to the lockups the domU''s had high cpu loads. The domU that I have the most problems with is a Zimbra server. My guess is that a rash of spam comes through and cpu loads get high, then the cpu''s lock up. Originally I had it running with 1 cpu but have since upped it 2 then 3 cpu''s. I have been collecting the lockup messages and have posed a few below. Any ideas? Recommendations? Thanks, Dana [138077.172283] ======================[138075.147398] BUG: soft lockup - CPU#0 stuck for 11s! [kswapd0:97] [138075.147411] [138075.147419] Pid: 97, comm: kswapd0 Tainted: G D (2.6.24-26-xen #1) [138075.147426] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 0 [138075.147441] EIP is at _spin_lock+0x7/0x10 [138075.147447] EAX: c1da48ec EBX: 00000000 ECX: 220c7000 EDX: 00000000 [138075.147453] ESI: 8b804067 EDI: c1da48ec EBP: 00000f28 ESP: ed707dec [138075.147459] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 [138075.147471] CR0: 8005003b CR2: 080f0010 CR3: 2213b000 CR4: 00000660 [138075.147482] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [138075.147488] DR6: ffff0ff0 DR7: 00000400 [138075.147495] [<c01773cb>] page_check_address+0x1cb/0x3c0 [138075.147514] [<c0119868>] xen_invlpg_mask+0x38/0x40 [138075.147529] [<c017762e>] page_referenced_one+0x6e/0x190 [138075.147541] [<c017875c>] page_referenced+0xec/0x130 [138075.147552] [<c01671cf>] shrink_active_list+0x18f/0x5c0 [138075.147567] [<c016826d>] shrink_zone+0xdd/0x100 [138075.147578] [<c01688cc>] kswapd+0x44c/0x490 [138075.147589] [<c013bb00>] autoremove_wake_function+0x0/0x40 [138075.147603] [<c011e270>] complete+0x40/0x60 [138075.147614] [<c0168480>] kswapd+0x0/0x490 [138075.147625] [<c013b842>] kthread+0x42/0x70 [138075.147635] [<c013b800>] kthread+0x0/0x70 [138075.147646] [<c0105bb7>] kernel_thread_helper+0x7/0x10 [138075.147658] ======================[138088.987826] BUG: soft lockup - CPU#1 stuck for 11s! [java:23215] [138088.987841] [138088.987846] Pid: 23215, comm: java Tainted: G D (2.6.24-26-xen #1) [138088.987850] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 1 [138088.987862] EIP is at _spin_lock+0x7/0x10 [138088.987866] EAX: c1da48ec EBX: 00000000 ECX: c1da48e0 EDX: 00000ca8 [138088.987870] ESI: 8b804067 EDI: 00000000 EBP: e20c7ca8 ESP: e226be04 [138088.987873] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 [138088.987883] CR0: 80050033 CR2: 940ef020 CR3: 2211f000 CR4: 00000660 [138088.987891] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [138088.987896] DR6: ffff0ff0 DR7: 00000400 [138088.987901] [<c016d88d>] unmap_vmas+0x43d/0xae0 [138088.987922] [<c011959c>] kmap_atomic+0x1c/0x30 [138088.987941] [<c01192fd>] kunmap_atomic+0x3d/0x60 [138088.987957] [<c0173ee8>] vma_adjust+0x1c8/0x440 [138088.987967] [<c0173765>] unmap_region+0x95/0x120 [138088.987975] [<c0174387>] do_munmap+0x147/0x1f0 [138088.987983] [<c0174c90>] mmap_region+0x70/0x450 [138088.987991] [<c01db3b7>] security_file_mmap+0x27/0x30 [138088.988001] [<c0175472>] do_mmap_pgoff+0x312/0x330 [138088.988008] [<c010a02b>] sys_mmap2+0xbb/0xd0 [138088.988016] [<c0105832>] syscall_call+0x7/0xb [138088.988023] [<c0320000>] svc_accept+0x150/0x410 [138088.988032] ====================== [66916.451144] BUG: soft lockup - CPU#0 stuck for 11s! [java:2758] [66928.193453] BUG: soft lockup - CPU#1 stuck for 11s! [java:3419] [336990.703192] BUG: soft lockup - CPU#1 stuck for 11s! [ps:32586] [336990.703206] [336990.703214] Pid: 32586, comm: ps Tainted: G D (2.6.24-26-xen #1) [336990.703221] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 1 [336990.703235] EIP is at _spin_lock+0x7/0x10 [336990.703241] EAX: c1dbc72c EBX: 00000000 ECX: c1dbc720 EDX: 00000007 [336990.703247] ESI: 57b51067 EDI: 00000001 EBP: e2cb93c8 ESP: e2033e4c [336990.703253] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 [336990.703266] CR0: 80050033 CR2: 08079004 CR3: 23651000 CR4: 00000660 [336990.703275] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [336990.703282] DR6: ffff0ff0 DR7: 00000400 [336990.703288] [<c0171646>] handle_mm_fault+0xae6/0x1360 [336990.703307] [<c020e057>] rb_insert_color+0x77/0xe0 [336990.703325] [<c032a27e>] do_page_fault+0x35e/0xe70 [336990.703337] [<c01745d4>] vma_merge+0x144/0x1d0 [336990.703349] [<c0174b75>] do_brk+0x195/0x240 [336990.703362] [<c0175126>] sys_brk+0xb6/0xf0 [336990.703374] [<c0329f20>] do_page_fault+0x0/0xe70 [336990.703384] [<c0328bc5>] error_code+0x35/0x40 [336990.703396] ======================[337005.938292] BUG: soft lockup - CPU#2 stuck for 11s! [zmlocalconfig:11371] [337005.938306] [337005.938312] Pid: 11371, comm: zmlocalconfig Tainted: G D (2.6.24-26-xen #1) [337005.938318] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 2 [337005.938330] EIP is at _spin_lock+0x7/0x10 [337005.938335] EAX: ec64a870 EBX: ec64a870 ECX: 00000002 EDX: ec64a871 [337005.938339] ESI: 00000000 EDI: c03fe800 EBP: c1261e38 ESP: c1261d7c [337005.938343] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 [337005.938357] CR0: 8005003b CR2: 08128000 CR3: 25d8e000 CR4: 00000660 [337005.938364] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [337005.938370] DR6: ffff0ff0 DR7: 00000400 [337005.938376] [<c01771f0>] page_lock_anon_vma+0x20/0x30 [337005.938391] [<c01786fd>] page_referenced+0x8d/0x130 [337005.938401] [<c01671cf>] shrink_active_list+0x18f/0x5c0 [337005.938411] [<c0164286>] get_dirty_limits+0x16/0x200 [337005.938421] [<ee04b38e>] mb_cache_shrink_fn+0x1e/0x100 [mbcache] [337005.938435] [<c016826d>] shrink_zone+0xdd/0x100 [337005.938444] [<c0168d72>] try_to_free_pages+0x152/0x250 [337005.938453] [<c0162fcb>] __alloc_pages+0x14b/0x390 [337005.938463] [<c01855c5>] do_sync_read+0xd5/0x120 [337005.938475] [<c0163247>] __get_free_pages+0x37/0x50 [337005.938483] [<c0124496>] copy_process+0xa6/0x1210 [337005.938493] [<c0197c34>] d_alloc+0x114/0x1a0 [337005.938503] [<c0125830>] do_fork+0x40/0x260 [337005.938511] [<c0210f00>] copy_to_user+0x30/0x60 [337005.938523] [<c0103226>] sys_clone+0x36/0x40 [337005.938530] [<c0105832>] syscall_call+0x7/0xb [337005.938542] ======================[336990.803889] BUG: soft lockup - CPU#0 stuck for 11s! [kswapd0:103] [336990.803907] [336990.803915] Pid: 103, comm: kswapd0 Tainted: G D (2.6.24-26-xen #1) [336990.803922] EIP: 0061:[<c03286ea>] EFLAGS: 00000286 CPU: 0 [336990.803940] EIP is at _spin_lock+0xa/0x10 [336990.803948] EAX: c1dbc86c EBX: 00000000 ECX: 22cc3000 EDX: 00000000 [336990.803955] ESI: 57b47067 EDI: c1dbc86c EBP: 00000ff0 ESP: ed725dec [336990.803961] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 [336990.803976] CR0: 8005003b CR2: b791e6d9 CR3: 23e3b000 CR4: 00000660 [336990.803986] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [336990.803992] DR6: ffff0ff0 DR7: 00000400 [336990.804001] [<c01773cb>] page_check_address+0x1cb/0x3c0 [336990.804026] [<c017762e>] page_referenced_one+0x6e/0x190 [336990.804039] [<c017875c>] page_referenced+0xec/0x130 [336990.804049] [<c01671cf>] shrink_active_list+0x18f/0x5c0 [336990.804064] [<c0210556>] memmove+0x36/0x40 [336990.804079] [<c0164286>] get_dirty_limits+0x16/0x200 [336990.804089] [<c0139857>] call_rcu+0x97/0xa0 [336990.804102] [<ee04b38e>] mb_cache_shrink_fn+0x1e/0x100 [mbcache] [336990.804120] [<c016826d>] shrink_zone+0xdd/0x100 [336990.804132] [<c01688cc>] kswapd+0x44c/0x490 [336990.804145] [<c013bb00>] autoremove_wake_function+0x0/0x40 [336990.804160] [<c011e270>] complete+0x40/0x60 [336990.804172] [<c0168480>] kswapd+0x0/0x490 [336990.804183] [<c013b842>] kthread+0x42/0x70 [336990.804194] [<c013b800>] kthread+0x0/0x70 [336990.804206] [<c0105bb7>] kernel_thread_helper+0x7/0x10 [336990.804218] ======================_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Sun, Jan 31, 2010 at 11:23 PM, Dana Rawding <dana@twc-inc.net> wrote:> Hi all, > > I''ve been experiencing a rash of CPU lockups on a number of domU''s recently. It''s been happening on two different servers. About a year ago I had this problem every once in a while but it was not frequent. I was running Ubuntu with Xen 3.1 and 2.6.24-18 back then. I''m now running Xen 3.3 and 2.6.24-26.Since you''re using Ubuntu''s kernel, the Ubuntu-way would be to report this bug on bugs.ubuntu.com, and wait until they come up with a fix :P> What I have noticed is that just prior to the lockups the domU''s had high cpu loads. The domU that I have the most problems with is a Zimbra server. My guess is that a rash of spam comes through and cpu loads get high, then the cpu''s lock up. Originally I had it running with 1 cpu but have since upped it 2 then 3 cpu''s. > > I have been collecting the lockup messages and have posed a few below. Any ideas? Recommendations?You might want to try using newer kernel. Both vanilla kernel and Suse''s Xen kernel should work for domU. See http://wiki.xensource.com/xenwiki/XenDom0Kernels. T -- Fajar _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Sun, Jan 31, 2010 at 11:23:36AM -0500, Dana Rawding wrote:> Hi all, > > I''ve been experiencing a rash of CPU lockups on a number of domU''s recently. It''s been happening on two different servers. About a year ago I had this problem every once in a while but it was not frequent. I was running Ubuntu with Xen 3.1 and 2.6.24-18 back then. I''m now running Xen 3.3 and 2.6.24-26. > > What I have noticed is that just prior to the lockups the domU''s had high cpu loads. The domU that I have the most problems with is a Zimbra server. My guess is that a rash of spam comes through and cpu loads get high, then the cpu''s lock up. Originally I had it running with 1 cpu but have since upped it 2 then 3 cpu''s. > > I have been collecting the lockup messages and have posed a few below. Any ideas? Recommendations? >Please check this wiki page: http://wiki.xensource.com/xenwiki/XenBestPractices Are all those OK on your setup? After those I''d upgrade the dom0 kernel, since Ubuntu''s 2.6.24 is known to be buggy. -- Pasi> Thanks, > Dana > > > [138077.172283] ======================> [138075.147398] BUG: soft lockup - CPU#0 stuck for 11s! [kswapd0:97] > [138075.147411] > [138075.147419] Pid: 97, comm: kswapd0 Tainted: G D (2.6.24-26-xen #1) > [138075.147426] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 0 > [138075.147441] EIP is at _spin_lock+0x7/0x10 > [138075.147447] EAX: c1da48ec EBX: 00000000 ECX: 220c7000 EDX: 00000000 > [138075.147453] ESI: 8b804067 EDI: c1da48ec EBP: 00000f28 ESP: ed707dec > [138075.147459] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 > [138075.147471] CR0: 8005003b CR2: 080f0010 CR3: 2213b000 CR4: 00000660 > [138075.147482] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [138075.147488] DR6: ffff0ff0 DR7: 00000400 > [138075.147495] [<c01773cb>] page_check_address+0x1cb/0x3c0 > [138075.147514] [<c0119868>] xen_invlpg_mask+0x38/0x40 > [138075.147529] [<c017762e>] page_referenced_one+0x6e/0x190 > [138075.147541] [<c017875c>] page_referenced+0xec/0x130 > [138075.147552] [<c01671cf>] shrink_active_list+0x18f/0x5c0 > [138075.147567] [<c016826d>] shrink_zone+0xdd/0x100 > [138075.147578] [<c01688cc>] kswapd+0x44c/0x490 > [138075.147589] [<c013bb00>] autoremove_wake_function+0x0/0x40 > [138075.147603] [<c011e270>] complete+0x40/0x60 > [138075.147614] [<c0168480>] kswapd+0x0/0x490 > [138075.147625] [<c013b842>] kthread+0x42/0x70 > [138075.147635] [<c013b800>] kthread+0x0/0x70 > [138075.147646] [<c0105bb7>] kernel_thread_helper+0x7/0x10 > [138075.147658] ======================> [138088.987826] BUG: soft lockup - CPU#1 stuck for 11s! [java:23215] > [138088.987841] > [138088.987846] Pid: 23215, comm: java Tainted: G D (2.6.24-26-xen #1) > [138088.987850] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 1 > [138088.987862] EIP is at _spin_lock+0x7/0x10 > [138088.987866] EAX: c1da48ec EBX: 00000000 ECX: c1da48e0 EDX: 00000ca8 > [138088.987870] ESI: 8b804067 EDI: 00000000 EBP: e20c7ca8 ESP: e226be04 > [138088.987873] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 > [138088.987883] CR0: 80050033 CR2: 940ef020 CR3: 2211f000 CR4: 00000660 > [138088.987891] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [138088.987896] DR6: ffff0ff0 DR7: 00000400 > [138088.987901] [<c016d88d>] unmap_vmas+0x43d/0xae0 > [138088.987922] [<c011959c>] kmap_atomic+0x1c/0x30 > [138088.987941] [<c01192fd>] kunmap_atomic+0x3d/0x60 > [138088.987957] [<c0173ee8>] vma_adjust+0x1c8/0x440 > [138088.987967] [<c0173765>] unmap_region+0x95/0x120 > [138088.987975] [<c0174387>] do_munmap+0x147/0x1f0 > [138088.987983] [<c0174c90>] mmap_region+0x70/0x450 > [138088.987991] [<c01db3b7>] security_file_mmap+0x27/0x30 > [138088.988001] [<c0175472>] do_mmap_pgoff+0x312/0x330 > [138088.988008] [<c010a02b>] sys_mmap2+0xbb/0xd0 > [138088.988016] [<c0105832>] syscall_call+0x7/0xb > [138088.988023] [<c0320000>] svc_accept+0x150/0x410 > [138088.988032] ======================> > > [66916.451144] BUG: soft lockup - CPU#0 stuck for 11s! [java:2758] > [66928.193453] BUG: soft lockup - CPU#1 stuck for 11s! [java:3419] > > > [336990.703192] BUG: soft lockup - CPU#1 stuck for 11s! [ps:32586] > [336990.703206] > [336990.703214] Pid: 32586, comm: ps Tainted: G D (2.6.24-26-xen #1) > [336990.703221] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 1 > [336990.703235] EIP is at _spin_lock+0x7/0x10 > [336990.703241] EAX: c1dbc72c EBX: 00000000 ECX: c1dbc720 EDX: 00000007 > [336990.703247] ESI: 57b51067 EDI: 00000001 EBP: e2cb93c8 ESP: e2033e4c > [336990.703253] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 > [336990.703266] CR0: 80050033 CR2: 08079004 CR3: 23651000 CR4: 00000660 > [336990.703275] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [336990.703282] DR6: ffff0ff0 DR7: 00000400 > [336990.703288] [<c0171646>] handle_mm_fault+0xae6/0x1360 > [336990.703307] [<c020e057>] rb_insert_color+0x77/0xe0 > [336990.703325] [<c032a27e>] do_page_fault+0x35e/0xe70 > [336990.703337] [<c01745d4>] vma_merge+0x144/0x1d0 > [336990.703349] [<c0174b75>] do_brk+0x195/0x240 > [336990.703362] [<c0175126>] sys_brk+0xb6/0xf0 > [336990.703374] [<c0329f20>] do_page_fault+0x0/0xe70 > [336990.703384] [<c0328bc5>] error_code+0x35/0x40 > [336990.703396] ======================> [337005.938292] BUG: soft lockup - CPU#2 stuck for 11s! [zmlocalconfig:11371] > [337005.938306] > [337005.938312] Pid: 11371, comm: zmlocalconfig Tainted: G D (2.6.24-26-xen #1) > [337005.938318] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 2 > [337005.938330] EIP is at _spin_lock+0x7/0x10 > [337005.938335] EAX: ec64a870 EBX: ec64a870 ECX: 00000002 EDX: ec64a871 > [337005.938339] ESI: 00000000 EDI: c03fe800 EBP: c1261e38 ESP: c1261d7c > [337005.938343] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 > [337005.938357] CR0: 8005003b CR2: 08128000 CR3: 25d8e000 CR4: 00000660 > [337005.938364] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [337005.938370] DR6: ffff0ff0 DR7: 00000400 > [337005.938376] [<c01771f0>] page_lock_anon_vma+0x20/0x30 > [337005.938391] [<c01786fd>] page_referenced+0x8d/0x130 > [337005.938401] [<c01671cf>] shrink_active_list+0x18f/0x5c0 > [337005.938411] [<c0164286>] get_dirty_limits+0x16/0x200 > [337005.938421] [<ee04b38e>] mb_cache_shrink_fn+0x1e/0x100 [mbcache] > [337005.938435] [<c016826d>] shrink_zone+0xdd/0x100 > [337005.938444] [<c0168d72>] try_to_free_pages+0x152/0x250 > [337005.938453] [<c0162fcb>] __alloc_pages+0x14b/0x390 > [337005.938463] [<c01855c5>] do_sync_read+0xd5/0x120 > [337005.938475] [<c0163247>] __get_free_pages+0x37/0x50 > [337005.938483] [<c0124496>] copy_process+0xa6/0x1210 > [337005.938493] [<c0197c34>] d_alloc+0x114/0x1a0 > [337005.938503] [<c0125830>] do_fork+0x40/0x260 > [337005.938511] [<c0210f00>] copy_to_user+0x30/0x60 > [337005.938523] [<c0103226>] sys_clone+0x36/0x40 > [337005.938530] [<c0105832>] syscall_call+0x7/0xb > [337005.938542] ======================> [336990.803889] BUG: soft lockup - CPU#0 stuck for 11s! [kswapd0:103] > [336990.803907] > [336990.803915] Pid: 103, comm: kswapd0 Tainted: G D (2.6.24-26-xen #1) > [336990.803922] EIP: 0061:[<c03286ea>] EFLAGS: 00000286 CPU: 0 > [336990.803940] EIP is at _spin_lock+0xa/0x10 > [336990.803948] EAX: c1dbc86c EBX: 00000000 ECX: 22cc3000 EDX: 00000000 > [336990.803955] ESI: 57b47067 EDI: c1dbc86c EBP: 00000ff0 ESP: ed725dec > [336990.803961] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 > [336990.803976] CR0: 8005003b CR2: b791e6d9 CR3: 23e3b000 CR4: 00000660 > [336990.803986] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [336990.803992] DR6: ffff0ff0 DR7: 00000400 > [336990.804001] [<c01773cb>] page_check_address+0x1cb/0x3c0 > [336990.804026] [<c017762e>] page_referenced_one+0x6e/0x190 > [336990.804039] [<c017875c>] page_referenced+0xec/0x130 > [336990.804049] [<c01671cf>] shrink_active_list+0x18f/0x5c0 > [336990.804064] [<c0210556>] memmove+0x36/0x40 > [336990.804079] [<c0164286>] get_dirty_limits+0x16/0x200 > [336990.804089] [<c0139857>] call_rcu+0x97/0xa0 > [336990.804102] [<ee04b38e>] mb_cache_shrink_fn+0x1e/0x100 [mbcache] > [336990.804120] [<c016826d>] shrink_zone+0xdd/0x100 > [336990.804132] [<c01688cc>] kswapd+0x44c/0x490 > [336990.804145] [<c013bb00>] autoremove_wake_function+0x0/0x40 > [336990.804160] [<c011e270>] complete+0x40/0x60 > [336990.804172] [<c0168480>] kswapd+0x0/0x490 > [336990.804183] [<c013b842>] kthread+0x42/0x70 > [336990.804194] [<c013b800>] kthread+0x0/0x70 > [336990.804206] [<c0105bb7>] kernel_thread_helper+0x7/0x10 > [336990.804218] ======================> _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users_______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
I have this problem too. Xen 3.3.1 Debian Lenny. LA on server up to 10-15, all domUs freeze and I can''t do anything. Please test I fix this problem by xm sched-credit -d 0 -w 512 . [787717.425090] BUG: soft lockup - CPU#0 stuck for 61s! [watchdog/0:5] [787717.425090] Modules linked in: xt_tcpudp xt_physdev iptable_filter ip_tables x_tables tun bridge ipv6 nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc loop joydev igb psmouse pcspkr i2c_i801 serio_raw button i2c_core evdev dca ext3 jbd mbcache dm_mirror dm_log dm_snapshot dm_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock sd_mod ide_pci_generic ide_core ehci_hcd uhci_hcd 3w_9xxx scsi_mod thermal processor fan thermal_sys [last unloaded: scsi_wait_scan] [787717.432148] CPU 0: [787717.432148] Modules linked in: xt_tcpudp xt_physdev iptable_filter ip_tables x_tables tun bridge ipv6 nfsd auth_rpcgss exportfs nfs lockd nfs_acl sunrpc loop joydev igb psmouse pcspkr i2c_i801 serio_raw button i2c_core evdev dca ext3 jbd mbcache dm_mirror dm_log dm_snapshot dm_mod sg sr_mod cdrom ata_generic usbhid hid ff_memless ata_piix libata dock sd_mod ide_pci_generic ide_core ehci_hcd uhci_hcd 3w_9xxx scsi_mod thermal processor fan thermal_sys [last unloaded: scsi_wait_scan] [787717.436173] Pid: 5, comm: watchdog/0 Not tainted 2.6.26-1-xen-amd64 #1 [787717.436173] RIP: e030:[<ffffffff8025ed13>] [<ffffffff8025ed13>] watchdog+0xbe/0x1cf [787717.436173] RSP: e02b:ffff880bce0d9ef0 EFLAGS: 00000207 [787717.436173] RAX: 0000000000000001 RBX: ffff880bcb4e5400 RCX: 0002cc64939f91fe [787717.436173] RDX: ffff880081656000 RSI: ffffffff804fe460 RDI: ffffffff8053a000 [787717.436173] RBP: ffff880bcb4e5400 R08: ffff880001be3040 R09: ffff880bce0d9e30 [787717.436173] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000399 [787717.436173] R13: 00000000000b3192 R14: 0000000000000000 R15: 0000000000000000 [787717.436173] FS: 00007f0cfbb3e6e0(0000) GS:ffffffff80539000(0000) knlGS:0000000000000000 [787717.436173] CS: e033 DS: 0000 ES: 0000 [787717.436173] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [787717.436173] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [787717.436173] [787717.436173] Call Trace: [787717.436173] [<ffffffff8025ec55>] ? watchdog+0x0/0x1cf [787717.436173] [<ffffffff8023f56b>] ? kthread+0x47/0x74 [787717.436173] [<ffffffff8022839f>] ? schedule_tail+0x27/0x5c [787717.436173] [<ffffffff8020be28>] ? child_rip+0xa/0x12 [787717.436173] [<ffffffff8023f524>] ? kthread+0x0/0x74 [787717.436173] [<ffffffff8020be1e>] ? child_rip+0x0/0x12 [787717.436173] I fix this problem by xm sched-credit -d 0 -w 512 . 2010/1/31 Dana Rawding <dana@twc-inc.net>> Hi all, > > I''ve been experiencing a rash of CPU lockups on a number of domU''s > recently. It''s been happening on two different servers. About a year ago I > had this problem every once in a while but it was not frequent. I was > running Ubuntu with Xen 3.1 and 2.6.24-18 back then. I''m now running Xen > 3.3 and 2.6.24-26. > > What I have noticed is that just prior to the lockups the domU''s had high > cpu loads. The domU that I have the most problems with is a Zimbra server. > My guess is that a rash of spam comes through and cpu loads get high, then > the cpu''s lock up. Originally I had it running with 1 cpu but have since > upped it 2 then 3 cpu''s. > > I have been collecting the lockup messages and have posed a few below. Any > ideas? Recommendations? > > Thanks, > Dana > > > [138077.172283] ======================> [138075.147398] BUG: soft lockup - CPU#0 stuck for 11s! [kswapd0:97] > [138075.147411] > [138075.147419] Pid: 97, comm: kswapd0 Tainted: G D (2.6.24-26-xen #1) > [138075.147426] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 0 > [138075.147441] EIP is at _spin_lock+0x7/0x10 > [138075.147447] EAX: c1da48ec EBX: 00000000 ECX: 220c7000 EDX: 00000000 > [138075.147453] ESI: 8b804067 EDI: c1da48ec EBP: 00000f28 ESP: ed707dec > [138075.147459] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 > [138075.147471] CR0: 8005003b CR2: 080f0010 CR3: 2213b000 CR4: 00000660 > [138075.147482] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [138075.147488] DR6: ffff0ff0 DR7: 00000400 > [138075.147495] [<c01773cb>] page_check_address+0x1cb/0x3c0 > [138075.147514] [<c0119868>] xen_invlpg_mask+0x38/0x40 > [138075.147529] [<c017762e>] page_referenced_one+0x6e/0x190 > [138075.147541] [<c017875c>] page_referenced+0xec/0x130 > [138075.147552] [<c01671cf>] shrink_active_list+0x18f/0x5c0 > [138075.147567] [<c016826d>] shrink_zone+0xdd/0x100 > [138075.147578] [<c01688cc>] kswapd+0x44c/0x490 > [138075.147589] [<c013bb00>] autoremove_wake_function+0x0/0x40 > [138075.147603] [<c011e270>] complete+0x40/0x60 > [138075.147614] [<c0168480>] kswapd+0x0/0x490 > [138075.147625] [<c013b842>] kthread+0x42/0x70 > [138075.147635] [<c013b800>] kthread+0x0/0x70 > [138075.147646] [<c0105bb7>] kernel_thread_helper+0x7/0x10 > [138075.147658] ======================> [138088.987826] BUG: soft lockup - CPU#1 stuck for 11s! [java:23215] > [138088.987841] > [138088.987846] Pid: 23215, comm: java Tainted: G D (2.6.24-26-xen #1) > [138088.987850] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 1 > [138088.987862] EIP is at _spin_lock+0x7/0x10 > [138088.987866] EAX: c1da48ec EBX: 00000000 ECX: c1da48e0 EDX: 00000ca8 > [138088.987870] ESI: 8b804067 EDI: 00000000 EBP: e20c7ca8 ESP: e226be04 > [138088.987873] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 > [138088.987883] CR0: 80050033 CR2: 940ef020 CR3: 2211f000 CR4: 00000660 > [138088.987891] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [138088.987896] DR6: ffff0ff0 DR7: 00000400 > [138088.987901] [<c016d88d>] unmap_vmas+0x43d/0xae0 > [138088.987922] [<c011959c>] kmap_atomic+0x1c/0x30 > [138088.987941] [<c01192fd>] kunmap_atomic+0x3d/0x60 > [138088.987957] [<c0173ee8>] vma_adjust+0x1c8/0x440 > [138088.987967] [<c0173765>] unmap_region+0x95/0x120 > [138088.987975] [<c0174387>] do_munmap+0x147/0x1f0 > [138088.987983] [<c0174c90>] mmap_region+0x70/0x450 > [138088.987991] [<c01db3b7>] security_file_mmap+0x27/0x30 > [138088.988001] [<c0175472>] do_mmap_pgoff+0x312/0x330 > [138088.988008] [<c010a02b>] sys_mmap2+0xbb/0xd0 > [138088.988016] [<c0105832>] syscall_call+0x7/0xb > [138088.988023] [<c0320000>] svc_accept+0x150/0x410 > [138088.988032] ======================> > > [66916.451144] BUG: soft lockup - CPU#0 stuck for 11s! [java:2758] > [66928.193453] BUG: soft lockup - CPU#1 stuck for 11s! [java:3419] > > > [336990.703192] BUG: soft lockup - CPU#1 stuck for 11s! [ps:32586] > [336990.703206] > [336990.703214] Pid: 32586, comm: ps Tainted: G D (2.6.24-26-xen #1) > [336990.703221] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 1 > [336990.703235] EIP is at _spin_lock+0x7/0x10 > [336990.703241] EAX: c1dbc72c EBX: 00000000 ECX: c1dbc720 EDX: 00000007 > [336990.703247] ESI: 57b51067 EDI: 00000001 EBP: e2cb93c8 ESP: e2033e4c > [336990.703253] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 > [336990.703266] CR0: 80050033 CR2: 08079004 CR3: 23651000 CR4: 00000660 > [336990.703275] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [336990.703282] DR6: ffff0ff0 DR7: 00000400 > [336990.703288] [<c0171646>] handle_mm_fault+0xae6/0x1360 > [336990.703307] [<c020e057>] rb_insert_color+0x77/0xe0 > [336990.703325] [<c032a27e>] do_page_fault+0x35e/0xe70 > [336990.703337] [<c01745d4>] vma_merge+0x144/0x1d0 > [336990.703349] [<c0174b75>] do_brk+0x195/0x240 > [336990.703362] [<c0175126>] sys_brk+0xb6/0xf0 > [336990.703374] [<c0329f20>] do_page_fault+0x0/0xe70 > [336990.703384] [<c0328bc5>] error_code+0x35/0x40 > [336990.703396] ======================> [337005.938292] BUG: soft lockup - CPU#2 stuck for 11s! > [zmlocalconfig:11371] > [337005.938306] > [337005.938312] Pid: 11371, comm: zmlocalconfig Tainted: G D > (2.6.24-26-xen #1) > [337005.938318] EIP: 0061:[<c03286e7>] EFLAGS: 00000286 CPU: 2 > [337005.938330] EIP is at _spin_lock+0x7/0x10 > [337005.938335] EAX: ec64a870 EBX: ec64a870 ECX: 00000002 EDX: ec64a871 > [337005.938339] ESI: 00000000 EDI: c03fe800 EBP: c1261e38 ESP: c1261d7c > [337005.938343] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0069 > [337005.938357] CR0: 8005003b CR2: 08128000 CR3: 25d8e000 CR4: 00000660 > [337005.938364] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [337005.938370] DR6: ffff0ff0 DR7: 00000400 > [337005.938376] [<c01771f0>] page_lock_anon_vma+0x20/0x30 > [337005.938391] [<c01786fd>] page_referenced+0x8d/0x130 > [337005.938401] [<c01671cf>] shrink_active_list+0x18f/0x5c0 > [337005.938411] [<c0164286>] get_dirty_limits+0x16/0x200 > [337005.938421] [<ee04b38e>] mb_cache_shrink_fn+0x1e/0x100 [mbcache] > [337005.938435] [<c016826d>] shrink_zone+0xdd/0x100 > [337005.938444] [<c0168d72>] try_to_free_pages+0x152/0x250 > [337005.938453] [<c0162fcb>] __alloc_pages+0x14b/0x390 > [337005.938463] [<c01855c5>] do_sync_read+0xd5/0x120 > [337005.938475] [<c0163247>] __get_free_pages+0x37/0x50 > [337005.938483] [<c0124496>] copy_process+0xa6/0x1210 > [337005.938493] [<c0197c34>] d_alloc+0x114/0x1a0 > [337005.938503] [<c0125830>] do_fork+0x40/0x260 > [337005.938511] [<c0210f00>] copy_to_user+0x30/0x60 > [337005.938523] [<c0103226>] sys_clone+0x36/0x40 > [337005.938530] [<c0105832>] syscall_call+0x7/0xb > [337005.938542] ======================> [336990.803889] BUG: soft lockup - CPU#0 stuck for 11s! [kswapd0:103] > [336990.803907] > [336990.803915] Pid: 103, comm: kswapd0 Tainted: G D (2.6.24-26-xen > #1) > [336990.803922] EIP: 0061:[<c03286ea>] EFLAGS: 00000286 CPU: 0 > [336990.803940] EIP is at _spin_lock+0xa/0x10 > [336990.803948] EAX: c1dbc86c EBX: 00000000 ECX: 22cc3000 EDX: 00000000 > [336990.803955] ESI: 57b47067 EDI: c1dbc86c EBP: 00000ff0 ESP: ed725dec > [336990.803961] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0069 > [336990.803976] CR0: 8005003b CR2: b791e6d9 CR3: 23e3b000 CR4: 00000660 > [336990.803986] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [336990.803992] DR6: ffff0ff0 DR7: 00000400 > [336990.804001] [<c01773cb>] page_check_address+0x1cb/0x3c0 > [336990.804026] [<c017762e>] page_referenced_one+0x6e/0x190 > [336990.804039] [<c017875c>] page_referenced+0xec/0x130 > [336990.804049] [<c01671cf>] shrink_active_list+0x18f/0x5c0 > [336990.804064] [<c0210556>] memmove+0x36/0x40 > [336990.804079] [<c0164286>] get_dirty_limits+0x16/0x200 > [336990.804089] [<c0139857>] call_rcu+0x97/0xa0 > [336990.804102] [<ee04b38e>] mb_cache_shrink_fn+0x1e/0x100 [mbcache] > [336990.804120] [<c016826d>] shrink_zone+0xdd/0x100 > [336990.804132] [<c01688cc>] kswapd+0x44c/0x490 > [336990.804145] [<c013bb00>] autoremove_wake_function+0x0/0x40 > [336990.804160] [<c011e270>] complete+0x40/0x60 > [336990.804172] [<c0168480>] kswapd+0x0/0x490 > [336990.804183] [<c013b842>] kthread+0x42/0x70 > [336990.804194] [<c013b800>] kthread+0x0/0x70 > [336990.804206] [<c0105bb7>] kernel_thread_helper+0x7/0x10 > [336990.804218] ======================> _______________________________________________ > Xen-users mailing list > Xen-users@lists.xensource.com > http://lists.xensource.com/xen-users >-- Best Regards, alex.faq8@gmail.com _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
On Feb 1, 2010, at 2:27 AM, Pasi Kärkkäinen wrote:> Please check this wiki page: > http://wiki.xensource.com/xenwiki/XenBestPracticesI have1.5 GB RAM dedicated to the dom0''s. It''s probably more RAM than necessary. Is there a suggestion as to what this number should be? The sched-credit was the default 256. I have upped it to 512 per the best practices and Alex''s suggestion. I''m hoping this calms things down. If not I plan to try a different kernel. Thanks to everyone for the suggestions. Dana _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users