Ray Barnes
2009-May-28 12:14 UTC
[Xen-devel] Xen 3.3.1 latest / starting guests causes Oops in tapdisk?
Hi all. Under CentOS 5.3 with a very common set of hardware and software (I have several of these machines, running several types of guests, all under 3.3.1 or 3.4), I''m hitting an Oops on the tapdisk process after I start particular guests. Please forgive me in advance, as I don''t have serial on this box so I can''t fully capture the console text. I do have what syslog spits out to the terminal while running ''xm create -c'': Message from syslogd@ at Wed May 27 09:03:41 2009 ... vpsbox2 kernel: Oops: 0000 [#1] Message from syslogd@ at Wed May 27 09:03:41 2009 ... vpsbox2 kernel: SMP Message from syslogd@ at Wed May 27 09:03:41 2009 ... vpsbox2 kernel: CPU: 1 Message from syslogd@ at Wed May 27 09:03:41 2009 ... vpsbox2 kernel: EIP is at get_user_pages+0x6b/0x460 Message from syslogd@ at Wed May 27 09:03:41 2009 ... vpsbox2 kernel: eax: 00000004 ebx: 00000040 ecx: 040a44fb edx: ebed4800 Message from syslogd@ at Wed May 27 09:03:41 2009 ... vpsbox2 kernel: esi: eaeb3cd4 edi: 00000000 ebp: b7e01000 esp: ebe81d14 Message from syslogd@ at Wed May 27 09:03:41 2009 ... vpsbox2 kernel: ds: 007b es: 007b ss: 0069 Message from syslogd@ at Wed May 27 09:03:41 2009 ... vpsbox2 kernel: Process tapdisk (pid: 7163, ti=ebe80000 task=eb55c3f0 task.ti=ebe80000) Message from syslogd@ at Wed May 27 09:03:41 2009 ... vpsbox2 kernel: Stack: c011d499 00000000 eb0b7ac0 eb55c3f0 00000003 00000000 00000022 00000001 Message from syslogd@ at Wed May 27 09:03:41 2009 ... vpsbox2 kernel: ebe81d58 c011da48 00000040 00000001 00000000 ec340200 c01a0792 00000001 Message from syslogd@ at Wed May 27 09:03:41 2009 ... vpsbox2 kernel: 00000001 00000000 ec3402b8 00000000 00000000 00001000 00bbfd00 00000000 Message from syslogd@ at Wed May 27 09:03:41 2009 ... vpsbox2 kernel: Call Trace: Message from syslogd@ at Wed May 27 09:03:41 2009 ... vpsbox2 kernel: Code: ea e8 8a 34 00 00 85 c0 89 c6 0f 84 11 02 00 00 8b 48 18 f7 c1 00 00 00 04 74 75 8b 50 50 89 e8 2b 46 04 c1 e8 0c c1 e0 02 03 02 <8b> 00 85 c0 74 5f 8b 54 24 48 85 d2 74 1c 89 c2 8b 4c 24 48 8b Message from syslogd@ at Wed May 27 09:03:41 2009 ... vpsbox2 kernel: EIP: [<c01655db>] get_user_pages+0x6b/0x460 SS:ESP 0069:ebe81d14 I''ll also attach a screenshot of what I get from the server console. domU config is as follows: kernel = "/home/vmlinuz-2.6.18-4-xen-686" ramdisk = "/home/initrd.img-2.6.18-4-xen-686" memory = 499 name = "jal" vif = [''mac=00:16:3e:3d:a0:31, vifname=jal, script=vif-jal''] disk = [ ''tap:aio:/home/vps/jal.img,sda1,w'' ] root = "/dev/sda1 ro" extra = "4" vcpus=1 on_reboot = ''restart'' on_crash = ''restart'' builder = ''linux'' Problem is present using CentOS 5.x guests using the Cent kernel also. The aforementioned domU is Debian 4 though. I''ve tried this under 3.3.1-release and the latest 3.3.1 from xen-3.3-testing as of two days ago with the same result. I set maxium loop devices at 255 (was 8), no change. Also, after the Oops is triggered, I can no longer start domUs (even good ones which would otherwise not exhibit the problem), until after a reboot. Is this a Xen problem, or should I be looking at the OS or something else? -Ray _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-May-28 12:27 UTC
Re: [Xen-devel] Xen 3.3.1 latest / starting guests causes Oops in tapdisk?
On 28/05/2009 13:14, "Ray Barnes" <tical.net@gmail.com> wrote:> Problem is present using CentOS 5.x guests using the Cent kernel also. The > aforementioned domU is Debian 4 though. I''ve tried this under 3.3.1-release > and the latest 3.3.1 from xen-3.3-testing as of two days ago with the same > result. I set maxium loop devices at 255 (was 8), no change. Also, after the > Oops is triggered, I can no longer start domUs (even good ones which would > otherwise not exhibit the problem), until after a reboot. Is this a Xen > problem, or should I be looking at the OS or something else?Probably it''s a dom0 kernel issue, in the blktap device driver. -- Keir _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ray Barnes
2009-May-28 13:55 UTC
Re: [Xen-devel] Xen 3.3.1 latest / starting guests causes Oops in tapdisk?
Thanks Keir. I''ve been using the supplied 2.6.18-8 kernels from your source repository throughout this issue. I just tried 3.4.0-release and the latest 3.4.0 from a pull this morning, no change. Perhaps there''s some way (other than abandoning blktap-backed guests) that I can work around it? -Ray On Thu, May 28, 2009 at 8:27 AM, Keir Fraser <keir.fraser@eu.citrix.com>wrote:> On 28/05/2009 13:14, "Ray Barnes" <tical.net@gmail.com> wrote: > > > Problem is present using CentOS 5.x guests using the Cent kernel also. > The > > aforementioned domU is Debian 4 though. I''ve tried this under > 3.3.1-release > > and the latest 3.3.1 from xen-3.3-testing as of two days ago with the > same > > result. I set maxium loop devices at 255 (was 8), no change. Also, > after the > > Oops is triggered, I can no longer start domUs (even good ones which > would > > otherwise not exhibit the problem), until after a reboot. Is this a Xen > > problem, or should I be looking at the OS or something else? > > Probably it''s a dom0 kernel issue, in the blktap device driver. > > -- Keir > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Keir Fraser
2009-May-28 15:30 UTC
Re: [Xen-devel] Xen 3.3.1 latest / starting guests causes Oops in tapdisk?
It might be interesting to try xen-unstable which should now be switched over to blktap2 by default. Quite apart from being interested as to how well that works right now, I also imagine you can get quicker responses from the developers of that. If it works, blktap2 would port over to 3.4 quite easily. Probably 3.3 too. -- Keir On 28/05/2009 14:55, "Ray Barnes" <tical.net@gmail.com> wrote:> Thanks Keir. I''ve been using the supplied 2.6.18-8 kernels from your source > repository throughout this issue. I just tried 3.4.0-release and the latest > 3.4.0 from a pull this morning, no change. Perhaps there''s some way (other > than abandoning blktap-backed guests) that I can work around it? > > -Ray > > > > On Thu, May 28, 2009 at 8:27 AM, Keir Fraser <keir.fraser@eu.citrix.com> > wrote: >> On 28/05/2009 13:14, "Ray Barnes" <tical.net <http://tical.net/> @gmail.com >> <http://gmail.com/> > wrote: >> >>> Problem is present using CentOS 5.x guests using the Cent kernel also. The >>> aforementioned domU is Debian 4 though. I''ve tried this under 3.3.1-release >>> and the latest 3.3.1 from xen-3.3-testing as of two days ago with the same >>> result. I set maxium loop devices at 255 (was 8), no change. Also, after >>> the >>> Oops is triggered, I can no longer start domUs (even good ones which would >>> otherwise not exhibit the problem), until after a reboot. Is this a Xen >>> problem, or should I be looking at the OS or something else? >> >> Probably it''s a dom0 kernel issue, in the blktap device driver. >> >> -- Keir >> >> > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
Ray Barnes
2009-May-28 16:19 UTC
Re: [Xen-devel] Xen 3.3.1 latest / starting guests causes Oops in tapdisk?
Thanks, i''ll switch to xen-unstable if I have problems with the workaround, which is (regrettably) to use file: instead of tap:aio: as the blkdev in the domU config files. -Ray On Thu, May 28, 2009 at 11:30 AM, Keir Fraser <keir.fraser@eu.citrix.com>wrote:> It might be interesting to try xen-unstable which should now be switched > over to blktap2 by default. Quite apart from being interested as to how > well > that works right now, I also imagine you can get quicker responses from the > developers of that. If it works, blktap2 would port over to 3.4 quite > easily. Probably 3.3 too. > > -- Keir > > On 28/05/2009 14:55, "Ray Barnes" <tical.net@gmail.com> wrote: > > > Thanks Keir. I''ve been using the supplied 2.6.18-8 kernels from your > source > > repository throughout this issue. I just tried 3.4.0-release and the > latest > > 3.4.0 from a pull this morning, no change. Perhaps there''s some way > (other > > than abandoning blktap-backed guests) that I can work around it? > > > > -Ray > > > > > > > > On Thu, May 28, 2009 at 8:27 AM, Keir Fraser <keir.fraser@eu.citrix.com> > > wrote: > >> On 28/05/2009 13:14, "Ray Barnes" <tical.net <http://tical.net/> @ > gmail.com > >> <http://gmail.com/> > wrote: > >> > >>> Problem is present using CentOS 5.x guests using the Cent kernel also. > The > >>> aforementioned domU is Debian 4 though. I''ve tried this under > 3.3.1-release > >>> and the latest 3.3.1 from xen-3.3-testing as of two days ago with the > same > >>> result. I set maxium loop devices at 255 (was 8), no change. Also, > after > >>> the > >>> Oops is triggered, I can no longer start domUs (even good ones which > would > >>> otherwise not exhibit the problem), until after a reboot. Is this a > Xen > >>> problem, or should I be looking at the OS or something else? > >> > >> Probably it''s a dom0 kernel issue, in the blktap device driver. > >> > >> -- Keir > >> > >> > > > > > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel