I played a bit with qemu-0.8.2-solaris and kqemu on an amd64 Solaris box. I''ve noticed that qemu-system-x86_64 isn''t able to boot a 64-bit Solaris x86 guest, and falls back to boot the 32-bit kernel. Solaris x86 expects that bit 17 is set in the cpuid feature flags, otherwise it doesn''t boot the amd64 64-bit kernel. I''ve used the following qemu patch, and S-x86 booting under qemu-system-x86_64 now tries to boot the 64-bit kernel as the default kernel. diff -ru ../qemu-0.8.2-solaris-orig/target-i386/cpu.h ./target-i386/cpu.h --- ../qemu-0.8.2-solaris-orig/target-i386/cpu.h 2006-07-22 19:23:34.000000000 +0200 +++ ./target-i386/cpu.h 2006-09-23 17:20:45.368457164 +0200 @@ -258,6 +258,7 @@ #define CPUID_MCA (1 << 14) #define CPUID_CMOV (1 << 15) #define CPUID_PAT (1 << 16) +#define CPUID_PSE36 (1 << 17) #define CPUID_CLFLUSH (1 << 19) /* ... */ #define CPUID_MMX (1 << 23) diff -ru ../qemu-0.8.2-solaris-orig/target-i386/helper2.c ./target-i386/helper2.c --- ../qemu-0.8.2-solaris-orig/target-i386/helper2.c 2006-07-22 19:23:34.000000000 +0200 +++ ./target-i386/helper2.c 2006-09-23 17:21:21.288480562 +0200 @@ -135,6 +135,8 @@ /* these features are needed for Win64 and aren''t fully implemented */ env->cpuid_features |= CPUID_MTRR | CPUID_CLFLUSH | CPUID_MCA; + /* this feature is needed for Solaris and isn''t fully implemented */ + env->cpuid_features |= CPUID_PSE36; #endif } cpu_reset(env); But there are are couple of problems: 1. when kqemu is used, user level S-x86 guest processes ramdomly crash with SIGSEGV. This seems to happen when we return from kernel to user level using the "iret" instruction; qemu runs "helper_iret_protected()" which calls "helper_ret_protected()". In "helper_ret_protected()" qemu thinks that the new code segment descriptor in the GDT has the "present" bit clear and raises a "0x0b segment not present" exception: if (!(e2 & DESC_P_MASK)) raise_exception_err(EXCP0B_NOSEG, new_cs & 0xfffc); I''m not sure why this happens. The 64-bit Solaris x86 kernel does play with the "segment present" bits in the GDT when there are both 32- and 64-bit user level processes. I seems that this problem with doesn''t happen when kqemu isn''t used. 2. when kqemu isn''t used, I''ve noticed that the 64-bit S-x86 guest os kernel randomly reports "EFAULT" errors for system calls. This results in svc.startd / svc.configd startup errors, telling me that there were disk error reading the repository DB. Or fsck errors, reporting block read errores, when checking the root filesystem. What I found out so far is that this problem seems to happen only when the standard C-library /lib/libc.so.1 is used. When I mount the /usr/lib/libc/libc_hwcap2.so.1 shared C-library on top of /lib/libc.so.1 the EFAULT errors are gone. A qemu problem with the "int $0x91" system calls in /lib/libc.so.1, which doesn''t happen with the "syscall" system calls from libc_hwcap2.so.1 ? -- This message posted from opensolaris.org
J?rgen Keil wrote:> I played a bit with qemu-0.8.2-solaris and kqemu on an amd64 Solaris box. > > I''ve noticed that qemu-system-x86_64 isn''t able to boot a 64-bit Solaris x86 > guest, and falls back to boot the 32-bit kernel.Precisely, this is due to the CPU detection code in multiboot: 6471548 mutlboot too restrictive about amd64 capable platforms> Solaris x86 expects that bit 17 is set in the cpuid feature flags, otherwise > it doesn''t boot the amd64 64-bit kernel.I see your patch has been applied to CVS already. :) Hopefully Martin will pick this up for inclusion with his patches in the next source release for the qemu downloads page.> But there are are couple of problems: > > 1. when kqemu is used, user level S-x86 guest processes ramdomly > crash with SIGSEGV. This seems to happen when we return from kernel > to user level using the "iret" instruction; qemu runs "helper_iret_protected()" > which calls "helper_ret_protected()". In "helper_ret_protected()" qemu > thinks that the new code segment descriptor in the GDT has the "present" > bit clear and raises a "0x0b segment not present" exception: > > if (!(e2 & DESC_P_MASK)) > raise_exception_err(EXCP0B_NOSEG, new_cs & 0xfffc); > > I''m not sure why this happens. The 64-bit Solaris x86 kernel does > play with the "segment present" bits in the GDT when there are > both 32- and 64-bit user level processes. > > I seems that this problem with doesn''t happen when kqemu isn''t used.That is interesting and will warrant more investigation...> 2. when kqemu isn''t used, I''ve noticed that the 64-bit S-x86 guest > os kernel randomly reports "EFAULT" errors for system calls. > This results in svc.startd / svc.configd startup errors, telling me that > there were disk error reading the repository DB. Or fsck errors, > reporting block read errores, when checking the root filesystem. > > What I found out so far is that this problem seems to happen only > when the standard C-library /lib/libc.so.1 is used. > > When I mount the /usr/lib/libc/libc_hwcap2.so.1 shared C-library on > top of /lib/libc.so.1 the EFAULT errors are gone. > > A qemu problem with the "int $0x91" system calls in /lib/libc.so.1, > which doesn''t happen with the "syscall" system calls from > libc_hwcap2.so.1 ?I''ve started looking at this. The most likely source of EFAULT errors seemed to be the copyin of args failing but that didn''t pan out. So now I''m looking into whether any of the system calls are actually returning EFAULT (likely due to munged arguments), or if something is happening in the post-syscall processing to return EFAULT after a good syscall. The post-syscall processing isn''t performed in the case of a syscall or sysenter instruction.. otherwise there are few differences since we don''t use the syscall/sysenter argument passing methods but instead appear to always retrieve the arguments from the user stack. Another question is why: moe -32 ''/usr/lib/libc/$HWCAP'' is returning the empty string when you run it in 64-bit QEMU, whereas it returns libc_hwcap2.so.1 on my AMD64 machine. I noticed that isainfo -x does not claim to have "amd syscalls" so this may be yet another missing capability bit in the CPUID emulation or somesuch. I haven''t investigated that lead further yet, though. The int $91 thing will need to be root caused, even if we figure out why the hwcap lib isn''t being used, since /proc uses int $91 syscalls in all three library versions. - Eric
> > 2. when kqemu isn''t used, I''ve noticed that the 64-bit S-x86 guest > > os kernel randomly reports "EFAULT" errors for system calls. > > This results in svc.startd / svc.configd startup errors, telling me that > > there were disk error reading the repository DB. Or fsck errors, > > reporting block read errores, when checking the root filesystem. > > > > What I found out so far is that this problem seems to happen only > > when the standard C-library /lib/libc.so.1 is used. > > > > When I mount the /usr/lib/libc/libc_hwcap2.so.1 shared C-library on > > top of /lib/libc.so.1 the EFAULT errors are gone. > > > > A qemu problem with the "int $0x91" system calls in /lib/libc.so.1, > > which doesn''t happen with the "syscall" system calls from > > libc_hwcap2.so.1 ? > > I''ve started looking at this. The most likely source of EFAULT errors > seemed to be the copyin of args failing but that didn''t pan out. So now > I''m looking into whether any of the system calls are actually returning > EFAULT (likely due to munged arguments), or if something is happening in > the post-syscall processing to return EFAULT after a good syscall. The > post-syscall processing isn''t performed in the case of a syscall or > sysenter instruction.. otherwise there are few differences since we > don''t use the syscall/sysenter argument passing methods but instead > appear to always retrieve the arguments from the user stack.I also made some progress with this issue. It seems to be the copyin_nowatch() call in copyin_args32() which is returning an error. http://cvs.opensolaris.org/source/xref/on/usr/src/uts/intel/ia32/os/syscall.c#co pyin_args32 copyin_nowatch() is failing because the user address is above "kernelbase". The user address is the process'' stackpointer plus on word: 144 greg32_t *sp = 1 + (greg32_t *)rp->r_sp; /* skip ret addr */ And the error is noticed in copyin(), which is called by copyin_nowatch(). I see that register structure contain an rp->r_rsp = 0xfffffe80080475a0 which looks like a 32-bit user stack address of 0x80475a0 with some unexpected bits set in the upper 32-bits of the 64-bit register. The address r_rsp = 0xfffffe80080475a0 is above kernelbase (0xfffffd8000000000), so copyin() fails, copyin_nowatch() fails, and copyin_args32() fails. copyin_arg32() is called from syscall_entry(), which signals the EFAULT error when the copyin_arg32() has failed.
Jürgen Keil
2006-Sep-29 09:33 UTC
[qemu-discuss] Re: 64-bit Solaris x86 kernel as guest OS
> I''ve started looking at this. The most likely source of EFAULT errors > seemed to be the copyin of args failing but that didn''t pan out.An mdb breakpoint on the set_errno(EFAULT) call in syscall_entry() works for me http://cvs.opensolaris.org/source/xref/on/usr/src/uts/intel/ia32/os/syscall.c#240 I''m running qemu like this: qemu-system-x86_64 -d int -m 512 -localtime -snapshot sol11.img and qemu''s interrupt trace log in /tmp/qemu.log contains: [b] .... lots of output deleted ... 8216: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee08e65 pc=00000000fee08e65 SP=0043:00000000febcbaac EAX=0000000000000036 8217: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee08e65 pc=00000000fee08e65 SP=0043:00000000febcbac0 EAX=0000000000000036 8218: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee092e5 pc=00000000fee092e5 SP=0043:00000000febcba3c EAX=00000000000000a1 8220: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee099a5 pc=00000000fee099a5 SP=0043:00000000febcbab0 EAX=000000000000008f 8235: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee08de5 pc=00000000fee08de5 SP=0043:fffffe80febcbab0 EAX=0000000000000014 <<<<<<<< 8260: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee09325 pc=00000000fee09325 SP=0043:fffffe80febcba80 EAX=00000000000000a4 <<<<<<<< 8285: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee0ab24 pc=00000000fee0ab24 SP=0043:00000000febcba7c EAX=00000000000000a5 8368: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee092c5 pc=00000000fee092c5 SP=0043:00000000febcba4c EAX=00000000000000a2 8369: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee0ab24 pc=00000000fee0ab24 SP=0043:00000000febcba7c EAX=00000000000000a5 8394: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee0af95 pc=00000000fee0af95 SP=0043:00000000febcba68 EAX=00000000000000e1 8403: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee08e65 pc=00000000fee08e65 SP=0043:00000000febcba74 EAX=0000000000000036 8407: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee08c35 pc=00000000fee08c35 SP=0043:00000000febcba90 EAX=0000000000000006 8408: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee08e65 pc=00000000fee08e65 SP=0043:00000000febcbac0 EAX=0000000000000036 8409: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee08c35 pc=00000000fee08c35 SP=0043:00000000febcbac0 EAX=0000000000000006 8414: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee09595 pc=00000000fee09595 SP=0043:00000000febcba64 EAX=000000000000006b 8416: v=91 e=0000 i=1 cpl=3 IP=003b:00000000fee08e65 pc=00000000fee08e65 SP=0043:fffffe80febcbac0 EAX=0000000000000036 <<<<<<<< [/b] The upper 32-bit of the RSP register are cleared, but sometimes there is 0xfffffe80 in the upper 32-bits of the stack pointer. In that log there are actually a few system calls with the strange stack pointer, and they didn''t fail, but when looking at the EAX register we see that these were system calls that dont have arguments: [b]> sysent::print [14]{ [14].sy_narg = ''\0'' <<<<<<<< [14].sy_flags = 0x1 [14].sy_call = 0 [14].sy_lock = 0 [14].sy_callc = getpid }> sysent::print [a4]{ [a4].sy_narg = ''\0'' <<<<<<<< [a4].sy_flags = 0 [a4].sy_call = 0 [a4].sy_lock = 0 [a4].sy_callc = lwp_self }> sysent::print [36]{ [36].sy_narg = ''\003'' <<<<<<<< [36].sy_flags = 0 [36].sy_call = 0 [36].sy_lock = 0 [36].sy_callc = ioctl } [/b] The last system call with the strange stack pointer and EAX = 0x36 actually has arguments, and is the first one that fails with EFAULT. -- This message posted from opensolaris.org
Jürgen Keil
2006-Sep-29 12:07 UTC
[qemu-discuss] Re: 64-bit Solaris x86 kernel as guest OS
> Another question is why: > > moe -32 ''/usr/lib/libc/$HWCAP'' > > is returning the empty string when you run it in 64-bit QEMU, whereas it > returns libc_hwcap2.so.1 on my AMD64 machine. I noticed that isainfo -x > does not claim to have "amd syscalls" so this may be yet another missing > capability bit in the CPUID emulation or somesuch. I haven''t > investigated that lead further yet, though.Sounds reasonable, "file" reports that the AMD_SYSC cpu feature is needed to use libc_hwcap2.so.1: % file /usr/lib/libc/libc_hwcap2.so.1 /usr/lib/libc/libc_hwcap2.so.1: ELF 32-bit LSB dynamic lib 80386 Version 1 [SSE2 SSE MMX CMOV AMD_SYSC CX8 FPU], dynamically linked, not stripped, no debugging information available -- This message posted from opensolaris.org
Jürgen Keil
2006-Sep-29 17:16 UTC
[qemu-discuss] Re: 64-bit Solaris x86 kernel as guest OS
> Another question is why: > > moe -32 ''/usr/lib/libc/$HWCAP'' > > is returning the empty string when you run it in 64-bit QEMU, whereas it > returns libc_hwcap2.so.1 on my AMD64 machine. I noticed that isainfo -x > does not claim to have "amd syscalls" so this may be yet another missing > capability bit in the CPUID emulation or somesuch. I haven''t > investigated that lead further yet, though.Now that the EFAULT errors are fixed (seems to be a qemu-system-x86_64 bug with "iret" from 64 bit mode to 32 bit mode), I''ve tried to reproduce this, but I failed. I get # isainfo -v 64-bit amd64 applications sse sse fxsr mmx cmov amd_sysc cx8 tsc fpu 32-bit i386 applications sse sse fxsr mmx cmov amd_sysc cx8 tsc fpu # moe -32 ''/usr/lib/libc/$HWCAP'' /usr/lib/libc/libc_hwcap2.so.1 I tested that with a snv_40 guest, running inside qemu-system-x86_64, on a S-x86 on-20060925 host (32-bit, no kqemu). -- This message posted from opensolaris.org
...> Now that the EFAULT errors are fixed (seems to be a qemu-system-x86_64 > bug with "iret" from 64 bit mode to 32 bit mode), I''ve tried to reproduce this,Nice shooting!! Do you have a patch for the sources when Martin builds the updated 0.8.2 Solaris packages? Also I''d like to try things out with the fix to see what other issues might still be lurking.> # isainfo -v > 64-bit amd64 applications > sse sse fxsr mmx cmov amd_sysc cx8 tsc fpu > 32-bit i386 applications > sse sse fxsr mmx cmov amd_sysc cx8 tsc fpu > # moe -32 ''/usr/lib/libc/$HWCAP'' > /usr/lib/libc/libc_hwcap2.so.1Makes sense to me based on your earlier email. I traced through last night and determined all of the capability bits are being asserted by QEMU (with your patch anyway :)) except for the 3DNOW! and AMD-MMX extensions which aren''t supported by QEMU anyway, and aren''t really supported by BOCHS. Since libc_hwcap2 doesn''t depend on them... It''s certainly possible that the interrupt/syscall bug was causing failures in the mounting of the alternate libc. I couldn''t get my virtaul machine far enough to even get moe to run to see what it would output last night. The way the scripts are written if moe got a SEGV it would result in us not mounting the alternate libc. - Eric
Jürgen Keil
2006-Oct-02 09:52 UTC
[qemu-discuss] Re: Re: 64-bit Solaris x86 kernel as guest OS
> > # isainfo -v > > 64-bit amd64 applications > > sse sse fxsr mmx cmov amd_sysc cx8 tsc fpu > > 32-bit i386 applications > > sse sse fxsr mmx cmov amd_sysc cx8 tsc fpu > > # moe -32 ''/usr/lib/libc/$HWCAP'' > > /usr/lib/libc/libc_hwcap2.so.1 > > Makes sense to me based on your earlier email. I traced through last > night and determined all of the capability bits are being asserted by > QEMU (with your patch anyway :)) except for the 3DNOW! and AMD-MMX > extensions which aren''t supported by QEMU anyway, and aren''t really > supported by BOCHS. Since libc_hwcap2 doesn''t depend > on them...Another possible explanation is that you somehow have booted the 32-bit Solaris x86 kernel (for example during installation; the CD / DVD boot is 32-bit only; or the failsafe boot, which is 32-bit only, too). The 32-bit Solaris x86 kernel doesn''t support AMD syscall.> It''s certainly possible that the interrupt/syscall bug was causing > failures in the mounting of the alternate libc. I couldn''t get my > virtaul machine far enough to even get moe to run to see what it would > output last night. The way the scripts are written if moe got a SEGV it > would result in us not mounting the alternate libc.A fix for the SEGV with kqemu should soon appear on qemu-cvs. ===================================================== The problem is that (in long mode) the AMD "syscall" instruction for an application process running in 32-bit compatibility mode jumps to the "LSTAR" address (64-bit kernel syscall entry point) when it should jump to the "CSTAR" address (32-bit kernel syscall enty point). ===================================================== Some logging added to the kqemu.c do_syscall() shows this: long mode is enabled (lma != 0), we''re executing 32-bit compatibility code (cs64 == 0). But we''re jumping to lstar, not cstar. kqemu: kqemu_cpu_exec: ret=0x300 kqemu: do_syscall, star=003b002800000000 lstar=fffffffffb800f72 cstar=fffffffffb8012c2 lma=4000 cs64=0 kqemu: do_syscall, jump lstar, eip=fffffffffb800f72 Suggested fix: ============= Determine if we have been executing 64-bit or 32-bit application code *before* setting the new (64-bit) kernel CS. (Problem with the original code wass that the new kernel CS was installed, then it was looking at (env->hflags & HF_CS64_MASK), which always has the HF_CS64_MASK bit set) diff -ru ../qemu-0.8.2-solaris-orig/kqemu.c ./kqemu.c --- ../qemu-0.8.2-solaris-orig/kqemu.c 2006-09-13 09:40:58.000000000 +0200 +++ ./kqemu.c 2006-09-30 17:01:47.206149086 +0200 @@ -473,6 +473,8 @@ selector = (env->star >> 32) & 0xffff; #ifdef __x86_64__ if (env->hflags & HF_LMA_MASK) { + int code64 = env->hflags & HF_CS64_MASK; + env->regs[R_ECX] = kenv->next_eip; env->regs[11] = env->eflags; @@ -488,7 +490,7 @@ DESC_S_MASK | DESC_W_MASK | DESC_A_MASK); env->eflags &= ~env->fmask; - if (env->hflags & HF_CS64_MASK) + if (code64) env->eip = env->lstar; else env->eip = env->cstar; -- This message posted from opensolaris.org
J?rgen Keil wrote:> A fix for the SEGV with kqemu should soon appear on qemu-cvs... This fix works for me. I have applied all of the patches to get 64-bit OpenSolaris guests working and re-uploaded the sources and binaries on the downloads page. I have also removed the comment about 64-bit guests not working from the guest page. - Eric
Bernd Schemmer
2006-Oct-02 20:06 UTC
[qemu-discuss] Re: 64-bit Solaris x86 kernel as guest OS
> I have applied all of the patches to get 64-bit > OpenSolaris guests > working and re-uploaded the sources and binaries on > the downloads page.There should be an additional version number for the binary builds - this would make life easier for those that are testing the builds. regards Bernd -- This message posted from opensolaris.org
Carlo Marcelo Arenas Belon
2007-Jan-14 08:00 UTC
[qemu-discuss] Re: 64-bit Solaris x86 kernel as guest OS
the patch needed for qemu-0.8.2 to be able to run 64-nit Solaris x86 as a guest (not including kqemu) is in the qemu cvs and will be part of qemu-0.8.3 when launched. a changeset which includes the modified files which can be applied to a vanilla qemu-0.8.2 can be found in (extracted from qemu cvs) http://bugs.gentoo.org/attachment.cgi?id=106858 and is attached here too for simplicity Carlo -- This message posted from opensolaris.org
Carlo Marcelo Arenas Belon
2007-Jan-14 08:01 UTC
[qemu-discuss] Re: 64-bit Solaris x86 kernel as guest OS
a changeset patch from qemu cvs -- This message posted from opensolaris.org -------------- next part -------------- A non-text attachment was scrubbed... Name: qemu-0.8.2-amd64_solaris.patch Type: application/octet-stream Size: 4912 bytes Desc: not available URL: <http://mail.opensolaris.org/pipermail/qemu-discuss/attachments/20070114/f59a50cd/attachment.obj>
Martin Bochnig
2007-Jan-14 08:28 UTC
[qemu-discuss] Re: 64-bit Solaris x86 kernel as guest OS
Carlo Marcelo Arenas Belon wrote:>the patch needed for qemu-0.8.2 to be able to run 64-nit Solaris x86 as a guest (not including kqemu) is in the qemu cvs and will be part of qemu-0.8.3 when launched. > >a changeset which includes the modified files which can be applied to a vanilla qemu-0.8.2 can be found in (extracted from qemu cvs) > > http://bugs.gentoo.org/attachment.cgi?id=106858 > >and is attached here too for simplicity > >Carlo > >Juergen Keil (together with Rich Lowe) are the original authors of that code. Didn''t you read the qemu-discuss at opensolaris.org archives? It''s already included in http://opensolaris.org/os/project/qemu/downloads/qemu-0.8.2-solaris_src_20061013fri.tar.bz2 && http://opensolaris.org/os/project/qemu/downloads/SUNWqemu-0.8.2_REV_2006.10.18-sol10-i386-opt.pkg.bz2 && http://www.martux.org/qemu/RELEASES/sparc/SUNWqemu-0.8.2,REV=2006.10.14-sol8-sparc-opt.pkg.gz . Plus - for the most part (the Sol_x64 guest support code) - in qemu''s vanilla CVS. -Martin Bochnig-