Not only on VMX and in generic code, but also on SVM now: svm_get_io_address() uses the segment base only when the guest is not in long mode - what if outs has an fs/gs override? I''m pretty sure the base address is needed then, which opens the question - does the CPU guarantee a valid (zero) base also for the other segment register, or does this need to be conditionalized? Further, in the same function (and likely elsewhere) the injection of GP faults seems pretty pointless - if either of the two conditions is true, then the CPU itself should have raised a GP fault for the guest already (i.e. execution flow would never get here). Thanks, Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> -----Original Message----- > From: xen-devel-bounces@lists.xensource.com > [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of > Jan Beulich > Sent: 22 November 2006 10:43 > To: xen-devel@lists.xensource.com > Subject: [Xen-devel] more segment/selector handling woes > > Not only on VMX and in generic code, but also on SVM now: > svm_get_io_address() uses the segment base only when the guest > is not in long mode - what if outs has an fs/gs override? I''m pretty > sure the base address is needed then, which opens the question - > does the CPU guarantee a valid (zero) base also for the other > segment register, or does this need to be conditionalized?Good question. I think you''ve found a bug, fs/gs should be taken into consideration in 64-bit mode. The x86-64 architecture "guarantees" that the base is zero in long-mode. More precisely, page 110 in the December 2005 AMD64 PRM Vol 2: * In data-segment descriptors referenced by DS, ES and SS segment registers, the base-address field is ignored,. For the purpose of virtual-address calculations, the base address is treates as if it has a value of zero. * Data sgemnents referenced by FS and GS segmente registers receive special treatment in 64-bit mode. For these segments, the base address field is not ignored, and a non-zero value can be used in a virtual-address calculations. A 64-bit segmnet base addres can be spciefied using model specific registers. See "FS and GS Registers in 64-bit mode" on page 88 for more information. So FS/GS should be considered for VA calculation in the code you refer to. Sorry I missed that when I wrote it.> > Further, in the same function (and likely elsewhere) the injection > of GP faults seems pretty pointless - if either of the two > conditions is true, then the CPU itself should have raised a GP > fault for the guest already (i.e. execution flow would never get > here).INS/OUTS will be checked by the processor for the first access only in the virtualized case, whilst the range is checked on every iteration of the instruction in the "real" processor case. Since we only take one intercept for the first operation and then does as much as possible (up to a page boundary), it''s possible that the code would be faulty and make GP fault on the consecutive accesses. Of course, if you trust the code to be correct, then it''s fine to eliminat the GP fault checking - but I put those in there to make sure that the virtual model is as close to the real processor as possible. They shouldn''t fire very often, I''m sure... ;-) -- Mats> > Thanks, Jan > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>> Not only on VMX and in generic code, but also on SVM now: >> svm_get_io_address() uses the segment base only when the guest >> is not in long mode - what if outs has an fs/gs override? I''m pretty >> sure the base address is needed then, which opens the question - >> does the CPU guarantee a valid (zero) base also for the other >> segment register, or does this need to be conditionalized? > >Good question. I think you''ve found a bug, fs/gs should be taken into >consideration in 64-bit mode. > >The x86-64 architecture "guarantees" that the base is zero in long-mode. > > >More precisely, page 110 in the December 2005 AMD64 PRM Vol 2: >* In data-segment descriptors referenced by DS, ES and SS segment >registers, the base-address field is ignored,. For the purpose of >virtual-address calculations, the base address is treates as if it has a >value of zero.Note the wording ''as if'' - this doesn''t tell me whether the internal base address field (which gets stored to the vmcb) can indeed be relied upon. But obviously the code would be simpler if that was the case in reality (and then perhaps the documentation could be updated accordingly).>> Further, in the same function (and likely elsewhere) the injection >> of GP faults seems pretty pointless - if either of the two >> conditions is true, then the CPU itself should have raised a GP >> fault for the guest already (i.e. execution flow would never get >> here). > >INS/OUTS will be checked by the processor for the first access only in >the virtualized case, whilst the range is checked on every iteration of >the instruction in the "real" processor case. Since we only take one >intercept for the first operation and then does as much as possible (up >to a page boundary), it''s possible that the code would be faulty and >make GP fault on the consecutive accesses. Of course, if you trust the >code to be correct, then it''s fine to eliminat the GP fault checking - >but I put those in there to make sure that the virtual model is as close >to the real processor as possible. They shouldn''t fire very often, I''m >sure... ;-)But that isn''t really done: svm_get_io_address() gets called once from svm_io_instruction(), and then up to a full page is being copied. The next part is copied only after having returned to the guest and having received the next exit. Further, even if the checks were done for each iteration, the present bit check would still be useless, only the limit check then is relevant (and should supposedly be done only when count > 1). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> -----Original Message----- > From: xen-devel-bounces@lists.xensource.com > [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of > Jan Beulich > Sent: 22 November 2006 11:45 > To: Petersson, Mats > Cc: xen-devel@lists.xensource.com > Subject: RE: [Xen-devel] more segment/selector handling woes > > >> Not only on VMX and in generic code, but also on SVM now: > >> svm_get_io_address() uses the segment base only when the guest > >> is not in long mode - what if outs has an fs/gs override? > I''m pretty > >> sure the base address is needed then, which opens the question - > >> does the CPU guarantee a valid (zero) base also for the other > >> segment register, or does this need to be conditionalized? > > > >Good question. I think you''ve found a bug, fs/gs should be taken into > >consideration in 64-bit mode. > > > >The x86-64 architecture "guarantees" that the base is zero > in long-mode. > > > > > >More precisely, page 110 in the December 2005 AMD64 PRM Vol 2: > >* In data-segment descriptors referenced by DS, ES and SS segment > >registers, the base-address field is ignored,. For the purpose of > >virtual-address calculations, the base address is treates as > if it has a > >value of zero. > > Note the wording ''as if'' - this doesn''t tell me whether the > internal base > address field (which gets stored to the vmcb) can indeed be > relied upon. > But obviously the code would be simpler if that was the case > in reality > (and then perhaps the documentation could be updated accordingly). > > >> Further, in the same function (and likely elsewhere) the injection > >> of GP faults seems pretty pointless - if either of the two > >> conditions is true, then the CPU itself should have raised a GP > >> fault for the guest already (i.e. execution flow would never get > >> here). > > > >INS/OUTS will be checked by the processor for the first > access only in > >the virtualized case, whilst the range is checked on every > iteration of > >the instruction in the "real" processor case. Since we only take one > >intercept for the first operation and then does as much as > possible (up > >to a page boundary), it''s possible that the code would be faulty and > >make GP fault on the consecutive accesses. Of course, if you > trust the > >code to be correct, then it''s fine to eliminat the GP fault > checking - > >but I put those in there to make sure that the virtual model > is as close > >to the real processor as possible. They shouldn''t fire very > often, I''m > >sure... ;-) > > But that isn''t really done: svm_get_io_address() gets called once > from svm_io_instruction(), and then up to a full page is being copied. > The next part is copied only after having returned to the guest and > having received the next exit. > Further, even if the checks were done for each iteration, the present > bit check would still be useless, only the limit check then > is relevant > (and should supposedly be done only when count > 1).I agree that the limit check is the "necessary" part of the check. There is no need to check present, as that''s (unless someone is changing the descriptor table - and even so, it wouldn''t really make any difference, as the VMCB wouldn''t get filled with the not-present segment, as the processor would GP-fault FIRST...) already been checked by the processor. Count = 0 .. 1 is a rare case, so checking if count > 1 is probably meaningless.... -- Mats> > Jan > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> -----Original Message----- > From: xen-devel-bounces@lists.xensource.com > [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of > Jan Beulich > Sent: 22 November 2006 11:45 > To: Petersson, Mats > Cc: xen-devel@lists.xensource.com > Subject: RE: [Xen-devel] more segment/selector handling woes > > >> Not only on VMX and in generic code, but also on SVM now: > >> svm_get_io_address() uses the segment base only when the guest > >> is not in long mode - what if outs has an fs/gs override? > I''m pretty > >> sure the base address is needed then, which opens the question - > >> does the CPU guarantee a valid (zero) base also for the other > >> segment register, or does this need to be conditionalized? > > > >Good question. I think you''ve found a bug, fs/gs should be taken into > >consideration in 64-bit mode. > > > >The x86-64 architecture "guarantees" that the base is zero > in long-mode. > > > > > >More precisely, page 110 in the December 2005 AMD64 PRM Vol 2: > >* In data-segment descriptors referenced by DS, ES and SS segment > >registers, the base-address field is ignored,. For the purpose of > >virtual-address calculations, the base address is treates as > if it has a > >value of zero. > > Note the wording ''as if'' - this doesn''t tell me whether the > internal base > address field (which gets stored to the vmcb) can indeed be > relied upon. > But obviously the code would be simpler if that was the case > in reality > (and then perhaps the documentation could be updated accordingly).I believe it would contain whatever is in the [GL]DT... It''s ignored by the processor (treated as zero). So, you''d have to check if it''s GS/FS or not, and then use either 0 or [fg]s.base accordingly. Note that one bit in EFER also allows limits for 64-bit segments, but I think it''s only ever used by VMWare, so it''s probably OK to ignore the limits completely (in 64-bit mode at least). -- Mats _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
>> Note the wording ''as if'' - this doesn''t tell me whether the >> internal base >> address field (which gets stored to the vmcb) can indeed be >> relied upon. >> But obviously the code would be simpler if that was the case >> in reality >> (and then perhaps the documentation could be updated accordingly). > >I believe it would contain whatever is in the [GL]DT... It''s ignored by >the processor (treated as zero). So, you''d have to check if it''s GS/FS >or not, and then use either 0 or [fg]s.base accordingly.Can you verify this with you hardware guys? It would mean that I''d also have to change the implementation of get_segment_base() that I introduced with a patch yesterday.>Note that one bit in EFER also allows limits for 64-bit segments, but I >think it''s only ever used by VMWare, so it''s probably OK to ignore the >limits completely (in 64-bit mode at least).Is this being detailed anywhere? Namely, whether there''s a CPUID feature flag for this (or is it always available), and how one would obtain 64-bit wide limits? I merely can see the flag being defined in the NPT BIOS And Kernel Developer''s Guide (the public Programmer''s Manual doesn''t even know this). Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> -----Original Message----- > From: xen-devel-bounces@lists.xensource.com > [mailto:xen-devel-bounces@lists.xensource.com] On Behalf Of > Jan Beulich > Sent: 22 November 2006 13:08 > To: Petersson, Mats > Cc: xen-devel@lists.xensource.com > Subject: RE: [Xen-devel] more segment/selector handling woes > > >> Note the wording ''as if'' - this doesn''t tell me whether the > >> internal base > >> address field (which gets stored to the vmcb) can indeed be > >> relied upon. > >> But obviously the code would be simpler if that was the case > >> in reality > >> (and then perhaps the documentation could be updated accordingly). > > > >I believe it would contain whatever is in the [GL]DT... It''s > ignored by > >the processor (treated as zero). So, you''d have to check if > it''s GS/FS > >or not, and then use either 0 or [fg]s.base accordingly. > > Can you verify this with you hardware guys? It would mean that I''d > also have to change the implementation of get_segment_base() > that I introduced with a patch yesterday.I''ll check with someone. I''ll hopefully have a reply early afternoon, but no guarantees...> > >Note that one bit in EFER also allows limits for 64-bit > segments, but I > >think it''s only ever used by VMWare, so it''s probably OK to > ignore the > >limits completely (in 64-bit mode at least). > > Is this being detailed anywhere? Namely, whether there''s a CPUID > feature flag for this (or is it always available), and how one would > obtain 64-bit wide limits? I merely can see the flag being defined in > the NPT BIOS And Kernel Developer''s Guide (the public > Programmer''s Manual doesn''t even know this).There''s no 64-bit segment limit, it''s a 32-bit value (actually 20 bit, with an optional 12 bit shift-and-fill-with-ones) just as in 32-bit mode. There''s no other public documentation at the moment. I''ve seen a presentation set that describes it in more detail, I''ll see if I can dig that out [and see if it''s marked "NDA" or not!]. It''s safe to ignore it for now, I should think. -- Mats> > Jan > > _______________________________________________ > Xen-devel mailing list > Xen-devel@lists.xensource.com > http://lists.xensource.com/xen-devel > > >_______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel
> Note the wording ''as if'' - this doesn''t tell me whether the > internal base > address field (which gets stored to the vmcb) can indeed be > relied upon. > But obviously the code would be simpler if that was the case > in reality > (and then perhaps the documentation could be updated accordingly).Indeed, you''re right: You need to check if the processor is in 64-bit mode and if so, return base as zero, as the base value may not be zero when the processor is storing it. This is so that 64-bit OS''s can load a 32-bit segment before jumping to the actual code. I just got this confirmed by Ben Serebrin... -- Mats _______________________________________________ Xen-devel mailing list Xen-devel@lists.xensource.com http://lists.xensource.com/xen-devel