Dominik Klein
2007-Aug-22 07:50 UTC
[Xen-users] Dedicating a physical CPU or Hyperthread to dom0? Strange test results
Hi I made the following tests with xen 3.1.0: I have two 3 GHz Xeon CPUs, which makes it 4 Hyperthreads. CPU1 = HT1 + HT2 CPU2 = HT3 + HT4 HT1 is dedicated to dom0. I run two identical PV domains with 1 VPCU each and 400 MB of RAM. VCPUs are mapped statically to HT3 (dom1) and HT4 (dom2). So there is nothing running on CPU1 except for dom0, HT2 is not used at all actually. dom0 HT1 dom1 HT3 dom2 HT4 I downloaded Apache Sourcecode and compiled it using "time make". I will talk about the displayed "realtime" needed to compile. So in this setup, if dom1 compiles while dom2 does nothing (and vice versa), this take 3 minutes. If both dom1 and dom2 compile at the same time, it takes 3 minutes each, so they do not seem to affect each other. Looks promising! Now I vcpu-pin dom1 to HT2. So dom0 and dom1 run on CPU1. dom0 HT1 dom1 HT2 dom2 HT4 Now: If both dom1 and dom2 compile at the same time, it takes 5 (five) minutes each. So there seems to be some kind of affect when something is run on the same physical CPU (not HT!), dom0 uses. I dont understand this. But what I find really strange is the following: If I vcpu-pin dom2 to HT3: dom0 HT1 dom1 HT2 dom2 HT3 and compile in dom1 and dom2 at the same time, it takes 3 minutes again. So it cannot be "always" true what I stated earlier (some kind of affect when something is run on the same physical CPU (not HT!), dom0 uses). I repeated all of these tests several times. This is reproducable. There are no extra services in dom0, dom1 and dom2 are setup from the same template (plain debian etch minimum install) and there are no services (yet) in dom1 and dom2. I also doublechecked for typo with the numbers in this email. Is this a known behaviour? Can someone explain it to me? Is it suggested to dedicate a physical CPU or should a hyperthread be sufficient? What is this like on dual or quad core CPUs? Is the behaviour the same if actual cores instead of HTs are assigned? Regards Dominik Here are some additional infos. If you need any more: ask! I''ll be happy to supply them xm vcpu-list 0 Name ID VCPU CPU State Time(s) CPU Affinity Domain-0 0 0 0 r-- 53484.9 0 Domain-0 0 1 - --p 4.3 any cpu Domain-0 0 2 - --p 2.8 any cpu Domain-0 0 3 - --p 3.0 any cpu CPU details from /proc/cpuinfo: model name : Intel(R) Xeon(TM) CPU 3.00GHz stepping : 3 cpu MHz : 2992.712 cache size : 2048 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 5 wp : yes flags : fpu tsc msr pae mce cx8 apic mtrr mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc up pni monitor ds_cpl cid cx16 xtpr bogomips : 5988.42 _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Javier Guerra Giraldez
2007-Aug-22 10:08 UTC
Re: [Xen-users] Dedicating a physical CPU or Hyperthread to dom0? Strange test results
Dominik Klein wrote:> CPU1 = HT1 + HT2 > CPU2 = HT3 + HT4Are you positively sure on this? i think Linux enumerates differently to make it easier to spread load, so it might be CPU1 = HT1 + HT3 CPU2 = HT2 + HT4 which would mean a totally different interpretation to your experiments. and, of course two processes running on different HTs on the same CPU will affect each other''s performance. a multithreaded CPU doesn''t have any more ALUs, cache, schedulers, etc. than a singletreaded one; it just have another set of state registers, making it easy to switch from one instruction stream to another. the advantage comes from giving the CPU something else to do while one thread stalls because of a cache miss, or an inter-instruction dependency. -- Javier _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users
Dominik Klein
2007-Aug-22 10:18 UTC
Re: [Xen-users] Dedicating a physical CPU or Hyperthread to dom0? Strange test results
Javier Guerra Giraldez schrieb:> Dominik Klein wrote: >> CPU1 = HT1 + HT2 >> CPU2 = HT3 + HT4 > > Are you positively sure on this?Hum. Actually not. When booting the machine with a non-xen Kernel, I see in /proc/cpuinfo: processor : 0 ... physical id : 0 ... processor : 1 ... physical id : 0 ... processor : 2 ... physical id : 3 ... processor : 3 ... physical id : 3 ... This would strengthen my assumption. In xen though, both dom0 and domU, I cannot see "physical id" in /proc/cpuinfo.> i think Linux enumerates differently to > make it easier to spread load, so it might be > > CPU1 = HT1 + HT3 > CPU2 = HT2 + HT4 > > which would mean a totally different interpretation to your experiments.Correct.> and, of course two processes running on different HTs on the same CPU > will affect each other''s performance. a multithreaded CPU doesn''t have > any more ALUs, cache, schedulers, etc. than a singletreaded one; it just > have another set of state registers, making it easy to switch from one > instruction stream to another. the advantage comes from giving the CPU > something else to do while one thread stalls because of a cache miss, or > an inter-instruction dependency.Good point. This said, it is unlikely that the enumeration in Xen is the same as in a non-xen kernel. Maybe a developer can say something about this. Regards Dominik _______________________________________________ Xen-users mailing list Xen-users@lists.xensource.com http://lists.xensource.com/xen-users